Model Interpretability through the Lens ...
Type de document :
Communication dans un congrès avec actes
Titre :
Model Interpretability through the Lens of Computational Complexity
Auteur(s) :
Barceló, Pablo [Auteur]
Millennium Institute for Foundational Research on Data [IMFD]
Monet, Mikaël [Auteur]
Linking Dynamic Data [LINKS]
Perez, Jorge [Auteur]
Millennium Institute for Foundational Research on Data [IMFD]
Department of Computer Science [Universidad de Chile]
Subercaseaux, Bernardo [Auteur]
Millennium Institute for Foundational Research on Data [IMFD]
Department of Computer Science [Universidad de Chile]
Millennium Institute for Foundational Research on Data [IMFD]
Monet, Mikaël [Auteur]
Linking Dynamic Data [LINKS]
Perez, Jorge [Auteur]
Millennium Institute for Foundational Research on Data [IMFD]
Department of Computer Science [Universidad de Chile]
Subercaseaux, Bernardo [Auteur]
Millennium Institute for Foundational Research on Data [IMFD]
Department of Computer Science [Universidad de Chile]
Titre de la manifestation scientifique :
NeurIPS 2020
Ville :
Held online
Pays :
Etats-Unis d'Amérique
Date de début de la manifestation scientifique :
2020-12-07
Discipline(s) HAL :
Informatique [cs]
Informatique [cs]/Intelligence artificielle [cs.AI]
Informatique [cs]/Complexité [cs.CC]
Informatique [cs]/Intelligence artificielle [cs.AI]
Informatique [cs]/Complexité [cs.CC]
Résumé en anglais : [en]
In spite of several claims stating that some models are more interpretable than others -- e.g., "linear models are more interpretable than deep neural networks" -- we still lack a principled notion of interpretability to ...
Lire la suite >In spite of several claims stating that some models are more interpretable than others -- e.g., "linear models are more interpretable than deep neural networks" -- we still lack a principled notion of interpretability to formally compare among different classes of models. We make a step towards such a notion by studying whether folklore interpretability claims have a correlate in terms of computational complexity theory. We focus on local post-hoc explainability queries that, intuitively, attempt to answer why individual inputs are classified in a certain way by a given model. In a nutshell, we say that a class $\mathcal{C}_1$ of models is more interpretable than another class $\mathcal{C}_2$, if the computational complexity of answering post-hoc queries for models in $\mathcal{C}_2$ is higher than for those in $\mathcal{C}_1$. We prove that this notion provides a good theoretical counterpart to current beliefs on the interpretability of models; in particular, we show that under our definition and assuming standard complexity-theoretical assumptions (such as P$\neq$NP), both linear and tree-based models are strictly more interpretable than neural networks. Our complexity analysis, however, does not provide a clear-cut difference between linear and tree-based models, as we obtain different results depending on the particular post-hoc explanations considered. Finally, by applying a finer complexity analysis based on parameterized complexity, we are able to prove a theoretical result suggesting that shallow neural networks are more interpretable than deeper ones.Lire moins >
Lire la suite >In spite of several claims stating that some models are more interpretable than others -- e.g., "linear models are more interpretable than deep neural networks" -- we still lack a principled notion of interpretability to formally compare among different classes of models. We make a step towards such a notion by studying whether folklore interpretability claims have a correlate in terms of computational complexity theory. We focus on local post-hoc explainability queries that, intuitively, attempt to answer why individual inputs are classified in a certain way by a given model. In a nutshell, we say that a class $\mathcal{C}_1$ of models is more interpretable than another class $\mathcal{C}_2$, if the computational complexity of answering post-hoc queries for models in $\mathcal{C}_2$ is higher than for those in $\mathcal{C}_1$. We prove that this notion provides a good theoretical counterpart to current beliefs on the interpretability of models; in particular, we show that under our definition and assuming standard complexity-theoretical assumptions (such as P$\neq$NP), both linear and tree-based models are strictly more interpretable than neural networks. Our complexity analysis, however, does not provide a clear-cut difference between linear and tree-based models, as we obtain different results depending on the particular post-hoc explanations considered. Finally, by applying a finer complexity analysis based on parameterized complexity, we are able to prove a theoretical result suggesting that shallow neural networks are more interpretable than deeper ones.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Commentaire :
36 pages, including 9 pages of main text. This is the author version of the NeurIPS'2020 paper. Except from minor differences that could be introduced by the publisher, the only difference should be the addition of the appendix, which contains all the proofs that do not appear in the main text
Collections :
Source :
Fichiers
- https://hal.inria.fr/hal-03052508/document
- Accès libre
- Accéder au document
- http://arxiv.org/pdf/2010.12265
- Accès libre
- Accéder au document
- https://hal.inria.fr/hal-03052508/document
- Accès libre
- Accéder au document
- https://hal.inria.fr/hal-03052508/document
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- 2010.12265.pdf
- Accès libre
- Accéder au document
- 2010.12265
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- 2010.12265.pdf
- Accès libre
- Accéder au document