Explaining Neural Networks: Post-hoc and Natural Language Explanations
- 👤 Speaker: Oana Camburu, University of Oxford 🔗 Website
- 📅 Date & Time: Friday 25 October 2019, 11:00 - 12:00
- 📍 Venue: Engineering Department, CBL Room BE-438.
Abstract
In this talk, we discuss two paradigms of explainability in neural networks: post-hoc explanations and neural networks generating natural language explanations for their own decisions. For the first paradigm, we present two issues of existing post-hoc explanatory methods. The first issue is that two prevalent perspectives on explanations—feature-additivity and feature-selection—lead to fundamentally different instance-wise explanations. In the literature, explainers from different perspectives are currently being directly compared, despite their distinct explanation goals. The second issue is that current post-hoc explainers have only been thoroughly validated on simple models, such as linear regression, and, when applied to real-world neural networks, explainers are commonly evaluated under the assumption that the learned models behave reasonably. However, neural networks often rely on unreasonable correlations, even when producing correct decisions. We introduce a verification framework for explanatory methods under the feature-selection perspective. Our framework is, to our knowledge, the first evaluation test based on a non-trivial real-world neural network for which we are able to provide guarantees on its inner workings. We show several failure modes of current explainers, such as LIME , SHAP and L2X . (based on https://arxiv.org/abs/1910.02065) For the paradigm of neural networks that explain their own decisions in natural language, we introduce a large dataset of human-annotated explanations for the ground-truth relations of SNLI , which we call e-SNLI. The corpus contains 570K instances, being, to our knowledge, the largest dataset of free-form natural language explanations. We present a series of models trained on e-SNLI. (based on https://papers.nips.cc/paper/8163-e-snli-natural-language-inference-with-natural-language-explanations.pdf) Finally, we show that this class of models is prone to outputting inconsistent explanations, such as “A dog is an animal” and “A dog is not an animal”, which are likely to decrease users’ trust in these systems. To detect such inconsistencies, we introduce a simple but effective adversarial framework for generating a complete target sequence, a scenario that has not been addressed so far. Finally, we apply our framework to the best model trained on e-SNLI, and we show that this model is capable of generating a significant amount of inconsistencies. (based on https://arxiv.org/abs/1910.03065)
Series This talk is part of the Machine Learning @ CUED series.
Included in Lists
- All Talks (aka the CURE list)
- Biology
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge Neuroscience Seminars
- Cambridge talks
- CBL important
- Chris Davis' list
- Creating transparent intact animal organs for high-resolution 3D deep-tissue imaging
- dh539
- dh539
- Engineering Department, CBL Room BE-438.
- Featured lists
- Guy Emerson's list
- Hanchen DaDaDash
- Inference Group Summary
- Information Engineering Division seminar list
- Interested Talks
- Joint Machine Learning Seminars
- Life Science
- Life Sciences
- Machine Learning @ CUED
- Machine Learning Summary
- ML
- ndk22's list
- Neuroscience
- Neuroscience Seminars
- Neuroscience Seminars
- ob366-ai4er
- Required lists for MLG
- rp587
- Seminar
- Simon Baker's List
- Stem Cells & Regenerative Medicine
- Trust & Technology Initiative - interesting events
- yk373's list
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)



Friday 25 October 2019, 11:00-12:00