2020 TheExplanationGameTowardsPredic

(Treviso & Martins, 2020) ⇒ Marcos V Treviso, and André F.T. Martins. (2020). “The Explanation Game: Towards Prediction Explainability through Sparse Communication.” In: arXiv preprint arXiv:2004.13876. doi:10.48550/arXiv.2004.13876

Subject Headings: Prediction Explainability.

Notes

Cited By

http://scholar.google.com/scholar?q=%222020%22+The+Explanation+Game%3A+Towards+Prediction+Explainability+through+Sparse+Communication

Quotes

Abstract

Explainability is a topic of growing importance in NLP. In this work, we provide a unified perspective of explainability as a communication problem between an explainer and a layperson about a classifier's decision. We use this framework to compare several prior approaches for extracting explanations, including gradient methods, representation erasure, and attention mechanisms, in terms of their communication success. In addition, we reinterpret these methods at the light of classical feature selection, and we use this as inspiration to propose new embedded methods for explainability, through the use of selective, sparse attention. Experiments in text classification, natural language entailment, and machine translation, using different configurations of explainers and laypeople (including both machines and humans), reveal an advantage of attention-based explainers over gradient and erasure methods. Furthermore, human evaluation experiments show promising results with post-hoc explainers trained to optimize communication success and faithfulness.

References

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2020 TheExplanationGameTowardsPredic	Marcos V Treviso André F.T. Martins			The Explanation Game: Towards Prediction Explainability through Sparse Communication				10.48550/arXiv.2004.13876		2020