Explaining black-box neural ranking models using sentence impact analysis

L. Holdijk, M. Verbrugge, E. Gerritse, A. de Vries

Abstract

Deep neural models are often criticized for operating as black-boxes. The lack of interpretability makes the decision making process of a model hard to interpret, and obfuscates any potential flaws. In this work we propose a simple, yet effective, method to partially alleviate this issue for neural ranking models. By analysing which sentences have the highest impact on a retrieved document’s final ranking, we obtain an explanation of the model’s decision making process that is easy to interpret. We argue that this explanation can be a useful guide for the end-users of a information retrieval system. By applying the proposed sentence impact analysis on BM25 and DRMM, we show that the obtained explanation can be used to uncover issues with the ranking models. We find that the removal of a single sentence can result in a completely different ranking for a document for DRMM, as well as BM25.