Assessing the Local Interpretability of Machine Learning Models

Dylan Slack, Sorelle A. Friedler, Carlos Scheidegger, Chitradeep Dutta Roy

Research output: Contribution to journalArticlepeer-review


The increasing adoption of machine learning tools has led to calls for accountability via model interpretability. But what does it mean for a machine learning model to be interpretable by humans, and how can this be assessed? We focus on two definitions of interpretability that have been introduced in the machine learning literature: simulatability (a users ability to run a model on a given input) and "what if" local explainability (a users ability to correctly determine a models prediction under local changes to the input, given knowledge of the models original prediction). Through a user study with 1000 participants, we test whether humans perform well on tasks that mimic the definitions of simulatability and "what if" local explainability on models that are typically considered locally interpretable. To track the relative interpretability of models, we employ a simple metric, the runtime operation count on the simulatability task.We find evidence that as the number of operations increases, participant accuracy on the local interpretability tasks decreases. In addition, this evidence is consistent with the common intuition that decision trees and logistic regression models are interpretable and are more interpretable than neural networks.

Original languageEnglish (US)
JournalUnknown Journal
StatePublished - Feb 9 2019

ASJC Scopus subject areas

  • General

Fingerprint Dive into the research topics of 'Assessing the Local Interpretability of Machine Learning Models'. Together they form a unique fingerprint.

Cite this