PROV provenance standard

Moreau, L. & Missier, P. (2013). PROV-DM: The PROV Data Model. W3C Recommendation.
URL: https://www.w3.org/TR/2013/REC-prov-dm-20130430/

Huynh, T. D., Stalla-Bourdillon, S. & Moreau, L. (2019). Provenance-based Explanations for Automated Decisions : Final IAA Project Report. URL: https://kclpure.kcl.ac.uk/portal/en/publications/provenancebased-explanations-forautomated-decisions(5b1426ce-d253-49fa-8390-4bb3abe65f54).html

Resources for exploring algorithm types

General

Hastie, T., Tibshirani, R., Friedman, J., & Franklin, J. (2005). The elements of statistical learning: data mining, inference and prediction. The Mathematical Intelligencer, 27(2), 83-85. http://thuvien.thanglong.edu.vn:8081/dspace/bitstream/DHTL_123456789/4053/1/%5BSpringer%20Series%20in%20Statistics-1.pdf

Molnar, C. (2019). Interpretable machine learning: A guide for making black box models explainable. https://christophm.github.io/interpretable-ml-book/

Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206. https://www.nature.com/articles/s42256-019-0048-x

Regularised regression (LASSO and Ridge)

Gaines, B. R., & Zhou, H. (2016). Algorithms for fitting the constrained lasso. Journal of Computational and Graphical Statistics, 27(4), 861-871. https://arxiv.org/pdf
/1611.01511.pdf

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288. http://beehive.cs.princeton.edu/course/read/tibshirani-jrssb-1996.pdf

Generalised linear model (GLM)

https://CRAN.R-project.org/package=glmnet

Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1-22. http://www.jstatsoft.org/v33/i01/

Simon, N., Friedman, J., Hastie, T., & Tibshirani, R. (2011). Regularization paths for Cox's proportional hazards model via coordinate descent. Journal of Statistical Software, 39(5), 1-13. URL http://www.jstatsoft.org/v39/i05/

Generalised additive model (GAM)

https://CRAN.R-project.org/package=gam

Lou, Y., Caruana, R., & Gehrke, J. (2012). Intelligible models for classification and regression. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 150-158). ACM. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.433.8241&rep=rep1&type=pdf

Wood, S. N. (2006). Generalized additive models: An introduction with R. CRC Press.

Decision tree (DT)

Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and Regression Trees. CRC Press.

Rule/decision lists and sets

Angelino, E., Larus-Stone, N., Alabi, D., Seltzer, M., & Rudin, C. (2017). Learning certifiably optimal rule lists for categorical data. The Journal of Machine Learning Research, 18(1), 8753-8830. http://www.jmlr.org/papers/volume18/17-716/17-716.pdf

Lakkaraju, H., Bach, S. H., & Leskovec, J. (2016, August). Interpretable decision sets: A joint framework for description and prediction. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1675-1684). ACM. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5108651/

Letham, B., Rudin, C., McCormick, T. H., & Madigan, D. (2015). Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model. The Annals of Applied Statistics, 9(3), 1350-1371. https://projecteuclid.org/download/pdfview_1/euclid.aoas/1446488742

Wang, F., & Rudin, C. (2015). Falling rule lists. In Artificial Intelligence and Statistics (pp. 1013-1022). http://proceedings.mlr.press/v38/wang15a.pdf

Case-based reasoning (CBR)/ Prototype and criticism

Aamodt, A. (1991). A knowledge-intensive, integrated approach to problem solving and sustained learning. Knowledge Engineering and Image Processing Group. University of Trondheim, 27-85. http://www.dphu.org/uploads/attachements/books/books_4200_0.pdf

Aamodt, A., & Plaza, E. (1994). Case-based reasoning: Foundational issues, methodological variations, and system approaches. AI communications, 7(1), 39-59. https://www.idi.ntnu.no/emner/tdt4171/papers/AamodtPlaza94.pdf

Bichindaritz, I., & Marling, C. (2006). Case-based reasoning in the health sciences: What's next?. Artificial intelligence in medicine, 36(2), 127-135. http://cs.oswego.edu/~bichinda/isc471-hci571/AIM2006.pdf

Bien, J., & Tibshirani, R. (2011). Prototype selection for interpretable classification. The Annals of Applied Statistics, 5(4), 2403-2424. https://projecteuclid.org/download/pdfview_1/euclid.aoas/1324399600

Kim, B., Khanna, R., & Koyejo, O. O. (2016). Examples are not enough, learn to criticize! criticism for interpretability. In Advances in Neural Information Processing Systems (pp. 2280-2288). http://papers.nips.cc/paper/6300-examples-are-not-enough-learn-to-criticize-criticism-for-interpretability.pdf

MMD-critic in python: https://github.com/BeenKim/MMD-critic

Kim, B., Rudin, C., & Shah, J. A. (2014). The bayesian case model: A generative approach for case-based reasoning and prototype classification. In Advances in Neural Information Processing Systems (pp. 1952-1960). http://papers.nips.cc/paper/5313-the-bayesian-case-model-a-generative-approach-for-case-based-reasoning-and-prototype-classification.pdf

Supersparse linear integer model (SLIM)

Jung, J., Concannon, C., Shroff, R., Goel, S., & Goldstein, D. G. (2017). Simple rules for complex decisions. Available at SSRN 2919024. https://arxiv.org/pdf/1702.04690.pdf

Rudin, C., & Ustun, B. (2018). Optimized scoring systems: toward trust in machine learning for healthcare and criminal justice. Interfaces, 48(5), 449-466. https://pdfs.semanticscholar.org/b3d8/8871ae5432c84b76bf53f7316cf5f95a3938.pdf

Ustun, B., & Rudin, C. (2016). Supersparse linear integer models for optimized medical scoring systems. Machine Learning, 102(3), 349-391. https://link.springer.com/article/10.1007/s10994-015-5528-6

Optimized scoring systems for classification problems in python: https://github.com/ustunb/slim-python

Simple customizable risk scores in python: https://github.com/ustunb/risk-slim

Resources for exploring supplementary explanation strategies

Surrogate models (SM)

Bastani, O., Kim, C., & Bastani, H. (2017). Interpretability via model extraction. arXiv preprint arXiv:1706.09773. https://obastani.github.io/docs/fatml17.pdf

Craven, M., & Shavlik, J. W. (1996). Extracting tree-structured representations of trained networks. In Advances in neural information processing systems (pp. 24-30). http://papers.nips.cc/paper/1152-extracting-tree-structured-representations-of-trained-networks.pdf

Van Assche, A., & Blockeel, H. (2007). Seeing the forest through the trees: Learning a comprehensible model from an ensemble. In European Conference on Machine Learning (pp. 418-429). Springer, Berlin, Heidelberg. https://link.springer.com/content/pdf/10.1007/978-3-540-74958-5_39.pdf

Valdes, G., Luna, J. M., Eaton, E., Simone II, C. B., Ungar, L. H., & Solberg, T. D. (2016). MediBoost: a patient stratification tool for interpretable decision making in the era of precision medicine. Scientific reports, 6, 37854. https://www.nature.com/articles/srep37854

Partial Dependence Plot (PDP)

Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of statistics, 1189-1232. https://projecteuclid.org/download/pdf_1/euclid.aos/1013203451

Greenwell, B. M. (2017). pdp: an R Package for constructing partial dependence plots. The R Journal, 9(1), 421-436. https://pdfs.semanticscholar.org/cdfb/164f55e74d7b116ac63fc6c1c9e9cfd01cd8.pdf

For the software in R: https://cran.r-project.org/web/packages/pdp/index.html

Individual Conditional Expectations Plot (ICE)

Goldstein, A., Kapelner, A., Bleich, J., & Pitkin, E. (2015). Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation. Journal of Computational and Graphical Statistics, 24(1), 44-65. https://arxiv.org/pdf/1309.6392.pdf

For the software in R see:
https://cran.r-project.org/web/packages/ICEbox/index.html
https://cran.r-project.org/web/packages/ICEbox/ICEbox.pdf

Accumulated Local Effects Plots (ALE)

Apley, D. W., & Zhu, J. (2019). Visualizing the effects of predictor variables in black box supervised learning models. arXiv preprint arXiv:1612.08468. https://arxiv.org/pdf/1612.08468;Visualizing

https://cran.r-project.org/web/packages/ALEPlot/index.html

Global variable importance

Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32. https://link.springer.com/content/pdf/10.1023/A:1010933404324.pdf

Casalicchio, G., Molnar, C., & Bischl, B. (2018, September). Visualizing the feature importance for black box models. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 655-670). Springer, Cham. https://arxiv.org/pdf/1804.06620.pdf

Fisher, A., Rudin, C., & Dominici, F. (2018). All models are wrong, but many are useful: Learning a variable’s importance by studying an entire class of prediction models simultaneously. arXiv:1801.01489

Fisher, A., Rudin, C., & Dominici, F. (2018). Model class reliance: Variable importance measures for any machine learning model class, from the “Rashomon” perspective. arXiv preprint arXiv:1801.01489. https://arxiv.org/abs/1801.01489v2

Hooker, G., & Mentch, L. (2019). Please Stop Permuting Features: An Explanation and Alternatives. arXiv preprint arXiv:1905.03151. https://arxiv.org/pdf/1905.03151.pdf

Zhou, Z., & Hooker, G. (2019). Unbiased Measurement of Feature Importance in Tree-Based Methods. arXiv preprint arXiv:1903.05179. https://arxiv.org/pdf/1903.05179.pdf

Global variable interaction

Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles. The Annals of Applied Statistics, 2(3), 916-954. https://projecteuclid.org/download/pdfview_1/euclid.aoas/1223908046

Greenwell, B. M., Boehmke, B. C., & McCarthy, A. J. (2018). A simple and effective model-based variable importance measure. arXiv preprint arXiv:1805.04755. https://arxiv.org/pdf/1805.04755.pdf

Hooker, G. (2004, August). Discovering additive structure in black box functions. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 575-580). ACM. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.91.7500&rep=rep1&type=pdf

Local Interpretable Model-Agnostic Explanation (LIME)

Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135-1144). ACM. https://arxiv.org/pdf/1602.04938.pdf?mod=article_inline

LIME in python: https://github.com/marcotcr/lime
LIME experiments in python: https://github.com/marcotcr/lime-experiments

Ribeiro, M. T., Singh, S., & Guestrin, C. (2018). Anchors: High-precision model-agnostic explanations. In Thirty-Second AAAI Conference on Artificial Intelligence. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.91.7500&rep=rep1&type=pdf

Anchors in python: https://github.com/marcotcr/anchor
Anchors experiments in python: https://github.com/marcotcr/anchor-experiments

Shapley Additive ExPlanations (SHAP)

Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems (pp. 4765-4774). http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf

Software for SHAP and its extensions in python: https://github.com/slundberg/shap
R wrapper for SHAP: https://modeloriented.github.io/shapper/

Shapley, L. S. (1953). A value for n-person games. Contributions to the Theory of Games, 2(28), 307-317. http://www.library.fa.ru/files/Roth2.pdf#page=39

Counterfactual explanation

Wachter, S., Mittelstadt, B., & Russell, C. (2017). Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. JL & Tech., 31, 841. https://jolt.law.harvard.edu/assets/articlePDFs/v31/Counterfactual-Explanations-without-Opening-the-Black-Box-Sandra-Wachter-et-al.pdf

Ustun, B., Spangher, A., & Liu, Y. (2019). Actionable recourse in linear classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency(pp. 10-19). ACM. https://arxiv.org/pdf/1809.06514.pdf

Evaluate recourse in linear classification models in python: https://github.com/ustunb/actionable-recourse

Secondary explainers and attention-based systems

Li, O., Liu, H., Chen, C., & Rudin, C. (2018). Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In Thirty-Second AAAI Conference on Artificial Intelligence. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/17082/16552

Park, D. H., Hendricks, L. A., Akata, Z., Schiele, B., Darrell, T., & Rohrbach, M. (2016). Attentive explanations: Justifying decisions and pointing to the evidence. arXiv preprint arXiv:1612.04757. https://arxiv.org/pdf/1612.04757

Other resources for supplementary explanation

IBM’s Explainability 360: http://aix360.mybluemix.net

Biecek, B., & Burzykowski, T. (2019). Predictive Models: Explore, Explain, and Debug, Human-Centered Interpretable Machine Learning. Retrieved from https://pbiecek.github.io/PM_VEE/

Accompanying software, Dalex, Descriptive mAchine Learning Explanations: https://github.com/ModelOriented/DALEX

Przemysław Biecek, Interesting resources related to XAI: https://github.com/pbiecek/xai_resources

Christoph Molnar, iml: Interpretable machine learning https://cran.r-project.org/web/packages/iml/index.html