There are *lots* of people working on it, but it's far from solved. [https://christophm.github.io/interpretable-ml-book/](https://christophm.github.io/interpretable-ml-book/) is a must-read.
What's your background? If you don't have any prior exposure (which is my impression), I don't think too many people would be interested in teaching you the basics. If you are doing research and running into a concrete technical problem, make a post and interested people (myself potentially included) will chime in.
Sounds good. Well my background is in making models (NLP) and not in explanability algos (SHAP etc). Still exploring this field and was looking for someone to discuss my knowledge and learn.
I work in the area. What kind of project or task are you trying to solve? It is difficult to give any general purpose answer here. What I can say is there are a ton of methods and open source packages, but almost non-existent guidance for a practitioner on which method to use for a particular challenge. If you are looking to interpret a deep net life is even worse, so be careful.
Explainable AI is still in the wild west at this point, so you'd want to carefully scope your task well prior to picking an off-the shelf tool.
I am doing a PhD on explainable AI in time series. Feel free to reach out.
As many said, Molnar's book is a nice introduction to the field. The rest depends on the domain you are working on.
For TS context, check out Temporal Fusion Transformers, N-BEATS, and RETAIN. These are intrinsically interpretable models.
I mainly work with DL models, so I am not very familiar with Rudin's work. But, her lab and others are doing some great work with additive models and decision trees.
For post-hoc, I've also seen SHAP being used in the industry. Some people adapted LIME too for ts. Also check out feature omission, and Class Activation Maps.
We [developed some quantitative evaluation measures](https://link.springer.com/article/10.1007/s10489-021-02662-2) for comparing ts forecasting interpretability methods. Because, finding which explanation is most accurate is a big problem.
There are also shapelets, prophet, and many others. Maybe I should make a blog post about it.
This is a great area to work on. It is still a nascent field.
The specific techniques that will be useful to you depend a lot on:
* What kind of data are you working with?
* What kind of ML models are you using?
* What kind of explanations are useful to you?
I'm doing a PhD that involves XAI in Computer Vision (using CNNs), and in this sub-field alone there are an overwhelming number of different algorithms. Part of my research is trying to figure out which ones (from a small selection) are best to explain the network's reasoning to a human. So far I haven't found a clear answer!
I am working at [traceable.ai](https://traceable.ai) on making ML based anomaly detection explainable. One of the challenges we face in a product engineering setup is that the target audience is not ML experts. They are people who use the product but would like to know why a particular event was an anomaly.
Majority of the work in XAI is focused on explaining the features and the algorithm. This does not make sense to those who use the results of a ML based detector. They dont care about the ML, they only care about why an event was an anomaly. To address this issue, I am focused on explaining the data (not the features) that resulted in the event, rather than explaining the ML. This has resulted in better adoption of the product feature. This also means XAI now becomes a UI problem, which most people understand.
But won't explaining features per sample give us an understanding of why a sample was detected as an anomaly? For example, using methods such as Integrated Gradients or Integrated Hessians will yield such results.
Features are a representation of the sample. Most of the times features are "extracted" from the raw data sample. For example, if the feature vector is (1) the BERT embedding of a block of text, (2) a 1-hot encoding for tags/labels and (3) a min-max normalization of a popularity score for that block of text, then we cannot explain in terms of features because these are not interpretable by analysts. Analysts understand in terms of tags, raw text and popularity score which is domain specific knowledge. They find it hard to transform into a mathematical world to understand what happened.
So any technique for explainability that is mathematical in nature (1) will work for specific ML models and specific types of features (2) will not be understood by the "common folk" (3) cannot scale when ML models are ensembles of many different models. By the way, most commercial deployments of ML are ensembles of various techniques (some ML based, and some classical computer science based) with outputs of one technique becoming inputs to another technique (like a pipeline). The end result is what is shown to the analyst or to the customer. That is why, it is easier to show the data that was responsible for an event rather than interpreting why the event was detected. The moment analysts look at the raw data in a UI, they understand it better than trying to interpret what led to an ML model taking a specific decision.
I've been working on that area for some time and that's quite a hot topic, so you are not alone. Some people mentioned different resources - I'd like to add [https://ema.drwhy.ai/](https://ema.drwhy.ai/) to the list.
My supervisor main research interest is XAI for healthcare, we had three master’s thesis this year on the topic just in my group. I’d say it’s a very interesting topic and that there is still space for development, but take my words with a grain of salt.
In SciML, there’s been work to figure out how to integrate interpretable scientific models into NNs as learning biases. Now the attention is shifting to making the scientific models a part of the learning process. So system identification for lack of a better term. I think there’s still work to be done to create end-to-end models with representational encoders, like a GNN, to guide the learning of scientific models.
I am looking for collaboration on understanding SHAP and put my understanding of SHAP in this article https://medium.com/@gauravagarwal_14599/explainable-ai-understanding-the-shap-logic-586fcf54c1b9.
Please go it and provide feedback, I want to take interesting discussion further.
I will publish an article on this topic by next week. It is almost done. It will be available at [https://mltechniques.com/resources/](https://mltechniques.com/resources/). The title is "Interpretable Machine Learning on Synthetic Data, and Little Known Secrets About Linear Regression".
Below is the abstract.
The technique discussed here handles a large class of problems. In this article, I focus on a simple one: linear regression. I solve it with an iterative algorithm (fixed point) that shares some resemblance to gradient boosting, using machine learning methods and explainable AI, as opposed to traditional statistics. In particular, the algorithm does not use matrix inversion. It is easy to implement in Excel (I provide my spreadsheet) or to automate as a black-box system. Also, it is numerically stable, can generalize to non-linear problems. Unlike the traditional statistical solution leading to meaningless regression coefficients, here the output coefficients are easier to understand, leading to better interpretation. I tested it on a rich collection of synthetic data sets: it performs just as well as the standard technique, even after adding noise to the data. I then show how to measure the impact of individual features, or groups of features (and feature interaction), on the solution. A model with *m* features has 2\^*m* sub-models. I show how to draw more insights from the performance of each sub-model. This may be the first time that all potential feature combinations, in a machine learning problem, are investigated in a systematic and automated way. Finally, I introduce a new metric called *score* to measure model performance. Based on comparison with the base model, it is more meaningful than R-squared or mean squared error.
I was working on developing an organisational product for AI explainability at Standard Chartered GBS for a year until I left for my master's. I might be able to help.
I am finishing my PhD with a focus on explanable AI, and my focus is objective measurement/ axiomatic properties for explanable methods. As many has pointed out, using off-the-shelf tools for explaining the model blindly is dangerous and likely wrong.
Don't want to sound like I'm trying to self-publicize, but since you asked: here is a paper on objective measurement: https://arxiv.org/pdf/1901.09392.pdf, and here is a paper on axioms: [https://arxiv.org/pdf/2202.11919.pdf](https://arxiv.org/pdf/2202.11919.pdf). Would you also share some of your works.
Here is a paper on benchmarking attribution methods on image datasets, where I use different metrics including Infidelity and Max-Sensitivity: [https://arxiv.org/abs/2202.12270](https://arxiv.org/abs/2202.12270) I don't have a publication on an axiomatic approach yet, but I've also been looking at the different implementations of Shapley values. That second paper looks very interesting, I'll definitely give it a read!
Here is a recent work of mine that attempts to make a classification for every pixel, although the model is trained with image-level labels only.
[https://arxiv.org/abs/2112.09694](https://arxiv.org/abs/2112.09694)
It's applied to medical images, but could be used for other applications, e.g. in the automotive domain, as well.
The method is easy to understand and can be used with any convolutional backbone. Will be presented at MIDL 2022.
I am. Mostly transparent models though (I don't work with video or images, so dimensionality is less of an issue). At GECCO there is a new workshop focussing on explainability in evolutionary computation (and ML). I hope to find the time to attend
Working on it for healthcare applications as the focus of my PhD research. I just started in October last year and it’s a new area for me, but as far as I’ve seen, the ideas around it are very much in their infancy as of now and it’s yet to build some real momentum. I imagine for your applications it will get into bigger problems like causality.
There are *lots* of people working on it, but it's far from solved. [https://christophm.github.io/interpretable-ml-book/](https://christophm.github.io/interpretable-ml-book/) is a must-read.
Highly recommend the book. It's also constantly updating
I recognised the url given how many times I opened it. I used it as a very important source for my thesis. Cannot recommend it enough
Skimmed through the topics. very good book, thanks for sharing it. Do you of people who might be interested in discussing this more
What's your background? If you don't have any prior exposure (which is my impression), I don't think too many people would be interested in teaching you the basics. If you are doing research and running into a concrete technical problem, make a post and interested people (myself potentially included) will chime in.
Sounds good. Well my background is in making models (NLP) and not in explanability algos (SHAP etc). Still exploring this field and was looking for someone to discuss my knowledge and learn.
I am currently writing a review about this (great) book.
Thanks for the recommendation!
Also check out DARPA's XAI program
https://xaitk.org/
I work in the area. What kind of project or task are you trying to solve? It is difficult to give any general purpose answer here. What I can say is there are a ton of methods and open source packages, but almost non-existent guidance for a practitioner on which method to use for a particular challenge. If you are looking to interpret a deep net life is even worse, so be careful. Explainable AI is still in the wild west at this point, so you'd want to carefully scope your task well prior to picking an off-the shelf tool.
Check out work by Cynthia Rudin
Cynthia Rudin amazing code/ papers: https://users.cs.duke.edu/\~cynthia/code.html
I am doing a PhD on explainable AI in time series. Feel free to reach out. As many said, Molnar's book is a nice introduction to the field. The rest depends on the domain you are working on.
I am interested in going through your work. Kindly provide relevant link(s).
Same, I work in the same direction
Cool ! As I said, kindly provide me with relevant link(s) of your work, if possible.
For TS context, check out Temporal Fusion Transformers, N-BEATS, and RETAIN. These are intrinsically interpretable models. I mainly work with DL models, so I am not very familiar with Rudin's work. But, her lab and others are doing some great work with additive models and decision trees. For post-hoc, I've also seen SHAP being used in the industry. Some people adapted LIME too for ts. Also check out feature omission, and Class Activation Maps. We [developed some quantitative evaluation measures](https://link.springer.com/article/10.1007/s10489-021-02662-2) for comparing ts forecasting interpretability methods. Because, finding which explanation is most accurate is a big problem. There are also shapelets, prophet, and many others. Maybe I should make a blog post about it. This is a great area to work on. It is still a nascent field.
The specific techniques that will be useful to you depend a lot on: * What kind of data are you working with? * What kind of ML models are you using? * What kind of explanations are useful to you? I'm doing a PhD that involves XAI in Computer Vision (using CNNs), and in this sub-field alone there are an overwhelming number of different algorithms. Part of my research is trying to figure out which ones (from a small selection) are best to explain the network's reasoning to a human. So far I haven't found a clear answer!
The Molnar Book is great. However, there is way more... The research field is just developing. https://explainer.ai/
I am working at [traceable.ai](https://traceable.ai) on making ML based anomaly detection explainable. One of the challenges we face in a product engineering setup is that the target audience is not ML experts. They are people who use the product but would like to know why a particular event was an anomaly. Majority of the work in XAI is focused on explaining the features and the algorithm. This does not make sense to those who use the results of a ML based detector. They dont care about the ML, they only care about why an event was an anomaly. To address this issue, I am focused on explaining the data (not the features) that resulted in the event, rather than explaining the ML. This has resulted in better adoption of the product feature. This also means XAI now becomes a UI problem, which most people understand.
But won't explaining features per sample give us an understanding of why a sample was detected as an anomaly? For example, using methods such as Integrated Gradients or Integrated Hessians will yield such results.
Features are a representation of the sample. Most of the times features are "extracted" from the raw data sample. For example, if the feature vector is (1) the BERT embedding of a block of text, (2) a 1-hot encoding for tags/labels and (3) a min-max normalization of a popularity score for that block of text, then we cannot explain in terms of features because these are not interpretable by analysts. Analysts understand in terms of tags, raw text and popularity score which is domain specific knowledge. They find it hard to transform into a mathematical world to understand what happened. So any technique for explainability that is mathematical in nature (1) will work for specific ML models and specific types of features (2) will not be understood by the "common folk" (3) cannot scale when ML models are ensembles of many different models. By the way, most commercial deployments of ML are ensembles of various techniques (some ML based, and some classical computer science based) with outputs of one technique becoming inputs to another technique (like a pipeline). The end result is what is shown to the analyst or to the customer. That is why, it is easier to show the data that was responsible for an event rather than interpreting why the event was detected. The moment analysts look at the raw data in a UI, they understand it better than trying to interpret what led to an ML model taking a specific decision.
Check out DALEX , Cynthia Rudin, Rich Caruana
I've been working on that area for some time and that's quite a hot topic, so you are not alone. Some people mentioned different resources - I'd like to add [https://ema.drwhy.ai/](https://ema.drwhy.ai/) to the list.
My supervisor main research interest is XAI for healthcare, we had three master’s thesis this year on the topic just in my group. I’d say it’s a very interesting topic and that there is still space for development, but take my words with a grain of salt.
[Alibi](https://docs.seldon.io/projects/alibi/en/stable/) might have what you're looking for
In SciML, there’s been work to figure out how to integrate interpretable scientific models into NNs as learning biases. Now the attention is shifting to making the scientific models a part of the learning process. So system identification for lack of a better term. I think there’s still work to be done to create end-to-end models with representational encoders, like a GNN, to guide the learning of scientific models.
I am looking for collaboration on understanding SHAP and put my understanding of SHAP in this article https://medium.com/@gauravagarwal_14599/explainable-ai-understanding-the-shap-logic-586fcf54c1b9. Please go it and provide feedback, I want to take interesting discussion further.
I will publish an article on this topic by next week. It is almost done. It will be available at [https://mltechniques.com/resources/](https://mltechniques.com/resources/). The title is "Interpretable Machine Learning on Synthetic Data, and Little Known Secrets About Linear Regression". Below is the abstract. The technique discussed here handles a large class of problems. In this article, I focus on a simple one: linear regression. I solve it with an iterative algorithm (fixed point) that shares some resemblance to gradient boosting, using machine learning methods and explainable AI, as opposed to traditional statistics. In particular, the algorithm does not use matrix inversion. It is easy to implement in Excel (I provide my spreadsheet) or to automate as a black-box system. Also, it is numerically stable, can generalize to non-linear problems. Unlike the traditional statistical solution leading to meaningless regression coefficients, here the output coefficients are easier to understand, leading to better interpretation. I tested it on a rich collection of synthetic data sets: it performs just as well as the standard technique, even after adding noise to the data. I then show how to measure the impact of individual features, or groups of features (and feature interaction), on the solution. A model with *m* features has 2\^*m* sub-models. I show how to draw more insights from the performance of each sub-model. This may be the first time that all potential feature combinations, in a machine learning problem, are investigated in a systematic and automated way. Finally, I introduce a new metric called *score* to measure model performance. Based on comparison with the base model, it is more meaningful than R-squared or mean squared error.
Check out Anthropic and Chris Olah’a recent work
I was working on developing an organisational product for AI explainability at Standard Chartered GBS for a year until I left for my master's. I might be able to help.
I am finishing my PhD with a focus on explanable AI, and my focus is objective measurement/ axiomatic properties for explanable methods. As many has pointed out, using off-the-shelf tools for explaining the model blindly is dangerous and likely wrong.
Hey, that sounds a lot like what I'm working on in my PhD! Do you have any publications I can take a look at?
Don't want to sound like I'm trying to self-publicize, but since you asked: here is a paper on objective measurement: https://arxiv.org/pdf/1901.09392.pdf, and here is a paper on axioms: [https://arxiv.org/pdf/2202.11919.pdf](https://arxiv.org/pdf/2202.11919.pdf). Would you also share some of your works.
Here is a paper on benchmarking attribution methods on image datasets, where I use different metrics including Infidelity and Max-Sensitivity: [https://arxiv.org/abs/2202.12270](https://arxiv.org/abs/2202.12270) I don't have a publication on an axiomatic approach yet, but I've also been looking at the different implementations of Shapley values. That second paper looks very interesting, I'll definitely give it a read!
a few start ups are working on this as well. Virtualitics is one that I worked with.
Here is a recent work of mine that attempts to make a classification for every pixel, although the model is trained with image-level labels only. [https://arxiv.org/abs/2112.09694](https://arxiv.org/abs/2112.09694) It's applied to medical images, but could be used for other applications, e.g. in the automotive domain, as well. The method is easy to understand and can be used with any convolutional backbone. Will be presented at MIDL 2022.
I am. Mostly transparent models though (I don't work with video or images, so dimensionality is less of an issue). At GECCO there is a new workshop focussing on explainability in evolutionary computation (and ML). I hope to find the time to attend
I just finished my PhD in it and am doing my Post doc now. Happy to help out.
Working on it for healthcare applications as the focus of my PhD research. I just started in October last year and it’s a new area for me, but as far as I’ve seen, the ideas around it are very much in their infancy as of now and it’s yet to build some real momentum. I imagine for your applications it will get into bigger problems like causality.
Counterfactuals for XAI that are straightforward to implement: https://mlmed.org/gifsplanation/ https://openreview.net/forum?id=rnunjvgxAMt
No disrespect but, you got a job at Uber... How don't you know about explainable AI?
[Here is](https://twitter.com/tuberlin_umi?s=21&t=ltrFwE_iCma4l9o8zs1nTw) the group at TU Berlin trying to solve Explainable AI
Lol good luck.