Interpretability Methods in Machine Learning

Jan 25, 2023•5 min read

Languages, frameworks, tools, and trends

Machine learning (ML) and artificial intelligence (AI) are emerging technologies with applications in nearly every segment of society, including healthcare, agriculture, finance, marketing, research, automobile, and manufacturing.

ML models are deployed to make important predictions and decisions that impact life. For example, Tesla’s cars utilize AI algorithms for auto driving. But although remarkable, such cars can be dangerous if those algorithms fail to deliver accurate results.

Thus, it’s very important for scientists to understand a model and interpret the results with a proper explanation of why and how the model came to the conclusion. Machine learning interpretability is as crucial as developing machine learning models.

In this article, we’ll take a look at the various techniques of machine learning interpretability, the advantages and disadvantages of the same, and compare them to understand which is appropriate in which situation.

Why is machine learning interpretability important?

1. Trust: Whenever we have to make important decisions, we usually reach out to trusted individuals. Similarly, AI models are deployed to places where accountability is important, such as in decision-making processes, in self-driving cars, virtual assistants, etc.

Interpretable machine learning models help achieve the trust and accountability of predictions made by them. However, it doesn’t reveal the accuracy of the model, i.e., how good a model is at predicting outcomes. It only reveals why the model is making those predictions.

2. Transferability of learning environment: We can make generalizations about things once we are aware of a particular aspect. But is the same possible in machine learning models? Yes, although it requires fine-tuning of parameters, a large dataset, and an interpretable machine learning model.

The model can help data scientists know how it will behave if the training environment is altered or if the model is deployed in a different environment than the one it was trained in.

3. Informativeness: One of the major significances of machine learning interpretability is that it allows for the exploration of data more closely and precisely. Data scientists sometimes miss important features of data. This can be highlighted by interpretable machine learning models to provide useful information about a dataset.

Properties of interpretable machine learning models

There are two important properties of interpretable machine learning models. The first deals with understanding how the algorithm used in the model works, which is known as transparency. The second is about visualizing and seeking an explanation of the outcomes from the model about how it learned the features, which are known as post-hoc explanations.

Consider predictions made by a deep learning model consisting of a large number of neural network layers. The neural network architecture doesn’t provide any explanation of the results and it’s not easy to interpret as well. It’s not possible for a human to go through all the calculations of the network as it comprises hundreds of neurons.

However, this can be explained using logic and facts behind the results which may be useful in interpreting whether the results were fair or not. Thus, the two properties are necessary for machine learning interpretability.

Methods for ML model interpretability

Method 1: Global surrogate models

This method seeks to explain the predictions of the black box model which are not possible to explain otherwise. It is easy to implement. As the first step of the process, the predictions are obtained from a black box. Then, an interpretable model is chosen and trained on the predictions of the black box model. It’s important to note that this interpretable model is not trained on the actual data or real value.

Once trained, the model can be used as a surrogate of the initial black box model. Thus, the predictions by the black box model can now be interpreted by the surrogate model. Such interpretable models are usually linear models like linear regression, decision trees, etc., and can be interpreted easily.

Method 2: LIME

LIME stands for Local Interpretable Model-Agnostic Explanations. It gives post-hoc explanations for black box models. It is similar to the global surrogate method in terms of choosing an interpretable model but differs largely in the concept of explaining predictions.

LIME employs local surrogate models that are used for explaining single instances of the predictions. It aims to explain each individual prediction made by the black box model by choosing a local surrogate model instead of using global surrogate models.

The following are the basic steps in its implementation:

Choose any instance from the predictions that need to be explained.
Prepare the perturbed samples of that instance.
Get the predictions of the perturbed samples using the black box model.
Apply weight to the perturbed samples based on how close they are to the original instance using kernels.
Train a weighted interpretable model on the perturbed samples and interpret the local surrogate model for an explanation.

The interpretable ML model can vary from regression to Lasso models. The number of features is decided manually by an engineer. A higher number of features results in a more faithful interpretable ML model.

How LIME is different from the global surrogate method

This section will underline some important differences between LIME and the global surrogate method.

Machine learning interpretability methods.webp

Method 3: SHAP

SHAP stands for SHapley Additive exPlanations, a technique for ML model interpretability developed by Lundberg & Lee in 2017. This technique, based on a game theory, considers each feature in a dataset as a “player” of a game. It calculates the contribution of each feature in making the prediction.

For example, consider a model trained on a dataset to predict the price of a car. Now, it’s important to understand the contribution of each feature in the predictions. One solution is the weight of a feature multiplied by the feature value. This only makes sense for linear models and not for complex models.

SHAP provides a solution to this problem. It calculates the importance of each feature or “player” by removing and adding all the subsets of the players. Different combinations are formed and the price of a car is predicted using these players. The disadvantage of this technique is that the computation time increases exponentially. Hence, it is computationally very expensive as the number of features increases.

Conclusion

With this article, we’ve understood the importance of machine learning interpretability. It’s necessary to develop a trustworthy model that can be used in places where decision-making is crucial.

Model interpretability provides useful insights about the data that may otherwise go unnoticed. There are several techniques for interpreting ML models, including global surrogates, LIME, and SHAP. They have their own advantages and disadvantages. For instance, LIME is a model-agnostic technique due to which it is preferred over the global surrogate method.

Thus, the explainability of ML models has made it possible to trust AI models and use them in vital places.

Author
Turing Staff