As AI becomes increasingly prevalent in more industries and applications, the demand for highly accurate AI models continues to grow. Indeed, AI models can often be more accurate as it replaces traditional methods, yet this can sometimes come at a price: how is this complex model making decisions, and how can we, as engineers, verify that the model is working as we expect?
Enter explainable AI – a set of tools and techniques that helps to understand model decision and uncover problems with black-box models like bias or susceptibility to adversarial attacks. Explainability can help those working with AI to understand how AI models arrive at predictions, which can be as simple as understanding which features drive model decisions but can also become more difficult when trying to explain more complex models.
Why Explainability?
Practitioners seek model explainability primarily for three reasons:
- Debugging: Understanding where or why predictions go wrong and running “what-if” scenarios can improve model robustness and eliminate bias.
- Confidence: Many stakeholders are interested in the ability to explain a model based on its role and interaction with the application. For example:
- A business owner wants to trust and understand how the AI model works.
- A customer wants to feel confident the application will work as expected in all scenarios and the behavior of the system is fair, rational, and transparent.
- An engineer wants insight into model behavior and improvements that can be made to the accuracy by understanding why a model is making certain decisions.
- Regulations: There is an increasing desire to use AI models in safety critical applications which may have internal and external regulatory requirements. Although each industry will have specific requirements, providing evidence of robustness of training and fairness and trustworthiness will be important.
Current Explainability Methods
Engineers have a variety of methods to choose from, it can be beneficial to break down into specific categories to understand which method will be of most value to the project and the engineer:
Explainable methods can be broken into Global and Local Methods.
- Global Methods: Those that provide an overview of the most influential variables in the model based on input data and predicted output.
- Local Methods: Those that provide an explanation of a single prediction result.
Choosing a Method for Interpretability
The figure below provides an overview of inherently explainable machine learning, various (model-agnostic) interpretability methods, and guidance on when to apply them.
Figure 1. Choosing a Method for Interpretability © 1984–2021 The MathWorks, Inc.
Each of these approaches has their own limitations. It is important to be aware of those limitations as you fit these algorithms to the various use cases. For example, the Finance industry commonly uses Shapley values because this method meets the regulatory requirement of providing “complete” explanations of predictions, but it requires much more computation than LIME.
Established approaches for deep learning provide local explanations and are well suited for debugging and justifying specific predictions. By contrast, machine learning can employ both local and global approaches. Regardless of the specific details of the algorithm used, the goal is the same: To help engineers have a deeper understanding about the data and the model.
The author Dr Amod Anandkumar, Application Engineering Manager at MathWorks leads a team of application engineers helping clients across industries successfully adopt and implement technologies like AI, automated Driving, wireless communications. You can reach him at aanandku@mathworks.com.