Using Counterfactuals to Understand Machine Learning Models
While machine learning (ML) models have become integral to many drug discovery efforts, most of these models are "black boxes" that don't explain their predictions. There are several reasons we would like to be able to explain a prediction. Provide scientific insights that will guide the design of new compounds. Instill confidence among team members. As I've said before, a computational chemist only has two jobs; to convince someone to do an experiment and to convince someone not to do an experiment. These jobs are much easier when you can explain the "why" behind a prediction. Debugging and improving models. Improving a model is easier if you can understand the rationale behind a prediction. As I wrote in a previous post , identifying and highlighting the molecular features that drive an ML prediction can be difficult. One recent promising approach is the counterfactuals method published by Andrew White's group at the University of Rochester