Showing posts from February, 2019

Some Thoughts on Evaluating Predictive Models

I'd like to use this post to provide a few suggestions for those writing papers that report the performance of predictive models.  This isn't meant to be a definitive checklist, just a few techniques that can make it easier for the reader to assess the performance of a method or model.

As is often the case, this post was motivated by a number of papers in the recent literature.  I'll use one of these papers to demonstrate a few things that I believe should be included as well as a few that I believe should be avoided.  My intent is not to malign the authors or the work, I simply want to illustrate a few points that I consider to be important.  As usual, all of the code I used to create this analysis is in GitHub
For this example, I'll use a recent paper in ChemRxiv which compares the performance of two methods for carrying out free energy calculations.  
1. Avoid putting multiple datasets on the same plot, especially if the combined performance is not relevant.  In t…