Here’s the first part of my review of some interesting machine learning (ML) papers I read in 2023. As with the previous editions , this shouldn’t be considered a comprehensive review. The papers covered here reflect my research interests and biases, and I’ve certainly overlooked areas that others consider vital. This post is pretty long, so I've split it into three parts, with parts II and III to be posted in the next couple of weeks. I. Docking, protein structure prediction, and benchmarking II. Large Language Models, active learning, federated learning, generative models, and explainable AI III. Review articles 2023 was a bit of a mixed bag for AI in drug discovery. Several groups reported that the deep learning methods for protein-ligand docking weren’t quite what they were initially cracked up to be. AlphaFold2 became pervasive, and people started to investigate, with mixed success, the utility of predicted protein structures. There were reports of significant advanc
I was taken aback by a recent CNBC article entitled “ Generative AI will be designing new drugs all on its own in the near future ”. I should know better than to pay attention to AI articles in the popular press, but I feel that even scientists working in drug discovery may have a skewed perception of what generative AI can and can’t do. To understand exactly what’s involved, it might be instructive to walk through a typical generative molecular design workflow and point out a few things. First, these programs are far from autonomous. Even when presented with a well-defined problem, generative algorithms produce a tremendous amount of nonsense. Second, domain expertise is essential when sifting through the molecules produced by a generative algorithm. Without a significant medicinal chemistry background, one can’t make sense of the results. Third, while a few nuggets exist in the generative modeling output, a lot of work and good old-fashioned cheminformatics are required to ext
Picking up where we left off in Part I , this post covers several other ML in drug discovery topics that interested me in 2023. Some areas, like large language models, are new, and most of the work is at the proof-of-concept stage. Others, like active learning, are more mature, and several groups are starting to explore nuances of the methods. Here’s the structure of Part II. 4. Large Language Models 5. Active Learning 6. Federated Learning 7. Generative Models 8. Explainable AI 9. Other Stuff 4. Large Language Models The emergence of GPT-4 and ChatGPT brought considerable attention to large language models (LLMs) in 2023. In November and December, several large pharmas held “AI Day” presentations featuring LLM applications for clinical trial data analysis. Many of these groups demonstrated the ability of LLMs to ingest large bodies of unstructured clinical data and subsequently generate tables and reports based on natural language queries. Aside from some very brief demos on co
Comments
Post a Comment