Examining the Data From the ChEMBL SARS-CoV-2 Drug Repurposing Screens

One interesting dataset in the ChEMBL 27 release is a compilation of several drug repurposing screens for SARS-CoV-2.  Given recent comments around the lack of consistency in these screens, I was eager to take a look at the data.  I thought it might also be interesting to share some of the techniques I used to explore the screening results.  It's my hope that readers will find some of the ways I manipulate data useful for their analyses.  As usual, all of the code associated with this post is in a Jupyter notebook on GitHub1. Getting the data from ChEMBL
As a first step, we need to construct an appropriate SQL query to extract the data we want to examine.  I'm using MySQL to access the ChEMBL data, not because it's the world's greatest database, but because I've been using it for a long time, and I have all of the necessary commands memorized.  I'm not going to use a fancy Object Relational Mapper (ORM); this is a one-off analysis, so plain old SQL is just fi…

Wicked Fast Cheminformatics with NVIDIA RAPIDS

Graphics Processing Units (GPUs) have revolutionized scientific computing.  Scientists have been using GPUs to achieve significant speed-ups in fields ranging from molecular dynamics to machine learning.  Unfortunately, programming GPUs is a rather painful process that requires considerable expertise. Fortunately for those of us who'd prefer to forgo the travails of CUDA programming, NVIDIA has released the RAPIDS library, which makes it easy to perform a wide array of data science operations on a GPU.  In this post, I'll present a few examples of how we can use RAPIDS to speed-up a few tasks that we commonly perform in Cheminformatics.  As usual, a Jupyter notebook containing all of the code associated with this post is available on GitHub.

2020 -06-23 I made a couple of changes to the code that slightly changed the runtimes and the trustworthiness values for t-SNE.  The conclusions are the same, RAPIDS ROCKS!

I've been following RAPIDS since its initial publi…