Building on the Fragments From the Diamond/XChem SARS-CoV-2 Main Protease (MPro) Fragment Screen (Part I)

Like many in the community, I've been spending time looking at the results of a fragment screen of the SARS-CoV-2 main protease (MPro) by the group at the Diamond Light Source. I thought it might be useful to share some of the techniques and the code that I use when working with the results of fragment screens. One of the first things I like to do is to virtually expand the fragment by adding a set of substituents and looking for ways that the fragment can be expanded to make new interactions and increase potency.

Once I've added the substituents, I'll generate 3D conformers for the resulting expanded fragments. The conformers are constrained so that the coordinates of the fragment are fixed. Since we've fixed the coordinates of the core fragment, we can trivially place these expanded fragments in the original binding site and evaluate their interactions. This approach is similar to the one we described in a 2004 paper in J. Med. Chem. In this post, I'll describe one approach to computationally expanding a fragment hit. In a subsequent post, I'll detail the method for generating the 3D conformations and evaluating these conformations in the binding site.

I recently found this preprint and the accompanying GitHub repo from Pavel Polishchuk describing a software tool called CReM (Chemically Reasonable Mutations). The CReM software is like a Swiss army knife for molecular modification and provides several useful capabilities for growing, mutating, and linking molecules. In this post, we'll take a look at a few ways that CReM can be used to modify one of the fragments from the Diamond COVID-19 structures. All of the code and data accompanying this post is available in a Jupyter notebook on GitHub.

CReM takes a simple but effective approach to modifying molecules. The workflow begins by creating a database of matched molecular pairs based on a set of input molecules. The CReM distribution contains a couple of databases compiled from the ChEMBL database. Each matched molecular pair contains a context that specifies an atom environment around which the matched molecular pair was found. When replacing or attaching functional groups, CReM matches the context around the attachment point and uses this context to identify relevant functional groups. The user can specify a radius for the context. Larger values for this radius specify a more specific context (fewer replacements) while small values specify a less specific environment (more replacements). As mentioned above, CReM provides the ability to mutate, grow, and link molecules. In this post, we'll focus on the "grow" capability, since this provides an excellent way to explore around fragment hits.

We'll start out with this fragment, labeled x0161, in the Diamond set.

To begin, we will create an RDKit molecule from the fragment and generate 2D coordinates.

frag_smi = "COC(=O)C=1C=CC(=CC1)S(=O)(=O)N"
frag_mol = Chem.MolFromSmiles(frag_smi)
AllChem.Compute2DCoords(frag_mol)

Next, we have to specify the location of the database that CReM will use.

crem_db_file = '/Users/pwalters/crem/replacements_sc2.5.db'

I found that some of the substituents that CReM uses contained undesirable (reactive, potentially toxic) functional groups. In order to remove these, I defined a REOS (Rapid Elimination Of Swill) class that uses some of the code and ideas I laid out in my post on Filtering Chemical Libraries. The REOS class is initialized with a set of filtering rules.

reos = REOS("bms_filters.csv")

Now that our setup is complete, we can add substituents to our fragment.

grow_list = list(grow_mol(frag_mol, db_name=crem_db_file,max_atoms=8,return_mol=True))
mol_list = [RemoveHs(x[1]) for x in grow_list]
_ = [AllChem.GenerateDepictionMatching2DStructure(m,frag_mol) for m in mol_list]

Note that I performed two operations on the resulting molecules to improve the aesthetics and make it easier to view the molecules.

Remove explicit hydrogens
Align the molecules to the fragment so that they're all oriented the same way

Let's grab a random sample of the molecules we generated and look at them.

random.shuffle(mol_list)

MolsToGridImage(mol_list[:25],molsPerRow=5)

There are several interesting ideas here, but this also points to the fact that we probably need to add a couple of additional functional group filters. In particular, we need to add a filter for long alkyl chains. We may also be able to get at this by adding a filter for rotatable bonds.

One of the other cool features in CReM is the ability to exclude certain atoms from expansion or to only expand from a predefined set of atoms. For instance, let's assume that the sulfonamide NH2 in the fragment is making a hydrogen bond, and we don't want to substitute that part of the molecule. The grow_mol function allows us to specify the atom numbers for the atoms that should not be substituted. In order to use this, we need to specify the atom numbers for the atoms that should not be substituted. To make this easier to use, we'll define a function to return atom numbers matching a SMARTS pattern.

def get_matching_atoms(mol_in,smarts):

pat = Chem.MolFromSmarts(smarts)

return [x[0] for x in mol_in.GetSubstructMatches(pat)]

We can now use this function to get the atom number for the sulfonamide nitrogen.

sulfonamide_N = get_matching_atoms(frag_mol,"[NH2]")

This time we'll add substituents to the molecule, but avoid the sulfonamide nitrogen.

grow_list = list(grow_mol(frag_mol, db_name=crem_db_file,max_atoms=8,protected_ids=sulfonamide_N,return_mol=True))

mol_list = [x for x in mol_list if reos.eval_mol(x)]

mol_list = [RemoveHs(x[1]) for x in grow_list]

_ = [AllChem.GenerateDepictionMatching2DStructure(m,core) for m in mol_list]

MolsToGridImage(mol_list[:25],molsPerRow=5)

Note that in this case, the sulfonamide is not substituted.

Finally, we can also specify only the parts of the molecule that will be substituted. Let's say that we only want to substitute the aromatic carbons on the fragment.

aromatic_atoms = get_matching_atoms(frag_mol,"[cH]")

grow_list = list(grow_mol(frag_mol, db_name=crem_db_file,max_atoms=5,replace_ids=aromatic_atoms,return_mol=True))

mol_list = [x for x in mol_list if reos.eval_mol(x)]

mol_list = [RemoveHs(x[1]) for x in grow_list]

_ = [AllChem.GenerateDepictionMatching2DStructure(m,core) for m in mol_list]

MolsToGridImage(mol_list[:25],molsPerRow=5)

This is just a quick overview of how an Open Source package can be used to generate ideas from hits coming out of a fragment screen. Hopefully someone will find this useful. In the next post I'll detail some of the tools that can be used to assess these expanded fragments.

Search This Blog

Practical Cheminformatics

Building on the Fragments From the Diamond/XChem SARS-CoV-2 Main Protease (MPro) Fragment Screen (Part I)

Comments

Post a Comment

Popular posts from this blog

We Need Better Benchmarks for Machine Learning in Drug Discovery

Some Thoughts on Splitting Chemical Datasets

Comparing Classification Models - You’re Probably Doing It Wrong