Posts

Showing posts from October, 2018

Self-Organizing Maps - 90s Fad or Useful Tool? (Part 1)

Image
In this post, I will explain how self-organizing maps (SOMs) work.  In the first part of this post, I'll explain the technological underpinnings of the technique.  If you're impatient and just want to get to the implementation, skip to part 2.

A few years ago I was having a discussion with a computational chemistry colleague and the topic of self-organizing maps (SOMs) came up.   My colleague remarked, "weren't SOMs one of those 90s fads, kind of like Furbys"?  While there were a lot of publications on SOMs in the early 1990s, I would argue that SOMs continue to be a useful and somewhat underappreciated technique.

What Problem Are We Trying to Solve?

In many situations in drug discovery, we want to be able to arrange a set of molecules in some sort of logical order.  This can be useful in a number of cases.
Clustering.  Sometimes we want to be able to put a set of molecules into groups and select representatives from each group.  This may be the case when we only h…

Self-Organizing Maps - The Code (Part 2)

Image
In this post, we will look at examples of how two different open source Python libraries can be used to generate self-organizing maps.  The MiniSom library is great for building SOMs for smaller sets with fewer than 10K molecules.   The Somoclu library can use either a GPU or multiple CPU cores to generate a SOM, so it's well suited to larger libraries.  While Somoclu is a lot faster than MiniSom, installation on non-Linux platforms can require a bit of extra work.

I've provided example use cases for both libraries as Jupyter notebooks.  Hopefully, this will make it easier for readers to experiment with these methods.
MiniSom The MiniSom library is great for generating SOMs for smaller datasets consisting of thousands to tens of thousands of molecules. I found the MiniSom library easy to install on a Mac or a Linux platform.   The MiniSom example notebook can be found here on GitHub.

Here's some benchmarking data using MiniSom.  In the plot below we compare the time require…

My Science/Programming Journey

Image
Note: This post is purely self-indulgent and probably won't be interesting to anyone who is not me.  You have been warned.  

A few recent tweets on programming languages got me thinking about my scientific/programming journey.  A long time ago in a galaxy far away ...
Phase I Varian/Analytichem 1984-1990 When I graduated from college in 1984, I got my first full-time job in science.  I was hired as the head of manufacturing chemistry at a small company called Analytichem International in Harbor City California.  Prior to this, while I was an undergrad at UCSB, I worked part-time for a company called Petrarch Systems synthesizing siloxane polymers for, among other things, medical devices and gas chromatography columns.  This experience doing siloxane chemistry was the reason I got the job at Analytichem, where they were doing similar chemistry on surfaces.  I started work at Analytichem with the intent of having a career as a polymer chemist. I'd had very little experience with…