Binding-Pocket Empty Space

Whether or not a molecule binds to a protein pocket depends not only on the interactions, but on the empty space surrounding the molecule. We need an algorithm for mapping out the empty space in a protein pocket. Here’s an outline of the python script we need:

  1. Load the coordinates of the protein and the coordinates of the ligand (Dr. Durrant will provide numpy arrays of these 3D coordinates).
  2. Calculate the convex hull of the alpha carbons of the atom. (Dr. Durrant will provide numpy array of the alpha-carbon coordinates.) I believe scipy has definitions for this purpose.
  3. Place “seed points” near the locations of the ligand atoms (coordinates snapped to the nearest 0.5 or 0.25 A, per user parameter).
  4. Recursively consider adjacent points on a 3D grid, spreading out from the seed points in all three cardinal directions. The recursion stops if the point goes outside the convex hull, or if it comes too close to one of the protein atoms.
  5. The function returns the 3D points that fill the negative space, as a numpy array.

Multi-Processor Python

e830b9082ef3023ecd0b470de7444e90fe76e6d21bb618419df7c8_640_serverIn our lab, we often need to run independent python functions on multiple processors. By independent, I mean the functions don’t need to communicate with each other (embarrassingly parallel). Our current scripts divide the individual “tasks” evenly in the beginning and then run each “chunk” (i.e., subset of all tasks) on a separate processor. But occasionally one of these chunks will take much longer to finish than the others, so the processors aren’t all used optimally.

  1. There needs to be periodic “load balancing.”
  2. We need a solution that works on windows if possible.
  3. We’re looking for something written in native python that can be easily incorporated into our projects, without requiring the installation of any third-party modules.

Pretty Grapher

Our lab often has to generate graphs for scientific publications. It can be very time consuming to get them looking just right. This project involves an automated graph-generating python script. It will accept a JSON file as input. That file includes the following: the data to plot, the type of graph, whether color or black and white should be used, the graph title, and the axes labels.

After loading the JSON data, the script will create the appropriate graph using the seaborn library. For the font, we prefer Ariel bold. The graph will then be saved in a vector format (preferably SVG).

Here are the kinds of graphs we’re interested in:

  1. https://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.jointplot.html#seaborn.jointplot
    • Search: “Add regression and kernel density fits”
    • Search: “Replace the scatterplots and histograms with density estimates and align the marginal Axes tightly with the joint Axes”
  2. https://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.distplot.html#seaborn.distplot
    • Search: “Show a default plot with a kernel density estimate and histogram with bin size determined automatically with a reference rule.”
  3. https://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.kdeplot.html#seaborn.kdeplot
    • Search: “Use filled contours”
  4. https://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.regplot.html#seaborn.regplot
    • Search: “Plot the relationship between two variables in a DataFrame”
  5. https://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.boxplot.html#seaborn.boxplot
    • Search: “Use swarmplot() to show the datapoints on top of the boxes”
  6. https://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.barplot.html#seaborn.barplot
    • Search: “Add “caps” to the error bars”