ProteinVR is a web-based application that allows users to view protein/ligand structures in virtual reality (VR) from their mobile, desktop, or VR-headset-based web browsers. Molecular structures are displayed within 3D environments that give useful biological context and allow users to situate themselves in 3D space.

Gypsum-DL: A New Program that Prepares Molecular Libraries for Virtual Screening

Happy to announce that the Durrant Lab recently released Gypsum-DL, a program that prepares molecular libraries for virtual screening.

Structure-based virtual screening requires carefully prepared 3D models of potential small-molecule ligands. Existing commercial programs for virtual-library preparation have restrictive and/or expensive licenses. In contrast, Gypsum-DL is open source. It accepts virtual compound libraries in SMILES or flat SDF formats as input. For each molecule in the library, it enumerates appropriate ionization, tautomeric, chiral, cis/trans isomeric, and ring-conformational forms. As output, it produces an SDF file with 3D coordinates assigned. A copy is available free of charge under the terms of the Apache License, Version 2.0:

Happy docking!

Gypsum-DL: an open-source program for preparing small-molecule libraries for structure-based virtual screening. Patrick J. Ropp, Jacob O. Spiegel, Jennifer L. Walker, Harrison Green, Guillermo A. Morales, Katherine A. Milliken, John J. Ringe & Jacob D. Durrant. Journal of Cheminformatics. Volume 11, Article number: 34 (2019)




PCAViz is an open-source Python/JavaScript toolkit for sharing and visualizing MD trajectories via a web browser. To encourage use, an easy-to-install PCAViz-powered WordPress plugin enables 'plug-and-play' trajectory visualization.




Gypsum-DL is a free, open-source program that converts 1D and 2D small-molecule representations (SMILES strings or flat SDF files) into 3D models. It outputs models with alternate ionization, tautomeric, chiral, cis/trans isomeric, and ring-conformational states.




Dimorphite-DL adds hydrogen atoms to molecular representations, as appropriate for a user-specified pH range. It is a fast, accurate, accessible, and modular open-source program for enumerating small-molecule ionization states.

Tips for Grad School

I recently got an email from a thoughtful undergraduate who’s going to grad school. He asked for some general advice. Here are my thoughts, for what they’re worth.

Whose Lab Should I Work In?

  1. When picking a lab, pick based on the lab’s principal investigator (PI) more than the specific project. Projects change. Look for a PI who will provide the supportive environment you need to grow as a scientist.
  2. Different PIs have different managerial styles. Some are micromanagers. Others are so hands off you might not even see them that often. Similarly, some students thrive when micromanaged, and others thrive when given complete independence. There’s not just one right way to manage a lab. But do be sure to find the right kind of PI for you.
  3. Labs with more senior PIs (tenured, for example) often have well established projects and hierarchies. A lot of students benefit from that. Senior PIs also tend to be better known in the community, which can be helpful when job-application time comes around.
  4. Labs with more junior PIs often (but not always!) afford students more opportunities to craft their own projects. Working for a younger professor often provides fun opportunities to help “grow the lab.” Younger PIs tend to work closer with grad students, too, though there are certainly exceptions. Excellent graduate students can also play important roles in helping a younger PI establish his or her lab. Letters of recommendation that cite concrete examples like that can be very powerful!

How Many Hours Should I Work as a Grad Student?

If you’re a graduate student, your career has begun! That means this is a full-time job. You should work at least 60 productive hours a week. Working more may lead to more career-advancing publications, but few can manage 80 hours a week. Don’t burn out!

There is a certain amount of random luck when it comes to getting a dream job after graduation. But there definitely is a correlation (even if it isn’t perfect) between how hard you work as grad student and how well you do professionally afterwards. Even if your PI isn’t counting the minutes you’re in the lab, don’t cheat yourself by working less than you should.

Try to Get Your Own Funding

Even if your lab is well funded, try to get a fellowship of your own. Even if you don’t get it, it’s good writing practice. And, regardless of your career plans, it’s impressive to see that kind of thing on a graduate C.V.’s.

Be Open to Jobs Outside of Academia

I worry that not enough graduate students consider careers outside of academia. I worry we don’t prepare them well enough for careers in industry. While I obviously love academia, there are many advantages to working elsewhere. Consider doing a summer internship with a biotech company during your PhD if your PI can spare you.

Work on Multiple Projects

Given that projects often fail, it’s good to work on multiple projects as a graduate student. Here’s some good advice from my graduate PI, J. Andrew McCammon:

Your main project at any given time should be one that is well-defined and that is solvable within 6 months. As you become established you may want to add a second project of medium difficulty, and then one of greater difficulty that would have a very great impact if completed… As soon as your main, straightforward project is underway, start on your second, slightly more ambitious project. If you run into problems with one project, you can set it aside for a day or two while you work on the other one. Then you can take a fresh look at the troublesome project.

At the same time, don’t take on so many projects that you can’t finished any of them. An unfinished project has the same career-advancing value as no project.

Spend Time Both Reading and Writing

It’s easy to focus so much on doing experiments that you don’t spent time reading about others’ science. Writing down your own research plan also keeps you organized and makes it easier to write manuscripts when the time comes. Don’t neglect both your scientific reading and writing!

Manage Your Time Carefully

Again, from Dr. McCammon:

Set daily, weekly and long-term goals. Review your goals and progress frequently. If you are consistently falling short of your goals, try keeping track of how you actually spend your time for a few days. Use this inventory to help plan adjustments. Know when to be meticulous and when you can be sloppier. If you’re banging your head against an obstacle in your research, step back, rethink the problem and try to go around the obstacle instead of through it.

Don’t Get Discouraged!

Grad school can be very difficult. There certainly are days and weeks where you will feel like you’re not getting anything done. But if you’re persistent, things will move forward eventually. Don’t get discouraged!

I hope these ideas help!




BlendMol is a Blender plugin that can easily import VMD 'Visualization State' and PyMOL 'Session' files. BlendMol empowers scientific researchers and artists by marrying molecular visualization and industry-standard rendering techniques. The plugin works seamlessly with popular analysis programs (i.e., VMD/PyMOL). Users can import into Blender the very molecular representations they set up in VMD/PyMOL.

MD Simulations: Analysis and Ideas

The purpose of this document is to briefly describe how to analyze an MD simulation in layman’s terms. It also offers general simulation ideas.

Analysis Techniques

Root Mean Square Deviation

How similar is the shape of each simulated protein conformation to some reference conformation? The conformation of the first simulation frame is a good choice for the reference. The RMSD should start to stabilize after you’ve been simulating for a while, indicating that the system is properly equilibrated. (Link)

Root Mean Square Fluctuation

How wiggly is each protein residue (e.g., amino acid) over the course of the simulation? (Link)


By eliminating protein conformations that are very similar, clustering generates an ensemble of particularly distinct conformations. (Link)

FTMAP Hotspot Analysis

You can use FTMAP to identify druggable hotspots. One approach is to apply FTMAP to representative structures identified through clustering, to account for protein flexibility.

Relaxed-Complex Scheme

Docking small-molecules into various conformations extracted from the simulation via clustering is also a good approach for identifying novel chemical probes. This kind of virtual screen is called the “relaxed complex scheme.”

Ensemble Electrostatics

The electrostatic potentials surrounding the protein can determine how some small molecules bind. You can also calculate ensemble-averaged versions of those potentials that may be more realistic. APBS and Delphi are two programs for calculating potentials.

Principal Component Analysis

Protein motions are very complex. PCA presents a simplified representation of these motions. Some of the minor motions are lost, but the larger-scale motions are still represented. You can project simulated conformations onto the first two principal components (2D graph of the complex motions), or you can morph a model of the protein itself according to the principal components. (Link)

Distance-Based Measurements

It can be helpful to measure the distance between two atoms over the course of a simulation. For example, you can monitor the distance between carboxylate and amine groups to see hydrogen bonds forming and breaking.

Measuring Pocket Volumes

Using the POVME algorithm, you can measure the volume of a given pocket over the course of the trajectory. Sometimes the largest-volume pocket can be useful for drug-discovery projects. POVME is also good at identifying cryptic pockets.

Analyze Hydrogen-Bond Networks

What somewhat-distant residues might influence the motions of the binding pocket through hydrogen-bond networks? HBonanza is a tool for measuring just that.

Pathways of Correlated Motions

You can also analyze the pathways of correlated motions that might connect distant residues. This can sometimes reveal allsoteric mechanisms. WISP is the tool to use.

Kinds of MD

There are endless flavors of MD. I’m going to put a list here, in case you want to research further. (If you do, please feel free to send brief summaries of each method so I can paste them here!)

  • Regular (vanilla) MD.
  • Accelerated MD (McCammon group)
  • Coarse-grained MD (awesome, though I haven’t done much with it)
  • Metadynamics
  • Replica exchange MD
  • Umbrella sampling
  • Markov-State-Model guided MD (amazing)
  • WESTPA (would like to start using)
  • Implicit-solvent MD
  • Brownian dynamics (not really MD, but relevant)

Please send additional methods and/or descriptions if you get a chance. Thanks!

Reasons to Simulate

“Could there be cryptic binding pockets in my protein that aren’t evident in any crystal structure?”

“I’d like to use multiple protein conformations for drug discovery, but all the crystal structures look the same. I’ll simulate to get more conformations!”

“I want to use something better than a docking score to predict ligand binding. Why not use an alchemical method like free-energy purturbation or thermodynamic integration?”

“Could understanding the dnyamics of a certain region of the protein reveal its molecular mechanism?”

“I’ve got two similar proteins, but they do different things. Maybe they are evolutionarily related, or one is a mutant of the other. Can dynamics reveal why they behave differently?”

“My protein has some crazy allosteric mechanism. Might MD simulations reveal the subtle shifts in correlated residue motions that transmit the allosteric signal?”

“I want to engineer a protein to do something new. Can I predict how mutations will change its function before I make the actual protein and test it experimentally?”

“I think I know how a small molecule binds to my protein, but I’m not sure. If I simulate it, will it slip in the binding pocket (probably the wrong pose!), or will its pose remain stable?”

Please send more ideas! I want this page to help with future brainstormig.

Reasons not to Simulate

“I want to see some large conformational changes.” It ain’t gong to happen on MD timescales.

“I want to know something for certain.” You always need experimental validation. You can test something from the literature (already experimentally demonstrated), or you can get a collaborator for prospective validation.

“I want to simulate some huge system.” Probably not going to happen, unless you can get a PRAC.

Running Calculations on CRC Resources


This brief tutorial shows how to run calculations on Center for Research Computing (CRC) resources.

1. Copy Your Files to the CRC

CRC resources run on Unix. If you’re also running some form of Unix (e.g., Ubuntu, macOS), it’s easiest to use the rsync command. rsync copies files between computers intelligently. For example, it only copies files if they don’t already exist on the remote computer. Here are some examples of use:

rsync -vrz /local/directory/with/files

Let’s parse that command line.

  1. rsync: The Unix command.
  2. -vrz: Tell me everything you’re doing verbosely, recurse into subdirectories, and compress files before you transfer them. Note: If you’re transferring files over a fast network, compressing them may actually take more time than just transferring them uncompressed.
  3. /local/directory/with/files: The path to the directory on your computer that you want to copy to the CRC.
  4. my_user_name: Your username on the CRC system.
  5. The URL that points to the CRC.
  6. /crc/destination/directory/: The remote CRC directory where you want to copy your directory/files.

So the above command will copy the /local/directory/with/files directory and all of its contents to /crc/destination/directory/files/ on the CRC.

You can also copy single files. In that case, the -r flag isn’t needed:

rsync -vz /local/file.ext

You can also use the open-source GUI program Filezilla to copy files to the CRC. You’ll need to use SFTP - SSH File Transfer Protocol. But Filezilla doesn’t use rsync‘s intelligent transfer strategy. Transfers will often take longer.

2. Log into the CRC

With your files copied, you are ready to log into the CRC. Logging in will allow you to use CRC’s Unix as if you were sitting at one of their terminals. We use the ssh command:


Once you enter your password, you’ll be able to type Unix commands that will be run on CRC’s system, not your local computer.

3. Change to the CRC Directory Where You will Run your Calculations

It’s just Unix, so you’ve got this:

cd /crc/destination/directory/files/

4. Make a Submission Script

Supercomputers and computer clusters are organized differently than laptops and desktops. The CRC computer you log into is called the login node. It’s used for file copying/moving, limited file processing, and telling the CRC system how to run your heavy-duty calculations. But don’t actually run those calculations on the login node!

Other CRC users are logged into the same node, though you can’t see them. If you run heavy calculations on the login node, it will slow everyone down. They might think to themselves, “Who is the jerk hogging the login node?” You might even get an angry email from the system administrator. Of course this has never happened to me.

Fortunately, there are many compute nodes connected to the login node. That’s where you want to actually run your calculations. But you need to tell the login node how to farm your calculations out to the compute nodes. That’s what a submission script is for.

There are several programs that manage compute jobs on super computers. Common ones include SLURM and PBS. CRC happens to use SLURM. Let’s parse a SLURM submission script for running a NAMD molecular dynamics simulation, which we’ll call

NAMD Submission Script

#SBATCH --job-name=shroom2
#SBATCH --output=shroom2.out
#SBATCH --nodes=8
#SBATCH --ntasks-per-node=28
#SBATCH --time=72:00:00
#SBATCH --cluster=mpi
#SBATCH --partition=opa

module purge
module load intel/2017.1.132 intel-mpi/2017.1.132
module load namd/2.12


mpirun -n $SLURM_NTASKS namd2 prod19.conf > prod19.conf.out

The SLURM Header

  1. #!/bin/bash: Tells CRC we’ll be using bash commands.
  2. #SBATCH --job-name=shroom2: The name of our job will be “shroom2”
  3. #SBATCH --output=shroom2.out: Write any errors to a file called shroom2.out
  4. #SBATCH --nodes=8: Ask for 8 compute nodes. Note that many jobs can only run on a single node (e.g., AutoDock Vina). MPI NAMD happens to be able to spread its calculations over multiple nodes, if it’s compiled as an MPI-compatible executable (details not so important).
  5. #SBATCH --ntasks-per-node=28: Ask for 28 processors on each node. It’s kind of like how our computer bob has 56 processors. Each compute node has multiple processors too.
  6. #SBATCH --time=72:00:00: Let the job run for 72 hours. The system won’t let you be a total compute hog. You can’t just let jobs run forever. Try to be as accurate as you can in guessing how long your calculation will take, and then add a little time. Shorter jobs start running faster, so there’s an advantage to accurately estimating how long your calculations will take.
  7. #SBATCH --cluster=mpi: CRC has several clusters, or “compute-node clubs.” Each cluster is made of compute nodes designed to accommodate different kinds of jobs. The mpi cluster is good for jobs that spread over multiple compute nodes, like MPI NAMD jobs.
  8. #SBATCH --partition=opa: Not sure… good to check CRC documentation.


Most people don’t use NAMD, so it doesn’t make sense to load it for every user. CRC uses a module system so users can load only the programs they need.

  1. module purge: Get rid of any currently loaded modules.
  2. module load intel/2017.1.132 intel-mpi/2017.1.132: Load modules NAMD needs.
  3. module load namd/2.12: Load the NAMD module itself.

Running your Program

Now you type the commands you want to run on the compute node, to actually perform your calculation. MPI NAMD needs some environment variables set:


The MPI NAMD command is a bit complicated:

mpirun -n $SLURM_NTASKS namd2 my_namd_conf_file.conf > my_namd_conf_file.conf.out

  1. mpirun: An executable program for running other executable programs such that they can take advantage of multiple nodes. NAMD happens to be able to do just that.
  2. -n $SLURM_NTASKS: The number of processors to use. SLURM itself provides this information via the $SLURM_NTASKS variable, which is set based on the parameters in the submission-script header.
  3. namd2: The NAMD executable, available because you loaded the namd/2.12 module.
  4. my_namd_conf_file.conf: The NAMD config file.
  5. >: A Unix command that says “put whatever you were going to print to the screen in a text file instead.”
  6. my_namd_conf_file.conf.out: The text file where the output will be written.

Other Submission-Script Examples

Here we’ll put examples of other submission scripts in the future.

Submitting the Job

You’ve created the submission script, so now tell SLURM to use it. The sbatch command does just that:


Your job heads off to the SLURM scheduler, which figures out when to start running it based on the parameters in your header and the needs of other CRC users.

How Goes My Job?

You can check on the progress of your job using CRC’s command. It wraps around and formats SLURM‘s own squeue command. Thanks, CRC uber nerds! Here’s the output of that command:

  JOBID PAR                                NAME ST         TIME  NODES CPUS     NODELIST(REASON)

  JOBID PAR                                NAME ST         TIME  NODES CPUS     NODELIST(REASON)
  31629 opa                             shroom2  R         0:05      8  224         opa-n[60-67]

  JOBID PAR                                NAME ST         TIME  NODES CPUS     NODELIST(REASON)

Take a look at the ST (state?) column. “R” Means our job is running on the compute nodes. I think “PD” means it’s waiting to run, and “C” means it’s been canceled. If you don’t see your job listed, it must have already finished, either because it completed, it ran into an error, or you ran out of your alloted time.

What if I want to Cancel my Job?

SLURM‘s scancel is good at that. The CRC uber nerds also made, which works even better on their system.

Copying Output Files Back to My Local Computer

Once your calculation is done, you can copy all the files back to your local computer using the rsync command. Let’s say you’re using your local computer’s Unix command line and want to copy the files from the CRC to your computer:

rsync -vrz /local/directory/

FYI: In a perfect world, if you were using CRC’s login node, you could also copy files to your local computer:

rsync -vrz /crc/destination/directory

But, if you’re using bob the computer, good luck getting past the firewall! It’s much easier to use bob‘s command line to copy files from the CRC, rather than using the CRC’s command line to copy files to bob.

Reminder: You can also use Filezilla, but I don’t recommend it.


The Yeast-Beast Pipeline

We here in the Durrant Lab are excited to be adding an experimental component to our work! Graduate student Jennifer Walker, a.k.a. the “yeast whisperer,” has been setting up yeast-evolution experiments. Our goal is to force yeast to evolve resistance to anti-cancer and anti-parasitic drugs.

Resistance-conferring mutations often affect the drug-binding protein (i.e., the drug “target”). To identify the target, we’ll sequence the genomes of the resistant yeast to discover which protein has changed. Identifying a drug’s target in yeast will teach us about the same target in cancer or parasite cells.

We will leverage this information to improve our computational techniques. The goal is to better understand drug mechanisms of action and to identify new small molecules that are perhaps even more potent.

Jen took some neat pictures of her ongoing work. Hope you enjoy!


Resistance-conferring mutations often affect the drug-binding protein (i.e., the drug “target”). To identify the target, we’ll sequence the genomes of the resistant yeast to discover which protein has changed. Identifying a drug’s target in yeast will teach us about the same target in cancer or parasite cells.