Computer-Aided Drug Design at the Durrant Lab › Forums › Gypsum-DL › a few problems…
- This topic has 4 replies, 1 voice, and was last updated 4 years, 5 months ago by
Jacob Durrant.
- AuthorPosts
- August 30, 2020 at 5:47 am #15144
Chritsian Le Gouill
GuestHello,
I tried Gypsum_DL with 3 different libraries. One of 5000 cmpds from Chembridge, it did go through but these structures were not converted: CC1=CC=CN2N=CC(C(=O)NC34CC5CC(C3)CC(C5)(C4)N3C=NC=N3)=C12; CCC12CC3CC(O)(C1)CC(C3)(C2)C(=O)N1CCN(CC1)C1CCOCC1 ; CCC12CC3CC(O)(C1)CC(C3)(C2)C(=O)N1CCN(CC1)C1=NC=CN=C1 ; CCC12CC3CC(O)(C1)CC(C3)(C2)C(=O)N1CCC(CC1)NS(C)(=O)=O
One library from SPECS of 37000 cmpds, prepared in Datawarrior and Smi file prepared with InstantJchem (ChemAxon). The conversion of this library with Gypsum hangs at one point and do not go through… no error message to indicate what is the problem.
One library of 250k. If I try to convert it, only one of the 12 threads (6 cores with 2 threads each) available is used so I had to stop it as it would have taken a month to go through. So I fragmented it in 50k and then all threads were used however at one point I get this message:
"Killed" …. I fragmented it into 25k, same problem.
During the conversion of the 37k and 250k, I also get these messages:
Detected unusual substructure: C=C([O-])[OH]
Detected unusual substructure: C(=[CH2])[OH]
Detected unusual substructure: [C-]
Here is the command line I use:
python run_gypsum_dl.py –source Smi-Specs_Divsersity_35000-cmpds.smi –min_ph 7.4 –max_ph 7.4 –pka_precision 1 –job_manager multiprocessing –num_processors -1 –use_durrant_lab_filtersBest, Christian
- August 31, 2020 at 2:51 am #15531
Christian Le Gouill
GuestHello,
I was not allocating enough Ram to Gypsum-DL. Now, I do not have any problem with the prg quitting suddenly with the mention "Killed".
I found 2 structures that were creating problems in one of my libraries (see other post). I removed them and everything is fine now.
Best Regards,
Christian- September 13, 2020 at 12:58 am #15885
Jacob Durrant
KeymasterHi Christian. Sorry for my delay in getting back to you, and much thanks for bringing these issues to my attention. Excellent that you were able to debug the memory issue. Seems likely that others will run into this problem too, so I added a line to the
README.md
file (to be included in future versions): https://git.durrantlab.pitt.edu/jdurrant/gypsum_dl/-/blob/master/README.md#memory-considerationsRegarding the four Chembridge compounds, I’m not surprised that Gypsum struggled to process them. They all contain adamantane substructures. I think what’s happening is that RDKit often fails to generate acceptable 3D coordinates for these constrained-ring structures, so Gypsum ends up throwing them out. In some cases, none of the generated structures are acceptable. To test this theory, I processed the following SMILES strings:
CC1=CC=CN2N=CC(C(=O)NC34CC5CC(C3)CC(C5)(C4)N3C=NC=N3)=C12 CCC12CC3CC(O)(C1)CC(C3)(C2)C(=O)N1CCN(CC1)C1CCOCC1 CCC12CC3CC(O)(C1)CC(C3)(C2)C(=O)N1CCN(CC1)C1=NC=CN=C1 CCC12CC3CC(O)(C1)CC(C3)(C2)C(=O)N1CCC(CC1)NS(C)(=O)=O
With these parameters:
{ "source": "t2.smi", "separate_output_files": true, "job_manager": "multiprocessing", "output_folder": "gypsum_dl_test_output_test2_mult/", "add_pdb_output": true, "num_processors": -1, "min_ph": 7.4, "max_ph": 7.4, "pka_precision": 1, "use_durrant_lab_filters": true, "thoroughness": 3, "max_variants_per_compound": 5 }
Three of the four compounds failed.
But when I increased the
thoroughness
to 6, even whenmax_variants_per_compound
was 1 to speed things up, only one compound failed, presumably because with these settings gypsum had more tries to get it right. I added some notes here: https://git.durrantlab.pitt.edu/jdurrant/gypsum_dl/-/blob/master/README.md#highly-constrained-ring-systemsRegarding the unusual substructures, these refer to MolVS-generated tautomers that Gypsum will discard. MolVS sometimes generates implausible tautomers, and throwing out inappropriate ones after-the-fact seemed like the best approach. Other, better-behaved tautomers generated from the parent compound could well be retained, though.
Thanks for all your help with this. It’s very helpful to have user feedback to improve the program.
Take care,
Jacob
- September 15, 2020 at 3:12 pm #15949
Christian Le Gouill
GuestHello,
Increasing thoroughness works well with most of the rejected cmpds. Thank you for your help and for creating such a useful tool.
Best,
Christian - September 27, 2020 at 3:13 am #16247
Jacob Durrant
KeymasterHi Christian. Re. the memory issue, you might want to try the latest version of Gypsum. I made some updates in version 1.1.7 that should reduce the amount of memory required. Take care.
- AuthorPosts
- The topic ‘a few problems…’ is closed to new replies.