Phosphate issues in Autogrow

Computer-Aided Drug Design at the Durrant Lab Forums Gypsum-DL Phosphate issues in Autogrow

Viewing 3 reply threads
  • Author
    Posts
    • #14954
      Ashim
      Guest

      Hi
      Great tool, thanks for making this.
      So I have been trying to generate nucleotide analogues in autogrow4.
      In the trial run, I used smiles strings formatted by the PubChem standardisation web server –


      C1=C(C2=C(N1)C(=NC=N2)N)C3C(C(C(N3)COP(=O)(O)OP(=O)(O)OP(=O)(O)O)O)O GalidesivirTP
      C1=C2C(=NC=NN2C(=C1)[C@@]3(C@HO)C#N)N RemdesivirTP
      C1=NC2=C(N1COCCOP(=O)(O)OP(=O)(O)OP(=O)(O)O)N=C(NC2=O)N AcyclovirTP
      C1=NC(=C2C(=N1)N(C=N2)CCOCCP(=O)(O)OP(=O)(O)OP(=O)(O)O)N AdefovirTP
      CC@@HOCCP(=O)(O)OP(=O)(O)OP(=O)(O)O TenofovirTP

      In the molecules generated however, a considerable number of the phosphate groups have phosphorus with a valency of 7, for example this smiles – Nc1nc2c([nH+]cn2COCCO[P@](=O)(O)O[PH](=O)(=O)OP(=O)([O-])[O-])c(=O)[n-]1
      (Check out that [PH] in the smiles – Chemsketch also shows that PH thing as an issue.)

      I think there is some ligand-conversion issue I’m missing here. The parameter details (vars file) is thus –
      {
      "nn1_script": "/mnt/d/Projects/DDH2020/Software/autogrow4/autogrow/docking/scoring/nn_score_exe/nnscore1/NNScore.py",
      "nn2_script": "/mnt/d/Projects/DDH2020/Software/autogrow4/autogrow/docking/scoring/nn_score_exe/nnscore2/NNScore2.py",
      "conversion_choice": "MGLToolsConversion",
      "obabel_path": "obabel",
      "custom_conversion_script": "",
      "prepare_ligand4.py": "/mnt/d/Projects/DDH2020/Software/mgltools_x86_64Linux2_1.5.6/MGLToolsPckgs/AutoDockTools/Utilities24/prepare_ligand4.py",
      "prepare_receptor4.py": "/mnt/d/Projects/DDH2020/Software/mgltools_x86_64Linux2_1.5.6/MGLToolsPckgs/AutoDockTools/Utilities24/prepare_receptor4.py",
      "mgl_python": "/mnt/d/Projects/DDH2020/Software/mgltools_x86_64Linux2_1.5.6/bin/pythonsh",
      "start_a_new_run": true,
      "max_time_mcs_prescreen": 1,
      "max_time_mcs_thorough": 1,
      "min_atom_match_mcs": 4,
      "protanate_step": false,
      "rxn_library": "all_rxns",
      "rxn_library_file": "",
      "function_group_library": "",
      "complementary_mol_directory": "",
      "number_of_processors": -1,
      "multithread_mode": "multithreading",
      "selector_choice": "Rank_Selector",
      "tourn_size": 0.1,
      "top_mols_to_seed_next_generation_first_generation": 5,
      "top_mols_to_seed_next_generation": 5,
      "diversity_mols_to_seed_first_generation": 2,
      "diversity_seed_depreciation_per_gen": 1,
      "filter_source_compounds": false,
      "use_docked_source_compounds": true,
      "num_generations": 5,
      "number_of_crossovers_first_generation": 5,
      "number_of_mutants_first_generation": 5,
      "number_of_crossovers": 10,
      "number_of_mutants": 10,
      "number_elitism_advance_from_previous_gen": 2,
      "number_elitism_advance_from_previous_gen_first_generation": 2,
      "redock_elite_from_previous_gen": false,
      "LipinskiStrictFilter": false,
      "LipinskiLenientFilter": true,
      "GhoseFilter": false,
      "GhoseModifiedFilter": false,
      "MozziconacciFilter": false,
      "VandeWaterbeemdFilter": false,
      "PAINSFilter": false,
      "NIHFilter": false,
      "BRENKFilter": false,
      "No_Filters": false,
      "alternative_filter": null,
      "dock_choice": "QuickVina2Docking",
      "docking_executable": null,
      "docking_exhaustiveness": 15,
      "docking_num_modes": null,
      "docking_timeout_limit": 600,
      "custom_docking_script": "",
      "scoring_choice": "VINA",
      "rescore_lig_efficiency": false,
      "custom_scoring_script": "",
      "max_variants_per_compound": 2,
      "gypsum_thoroughness": 3,
      "min_ph": 6.4,
      "max_ph": 8.4,
      "pka_precision": 1.0,
      "gypsum_timeout_limit": 60,
      "debug_mode": false,
      "reduce_files_sizes": false,
      "generate_plot": true,
      "timeout_vs_gtimeout": "timeout",
      "filename_of_receptor": "/mnt/d/Projects/DDH2020/NucleotideLib/ligand.pdb",
      "center_x": 91.268,
      "center_y": 92.146,
      "center_z": 103.829,
      "size_x": 15.0,
      "size_y": 15.0,
      "size_z": 15.0,
      "source_compound_file": "/mnt/d/Projects/DDH2020/NucleotideLib/Nts.smi",
      "root_output_folder": "/mnt/d/Projects/DDH2020/NucleotideLib/Out1/",
      "mgltools_directory": "/mnt/d/Projects/DDH2020/Software/mgltools_x86_64Linux2_1.5.6/",
      "chosen_ligand_filters": [
      "LipinskiLenientFilter"
      ],
      "output_directory": "/mnt/d/Projects/DDH2020/NucleotideLib/Out1/Run_0/"
      }

      Kindly let me know of any way to handle this, as otherwise I’ll have to manually curate the data generated.
      Regards
      Ashim

    • #15137
      Jacob Durrant
      Keymaster

      Hi Ashim. Much thanks for your post. I did some investigating and found the source of the problem. AutoGrow4 uses Gypsum-DL to generate 3D molecules from SMILES strings, and Gypsum-DL in turn uses MolVS to identify alternate tautomeric forms. MolVS generally does a good job, but it occasionally creates inappropriate tautomers. That’s exactly what happened in this case.

      Gypsum-DL includes a parameter, --use_durrant_lab_filters, that allows users to discard SMILES with bad tautomers from a pre-defined list. I apparently ran into your same problem myself back in the day, because I found this line in the Gypsum-DL code:

      "O=[PH](=O)([#8])([#8])", # molvs does odd tautomer: OP(O)(O)=O => O=[PH](=O)(O)O

      That’s the exact substructure you’re seeing.

      So I think if you add "use_durrant_lab_filters": true to your AutoGrow4 JSON file, that will eliminate these bad tautomers.

      In testing Gypsum-DL with your "max_variants_per_compound": 2,, though, I realized you might run into a second problem. Often both of the variants will contain the bad substructure, so use_durrant_lab_filters will eliminate both. To better your chances that the program will identify variants of Acyclovir that survive the durrant-lab filters, I recommend increasing max_variants_per_compound. When I used a value of 5, I got several good SMILES strings.

      We’ve considered turning durrant-lab filters on by default, but decided not to to maintain backwards compatibility. It might be time for us to reconsider that decision, though, to avoid problems like these!

      Hope this helps. Take care.

      ~Jacob

    • #15146
      Ashim
      Guest

      Thank you for this detailed response.

      It seems that my copy of autogrow’s durrant filter file did not have this particular substructure added to the list of substructures to remove, so I manually added it. Doing that, and changing the max variants to 5 gives me structures well suited for my task.

      Thanks again
      Regards
      Ashim

    • #15468
      Jacob Durrant
      Keymaster

      Much thanks for posting your solution, Ashim. This substructure was added to AutoGrow4 about two months ago, with the 4.0.2 update. So if your copy was older than that, it wouldn’t have had this feature yet. Take care.

      https://git.durrantlab.pitt.edu/jdurrant/autogrow4/-/blob/4.0.2/autogrow/operators/convert_files/gypsum_dl/gypsum_dl/Steps/SMILES/DurrantLabFilter.py#L42

Viewing 3 reply threads
  • The topic ‘Phosphate issues in Autogrow’ is closed to new replies.