Chem-Ant Introduction¶
Select material candidates to output molecules similar to the target molecule with MCTS Solver and Genetic Programming.
similarity_ant.py
is based on the code of
deap/examples/gp/ant.py.
Requirements¶
On Ubuntu:
sudo -H -s
apt install python3-pip
pip3 install -r requirements.txt
exit
Or:
pip3 install deap
pip3 install mcts
pip3 install rdkit
pip3 install global_chem_extensions
pip3 install mcts-solver
Or:
pip3 install chem-ant
If you want to use chem-ant
with chem-classification
:
pip3 install simpletransformers
Or:
pip3 install chem-classification
General Usage¶
By default, you get a list of molecules from smiles.csv
. The target is Nirmatrelvir. From that list, the best material for the fragments is selected. The output csv file also contains molecules created during the execution of mcts. If you want to reuse the csv file as a smiles list, add --select option. If you want to run commands directly without installing the packages, execute just like python3 similarity_mcts.py --help:
similarity-mcts --help
similarity-mcts -i -l1 -e3 -r10 -b500 -p train_smiles
similarity-mcts -i -l1 -e3 -r10 -b500 -p eval_smiles
If you want to specify a target and execute:
similarity-mcts -i -l1 -e3 -r10 -b500 -p train_smiles -t "CC(C)(C)C(NC(=O)C(F)(F)F)C(=O)N1CC2C(C1C1CCNC1=O)C2(C)C"
similarity-mcts -i -l1 -e3 -r10 -b500 -p eval_smiles -t "CC(C)(C)C(NC(=O)C(F)(F)F)C(=O)N1CC2C(C1C1CCNC1=O)C2(C)C"
similarity-mcts selects and outputs the candidates that can be the material of the fragments from the smiles list. If you just want to output target-like molecules from the smiles list without running mcts:
similarity-genMols --help
similarity-genMols -t "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C" -m "CC1=CC=CC=C1C(C)C" "Cc1ccccc1CC(C#N)NC1CCNC1=O" -f "gen2.csv"
Chem-Classification¶
Output dataset in json format for chem-classification
:
importSmiles -t "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C" -p "train_smiles"
importSmiles -t "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C" -p "eval_smiles"
If you want to output the dataset for regression model:
importSmiles -t "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C" -p "train_smiles" -r
importSmiles -t "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C" -p "eval_smiles" -r
Train the classification model and predict the similarity between Nirmatrelvir and YH-53:
from chem_classification.similarity_classification import SimilarityClassification
s = SimilarityClassification()
s.train_and_eval("train_smiles/smiles.json", "eval_smiles/smiles.json")
s.predict_smiles_pair(["CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C", "CC(C)CC(C(=O)NC(CC1CCNC1=O)C(=O)C2=NC3=CC=CC=C3S2)NC(=O)C4=CC5=C(N4)C=CC=C5OC"])
Loading a local save:
s = SimilarityClassification("local-path/your-outputs")
Train regression model to predict similarity between Nirmatrelvir and YH-53:
from chem_classification.similarity_classification import SimilarityRegression
s = SimilarityRegression()
s.train_and_eval("train_smiles/smiles.json", "eval_smiles/smiles.json")
s.predict_smiles_pair(["CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C", "CC(C)CC(C(=O)NC(CC1CCNC1=O)C(=O)C2=NC3=CC=CC=C3S2)NC(=O)C4=CC5=C(N4)C=CC=C5OC"])
Another regression model trained by json files output by similarity-mcts can predict the similarity with the target molecule from the material candidates and cooperate with similarity-ant:
similarity-mcts -i -l2 -e3 -r10 -b100 -p "train_smiles" -f "smiles.json" -j
similarity-mcts -i -l2 -e3 -r10 -b100 -p "eval_smiles" -f "smiles.json" -j
Note
From chem-ant
0.0.7,
I changed it to create datasets with molecular fragments as tokens, so the difference between the two regression models is gone.
Cooperation between chem-classification
and similarity-ant (currently not working):
similarity-ant -n20 -g5 -b 1 -p gen_smiles -d -o "local-path/your-outputs"
Cooperation between regression model of chem-classification
and similarity-ant:
similarity-ant -n20 -g5 -b 1 -p gen_smiles -r -o "local-path/your-outputs"