Chem-Ant Introduction

Select material candidates to output molecules similar to the target molecule with MCTS Solver and Genetic Programming.

similarity_ant.py is based on the code of deap/examples/gp/ant.py.

Requirements

On Ubuntu:

sudo -H -s
apt install python3-pip
pip3 install -r requirements.txt
exit

Or:

pip3 install deap
pip3 install mcts
pip3 install rdkit
pip3 install global_chem_extensions
pip3 install mcts-solver

Or:

pip3 install chem-ant

If you want to use chem-ant with chem-classification:

pip3 install simpletransformers

Or:

pip3 install chem-classification

General Usage

By default, you get a list of molecules from smiles.csv. The target is Nirmatrelvir. From that list, the best material for the fragments is selected. The output csv file also contains molecules created during the execution of mcts. If you want to reuse the csv file as a smiles list, add --select option. If you want to run commands directly without installing the packages, execute just like python3 similarity_mcts.py --help:

similarity-mcts --help
similarity-mcts -i -l1 -e3 -r10 -b500 -p train_smiles
similarity-mcts -i -l1 -e3 -r10 -b500 -p eval_smiles

If you want to specify a target and execute:

similarity-mcts -i -l1 -e3 -r10 -b500 -p train_smiles -t "CC(C)(C)C(NC(=O)C(F)(F)F)C(=O)N1CC2C(C1C1CCNC1=O)C2(C)C"
similarity-mcts -i -l1 -e3 -r10 -b500 -p eval_smiles -t "CC(C)(C)C(NC(=O)C(F)(F)F)C(=O)N1CC2C(C1C1CCNC1=O)C2(C)C"

similarity-mcts selects and outputs the candidates that can be the material of the fragments from the smiles list. If you just want to output target-like molecules from the smiles list without running mcts:

similarity-genMols --help
similarity-genMols -t "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C" -m "CC1=CC=CC=C1C(C)C" "Cc1ccccc1CC(C#N)NC1CCNC1=O" -f "gen2.csv"

Chem-Classification

Output dataset in json format for chem-classification:

importSmiles -t "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C" -p "train_smiles"
importSmiles -t "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C" -p "eval_smiles"

If you want to output the dataset for regression model:

importSmiles -t "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C" -p "train_smiles" -r
importSmiles -t "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C" -p "eval_smiles" -r

Train the classification model and predict the similarity between Nirmatrelvir and YH-53:

from chem_classification.similarity_classification import SimilarityClassification
s = SimilarityClassification()
s.train_and_eval("train_smiles/smiles.json", "eval_smiles/smiles.json")
s.predict_smiles_pair(["CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C", "CC(C)CC(C(=O)NC(CC1CCNC1=O)C(=O)C2=NC3=CC=CC=C3S2)NC(=O)C4=CC5=C(N4)C=CC=C5OC"])

Loading a local save:

s = SimilarityClassification("local-path/your-outputs")

Train regression model to predict similarity between Nirmatrelvir and YH-53:

from chem_classification.similarity_classification import SimilarityRegression
s = SimilarityRegression()
s.train_and_eval("train_smiles/smiles.json", "eval_smiles/smiles.json")
s.predict_smiles_pair(["CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C", "CC(C)CC(C(=O)NC(CC1CCNC1=O)C(=O)C2=NC3=CC=CC=C3S2)NC(=O)C4=CC5=C(N4)C=CC=C5OC"])

Another regression model trained by json files output by similarity-mcts can predict the similarity with the target molecule from the material candidates and cooperate with similarity-ant:

similarity-mcts -i -l2 -e3 -r10 -b100 -p "train_smiles" -f "smiles.json" -j
similarity-mcts -i -l2 -e3 -r10 -b100 -p "eval_smiles" -f "smiles.json" -j

Note

From chem-ant 0.0.7, I changed it to create datasets with molecular fragments as tokens, so the difference between the two regression models is gone.

Cooperation between chem-classification and similarity-ant (currently not working):

similarity-ant -n20 -g5 -b 1 -p gen_smiles -d -o "local-path/your-outputs"

Cooperation between regression model of chem-classification and similarity-ant:

similarity-ant -n20 -g5 -b 1 -p gen_smiles -r -o "local-path/your-outputs"