===================== Chem-Ant Introduction ===================== Select material candidates to output molecules similar to the target molecule with MCTS Solver and Genetic Programming. :file:`similarity_ant.py` is based on the code of `deap/examples/gp/ant.py `__. Requirements ============ On Ubuntu: .. code-block:: bash sudo -H -s apt install python3-pip pip3 install -r requirements.txt exit Or: .. code-block:: bash pip3 install deap pip3 install mcts pip3 install rdkit pip3 install global_chem_extensions pip3 install mcts-solver Or: .. code-block:: bash pip3 install chem-ant If you want to use :mod:`chem-ant` with :mod:`chem-classification`: .. code-block:: bash pip3 install simpletransformers Or: .. code-block:: bash pip3 install chem-classification - `DEAP `__ - `MCTS `__ - `RDKit `__ - `rdkit-pypi `__ - `Global-Chem `__ - `chem-ant `__ - `chem-classification `__ - `mcts-solver `__ .. note:: The new package :mod:`rdkit` supports Python versions 3.8 through 3.12, whereas :mod:`rdkit-pypi` only supports Python versions 3.7 through 3.11. :mod:`Chem-ant` depends on :mod:`global-chem-extensions`, but both depend on :mod:`rdkit-pypi`. :mod:`Chem-ant` version 0.1.0 will depend on :mod:`rdkit`, but :mod:`global-chem-extensions` will support :mod:`rdkit` in v2.0. Therefore, if you want to install :mod:`chem-ant` on Python 3.12, you must follow these steps: 1. Get the git repository of global-chem 2. Manually edit :file:`global-chem/global_chem_extensions/requirements.txt` 3. Build and install it .. code-block:: bash git clone git@github.com:akuroiwa/global-chem.git cd global_chem_extensions/ After editing the file :file:`requirements.txt`: .. code-block:: bash sed -i 's/rdkit-pypi/rdkit/g' requirements.txt pip install . .. seealso:: - `Global-Chem Pull Request #309 `_ - `Global-Chem requirements.txt `_ General Usage ============= By default, you get a list of molecules from :file:`smiles.csv`. The target is Nirmatrelvir. From that list, the best material for the fragments is selected. The output csv file also contains molecules created during the execution of mcts. If you want to reuse the csv file as a smiles list, add :command:`--select` option. If you want to run commands directly without installing the packages, execute just like :command:`python3 similarity_mcts.py --help`: .. code-block:: bash similarity-mcts --help similarity-mcts -i -l1 -e3 -r10 -b500 -p train_smiles similarity-mcts -i -l1 -e3 -r10 -b500 -p eval_smiles If you want to specify a target and execute: .. code-block:: bash similarity-mcts -i -l1 -e3 -r10 -b500 -p train_smiles -t "CC(C)(C)C(NC(=O)C(F)(F)F)C(=O)N1CC2C(C1C1CCNC1=O)C2(C)C" similarity-mcts -i -l1 -e3 -r10 -b500 -p eval_smiles -t "CC(C)(C)C(NC(=O)C(F)(F)F)C(=O)N1CC2C(C1C1CCNC1=O)C2(C)C" :command:`similarity-mcts` selects and outputs the candidates that can be the material of the fragments from the smiles list. If you just want to output target-like molecules from the smiles list without running mcts: .. code-block:: bash similarity-genMols --help similarity-genMols -t "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C" -m "CC1=CC=CC=C1C(C)C" "Cc1ccccc1CC(C#N)NC1CCNC1=O" -f "gen2.csv" The StopIteration problem has been fixed since :mod:`chem-ant` 0.1.0, so the :command:`similarity-ant` command will run without stopping. I plan to continue improving this bug. In addition, a new :command:`--GlobalChem` option has been added. This gets smiles from the :mod:`global-chem` database as the material for fragments. .. code-block:: bash similarity-ant -n20 -g10 -b 1 -p train_smiles -e1 -c electrophilic_warheads_for_kinases Chem-Classification ==================== Output dataset in json format for :mod:`chem-classification`: .. code-block:: bash importSmiles -t "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C" -p "train_smiles" importSmiles -t "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C" -p "eval_smiles" If you want to output the dataset for regression model: .. code-block:: bash importSmiles -t "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C" -p "train_smiles" -r importSmiles -t "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C" -p "eval_smiles" -r Train the classification model and predict the similarity between Nirmatrelvir and YH-53: .. code-block:: python from chem_classification.similarity_classification import SimilarityClassification s = SimilarityClassification() s.train_and_eval("train_smiles/smiles.json", "eval_smiles/smiles.json") s.predict_smiles_pair(["CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C", "CC(C)CC(C(=O)NC(CC1CCNC1=O)C(=O)C2=NC3=CC=CC=C3S2)NC(=O)C4=CC5=C(N4)C=CC=C5OC"]) Loading a local save: .. code-block:: python s = SimilarityClassification("local-path/your-outputs") Train regression model to predict similarity between Nirmatrelvir and YH-53: .. code-block:: python from chem_classification.similarity_classification import SimilarityRegression s = SimilarityRegression() s.train_and_eval("train_smiles/smiles.json", "eval_smiles/smiles.json") s.predict_smiles_pair(["CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C", "CC(C)CC(C(=O)NC(CC1CCNC1=O)C(=O)C2=NC3=CC=CC=C3S2)NC(=O)C4=CC5=C(N4)C=CC=C5OC"]) Another regression model trained by json files output by :command:`similarity-mcts` can predict the similarity with the target molecule from the material candidates and cooperate with :command:`similarity-ant`: .. code-block:: bash similarity-mcts -i -l2 -e3 -r10 -b100 -p "train_smiles" -f "smiles.json" -j similarity-mcts -i -l2 -e3 -r10 -b100 -p "eval_smiles" -f "smiles.json" -j .. note:: From :mod:`chem-ant` 0.0.7, I changed it to create datasets with molecular fragments as tokens, so the difference between the two regression models is gone. Cooperation between :mod:`chem-classification` and :command:`similarity-ant` (currently not working): .. code-block:: bash similarity-ant -n20 -g5 -b 1 -p gen_smiles -d -o "local-path/your-outputs" Cooperation between regression model of :mod:`chem-classification` and :command:`similarity-ant`: .. code-block:: bash similarity-ant -n20 -g5 -b 1 -p gen_smiles -r -o "local-path/your-outputs"