Chess-Ant Article¶
- author:
Akihiro Kuroiwa
- date:
2021/12/25
- abstract:
Like AlphaFold developed by DeepMind, MCTS can be applied to bioinformatics and cheminformatics. Chess-Ant is influenced by AlphaZero, but tries a different method. I would like to show the possibility that the combination of MCTS Solver and Genetic Programming can be put to practical use.
Influence from AlphaZero¶
A variant of the Polynomial Upper Confidence Tree (PUCT) introduced in AlphaGo Zero and AlphaZero complements the exploration rate \(C(s)\) with the prior probability \(P(s,a)\). My chess-ant uses the conventional Upper Confidence Tree (UCT)
and replaces the adjustment of the constant \(C\) with Genetic Programming
(GP) [1]. From \(1/\sqrt{1}\) to \(1/\sqrt{9}\), GP chooses the value of the constant \(C\) according to the conditions.
Depending on the value of the constant \(C\),
chess_ant.py
will either actively or passively search.
AlphaZero’s PUCT [2] [3] [4] [5]:
A deep neural network \((p, v) = f_{\theta}(s)\) with parameters \(\theta\) evaluates the leaf node \(s_{L}\) :
Initialize:
Update:
Details |
|
---|---|
(s,a) |
Each state-action pair. |
N(s) |
The parent visit count. |
N(s,a) |
The visit count. |
W(s,a) |
The total action-value. |
Q(s,a) |
The mean action-value. |
P(s,a) |
The prior probability of selecting a in s. |
C(s) |
The exploration rate. |
p |
A vector of Move probabilities (predicted by a deep neural network). |
v |
A scalar value (predicted by a deep neural network). |
Chess-Ant’s UCT [6] [7] [8] [9] [10]:
The judgment conditions are as follows:
Conditions |
Details |
---|---|
if_improvement |
If UCT increases from the previous time |
if_shortcut |
If move_stack is shorter than last time |
if_result |
Whether the last time rollout was a win, a loss or a draw |
if_pruning |
Whether UCT was infinity, -infinity or something else |
if_is_check |
Is the board of the node selected last time checked? |
Also, since GP can be redone in each generation, it is possible to prevent the UCT from being highly evaluated at the beginning of the search based on the number of visits. It may get the effect of this paper [11].
MCTS Solver¶
With the introduction of MCTS Solver [12] [13], chess-ant will cut branches to speed up and reduce wrong answers.
The reason for the significant changes to the pseudocode in the paper is that the code for the package referenced by the inherited classes is written differently.
In addition, there is no goto
in Python, so there is extra code. In the
pseudo code, everything is completed inside the MCTS Solver, but in
chess_ant.py
, it is written separately in _executeRound()
. Execute
rollout in mctsSolver()
, perform processing such as pruning, output
reward like rollout, and input it to backpropogate()
.
Like negamax, the MCTS solver is a recursive function and requires a stop condition.
Parallelization¶
In mcts-solver
0.0.5, with the help of OpenAI’s ChatGPT and Google Bard, I modified the code to allow parallel processing [14] [15].
Tree parallelization is more difficult than root parallelization because it requires the use of locks.
I’m not sure if these changes are working correctly, so they may change significantly in the future.
Context managers are useful, but they can get you into trouble if you use them incorrectly.
Change History¶
Since both chess-classification
and chem-classification
use the same algorithm, I made similar changes at the same time.
In chess-classification
0.0.4, fen is separated into tokens to create a dataset. The fen is separated by columns instead of letter by letter.
Corrected by adding math.sqrt(2)
since there was an error in the UCB1 algorithm until chess-ant
0.0.5 manual.
It seems that the author of mcts
initially set explorationConstant = 1 / math.sqrt(2)
in mcts.py
to cancel out math.sqrt(2)
.
I added a new terminal called selectNodeEphemeralConstant
to replace the ephemeral constant in chess-ant
0.0.6.
According to Issues #658, Deap does not work with Python 3.10, but with version 3.11.
In ~/.bashrc
or ~/.bash_aliases
:
# Python version
alias python3='/usr/bin/python3.11'
and execute source command:
source ~/.bash_aliases
Another method is update-alternatives.
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 2
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 1
update-alternatives --config python3
python3 --version
To revert to the original version that was set as top priority:
update-alternatives --auto python3
Please note that as of December 2022, gnome-terminal does not start with Python 3.11 on Ubuntu 22.04.1 LTS.
A more common method is to use venv
:
sudo apt install python3.11-venv
python3.11 -m venv ~/.venv3.11
source ~/.venv3.11/bin/activate
which pip3
To finish:
deactivate
In chess-classification
0.0.5, you can load a local save.
In order to shorten the training time, [16] the conventional model was changed to the checkpoint google/electra-small-discriminator [17] of the electra model.
My pgn files are more like tactics [18] [19] than chess problems, so it’s more efficient to create a dataset from tactics. More than 300 rows of pgn data are required for model training and model evaluation in order to increase the accuracy rate.
Development Plan¶
Todo
As another project, I will introduce deep learning [20] [21] for Natural Language Processing (NLP) in FEN’s win / loss evaluation.
genPgn.py
automatically plays with stockfish and outputs PGN files. By
the way, version 0.0.1 of genPgn.py
contains the walrus operator, so it only works with
Python 3.8 or higher. importPgn.py
creates a dataset from PGN files.
chess_classification.py
generates a trained model with simple
transformers. I plan to use this trained model to replace the rollout of
chess_ant.py
, like AlphaZero.
Reference¶
Bibliography¶
Yao, Yao. (2018). API Python Chess: Distribution of Chess Wins based on random moves
Stöckl, Andreas. (2019). An incremental evaluation function and a test-suite for computer chess
Fiekas, Niklas. (2015). An implementation of the Bratko-Kopec Test using python-chess
Hart, Alex. (2011). Alpha Beta pruning on a Minimax tree in Python
Shrott, Ryan. (2017). Genetic Programming applied to AI Heuristic Optimization
Hartikka, Lauri. (2017). A step-by-step guide to building a simple chess AI
Brynjulfsson, Erlingur. A Genetic Minimax Game-Playing Strategy
Yu, Tina and Christopher D. Clack. “Recursion , Lambda Abstractions and Genetic Programming.” .