Hello Hivers and Chess Community!
While browsing YouTube, I came across a project that really caught my attention, LLM Chess Arena.
It combines two of the things I really like: Chess and Programming.
A perfect combo for me! In addition, the project uses Python, a language that I am improving in a course, and LLM (Large Language Models), the new hype in the technology world.
In other words, I went straight to check it out. Let's go?
Let's Build
I cloned the repository and, after running a few games, I started thinking about how I could contribute. Many ideas were coming up not only for this project but also for how I could use it in other applications.
My First Impressions were that the project was very well structured and in a simple way, the developer managed to do something that seems super complex: put two AIs to play chess, and even more, with a judge validating the moves.
But while testing, I realized that the model wasn't "seeing" the board very well. It was making moves as if it was attacking pieces that, in fact, weren't even there! 🤔 This made me investigate how the moves were being described to the model.
The project was using SAN (Standard Algebraic Notation), and even though I understood the basics, my reading wasn't that fast. That's when the first idea came to me: why not add a visualization of the board to follow the moves in real time?
Viewing the Board
One of the first improvements was to transform the terminal code into something more visual, with the pieces being displayed on a board at each turn.
This way, I was able to follow the games in "live action". In addition, I saved each turn in images to facilitate later analysis.
This allowed me to observe that the model sometimes made decisions without taking into account the real position of the pieces, such as attacking a non-existent pawn on the d4 square.
Changing from SAN to LEN
That's when I also remembered that chess has other notations, and I decided to test LEN (Long Algebraic Notation). After delving deeper and deeper into the codes and making some changes, I realized that, with LEN, the model began to "see" the board better.
It seems that the moves became more coherent, and the errors were reduced and the plays began to flow better, without repeated invalid moves and infinite loops.
And to avoid these infinite loops, I had implemented a system where, after five invalid moves, the model "gives up" on the game. This prevented it from continuing to insist on impossible moves.
After these improvements, I realized that the games were still taking too long. That's when I thought: "What if, instead of comparing only LLM models, I tested different prompts?"
I remembered a friend who always says:
"If LLM doesn't give you the result you want, the problem may be in your prompt."
So I decided to create two new prompts: a more Aggressive style and a more Strategic style.
The Battle of the Prompts: Aggressive vs. Strategic
To my surprise, the change was significant! Where previously I had boring games with over 100 turns, now I saw a game unfold in a much more interesting way. In one of them, the aggressive prompt won in just 23 turns with a checkmate!
This proved that prompts can really influence the behavior of models in games.
More Experiments: Endgame and Defensive
Excited by the results, I created two more prompts: one focused on the endgame and another more defensive. Now, with this variety of styles, I'm curious to see how the different prompts will perform in future tests.
What about you? What would be your prompt for this project? Do you have any ideas?
If you want to check out the original project and see my improvements, here are the links:
- Link to the project on GitHub
- GPT-4o vs Gemini-Pro games
- Gemini-Pro vs GPT-4o games
- llama3 Aggressive vs llama3 game Strategic
After my analysis, I asked GPT Chat to analyze the game and this was the result:
Chat GPT Analysis:
This game shows a highly tactical battle between two LLaMA3 models, with White emerging victorious. Here are the key moments:
Opening Phase:
Both players opted for a standard e4-e5 opening, leading to central tension with pawn moves and early knight development.
Black chose to contest the center with 3...d5, but White's quick exchanges on d5 gave them a slight edge in controlling the center.Midgame Tactics:
White's aggressive 7. h4 took advantage of Black's premature kingside expansion with 6...g5.
White opened the h-file and created attacking chances by sacrificing material, notably the rook on h7,
seizing control over Black's exposed kingside.Final Attack:
White's 11. Qh5+ placed Black in a tough spot, forcing defensive moves.
The decisive blow came with 12. Qxg6#, delivering checkmate as Black's king was trapped with no escape routes.White capitalized on Black's weaknesses, especially after 6...g5, and successfully launched a direct kingside attack to secure the win.
[Event "LLM Chess Arena"]
[Site "Cloud"]
[Date "2024.09.25"]
[Round "1"]
[White "llama3-70b-8192 Aggressive"]
[Black "llama3-70b-8192 Strategic"]
[Result "1-0"]
1. e4 e5 { My move: "e5"
By playing e5, I'm challenging White's central control and preparing to develop other pieces.
This move also opens up the d5 square for my knight to potentially jump to, attacking White's pawn on e4
and putting pressure on the center. } 2. Nc3 { My move: "Nb1c3"
I'm developing my knight to a strong central square, attacking the pawn on e5 and preparing to control the d5 square,
which is a strategic location in the center of the board. This move also helps to support a potential pawn storm on the
kingside and prepares to develop other pieces. } 2... Nf6 { My move: "Nf6"
This move develops my knight, attacks the opponent's pawn on e4, and prepares to control the central squares.
It also supports the potential development of other pieces, such as the bishop on c8 and the queen on d8,
which can put pressure on the opponent's position. } 3. d4 { My move: "d4"
I'm playing d4 to challenge Black's control of the center and create space for my pieces to develop.
This move also prepares to potentially push the pawn to d5, which could put pressure on Black's position and create weaknesses.
Additionally, it opens up the d-file for my rook to potentially use in the future. } 3... d5 { My move: "d5"
By playing d5, I'm challenging my opponent's central control and preparing to develop my dark-squared bishop.
This move also opens up the possibility of a potential pawn exchange on d4, which could weaken my opponent's pawn
structure and create opportunities for counterplay. } 4. exd5 { My move: "exd5"
I'm playing exd5 to immediately challenge Black's control of the center and open up the d-file for my rook to potentially
use in the future. This move also gains me a pawn and puts pressure on Black's position, making it harder for them to
develop their pieces harmoniously. } 4... Nxd5 { My move: "Nxd5"
I'm playing Nxd5 to immediately challenge my opponent's central pawn and prepare to develop other pieces. This move also
opens up the possibility of a potential fork on the knight and bishop, which could lead to a tactical advantage.
} 5. Nxd5 { My move: "Nxd5"
I'm playing Nxd5 to immediately recapture the pawn and maintain control of the center of the board.
This move also opens up the possibility of developing other pieces and putting pressure on Black's position.
} 5... Be6 { My move: "e6"
This move develops my dark-squared bishop, attacks the knight, and prepares to develop other pieces. It
also aims to challenge the central control and create potential pawn breaks on the queenside, which can lead
to a strong initiative. } 6. Nf4 { My move: "Nf4"
I'm playing Nf4 to attack the bishop on e6 and prepare to develop other pieces. This move also puts pressure on the
kingside and opens up the possibility of a potential pin on the knight on f6. } 6... g5 { My move: "g5"
This move aims to challenge the knight on f4 and prepare to develop other pieces, particularly the dark-squared
bishop. By playing g5, I'm also creating the potential for a pawn storm on the kingside, which could lead to a
weakening of White's position. Additionally, this move helps to support the development of my queenside
pieces and prepares to potentially open lines for my rook and queen. } 7. h4 { My move: "h4"
This move aims to take advantage of Black's weakened kingside pawn structure and prepares to launch a powerful
attack on the black position. By playing h4, I'm also eyeing the possibility of opening up the h-file for my rook to
potentially use in the future. } 7... gxh4 { My move: "gxh4"
By playing gxh4, I'm attacking the pawn on h4 and preparing to develop other pieces. This move also weakens White's
pawn structure on the kingside, which can be exploited later in the game. } 8. Rxh4 { My move: "Rxh4"
I'm playing Rxh4 to immediately take advantage of the pawn exchange on the previous move, gaining control of the
open file and putting pressure on Black's position. This move also develops my rook and prepares to potentially
launch a kingside attack. } 8... Qe7 { My move: "Qe7"
I'm playing Qe7 to develop my queen, put pressure on the f4-pawn, and prepare to castle kingside. This move also
eyes the kingside and gets ready to potentially attack the white king, while also supporting the pawn on e5.
} 9. Rxh7 { My move: "Rxh7"
I'm playing Rxh7 to take advantage of the weakened kingside and put immediate pressure on Black's position.
This move also opens up the possibility of a potential fork on the king and queen, which could lead to a
significant material gain. } 9... Rg8 { My move: "Rg8"
This move develops my rook and attacks the opponent's rook, which is currently attacking my king.
By playing Rg8, I'm preparing to potentially fork the opponent's rook and king, while also eyeing
the f7-pawn as a potential weakness to exploit in the future. } 10. Rxf7 { My move: "Rxf7"
I'm playing Rxf7 to take advantage of the weakened kingside and create a strong attacking opportunity.
This move also puts pressure on Black's position, forcing them to respond to the threat of Rxf7, which can
lead to further weaknesses. } 10... Kxf7 { My move: "Kxf7"
This move aims to immediately address the threat posed by the rook on f7, while also preparing to develop
other pieces and potentially create counterplay on the kingside. By capturing the rook, I'm also opening
up the f-file for my rook to potentially use in the future. } 11. Qh5+ { My move: "Qh5"
I'm taking advantage of the weakened kingside by attacking the black king directly. This move also puts
pressure on the f7-pawn, which is now isolated and vulnerable, and prepares to potentially open up the f-file
for my rook to attack the king. } 11... Rg6 { My move: "g6"
This move attacks the opponent's queen and prepares to develop other pieces, particularly the
dark-squared bishop, which can potentially create a strong attacking force on the kingside. By playing g6, I'm
also supporting the pawn on f7 and getting ready to castle kingside, which would bring my king to a safer
position and connect the rooks. } 12. Qxg6# { My move: "Qxg6#"
This move is a direct attack on the opponent's king, aiming for an immediate checkmate. By capturing the
rook on g6, I am putting the king in a position where it cannot escape the check, and the game is
essentially over. This aggressive move is a result of the tactical opportunities created by the previous
moves, and it's time to capitalize on them. } 1-0
Versão Brasileira (Skip)
Olá Hivers and Chess Community!
Navegando pelo YouTube, me deparei com um projeto que chamou muito minha atenção, o LLM Chess Arena
Ele une duas das coisas que gosto muito: Xadrez e Programação.
Um combo perfeito para mim! Além disso, o projeto utiliza Python, uma linguagem que estou aprimorando em um curso, e LLM (Large Language Models), o novo hype do mundo tecnológico.
Ou seja, fui logo conferir. Vamos lá?
Let's Build
Clonei o repositório e, depois de rodar algumas partidas, comecei a pensar em como poderia contribuir. Muitas ideias iam surgindo não só para esse projeto mas como poderia usar em outras aplicações.
As minhas Primeiras Impressões, foi de que o projeto se mostrou muito bem estruturado e de maneira simples, o desenvolvedor conseguiu fazer algo que parece super complexo: colocar duas IAs para jogar xadrez, e ainda mais, com um juiz validando as jogadas.
Mas, enquanto testava, percebi que o modelo não estava "enxergando" muito bem o tabuleiro. Ele fazia jogadas como se estivesse atacando peças que, na verdade, nem estavam lá! 🤔 Isso me fez investigar como as jogadas estavam sendo descritas ao modelo.
O projeto estava usando a notação SAN (Standard Algebraic Notation), e mesmo entendendo o básico, minha leitura não era tão rápida. Daí surgiu a primeira ideia: por que não adicionar uma visualização do tabuleiro para acompanhar as jogadas em tempo real?
Visualizando o Tabuleiro
Uma das primeiras melhorias foi transformar o código do terminal em algo mais visual, com as peças sendo exibidas em um tabuleiro a cada turno.
Assim, consegui acompanhar as partidas em "live action". Além disso, salvei cada turno em imagens para facilitar a análise posterior.
Isso me permitiu observar que o modelo, às vezes, tomava decisões sem levar em conta a posição real das peças, como atacar um peão inexistente na casa d4.
Mudando de SAN para LEN
Foi ai que lembrei também que o xadrez possui outras notações, e resolvi testar a LEN (Long Algebraic Notation). Depois de ia cada vez mais a fundo nos códigos e fazendo algumas mudanças, percebi que, com a LEN, o modelo passou a "enxergar" melhor o tabuleiro.
Parece que as jogadas ficaram mais coerentes, e as falhas foram reduzidas e as jogadas começaram a fluir melhor, sem repetidos movimentos inválidos e loops infinitos.
E para evitar esses loops infinitos, tinha colocado um sistema onde, após cinco jogadas inválidas, o modelo "desiste" do jogo. Isso evitou que ele continuasse insistindo em jogadas impossíveis.
Depois dessas melhorias, percebi que as partidas continuavam longas demais. Foi aí que pensei: "E se, ao invés de comparar apenas modelos LLM, eu testasse diferentes prompts?"
Lembrei de um amigo que sempre diz:
"Se o LLM não te dá o resultado que você quer, o problema pode estar no seu prompt."
Decidi então criar dois novos prompts: um estilo mais Agressivo e outro mais Estratégico.
A Batalha dos Prompts: Agressivo x Estratégico
Para minha surpresa, a mudança foi significativa! Onde antes eu tinha jogos sem graça, com mais de 100 turnos, agora vi uma partida se desenrolar de forma bem mais interessante. Em uma delas, o prompt agressivo venceu em apenas 23 turnos com um checkmate!
Isso provou que os prompts realmente podem influenciar o comportamento dos modelos nas partidas.
Mais Experimentos: Endgame e Defensivo
Empolgado com os resultados, criei mais dois prompts: um focado em endgame e outro mais defensivo. Agora, com essa variedade de estilos, estou curioso para ver como os diferentes prompts vão se sair nos próximos testes.
E você? Qual seria seu prompt para esse projeto? Tem alguma ideia?
Se quiser conferir o projeto original e ver as minhas melhorias, aqui estão os links:
- Link para o projeto no GitHub
- Partidas GPT-4o vs Gemini-Pro
- Partidas Gemini-Pro vs GPT-4o
- Partida llama3 Aggressive vs llama3 Strategic
Depois da minha análise, pedi para o Chat GPT analisar a partida e esse foi o resultado:
Chat GPT Analysis:
This game shows a highly tactical battle between two LLaMA3 models, with White emerging victorious. Here are the key moments:
Opening Phase:
Both players opted for a standard e4-e5 opening, leading to central tension with pawn moves and early knight development.
Black chose to contest the center with 3...d5, but White's quick exchanges on d5 gave them a slight edge in controlling the center.Midgame Tactics:
White's aggressive 7. h4 took advantage of Black's premature kingside expansion with 6...g5.
White opened the h-file and created attacking chances by sacrificing material, notably the rook on h7,
seizing control over Black's exposed kingside.Final Attack:
White's 11. Qh5+ placed Black in a tough spot, forcing defensive moves.
The decisive blow came with 12. Qxg6#, delivering checkmate as Black's king was trapped with no escape routes.White capitalized on Black's weaknesses, especially after 6...g5, and successfully launched a direct kingside attack to secure the win.
[Event "LLM Chess Arena"]
[Site "Cloud"]
[Date "2024.09.25"]
[Round "1"]
[White "llama3-70b-8192 Aggressive"]
[Black "llama3-70b-8192 Strategic"]
[Result "1-0"]
1. e4 e5 { My move: "e5"
By playing e5, I'm challenging White's central control and preparing to develop other pieces.
This move also opens up the d5 square for my knight to potentially jump to, attacking White's pawn on e4
and putting pressure on the center. } 2. Nc3 { My move: "Nb1c3"
I'm developing my knight to a strong central square, attacking the pawn on e5 and preparing to control the d5 square,
which is a strategic location in the center of the board. This move also helps to support a potential pawn storm on the
kingside and prepares to develop other pieces. } 2... Nf6 { My move: "Nf6"
This move develops my knight, attacks the opponent's pawn on e4, and prepares to control the central squares.
It also supports the potential development of other pieces, such as the bishop on c8 and the queen on d8,
which can put pressure on the opponent's position. } 3. d4 { My move: "d4"
I'm playing d4 to challenge Black's control of the center and create space for my pieces to develop.
This move also prepares to potentially push the pawn to d5, which could put pressure on Black's position and create weaknesses.
Additionally, it opens up the d-file for my rook to potentially use in the future. } 3... d5 { My move: "d5"
By playing d5, I'm challenging my opponent's central control and preparing to develop my dark-squared bishop.
This move also opens up the possibility of a potential pawn exchange on d4, which could weaken my opponent's pawn
structure and create opportunities for counterplay. } 4. exd5 { My move: "exd5"
I'm playing exd5 to immediately challenge Black's control of the center and open up the d-file for my rook to potentially
use in the future. This move also gains me a pawn and puts pressure on Black's position, making it harder for them to
develop their pieces harmoniously. } 4... Nxd5 { My move: "Nxd5"
I'm playing Nxd5 to immediately challenge my opponent's central pawn and prepare to develop other pieces. This move also
opens up the possibility of a potential fork on the knight and bishop, which could lead to a tactical advantage.
} 5. Nxd5 { My move: "Nxd5"
I'm playing Nxd5 to immediately recapture the pawn and maintain control of the center of the board.
This move also opens up the possibility of developing other pieces and putting pressure on Black's position.
} 5... Be6 { My move: "e6"
This move develops my dark-squared bishop, attacks the knight, and prepares to develop other pieces. It
also aims to challenge the central control and create potential pawn breaks on the queenside, which can lead
to a strong initiative. } 6. Nf4 { My move: "Nf4"
I'm playing Nf4 to attack the bishop on e6 and prepare to develop other pieces. This move also puts pressure on the
kingside and opens up the possibility of a potential pin on the knight on f6. } 6... g5 { My move: "g5"
This move aims to challenge the knight on f4 and prepare to develop other pieces, particularly the dark-squared
bishop. By playing g5, I'm also creating the potential for a pawn storm on the kingside, which could lead to a
weakening of White's position. Additionally, this move helps to support the development of my queenside
pieces and prepares to potentially open lines for my rook and queen. } 7. h4 { My move: "h4"
This move aims to take advantage of Black's weakened kingside pawn structure and prepares to launch a powerful
attack on the black position. By playing h4, I'm also eyeing the possibility of opening up the h-file for my rook to
potentially use in the future. } 7... gxh4 { My move: "gxh4"
By playing gxh4, I'm attacking the pawn on h4 and preparing to develop other pieces. This move also weakens White's
pawn structure on the kingside, which can be exploited later in the game. } 8. Rxh4 { My move: "Rxh4"
I'm playing Rxh4 to immediately take advantage of the pawn exchange on the previous move, gaining control of the
open file and putting pressure on Black's position. This move also develops my rook and prepares to potentially
launch a kingside attack. } 8... Qe7 { My move: "Qe7"
I'm playing Qe7 to develop my queen, put pressure on the f4-pawn, and prepare to castle kingside. This move also
eyes the kingside and gets ready to potentially attack the white king, while also supporting the pawn on e5.
} 9. Rxh7 { My move: "Rxh7"
I'm playing Rxh7 to take advantage of the weakened kingside and put immediate pressure on Black's position.
This move also opens up the possibility of a potential fork on the king and queen, which could lead to a
significant material gain. } 9... Rg8 { My move: "Rg8"
This move develops my rook and attacks the opponent's rook, which is currently attacking my king.
By playing Rg8, I'm preparing to potentially fork the opponent's rook and king, while also eyeing
the f7-pawn as a potential weakness to exploit in the future. } 10. Rxf7 { My move: "Rxf7"
I'm playing Rxf7 to take advantage of the weakened kingside and create a strong attacking opportunity.
This move also puts pressure on Black's position, forcing them to respond to the threat of Rxf7, which can
lead to further weaknesses. } 10... Kxf7 { My move: "Kxf7"
This move aims to immediately address the threat posed by the rook on f7, while also preparing to develop
other pieces and potentially create counterplay on the kingside. By capturing the rook, I'm also opening
up the f-file for my rook to potentially use in the future. } 11. Qh5+ { My move: "Qh5"
I'm taking advantage of the weakened kingside by attacking the black king directly. This move also puts
pressure on the f7-pawn, which is now isolated and vulnerable, and prepares to potentially open up the f-file
for my rook to attack the king. } 11... Rg6 { My move: "g6"
This move attacks the opponent's queen and prepares to develop other pieces, particularly the
dark-squared bishop, which can potentially create a strong attacking force on the kingside. By playing g6, I'm
also supporting the pawn on f7 and getting ready to castle kingside, which would bring my king to a safer
position and connect the rooks. } 12. Qxg6# { My move: "Qxg6#"
This move is a direct attack on the opponent's king, aiming for an immediate checkmate. By capturing the
rook on g6, I am putting the king in a position where it cannot escape the check, and the game is
essentially over. This aggressive move is a result of the tactical opportunities created by the previous
moves, and it's time to capitalize on them. } 1-0
# _______________
# ♖ ♘ ♗ ♕ ♔ ♗ . ♖
# ♙ ♙ ♙ . . ♙ ♙ ♙
# . . . . . ♘ . .
# . . . ♙ ♙ . . .
# . . . ♟ ♟ . . .
# . . ♞ . . . . .
# ♟ ♟ ♟ . . ♟ ♟ ♟
# ♜ . ♝ ♛ ♚ ♝ ♞ ♜
#----------------