- Updated Monday, April 4th 2016 @ 20:22:51
The score in this game, like in other games, depends highly on who starts. It would be nice to play all matches twice with each bot starting once. My current version started twice out of 13 matches, which makes it difficult to see its real strength.
Another nice thing would be to tell the challenges apart from regular matches.
Edit: Should be zugzwang, not zugschwang.
- Created Monday, April 4th 2016 @ 20:50:17
I like this idea. When my bot plays against itself, player 1 wins almost 60% of the games. I'm curious, is that similar to what you (or others) are seeing?
- Created Monday, April 4th 2016 @ 22:17:15
I've never tested against my own bot, but being player 1 should absolutely help.
At the Go competition they are looking at adding komi, which I think is no more than fair and well tested for this game. Considering the size of the board and the complexity of Go, I'm guessing that the difference between bots can quickly exceed the komi value no matter which bot starts.
In the case of Ultimate Tic Tac Toe this is very different. Winning from the top 10 bots as second to play is already pretty tricky.
I've requested the same thing before at the Heads Up Omaha competition, but it wasn't added. I understand that the elo rating will balance out over time anyway, but this will easily help speed up the balance.
(I'm seeing that my bot made the finals in Heads Up Omaha, but was always second to play. Nice...)
- Created Tuesday, April 5th 2016 @ 03:47:02
Yes, it only evens up over time if they run a truly vast number of games. The unfairness could be solved easily by running two games per matchup, alternating who plays first. I'm very much in favour of the proposed change.
- Created Tuesday, April 5th 2016 @ 14:21:07
It would be nice if at least the probability to start first was weighted against the the starting position frequency during previous games.
- Created Tuesday, April 5th 2016 @ 17:33:10
Weighting is more tricky than just playing mirrored matches.
- Updated Tuesday, April 5th 2016 @ 19:05:26
Right, also mirror matches would be useful for developers to assess their bot against an opponent
EDIT: my bot always started first in its last 10 games!
- Created Tuesday, April 5th 2016 @ 21:37:38
True, a single challenge only ever gives 50% of the information at most.
Ah, that's why mine started second 13 times in a row ;-) I'm on 3 times first and 19 second out of the 22 games my current version played.
- Created Tuesday, April 19th 2016 @ 12:46:37
The advantage of playing as first bot is of more than 100 ELO for my bot and it seems of more than 200 ELO for many other bots... I think that taking care of the number of "as first" and " as second" games is getting critical.
Even more if you think that in the Leaderboard there are are often a lot of bots in a fistful of ELOs.
- Created Tuesday, August 23rd 2016 @ 00:10:18
+1 for mirrored matches
- Created Wednesday, August 31st 2016 @ 09:42:21
I scored 62/64 percent against my own bot, so I agree, that to level this out, you need two games at once.
- Updated Wednesday, August 31st 2016 @ 10:50:57
I also think playing 2 matches each time with reversed colors is the fairest and simplest solution. This is what I do when i want to compare locally to versions of my bot, and i suspect this is what most of us do.