- Created Monday, November 3rd 2014 @ 02:59:29
I'm quite interested in knowing the format of the final competition because it can affect how we should write our bots.
the elo ranking currently ranks players by how well they fair against a large number of opponents on average. Just because the rank1 person exploits average opponents the best does not mean he can win more than 50% of the times against the rank2 person.
From what I know about other poker AI competitions, one common format is to repeatedly run giant roundrobin over all the bots, and after every round eliminate the bot that performed the worst.
for example if you start off with 20 bots in the finals, everyone play against every other person for statistically significant number of matches, then the bot that one the least number of games is eliminated any games with the 20th player is ignored for calculation for the next round with the remaining 19 players. This means that the person who won the first round with all 20 players (who exploit average opponents the best) does not necessarily win the tournament, but the bot that wins the tournament should be one that can win more than 50% of the times against any other opponent.
It would be great if the developers give some insight on this.
DeveloperUpdated Monday, November 3rd 2014 @ 16:29:10
Regarding your comments on the ELO system, your assumption that each bot plays against all other bots is incorrect. In order to increase the amount of meaningful matches, we opted to have each bot only play against bots with similar (plus and minus 10) rank. In other words, the 1st ranked bot only plays against bots ranked 2 to 11, while the 50th ranked bot plays against bots ranked 40 to 60.
As for the finals, the format will be the same as in the previous Warlight challenge:
- We'll disable bot submissions and reset elo scores a week before the start of the semi finals;
- At the end of the week, the best 24 or 48 bots will move on to the semi finals;
- The semi finals consist of a seeded best of 5 double elimination tournament.
- The 5 best players from the winners bracket and 3 best players from the losers bracket move on to the finals;
- The finals consist of a best of 3 single elimination tournament (played live on the home page).
- Updated Monday, November 3rd 2014 @ 21:17:22
Thanks for the quick response.
I do have to admit II'm very surprised at the format of the semi finals and finals.Omaha poker is a game with extremely high variance. If player1 can beat player2 about 55% of the times consistently, then it is far superior. However with the current format, player1 will only win the best out of 5 59.3% of the times and best out of 3 57.4%. I can understand time constraint issues if this is a normal tournament with human players, but for a rather academic competition like this I would strongly urge running a statistically significant number of matches (at least best 11 out of 21, or 21 out of 41), otherwise the results is like drawing from a lottery with much better players only having a slight advantage. I'm sure most competitors here would rather allow a full day to run the semifinals and finals and find the true best players rather than a short lottery for the prizes.
*To add some credibility for myself, I have participated in and provided inputs for planning finals for a poker AI competition for texas hold'em before at MIT and players each play 10000 to 100000 hands with each other in order to find out who is better with over 99.9% confidence.
- Updated Tuesday, November 4th 2014 @ 03:25:51
If you agree and would like everyone to find the true rank of the players with very high confidence in 1-2 days for the semi finals and finals, I can recommend a format such that every elimination is backed by at least 100 matches with other players.
- Created Tuesday, November 4th 2014 @ 03:34:25
The format that I would strongly recommend is something like below using, for example, 24 players in the semi finals.
1) all players play against all other 23 players in a giant round robin for 4 games each.
2) The player with the least number of wins is eliminated as 24th.
3) all games involving the eliminated player is removed, and the calculation in step 2 is repeated for the 23rd player. repeat until 12 players left.
4) giant round robin of 12 players for 8 games each.
5) eliminate 13h-9th same as in step 2-3
6) giant round robin of 8 players for 16 games each.
7) eliminate 8th-5th.
8) round robin of 4 players for 32 games each.
9) eliminate 4th-3rd.
10) 64 games for the final.
data in previous giant round robins are to be combined with later ones. only the games involving eliminated players are removed every time. For example, the number of games used in determining the final winner is 64 + 32 + 16 + 8 + 4 = 124.
The total number of games played is 2336 which is around 35 hours of playing time at the current rate games finish in the game log.
personally I think the number of games in my proposed format is still low, but it is still magnitudes more accurate than double elimination with best out of 3 or best out of 5.
- Created Tuesday, November 4th 2014 @ 13:27:07
@chezedude Your format is not perfect either. Consider bot1 winning 45% games against bot2 and 70% against other bots. Bot2 winning 55% against all other bots. With your format bot2 has a high chance of winning, but I would say bot1 is overall better.
I think double elimination tournament is a good idea, but maybe it would be better to have higher number of games in each match (7 or more?). Reduce the luck factor but don't eliminate it completely - it's part of the fun too.
- Updated Tuesday, November 4th 2014 @ 14:27:04
@pomyk, Bot2 will not have higher chance of placing first than bot1 because "the games involving eliminated players are removed every time." meaning when it's down to the last 2 bots, only the games between bot1 and bot2 matters.
The format I mentioned is used by many serious poker AI competitions in academia. The format is called "bankroll instant run-off" a simple description is found here http://www.computerpokercompetition.org/index.php/component/content/article/81-background/82-winner-determination-bankroll-instant-run-off
- Created Tuesday, November 4th 2014 @ 14:45:16
@chezedude bot2 will have higher chance because it's winning 55% of games against bot1.
- Updated Tuesday, November 4th 2014 @ 15:04:11
ah I misread it. the part about bot2 winning 55% against all "other" bots made me think he loses against bot1, but he's really winning 55% against all bots, that means bot2 is indeed superior. bot1 is better at exploiting worse bots but bot2 should be the winner.
You could argue that bot1 is superior because it wins more overall however I would still say that bot2 is superior. However this goes back to the original question of what is the goal of this tournament. as you can see in my link there are sometimes two ways to measure how good a bot is. one is to measure if a bot can beat (win >50%) against any opponent in which case bot2 wins. another less common measurement is how well a bot exploits a wide range of opponents in which case bot1 wins. In my opinion the former is more important and that's what the bankroll instant run off format supports.
- Updated Wednesday, November 5th 2014 @ 22:37:24
I'd like to add that in some poker competitions they even add "exploit bots" in order to stimulate exploiting rather than just trying to play game theoretically optimal. This means that the bot that wins the most overall (imagine one bot that can completely destroy a single top ranked bot, wouldn't that be cool?) would be declared the best bot. To me, that actually seems like more fun than to conclude with a winner after running 100k games where one bot wins a statistically significant 0.5% more games against the 2nd best bot.
- Updated Wednesday, November 5th 2014 @ 23:18:39
@dualinity yes some poker competitions do include "exploit bots". that is basically the "total bankroll" format in my previous link. However it does not mean it will destroy top ranked bots usually. Rather it destroys low ranked bots and loses or break even with other top ranked bots and win the competition because the amount it won from low ranked bots is more than what others won since it makes more of a difference. Destroying top ranked bots is nearly impossible because the higher the rank the closer to nash equilibrium one would need to play.
The point though is that creating a bot that exploits the best vs a wide range of opponents vs creating a bot that has highest chance of winning >50% against anyone else is very very different. The first generally focuses on opponent modeling and it itself will actually play an exploitable strategy in order to try to win more against low ranked bots. The latter will play a close to nash equilibrium strategy.
To make a very simple example of why these two are different, I can use rock paper scissors to illustrate the concept. you have player10 who is a bad player that does little opponent modeling who plays rock-40%, paper-30%, scissors-30%. an exploitive player's optimal strategy against player10 would be to play paper 100% of the times, however you should only do so if you have enough data to show or simply assumes that player10 does not adjust his strategy to yours. Clearly the exploitive player's counterstrategy is extremely exploitable as well. A close to nash equilibrium player would instead choose to counter player10 by playing a strategy very close to nash, for example rock 33%, paper 35%, scissors 32%. While the nash player will win less against player10 than exploitive player will, it plays a safe strategy that previous any opponents from exploiting him because who knows if player10 is actually a very advanced AI who is simply baiting you to play paper more frequently than you should or not.
As for poker, I will use playing against the AllinBots as an example. If you want to maximize your chance of winning against an AllinBot and achieve >80% chance of beating him, you have to give up hands where you have 60% or 70% advantage in the game (assuming you highly suspect the opponent is AllinBot after a number of hands). However giving up 60 or 70% odds is an exploitive strategy that can be exploited by another bot who pretends to be AllinBot by being overly aggressive early game, and after getting a small lead from unnessecary folding with 60-70% by the opponent, it can play normally again. (I know it's not as simple as I made it sound, but just the concept that I'm trying to point out).
- Updated Wednesday, November 5th 2014 @ 23:20:24
The latter 2 paragraphs (pre-your edit) are exactly the explanation for the point im trying to make: exploitation will make this a more fun competition rather than the other format. It will also be better for the lower amount of matches played.
- Updated Wednesday, November 5th 2014 @ 23:24:17
an exploiting competition would be competing to win the most from those allin bots and low ranked bots rather than see who can beat the top players, because the amount you (win minus lose) from bad players way way overshadows the amount from top players). To me that is very boring.
If I'm a rank 10 tennis player in the world, my goal is not to try to beat the rank 100 player with as close to 100% of the times as possible. my goal would be to try to beat the players in the top 9 more often than I lose to them.
- Updated Wednesday, November 5th 2014 @ 23:32:28
The point is that the top X get selected, and I dare to bet you there will be no AllInBots there.
But yea, I get you. More matches certainly be good against extreme luck. Still, I'd personally prefer the bot to win that is the most exploitative among the top bots rather than one that wins 50.5% against all.
- Updated Wednesday, November 5th 2014 @ 23:43:52
I'm fine with competing either for exploitive or for >50% against all. However the current format does not support either one. I believe there is >20% chance of some allin bot getting in the top 8 considering there are so many if we went through with it.
I understand they did the same format for warlight, but if you simply look at the elo number for top players on each leaderboard, you would see this is not feasible for poker.
Taking from wikipedia "A player whose rating is 100 points greater than their opponent's is expected to win 64% of the time; if the difference is 200 points, then the expected win proportion for the stronger player is 76%"
for poker the difference between the top ranked bot vs the default ranking is only 400. in warlight it's over 1200. That the win ratio between players are less close to 50%. However for poker even a bot as bad as one that always goes all in still have easily >30% chance of winning from the best bots, that means even Allin bot is not more than 200 elo away from the best bot. The only thing preventing it currently is that players only play within 9 ranks away and there are enough players on the board to shield the top players from all in bots. otherwise we'd see allin bots with much higher elo.