Showing posts with label Claudico. Show all posts
Showing posts with label Claudico. Show all posts

Sunday, May 10, 2015

Solving heads-up limit hold’em poker

A game of Texas hold 'em. Image from Wikipedia.
From one of my previous posts, I talked about the poker playing computer Claudico as it played against human opponents for nearly two weeks. The results came in with the four human players winning $732,713 over the computer out of $170 million that was bet during play. However, the number is very insignificant compared to the $170 million played so the scientists behind Claudico is considering that this might be a tie. Humans are still better at imperfect information games than computers but not by much. It may only take a little longer for computers to master it.

On January 9th, 2015, "Heads-up limit hold’em poker is solved" was published in the Science journal as the first article to prove that a "nontrivial imperfect-information game played competitively by humans" can be beaten by a computer (Bowling, Burch, Johanson, & Tammelin, 2015). The difference between Claudico and research done here is the type of poker been played. Claudico is programmed to deal with heads-up no limit hold’em poker where bets and raises have no limit. This means Claudico has to deal with 10161 information sets, or possible outcomes. Heads-up limit hold’em poker limits how much can be bet and reduces the amount of possible outcomes down to 3.19 × 1014. Even though this amount seems trivial compared to Claudico, it's a first step into solving the world of imperfect games.

The paper uses a method called Counterfactual regret minimization (CFR). It approximates the Nash Equilibrium, the best strategy of winning the game, by using two regret-minimizing algorithms. The program plays itself and calculates the detrimental effects of not picking the best option. It tries to minimize the negative effects, regrets, in order to find the best deterministic strategy. The problem was storing the regret values of all the possible outcomes. To do so requires 262 TB of storage. To fix this problem, Bowling et al. truncated the decimals into integers and implemented compression methods to bring down the required storage down to a workable 17 TB.

References

Bowling, M., Burch, N., Johanson, M., & Tammelin, O. (2015). Heads-up limit hold'em poker is solved. Science, 347(6218), 145-149. doi:http://dx.doi.org/10.1126/science.1259433
 
Maynard, J. (2015, May 10). CMU AI Claudico Is Good At Poker But Not Good Enough For World's Best Human Players. Retrieved May 11, 2015, from http://www.techtimes.com/articles/51946/20150510/cmu-ai-claudico-good-poker-enough-worlds-best-human-players.htm

Walters, K., & Watts, E. (2015, April 24). Brains Vs. Artificial Intelligence. Retrieved May 11, 2015, from http://www.cmu.edu/news/stories/archives/2015/april/computer-faces-poker-pros.htm


Sunday, April 26, 2015

For the first time, artificial intelligence will play poker against top human players

To the A.I., a simple game of poker is a tree of all possible events. Image from Heads-up Limit Hold’em Poker is Solved by Michael Bowling, Neil Burch, Michael Johanson, and Oskari Tammelin.
Claudico, an artificial intelligence program developed by Professor Tuomas Sandholm at Carnegie Mellon, will be playing heads-up no-limit Texas hold’em against four top human players. 80,000 hands will be played over the course of fourteen days, from April 24th to May 8th, and the matches will be streamed every day on Twitch.tv starting at 11:30am in the morning. The difference between poker and games like chess or checkers is the amount of information available to the players. Poker is a game of imperfect information, where the knowledge of the opponents' hand is unknown and the cards dealt are based on chance. As opposed to chess, a game of perfect information, both players know every single action the opponent has taken against them. Poker has become the basis for imperfect information games because there are already broad skill levels for computers to play against, with the added layer of complexity for interpreting other players' actions while simultaneously hiding your own.

A solution concept for a game like poker is called the Nash equilibrium. The definition is simple, for non-cooperative games, the strategy each player has chosen to play the game is the best strategy they can employ assuming they know the strategies of others. Players will receive no benefit from changing their strategies as the strategy they already chosen is the best strategy for them. Finding the Nash equilibrium, the best possible way of playing the game, using linear programming takes an exponential amount of time, O(2nk). This is a simpler time complexity than brute forcing a traveling salesman problem (TSP) but still takes an incredible amount of time because it's not polynomial. However, finding the Nash equilibrium does not fall under NP-complete. Nash's theorem concludes that a solution is guaranteed to exist, while it's unclear whether NP-complete problems can have a polynomial solution at all. So instead of being grouped under NP-complete, Nash equilibrium falls under another subclass of NP call PPAD, where solving the problem may be difficult but the solution is guaranteed to exist. This applies for finding the Nash equilibrium on non-cooperative games with k-players, provided k is a finite integer greater or equal to two.

An A.I. plays the game of love. Image courtesy of https://xkcd.com/601/.
The first algorithm to solve imperfect-information games in polynomial time was developed in 2003 and applied to poker in 2005. As of January 2015, heads-up limit Texas hold 'em poker has been declared weakly solved by Bowling et al. (2015). The algorithm can now win or draw against all possible moves the opponent makes from the very beginning. It does so by computing a very good approximation of the Nash equilibrium strategy. With the tournament just beginning, will Claudico follow the footsteps of Deep Blue and Watson and mark the next milestone in artificial intelligence? 

Edit: I mistook exponential as factorial and it has been corrected.

References

The official website for Brains vs Artificial Intelligence 

Bowling, M., Burch, N., Johanson, M., & Tammelin, O. (2015). Heads-up limit hold'em poker is solved. Science, 347(6218), 145-149. doi:http://dx.doi.org/10.1126/science.1259433

Daskalakis, C., Goldberg, P. W., & Papadimitriou, C. H. (2009). The complexity of computing a nash equilibrium. Association for Computing Machinery.Communications of the ACM, 52(2), 89. Retrieved from http://search.proquest.com/docview/237061886?accountid=14541 

Fortnow, L. (2005, December 15). Computational Complexity. Retrieved April 26, 2015, from http://blog.computationalcomplexity.org/2005/12/what-is-ppad.html 

Osborne, M., & Rubinstein, A. (1994). A course in game theory. Cambridge, Mass.: MIT Press.

Rubin, J., & Watson, I. (2011). Computer poker: A review. Artificial Intelligence, 175(5-6), 958-987. doi:http://dx.doi.org/10.1016/j.artint.2010.12.005

Sandholm, T. (2010). The state of solving large incomplete-information games, and application to poker. AI Magazine, 31(4), 13-32. Retrieved from http://search.proquest.com/docview/847665029?accountid=14541 

von Stengel, B. (2010). Computation of nash equilibria in finite games: Introduction to the symposium. Economic Theory, 42(1), 1-7. doi:http://dx.doi.org/10.1007/s00199-009-0452-2