Sunday, April 26, 2015

For the first time, artificial intelligence will play poker against top human players

To the A.I., a simple game of poker is a tree of all possible events. Image from Heads-up Limit Hold’em Poker is Solved by Michael Bowling, Neil Burch, Michael Johanson, and Oskari Tammelin.
Claudico, an artificial intelligence program developed by Professor Tuomas Sandholm at Carnegie Mellon, will be playing heads-up no-limit Texas hold’em against four top human players. 80,000 hands will be played over the course of fourteen days, from April 24th to May 8th, and the matches will be streamed every day on Twitch.tv starting at 11:30am in the morning. The difference between poker and games like chess or checkers is the amount of information available to the players. Poker is a game of imperfect information, where the knowledge of the opponents' hand is unknown and the cards dealt are based on chance. As opposed to chess, a game of perfect information, both players know every single action the opponent has taken against them. Poker has become the basis for imperfect information games because there are already broad skill levels for computers to play against, with the added layer of complexity for interpreting other players' actions while simultaneously hiding your own.

A solution concept for a game like poker is called the Nash equilibrium. The definition is simple, for non-cooperative games, the strategy each player has chosen to play the game is the best strategy they can employ assuming they know the strategies of others. Players will receive no benefit from changing their strategies as the strategy they already chosen is the best strategy for them. Finding the Nash equilibrium, the best possible way of playing the game, using linear programming takes an exponential amount of time, O(2nk). This is a simpler time complexity than brute forcing a traveling salesman problem (TSP) but still takes an incredible amount of time because it's not polynomial. However, finding the Nash equilibrium does not fall under NP-complete. Nash's theorem concludes that a solution is guaranteed to exist, while it's unclear whether NP-complete problems can have a polynomial solution at all. So instead of being grouped under NP-complete, Nash equilibrium falls under another subclass of NP call PPAD, where solving the problem may be difficult but the solution is guaranteed to exist. This applies for finding the Nash equilibrium on non-cooperative games with k-players, provided k is a finite integer greater or equal to two.

An A.I. plays the game of love. Image courtesy of https://xkcd.com/601/.
The first algorithm to solve imperfect-information games in polynomial time was developed in 2003 and applied to poker in 2005. As of January 2015, heads-up limit Texas hold 'em poker has been declared weakly solved by Bowling et al. (2015). The algorithm can now win or draw against all possible moves the opponent makes from the very beginning. It does so by computing a very good approximation of the Nash equilibrium strategy. With the tournament just beginning, will Claudico follow the footsteps of Deep Blue and Watson and mark the next milestone in artificial intelligence? 

Edit: I mistook exponential as factorial and it has been corrected.

References

The official website for Brains vs Artificial Intelligence 

Bowling, M., Burch, N., Johanson, M., & Tammelin, O. (2015). Heads-up limit hold'em poker is solved. Science, 347(6218), 145-149. doi:http://dx.doi.org/10.1126/science.1259433

Daskalakis, C., Goldberg, P. W., & Papadimitriou, C. H. (2009). The complexity of computing a nash equilibrium. Association for Computing Machinery.Communications of the ACM, 52(2), 89. Retrieved from http://search.proquest.com/docview/237061886?accountid=14541 

Fortnow, L. (2005, December 15). Computational Complexity. Retrieved April 26, 2015, from http://blog.computationalcomplexity.org/2005/12/what-is-ppad.html 

Osborne, M., & Rubinstein, A. (1994). A course in game theory. Cambridge, Mass.: MIT Press.

Rubin, J., & Watson, I. (2011). Computer poker: A review. Artificial Intelligence, 175(5-6), 958-987. doi:http://dx.doi.org/10.1016/j.artint.2010.12.005

Sandholm, T. (2010). The state of solving large incomplete-information games, and application to poker. AI Magazine, 31(4), 13-32. Retrieved from http://search.proquest.com/docview/847665029?accountid=14541 

von Stengel, B. (2010). Computation of nash equilibria in finite games: Introduction to the symposium. Economic Theory, 42(1), 1-7. doi:http://dx.doi.org/10.1007/s00199-009-0452-2

Monday, April 20, 2015

An introduction to the traveling salesman problem

There's always a relevant XKCD comic. From https://xkcd.com/399/
What's the most efficient way of going through a set of cities once and coming back to where you started? As simple as it sounds, iterating through every possible combination would be next to impossible if the list of cities is big enough. If there's 100 cities and you pick one as the starting point, there's still 99 more cities to choose from. The number of combinations would be 100(100-1)(100-2)... until there's only one choice left. The amount of possible solutions increases at a staggering rate as the amount of cities increase. Researchers have worked on trying to find a way to solve it as fast as possible. To understand the traveling salesman problem (TSP) in terms of computer science, we must first understand the concepts of time complexity.

Given an input n, the time complexity is determined by how much of the given input, n, we have to go through. For example, if I were to look through 100 boxes one by one, the worst case scenario would be finding what I was looking for after going through all 100 boxes. The time complexity of the example would be written as O(n) in the Big O notation because given the input, n = 100, and my search method, I need to look through all of them. Although I can find whatever it is in the first box I look into, time complexity is usually determined by the worst case scenario.

These time complexities are grouped into classes and the two important classes regarding TSP are classes P and NP. Problems in class P are solvable in polynomial time with a deterministic algorithm where one input can only have one output. Polynomial time meaning time complexities such as n, nlog(n), n10, etc. Problems in class NP, non-deterministic polynomial time, cannot be solved in polynomial time with a deterministic algorithm, but solutions can be checked in polynomial time. TSP falls under the class NP because while solving it is difficult we can verify the solution by checking if the solution goes through every city just once which takes a polynomial amount of time. The goal now is to find the algorithm that can solve TSP within polynomial time.

Why is this important? TSP is just one small droplet in the overarching problem of P versus NP. If a solution can be found for the decision version of TSP, this would prove P = NP meaning all NP-complete problems can be solved within a minimal amount of time. NP-complete problems are problems with no known shortcuts. This breakthrough will change the works of medical sciences, mathematics, engineering, transportation, and computer science. A lot of fundamental problems are NP-complete such as searching through DNA sequences, optimal protein folding, and shortening mathematical proofs. But even after decades of research, P = NP still hasn't been proven and it's becoming more of a fantasy. For more information on P and NP problems or an overview on P versus NP in general, Lance Fortnow wrote a great article called "The status of the P versus NP problem" in the Communications of the ACM magazine. He also has a much better blog on dealing with computational complexity at http://blog.computationalcomplexity.org/.

References

Lecture by Erik Demaine on Computational Complexity

Fortnow, L. (2009). The status of the P versus NP problem. Association for Computing Machinery.Communications of the ACM, 52(9), 78. doi:http://dx.doi.org/10.1145/1562164.1562186

Hamburger, H. & Richards, D. (2002). Logic and language models for computer science. Upper Saddle River, N.J: Prentice Hall.