Hello Reddit! We are PhD student Noam Brown and Professor Tuomas Sandholm at the Computer Science Department of Carnegie Mellon University. We do research on developing AIs that can reason about hidden information (which is widespread in real-world strategic interactions). Earlier this year we built Libratus, the first and only AI to defeat top humans in no-limit poker. We played four of the world's best pros in a 120,000 hand, 20-day Brains vs. AI match of heads-up no-limit Texas hold'em, with a prize pool of $200,000. The AI won the match decisively, winning a combined $1.8 million (at $50/$100 blinds). The victory was statistically significant with a p-value of 0.0002. The details of the bot were just published in Science Magazine!
We're here to talk about Libratus, the competition, what this means for the future of AI, and any other questions you might have.
We'll be back at 9 am to answer your questions, Ask us anything!
EDIT: We're closing the AMA. Thanks for the questions everyone!
Thank you for doing this AMA!
What are some of the difficulties of poker that make this unique from other games that have been beaten by AI such as GO and chess?
Where does the AI gain advantages over human players? Is it better at recognising bluffs? Does it bluff better itself? Does it follow statistics better?
Poker has hidden information. That makes the game fundamentally different from past games that AI has conquered such as checkers, chess, and Go, and requires completely different techniques.
The main reason for this is that it is no longer possible to determine the optimal strategy for a situation in isolation in poker. If I show you an endgame in chess, you don't have to know anything about how you got there, or what other situations could have come up. The only thing that matters is that board and the states that can be reached from that point on. In a game like poker, that's not true. The optimal strategy in one situation is dependent on the optimal strategy for the entire game. So you always have to consider the strategy for the whole game. We give a pretty detailed example of this in Section 2 of this paper
The AI is better at a few things. In particular:
1) It's way better at determining optimal probability distributions. Humans can only manage to use 1 or 2 bet sizes. The AI can carefully balance between 10.
2) The AI is way better at determining which bet size to use. Humans typically bet between 0.5x and 1x pot. The AI can bet between 0.1x and 20x pot, and will use extreme sizes where humans would have never used them before.
The AI actually doesn't look at statistics of its opponent. It never looked at their cards, even when they were revealed.
Hi, congratulations on getting Science on your CV. My question is what language did you use for programming and how computationally intensive is the code (time and memory - wise). I am interested in using your paper for making my own code for different application of course.
Thanks! We used C++ for the most important components. We used a lot of resources, about 196 * 24 cores and about 196 * 128 GB of memory. The core idea is pretty elegant (I think anyway), and easy to understand. We provide pseudocode in the paper. But there are a lot of poker-specific implementation details that we use in poker that make it run orders of magnitude faster in that domain. Those optimizations take a while to understand and implement correctly. That said, if you are looking at other domains then those improvements are probably not relevant.
What other potential applications for Libratus? In poker you have known unknowns. How well will Libratus's performance translate to scenarios where you may have unknown unknowns?
There are a lot! I really see this research as fundamental to addressing the problem of hidden information in strategic settings. Prior to this line of research, there really weren't any good answers to this problem.
Most real-world strategic interactions involve hidden information (e.g., negotiations, security situations, auctions, military scenarios, financial markets, etc.). I think this research will eventually be applied to all of these domains, though the timeline isn't clear. It could be 10 years or it could be 50 for some of those domains.
As you point out, the major challenge is converting these real-world domains into well-defined models with clearly defined actions and payoffs. That is going to be a challenge, but one that I think can eventually be overcome.
To add to u/daburndo 's question, I wanted to ask if you got into both facial recognition and the psychological/physiological effects of lying, as well as creating facial expressions and simulating different tones of voice, to help bring the AI's ability into the more social/less mathematical aspect of poker.
The AI does not look for any "tells" in its opponent or try to exploit them. Instead, it is trying to approximate a perfect strategy (called a "Nash equilibrium"), because if it plays the Nash equilibrium, it cannot lose in expectation. Nash equilibria are proven to exist in all games, but they can be very difficult to find.
I think there's a lot of interesting work to be done in looking at facial expressions, etc., but it's not the kind of research we've been focused on. It also would probably not be helpful in beating top humans, because they are very good at hiding their tells. For example, they could just wear a bag over their head, or take the exact same amount of time to make every decision. The far more important thing is to ensure the bot doesn't give off tells that the humans could figure out (e.g., the bot never bluffing).
In online gambling, do you forsee a way to distinguish good human players from AI? If not, is this the end of online poker?
I think there are ways to distinguish humans from AIs, and I've heard from poker pros that the online poker sites are quite active in trying to detect and prevent AIs.
Keep in mind though, the sites don't have to be 100% successful in their bot detection. They just have to detect bots with a high enough probability that it is no longer profitable to run bots online. That's a far lower bar that is probably achievable, considering the entire bankroll of the bot is confiscated when they are caught.
What I would like to see is an AI capable of defeating players in games Rocket League. How far away do you think we are from this capability?
I think this line of research is going to be extremely relevant to making AIs for all sorts of games involving hidden information. I think some of the next steps might be RTS games like Dota2 and Starcraft, as well as card games like Hearthstone. But I think in the long run we'll see it being used in all imperfect-information games, and probably pretty quickly. I'd say 10-15 years.
How does the AI handle nonverbal/non-game information cues? Or does it? If not, where does it best make up for that disadvantage?
We specifically agreed with the Humans that Libratus will not use any such information, not even timing tells (that is, how long the Human thinks in any given situation). We did allow the humans to use any such information -- in particular any timing tells they may pick up on -- during the match.
Using such cues is a form of exploitation. Game-theoretic play does not use such cues, and exact game-theoretic play would be unbeatable despite not using such cues. Libratus is based on approximating game-theoretic play (it does not solve for an exact Nash equilibrium because the game is too large for an exact solution to be computed; the game has 10161 different situations that a player can face).
What are the inputs that the AI handles? Does it remember cards from previous hands? Does it attempt to count cards?
The AI takes as input the cards it has, the cards on the board (and the order they were revealed), and all public actions that have occurred in that hand.
Counting cards is for Blackjack, not poker, but it does implicitly consider what cards are left in the deck.
It doesn't look at cards from previous hands. That's not necessary to find an optimal strategy in a game like poker.
Is the AI aggressive or passive? Does it avoid large loss possibilities or does it make strategic moves for large pot exchanges?
The AI is risk-neutral. It always attempts to maximize its expected value. That said, it appears to be more passive than humans in some spots, and far more aggressive in others. In particular, it will frequently bet sizes that are far larger than what is typical among humans.
Does it adapt to a losing run, or is it set "emotion-less"?
It's emotionless. It will always try to maximize its expected value.
What's your favorite lunch spot in Oakland?
I would love to see your AI play the Canadian poker AI. Any plans on making this happen? What would happen?! Any plans on adding to the number of players in the game?
For the purposes of this answer, we will assume that "the Canadian poker AI" refers to DeepStack.
No results have been reported on DeepStack beating any prior AIs. In particular, results of it playing against our Baby Tartanian8, which won the Annual Computer Poker Competition in early 2016, and which was available for benchmarking when DeepStack played against humans in November/December 2016, have not been published. In December 2017, we again offered to have a match between DeepStack and Baby Tartanian8, but the DeepStack team refused. So, while there has been a lot of PR about DeepStack, there is no indication of it being even the second-best AI for the game.
Libratus beats Baby Tartanian8 -- the best prior AI -- by a very large margin (63 mbb/game), see http://science.sciencemag.org/content/early/2017/12/15/science.aao1733. Organizing any further AI-to-AI matches with Libratus would be quite expensive in terms of the computational resources, and we don't see the point since we have already beaten the vetted best prior AI and top human specialist professionals.
Where do you see AI 100 years from now?
- t3_7kksax_comments.json 60.2 KB
This article and its reviews are distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and redistribution in any medium, provided that the original author and source are credited.