EDMONTON — Press “enter,” dealer — scientists have taught a computer how to play unbeatable poker.
While the news may sadden the hearts of rec-room card sharps everywhere, the winners in this game are programmers trying to do everything from improve public security to help doctors treat patients with diabetes.
“We should be able to use these algorithms in any well-defined problem,” said Michael Bowling, the University of Alberta computer scientist who co-authored a paper in the journal Science that details how the program for two-handed, fixed-bet Texas Hold ‘Em can’t do worse than break even.
Scientists in the field of game theory long ago taught computers to play games such as checkers and chess. But poker has remained elusive because it’s a so-called “imperfect information” game. A player has to make decisions without knowing all the data such as what the other player is holding.
“This game has been, historically, an important challenge problem,” Bowling said. “Poker is one of the games that really motivated the whole founding of the field of game theory back in the ’20s.”
Bowling’s team made its breakthrough by refining a previously developed technique called counterfactual regret minimization that allows a computer to look back at previous hands and learn from its mistakes. Although that sounds similar to how humans improve, the computer used here became a one-player Las Vegas.
“It spent two months playing billions and billions of hands of poker against itself to find the perfect strategy,” said Bowling. “The strategy is 1,000 times larger than all the English-language Wikipedia.”
It’s unlikely to be of much use at anyone’s Saturday night game.
“You have to memorize a 10-terabyte table of probabilities.”
A terabyte is one byte followed by 12 zeros.
But the point was never to become an unbeatable online poker star. The same process that taught the computer when to hold ’em and when to fold ’em can be transferred to any problem with well-defined rules and outcomes, many options and imperfect information – terrorist security, for example.
“We run patrols, we do searches — we have these tools at our disposal, but how do we deploy them? We want to find a strategy that’s unbeatable.
“What we’ve done is shown that we can do these game theoretic analyses at a scale that hasn’t been done before – at a really enormous complexity. That means that we can start looking at problems in that security sphere.”
Game theory is already being used to help schedule air marshals on commercial flights in the United States.
Bowling’s team is also working with diabetes researchers to see if the computer poker work can help manage the disease.
Doctors and patients typically come up with a plan to adjust insulin intake to food consumption, exercise and other variables. But those variables can change. Nor do doctors have any guarantees how well the patient will follow the treatment plan.
“Building a policy that is robust to those uncertainties is not that different from building a poker policy that’s robust to not knowing what cards the opponent has,” Bowling said.
“If we could have a decision support system that could maybe help the patient tweak their formula on their own, or even assist the doctor to do it faster, then we could improve the effectiveness of these treatment policies.”
Despite its larger ambitions, there are lessons in Bowling’s paper for the casual player, although they will already be familiar to the experienced.
— Avoid simply calling bets. If you’re in, you’re probably best to raise.
— Don’t make the maximum allowed bet in the first round.
— Hang in there. The computer routinely played weaker hands than most human players.
“What the poker programs have always suggested is that the human players are too conservative at this game.”