Article Science

"Thus far, superhuman AI milestones in
An artificial intelligence strategic reasoning have been limited to

program developed by two-party competition. The ability to
beat five other players in such a
Carnegie Mellon complicated game opens up new
University in opportunities to use AI to solve a wide
variety of real-world problems."
collaboration with
"Playing a six-player game rather than
Facebook AI has head-to-head requires fundamental
defeated leading changes in how the AI develops its
playing strategy," said Brown, who
professionals in joined Facebook AI last year. "We're
six-player no-limit elated with its performance and believe
some of Pluribus' playing strategies
Texas hold'em poker, might even change the way pros play
the game."
the world's most
Pluribus' algorithms created some
popular form of poker. surprising features into its strategy. For
instance, most human players avoid
The AI, called Pluribus, defeated poker
"donk betting" -- that is, ending one
professional Darren Elias, who holds the
round with a call but then starting the
record for most World Poker Tour titles,
next round with a bet. It's seen as a
and Chris "Jesus" Ferguson, winner of
weak move that usually doesn't make
six World Series of Poker events. Each
strategic sense. But Pluribus placed
pro separately played 5,000 hands of
donk bets far more often than the
poker against five copies of Pluribus.
professionals it defeated.
In another experiment involving 13 pros,
"Its major strength is its ability to use
all of whom have won more than $1
mixed strategies," Elias said last week as
million playing poker, Pluribus played
he prepared for the 2019 World Series
five pros at a time for a total of 10,000
of Poker main event. "That's the same
hands and again emerged victorious.
thing that humans try to do. It's a
"Pluribus achieved superhuman matter of execution for humans -- to do
performance at multi-player poker, this in a perfectly random way and to
which is a recognized milestone in do so consistently. Most people just
artificial intelligence and in game theory can't."
that has been open for decades," said
Pluribus registered a solid win with
Tuomas Sandholm, Angel Jordan
statistical significance, which is
Professor of Computer Science, who
particularly impressive given its
developed Pluribus with Noam Brown,
opposition, Elias said. "The bot wasn't
who is finishing his Ph.D. in Carnegie
just playing against some middle of the
Mellon's Computer Science Department
road pros. It was playing some of the
as a research scientist at Facebook AI.
best players in the world."
Michael "Gags" Gagliano, who has game. It looks ahead several moves as it
earned nearly $2 million in career does so, but not requiring looking ahead
earnings, also competed against all the way to the end of the game,
Pluribus. which would be computationally
prohibitive. Limited-lookahead search is
"It was incredibly fascinating getting to
a standard approach in
play against the poker bot and seeing
perfect-information games, but is
some of the strategies it chose" said
extremely challenging in
Gagliano. "There were several plays
imperfect-information games. A new
that humans simply are not making at
limited-lookahead search algorithm is
all, especially relating to its bet sizing.
the main breakthrough that enabled
Bots/AI are an important part in the
Pluribus to achieve superhuman
evolution of poker, and it was amazing
multi-player poker.
to have first-hand experience in this
large step toward the future." Specifically, the search is an
imperfect-information-game solve of a
All of the AIs that displayed
limited-lookahead subgame. At the
superhuman skills at two-player games
leaves of that subgame, the AI considers
did so by approximating what's called a
five possible continuation strategies
Nash equilibrium. Named for the late
each opponent and itself might adopt
Carnegie Mellon alumnus and Nobel
for the rest of the game. The number of
laureate John Forbes Nash Jr., a Nash
possible continuation strategies is far
equilibrium is a pair of strategies (one
larger, but the researchers found that
per player) where neither player can
their algorithm only needs to consider
benefit from changing strategy as long
five continuation strategies per player
as the other player's strategy remains
at each leaf to compute a strong,
the same. Although the AI's strategy
balanced overall strategy.
guarantees only a result no worse than
a tie, the AI emerges victorious if its Pluribus also seeks to be unpredictable.
opponent makes miscalculations and For instance, betting would make sense
can't maintain the equilibrium. if the AI held the best possible hand,
but if the AI bets only when it has the
In a game with more than two players,
best hand, opponents will quickly catch
playing a Nash equilibrium can be a
on. So Pluribus calculates how it would
losing strategy. So Pluribus dispenses
act with every possible hand it could
with theoretical guarantees of success
hold and then computes a strategy that
and develops strategies that
is balanced across all of those
nevertheless enable it to consistently
possibilities.
outplay opponents.
Though poker is an incredibly
Pluribus first computes a "blueprint"
complicated game, Pluribus made
strategy by playing six copies of itself,
efficient use of computation. AIs that
which is sufficient for the first round of
have achieved recent milestones in
betting. From that point on, Pluribus
games have used large numbers of
does a more detailed search of possible
servers and/or farms of GPUs; Libratus
moves in a finer-grained abstraction of
used around 15 million core hours to hand — rather than selling competitors
develop its strategies and, during live on the strength of what it’s holding.
game play, used 1,400 CPU cores. “The bot doesn’t view it as deceptive or
Pluribus computed its blueprint strategy lying in any way, it just views it as ‘This
in eight days using only 12,400 core is the action that’s going to make me
hours and used just 28 cores during live the most money in this situation.’”
play. Brown said.
https://www.sciencedaily.com/rel
eases/2019/07/190711141343.htm
COMMENT:
Poker remains to be one of the hardest

games people can play it requires
strategy and perfect decision making.
Then there’s this article showing an
Artificial Intelligence playing poker. The
article says that the Artificial
Intelligence aka Pluribus was able to
defeat a number of pro poker players
that has a number of titles.
Though poker is an incredibly
complicated game, Pluribus made
efficient use of computation. AIs that
have achieved recent milestones in
games have used large numbers of
servers and/or farms of GPUs; Libratus
used around 15 million core hours to
develop its strategies and, during live
game play, used 1,400 CPU cores.
Pluribus computed its blueprint strategy
in eight days using only 12,400 core
hours and used just 28 cores during live
play. I myself play poker with my friends
but never expected that a Artificial
Intelligence could conquer the league of
poker. Pluribus is different because,
more or less, it is analyzing the effect of
bluffing — that is, betting with a weak

Article Science

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Article Science

Uploaded by

Copyright:

Available Formats

"Thus far, superhuman AI milestones in

An artificial intelligence strategic reasoning have been limited to

Poker remains to be one of the hardest

You might also like