5 views

Original Title: Modeling Poker -3part

Uploaded by Zlatomir_1992

Modeling Poker -3part

Creating poker models

Attribution Non-Commercial (BY-NC)

- 159 -PokerBank Articles
- VOA Words and Their Stories
- Educational Technology 2 Answer 1-11
- BluffingMy 'Final' Offer
- Gambling America: Balancing the Risks of Gambling and Its Regulation, Cato Policy Analysis No. 349
- Booth v. Illinois, 184 U.S. 425 (1902)
- 58- The Case Of The Singing Skirt - Erle Stanley Gardner.epub
- Feng Shui Gambling
- Social Issue Gambling
- WINE 1
- SB 7900-C
- Starkville Dispatch eEdition 3-18-19
- Draft Kings Complaint
- finalreportnsgc_000
- English Vocabulary for Writing
- Flop Texture 2nd Barreling
- AO-Module-MilkRun-HR.pdf
- Pastime Cardroom
- roberts-rules-of-poker-version-11-fullsize
- pinneaple

You are on page 1of 11

Introduction

This is Part 3 of the article series "Modeling Poker" where we'll study poker theory analytically using mathematical modeling and simplified poker games (so-called toy games).

In Part 1 and Part 2 we solved the following version of the AQK game over 1/2 street with fixedlimit betting structure:

Rules for the AKQ game: We have two players: Alice (out of position) and Bob (in position) At the start of the game both players puts a 1 bb ante into the pot Both players get dealt one random card from the AKQ deck Alice checks "in the dark" Bob can now check and see a showdown, or he bets 1 bb If Bob bets, Alice can fold, or she can call and see a showdown When the betting round is over, and nobody has folded, the highest card wins in a showdown

We then solved the game under the assumption that both players are playing perfectly. We deduced the following optimal strategies for Alice and Bob: Alice Always check-calls A Check-calls K 1/3 of the time Always check-folds Q

Bob Always bets A for value Always checks behind with K Buffs Q 1/3 of the time

In Part 3 we'll continue to work on this game. This time we'll experiment with the solution found previously and see how Alice and Bob can increase their EV if they play against an opponent that makes mistakes. If one of them deviates from optimal play, the other can increase EV by also deviating from optimal play. We then move from optimal play to exploitive play.

In Part 1 and Part 2 we found optimal strategies for both Alice and Bob, and we calculated the value of the game for each of them. We found that Bob made +1/18 bb per game, which is 100/18 =5.56 bb/100 using standard poker jargon. Since the game is heads-up with no rake, it's a zero sum game. Alice's win rate is then the negative of Bob's win rate, namely -5.56 bb/100. Both Alice and Bob are playing optimally. An optimal strategy is what we get when we're trying to play perfectly against an opponent who is also trying to play perfectly against us. We can imagine an evolutionary process where each player continually adjusts to the other player to exploit his or her mistakes maximally. At the same time each player is trying to defend against the other player's attempts to exploit. The process converges towards a state where neither player can do anything to increase his EV for the game. Any attempt to do so will create an opportunity for the other player to increase his EV even more. We then end up with an optimal strategy pair. The pair is made up of two optimal strategies, one for each player. If one player deviates from his optimal strategy, the other player can increase EV by adapting to this deviation. Thus, neither player has any incentive to deviate from optimal play, since: 1. They can't profit from it, if their opponent continues to play optimally. 2. They risk losing, since their opponent has the option to adjust to exploit them

Therefore, deviating from optimal strategy for no reason is a "negative freeroll". In the best case scenario our EV stays the same (if our opponent continues to play optimally). In the worst case our EV drops (if he deviates from optimal play to exploit our deviation from optimal play).

Now we'll let Alice and Bob make various mistakes in the AKQ game, and then we'll see how the other player can increase EV by adjusting to exploit the mistakes. To measure the effects of the adjustments, we'll look at Bob's EV in the game. We'll use an Excel spreadsheet to calculate Bob's EV as a function of the strategies employed by him and Alice:

Link for downloading the spreadsheet (right click and choose "Save as .."): AKQ-game.xls We have used the following abbreviations: - c/c =check-call - c/f =check-fold We now simply fill inn the percentages for Alice's and Bob's folding, checking, calling and betting with each hand (the top part of the spreadsheet). Then we get Bob's EV calculated for us (bottom part). we also get the individual EV contributions from each of the 6 possible scenarios that can occur (Alice has A/Bob has K, Alice has A/Bob has Q, etc). The percentages that are most interesting for us to tweak (how often Bob bluffs with Q and how often Alice calls with K) are marked with red. In the picture above we have plugged in the optimal strategies for both players. As expected, we we find Bob's EV to be +5.56 bb/100. Now we'll introduce systematic mistakes (=leaks) for both Alice and Bob, where we define a mistake as any deviation from optimal play. Then we'll let the other player adjust to exploit the mistakes.

Assume Alice is a calling station. She of course folds her pure "air" (the worthless hand Q), but she now always calls with the bluffcatcher K. Let's first see what happens when Bob continues to play optimally:

Alice always calls with K, but Bob's EV stays the same. Is this unexpected? No, since the definition of an optimal strategy is that our opponent can not change our EV by changing his or her strategy. Alice now attempts to increase her EV by calling more. But since Bob is valuebetting and bluffing with perfect balance, she achieves nothing and the EV does not change. Alice is now making a mistake. However, Bob does not profit from this mistake if he continues to play optimally. But Alice has created a leak in her game that Bob can attack. So what should he do? In real poker there are two intuitively obvious adjustments to make against an unbluffable calling station: - Bet more hands for value - Bluff less Here Bob doesn't have any more value hands to use (the only hand he can bet profitably is A), so his adjustment must be to decrease his bluffing percentage with Q. Since Alice always calls Bob's bluffs, it's obviously correct for Bob to never bluff at all. He is getting 2 : 1 in pot-odds on a bluff, but he always gets called (Alice can have A or K when Bob has Q, and she always calls with both of them). So bluffing becomes pointless, and we decrease Bob's bet% with Q to 0%. :

Bob's EV now increases from 5.56 bb/100 to 16.67 bb/100. This is an increase of almost 300%!. This illustrates an important principle for play against loose opponents in all forms of poker: You can make a lot of money playing opponents with loose calling standards. But you have to be willing to "gear down" and drop most (if not all) of your bluffs. A lot of the extra profit you make against these players comes from the fact that you don't have to bluff them to get your strong hands paid off. Bluffing will just cost you money. You can also profit from betting more hands for value in real poker. The same principle goes for these hands: You'll get paid when you bet them for value, even if you never bluff. Bob her bets the same hands (A) as before, since he doesn't have additional value hands to use. But he gets paid more every time, and he can drop bluffing from his strategy, since Alice is unbluffable. When Alice is willing to pay off Bob's valuebets every time she has a bluffcatcher, Bob's EV triples, even if his only adjustment is to stop bluffing. However, against a loose player who is observant and capable of adjusting, Bob's adjustment opens himself up for getting counter-exploited. If Alice realizes that Bob now only bets A for value and never bluffs, she can exploit him back by never calling with K. Bob then loses EV relative to optimal play, and this is the next scenario we'll look at:

Let's go one step further with the scenario from the previous example and assume that Alice has adjusted to Bob's lack of bluffing, and she now folds K every time. :

Bob's EV now drops to 0. He only bets A and checks down K and Q. Since Alice never calls when Bob bets, Bob could just as well have checked A also. The net result of both player's adjustment is that no money changes hands as a result of betting, and the outcome becomes similar to both players checking to see who wins. And then the game becomes symmetrical, and neither player can win money. We remember that this process started with Bob dropping his bluffs to exploit Alice's always-call strategy with her bluffcatchers. Then Alice adjusted to this adjustment by always folding her bluffcatchers. Now Bob is the one who gets exploited, since his EV drops from 5.56 bb/100 to 0. So which player is making the mistake here? If we use game theory optimal though processes, this question becomes unimportant. What matters is which player is getting exploited at the moment. It began with Alice making a mistake, and then Bob adjusted. Alice then adjusted to Bob's adjustment, and his adjustment ended up costing him money. Both players are now making mistakes (since we define a mistake as any deviation from optimal play), but in the end Bob is the one getting exploited. Note that Alice's adjustment to Bob's lack of bluffing transforms her from a calling station to a nit. A nit plays too tight, and we cane exploit a nit by bluffing a lot. When Alice changes from a calling station to a nit, Bob can make another adjustment and begin bluffing again. Since Alice always folds her bluffcatchers, it's obvious that Bob maximizes his EV by bluffing at every opportunity:

And Bob is back at +16.67 bb/100 again, which is what he had when he never bluffed against the calling station-version of Alice. We have another situation where both players are making mistakes, but this time it's Alice who ends up getting exploited. We see that the process of exploitation/counter-exploitation sends both players on a journey of "strategic ping-pong" with sudden and extreme strategy shifts. When two observant players are trying to exploit each other aggressively, this can happen. Both players are using reads and previous history to predict how their opponent will play right now. Then they are both trying to stay one step ahead of the other. When you are playing against a weak opponent with systematic leaks, you should adjust to exploit this. For example, Bob can triple his win rate by never bluffing against a calling station in the AKQ game, or by always bluffing against a nit. If your opponent is unaware of what you are doing, you don't have to worry about getting counter-exploited. But against an observant opponent, your exploitive adjustment opens you up for counter-exploitation. Bob therefore has a decision to make in the AKQ game when he wants to take advantage of Alice's leaks: He can continue to play optimally and take his guaranteed profit of 5.56 bb/100 He can deviate from optimal play himself and hope Alice is not aware enough to punish him for it

In practice, when we play real poker we're often making moderate adjustments to hide the fact that we're exploiting a leak. Let's say you're playing NLHE 6-max. You're on the button with two very tight players in the blinds. They are so tight that you believe you can open any two cards profitably. But if you think they will catch on and fight back more, you might be better of in the long run showing some moderation.

Your task is then to find a sweet spot that balances your desire to steal a lot with your desire to keep your opponent in a tight state where you can continue to steal a lot. So on the long run it might be better for you to steal something like 80%, and fold the absolute garbage hands like 83o, 62o, etc. They might be profitable opening hands in isolation, here and now, but if you open 100% of your hands, your profit from stealing on the button might decrease in the long run. That said, be aware that if we want to exploit a non-optimal strategy maximally, we have to make extreme adjustments. Even if the mistake we're exploiting isn't extreme. We'll now illustrate this graphically:

Now we'll illustrate some important concepts: Even if our opponent's leak isn't extreme, we can always by deviating from optimal play When we want to exploit an oppontent's mistake maximally, we have to use an extreme adjustment However, against an observant opponent, it might be more profitable in the long run to make a moderate adjustment that hides our strategy change from our opponent

By "extreme" we here mean a maximum deviation from optimal play in one direction or other. We have seen examples of this previously when we let Alice play like a calling station and always calling with K. Bob then made an extreme strategy change by dropping all bluffs. This was obviously his most profitable adjustment, since his bluffs had zero chance of success. Now we'll let Alice have smaller leaks and we'll show that Bob's most profitable adjustment is still the most extreme adjustment:

Now we let Alice have calling station tendencies, but not to the extreme. In the optimal strategy she is supposed to call 33% with L. Let's now assume she plays just a tad looser that that, calling 40%. This is per our definition a leak, but not an extreme leak. How should Bob adjust to exploit this moderate leak? Beginners sometimes misunderstand this problem and believe that the best adjustment to exploit a moderate leak is to make a moderate adjustment. This is wrong. The best way to exploit any leak, but or small, is to move as far away from optimal play as possible. This of course presupposes that our leaky opponent (or other players at the table) doesn't quickly adjust to counter-exploit us.

We'll show this graphically. The strategy component Bob tweaks against an Alice who calls a it too much is his bluffing percentage with Q. We start with 0%, gradually adjusts up to 100% bluffing, and plot the resulting EV graph:

Bob's EV decreases linearly as a function of his bluffing percentage. If he plays optimally (marked on the graph) and bluffs 33% he makes 5.56 bb/100. If he bluffs 0% he makes 6.67 bb/100 (we can calculate this by plugging Alice's call% =40% and Bob's bluff% =0% into the spreadsheet we used earlier). And if he increases his bluffing to 100%, his EV drops to 3.33 bb/100. Bob's best strategy against an Alice with moderate calling station tendencies is therefore to deviate maximally from optimal play and stop bluffing altogether. We can also argue mathematically for this adjustment. Bob bluffs 1 bb into a 2 bb pot (he's getting pot odds 2 : 1), but he can't make money from a player who calls more than 1/(2 + 1) =1/3 of the time. Since Alice calls A every time and K 40% of the time, her total calling percentage with A and K combined is 0.5 x 100% (A) + 0.5 x 40% (K) =70%. So the odds against her folding are 70 : 30 =2.33 : 1, which is worse than Bob's pot-odds 2 : 1. But as we have said earlier, Bob might not want to drop all bluffs in practice, if he suspects Alice is observant and capable of adjusting. Bob then has 3 alternatives: Continue to play optimally and take his guaranteed 5.56 bb/10 profit Adjust to 0% bluffing and go for 6.67 bb/100 profit, hoping Alice won't adjust and exploit him instead (we saw earlier that Alice can now reduce his EV to 0 bb/100 if she stops calling with K altogether) Make a moderate adjustment (say, bluffing 20% instead of the optimal 33% or the maximally exploitive 0%) to make it less obvious for Alice that she's being exploited

If Bob chooses the last alternative, he is trying to find a sweet spot that maximizes his profit overall in the long run. He wants to exploit Alice, but at the same time he doesn't want her to adjust. Let's say that Bob after some consideration lands on 20% bluffing. Then he increases his EV from the optimal 5.56 bb/100 to 6.00 bb/100, as shown below. This is pretty good.

Alice can now exploit this by dropping all calling with K, if she realizes what Bob is doing. The same logic applies. Bob's deviation from optimal play is a moderate one, but Alice's best response is an extreme one. If she drops all bluffs, Bob's EV becomes 3.33 bb/100:

And we're back to the extreme exploit/counter-exploit game we discussed earlier in the article. If Bob doesn't want to play this way, he can simply stick with optimal play. But if he believes he adjusts better and quicker than Alice, he can increase his profits by trying to exploit her, prepared to adjust to her adjustments to his adjustments, ad nauseum.

3. Summary

In this article we have experimented with strategies in the AKQ game over 1/2 street with fixed-limit betting. We made an Excel spreadsheet that calculated Bob's EV in the game as a function of his and Alice's strategies. Then we let Alice make various mistakes that Bob tried to exploit. If Alice has leaks, and if she never tries to adjust to Bob's adjustments, the game is simple for Bob. He moves as far away from optimal play as he can get, in order to exploit Alice's mistakes maximally. This is his most profitable play regardless of the size of Alice's mistake. The same applies to Alice if she wants to adjust to Bob's mistakes. In practice the most profitable adjustment in the long run might be a moderate one. This disguises our attempt to exploit and makes our opponent less likely to counter-exploit us. In Part 4 we'll look at another variation of the AKQ game. Now we'll let the pot be of arbitrary size P, and we'll find the general solution for the AKQ game over 1/2 street with fixed-limit betting. We'll also discuss some of the effects of pot size for fixed-limit play in general.

- 159 -PokerBank ArticlesUploaded byJerry Monaco
- VOA Words and Their StoriesUploaded bySuhail Zafar
- Educational Technology 2 Answer 1-11Uploaded byMichelle Sison
- BluffingMy 'Final' OfferUploaded byMark Turner
- Gambling America: Balancing the Risks of Gambling and Its Regulation, Cato Policy Analysis No. 349Uploaded byCato Institute
- Booth v. Illinois, 184 U.S. 425 (1902)Uploaded byScribd Government Docs
- 58- The Case Of The Singing Skirt - Erle Stanley Gardner.epubUploaded byluca cassino
- Feng Shui GamblingUploaded byChandni Dossani
- Social Issue GamblingUploaded byEvan Wiranata
- WINE 1Uploaded byLao Zhu
- SB 7900-CUploaded byCapitol Confidential
- Starkville Dispatch eEdition 3-18-19Uploaded byThe Dispatch
- Draft Kings ComplaintUploaded bycseiler8597
- finalreportnsgc_000Uploaded byJason Taylor
- English Vocabulary for WritingUploaded byFatin Haya Ismail
- Flop Texture 2nd BarrelingUploaded byRafael Cohen
- AO-Module-MilkRun-HR.pdfUploaded bybroodhunter2
- Pastime CardroomUploaded byfarnaz_2647334
- roberts-rules-of-poker-version-11-fullsizeUploaded byapi-184685451
- pinneapleUploaded byPaulo Trick
- thegamingindustryUploaded byJoderick Tejada
- February 12 BeaconUploaded byAdministrator6060
- Rifts - Character Sheet - Runner OCC.pdfUploaded byTess Mercer
- United States of America v. Approximately $5,000.00 in U.S. Currency - Document No. 2Uploaded byJustia.com
- redfern now stimulus questionsUploaded byapi-234148861
- Final ChanceUploaded byAnonymous r1azSbNRlm
- saUploaded byNithin George
- call of duty rulesetUploaded byapi-281995177
- Bryce CharUploaded byAllen Peter Weixler
- SuffolkUploaded byJose Ibanez

- 2013 Samsung TV Firmware Upgrade InstructionUploaded bybob_mirk
- Peoria County inmates 04/09/13Uploaded byJournal Star police documents
- Sandline Flagging Operation Results in FatalityUploaded byReda AL-mesbah
- lazomercader.pdfUploaded byM Lapuz
- InfoUploaded bykopiko
- Emergency Planning for Industrial Hazards - H.B.F.gowUploaded bycejotafual
- Business Benefits of a PLM SystemUploaded byKhairul Anuar
- National Register of Large Dams 2009Uploaded byThapa S
- lapideUploaded byKarthikkumar Baskaran
- dsfsdfsdfUploaded byJoel Pimentel
- Preserving Records 1Uploaded byNur Fera Samsudin
- TSHA Alamo eBookUploaded bySalgado Angel Salgado
- HF Strategic Plan 01.10.15Uploaded byTrudy Soucoup
- Chem1184 Student Manual 2006Uploaded bymmike_36
- Mitsubishi ASXUploaded byreadalotbutnowisdomyet
- qm riemann primes-cUploaded byapi-238805532
- Entreprenuership ArticleUploaded byFatima Naeem
- L3 Role of EHO and AEHO in MalaysiaUploaded byNandakumaran Selvakumaran
- The History of Life Insurance Companies in India Began With the Establishment of Oriental Life Insurance Company in the Year 1818 in CalcuttaUploaded byPappu Jaiswal
- Implementation of Dynamic Time Warping for Video IndexingUploaded byijcsis
- User GuideUploaded bySolomane Kané
- Brochure the Precast WikaUploaded bymustika05
- Brand Management Course Outline 2014Uploaded bydheerajsingh73
- Using Telegram Messenger as Push Notification and Bot Application _ LinkedInUploaded byLuiz Dias
- PDS Hempel's Galvosil 15700 en-GBUploaded bybayu
- Journal of Pigmentary Disorders 2015 Topical Lightening Cream VS Topical LC Oral PE and PF.pdfUploaded bySiti Maryam Natadisastra
- AEM 3e Chapter 01Uploaded bysamag35
- Signature Bank 2011 Annual ReportUploaded bybarkerdesign
- 50 Mixing Tips From Steven SlateUploaded byjedwards01
- Bio Lab 14Uploaded byNor Ashikin Ismail