Unfolding the universe of possibilities..

Whispers from the digital wind, hang tight..

Modeling Games with Markov Chains

09.10.2023

16 minutes

50 Views

Exploring Probabilistic Modeling using “Shut the Box”

Introduction

From playing cards with friends to winning money at the roulette table, the joy of a great game is irresistible to most. But no matter how fun, after a few losses even the most optimistic of players can’t help but ask “Is the game rigged?” For those of us with some knowledge of probability leaving this question unanswered can feel unsatisfying to say the least. In this post we will explore a type of probabilistic model for answering questions of this kind. Specifically, I will show how Markov Chains can be applied to the game “Shut the Box,” in hopes of inspiring you to use probability to answer your own game related questions.

What is a Markov Chain and How do they Relate to Games?

Markov Chains are a kind of probabilistic model that are simple, well-studied and enable the modeling of many real world stochastic processes. Specifically, the goal of a Markov Chain is to model a probabilistic sequence of events where the events are taken from a set known as the states. In order to achieve this goal, the Markov Chain stores the probability of transitioning between any two states in a matrix known as the transition matrix. In addition to the states and the transition matrix, that specify the behavior of the stochastic process, the defining characteristic of a Markov Chain is the Markov property. Intuitively, this property asserts that the current state must hold the necessary information to determine the probability of transitioning to any next state irrespective of which states have previously occurred in the sequence. As such, any stochastic process where this property holds can be modeled using a Markov Chain, making it a very powerful tool. To get a better sense for these concepts, and Markov chains in general, let’s look at an example of a simple game:

Let’s say you were going to gamble on a sequence of fair coin flips. At the beginning of the game you start with $5 and before each coin flip you predict the outcome (heads or tails). For each coin flip if you guess correctly you gain a dollar and if you guess incorrectly you lose a dollar. The game ends when either you reach $0 or $10.

If we let the states of the game be defined by the player’s cash balance after each coin flip, the Markov property holds and luckily we can model this game using a Markov chain! Graphically, we can represent the states and transition probabilities of this Markov Chain as follows:

In the diagram above the orange circles represent the game states and the arrows represent the probability of transitioning between states. Note that all transition probabilities are 0.5 as changing states requires correctly guessing the outcome of a fair coin flip. Additionally, states 0 and 10 have an arrow pointing to themselves labeled with a probability 1 as they are the end states of the game.

As you will see later in this article by modeling a game in this fashion we can answer many interesting questions. For instance,

What is the expected win/loss probability?How many steps (coin flips) will it take to win/lose the game on average?After t steps (coin flips) what is the probability distribution over the game states?

After reading the rest of this post, I urge you to return to this section, and consider how to answer these questions for the simple game presented above.

“Shut the Box” Game Explained

To truly explore some of the complexities related to probabilistic modeling with Markov Chains, we will focus our efforts on the game “Shut the Box.” I stumbled upon this game while scrolling Instagram Reels and with a vague understanding of the rules, I set out to determine how hard it was to win.

Roland Scheicher / Roland Scheicher at German Wikipedia, Public domain, via Wikimedia Commons

Shut the Box is a multiplayer game played using a board (shown above) consisting of nine tiles (labeled 1 through 9) and two six-sided dice. On each player’s turn, all of the tiles start in the upright position. Then, the two dice are rolled, and the player is allowed to flip down any tiles whose sum equals the combined value of the two dice. The player then repeats this process of rolling the dice and flipping down tiles until there exists no subset of upright tiles that sum to the combined value of the two dice. At this point, the player’s turn is over, and their score is the sum of tiles that are still upright on the board. In this version of the game, after all player’s have finished their turns, the player with the lowest score is deemed the winner. However, if a player’s turn ends because they were able to flip down all of the tiles (i.e. “shut the box”), they automatically win the game. It was this rule in particular that caught my attention, and thus will be the focus of this exploration. Specifically, I wanted to answer the question: how hard is it to “shut the box?”

Modeling “Shut the Box” using a Markov Chain

From the previous sections it may be clear that we now should model “Shut the Box” as a Markov Chain. While this may seem like an intuitive next step, let’s ensure this is possible by checking whether the Markov property holds. To do this we will define the unique states of this game as all subsets of the set {1, 2, 3, 4, 5, 6, 7, 8, 9}. This is because any moment during a player’s turn can be uniquely characterized by which tiles are flipped up and down. Numerically, we can think of these states as being 9 digit binary numbers (an upright tile is a 1) of which there are 2⁹ giving 512 unique states. Under this definition of states, the Markov property holds as only the probabilistic characteristics of the dice, the current state and the game rules determine the probability of transitioning between states. Most importantly, given that we know the current state, the previous states reached during a player’s turn have no effect on the probability of reaching other states in the future.

Now that the states are well defined and we have confirmed that “Shut the Box” can indeed be modeled as a Markov Chain, the only element left to define is the transition matrix T. Specifically, Tᵢₖ (entry at ith row and kth column) represents the probability of transitioning from state i to state k. It is in determining these probabilities that the interesting complexities of modeling this game emerge.

To figure out the probability of transitioning from one state to another we must answer the question “what actions must occur in the game to cause the state to change?” For our game of interest, “Shut the Box,” there are two probabilistic actions that take place to determine the next state: a dice roll and a selection of tiles to flip down. Let’s start by examining the role of the dice in transitioning between two states, i and k. We will first let Sᵢ and Sₖ be the sets containing the upright numbers in states i and k respectively. For the transition from state i to state k to have a non-zero probability it is clear that Sₖ must be a proper subset of Sᵢ. This is because the number of upright tiles must decrease when transitioning between states, and tiles cannot be flipped up once they are down. Under this state representation and the assumption that Sₖ ⊂ Sᵢ , the numbers flipped down in the process of transitioning between states i and k make up the set D = Sᵢ – Sₖ (elements in Sᵢ that are not in Sₖ).

As such, one necessary condition for this transition to occur is the sum of the rolled dice must be equal to the sum of the elements in D. Formally the following equation must hold,

where X₁ and X₂ are discrete random variables representing the values of the two dice. If we let z be the sum of elements in D, the probability that the above equation holds with two six-sided dice can be derived as follows:

In the case that z < 2 or z > 12, the probability of transitioning from state i to state k is zero as there does not exist a roll of two regular dice that sums to z.

While it may seem as if this is the only probabilistic computation needed to fill in the entries of the transition matrix, a state transition also requires the player to choose from a set of valid moves after the dice roll. Thus, to transition from state i to state k not only does the combined value of the dice need to be equal to z, but the player must also choose to flip down the tiles in set D instead of the possibly many other options consistent with the dice roll.

To model this element of human strategy, we will assume the player chooses from the possible sets of tiles to flip down uniformly at random. Under this assumption, the probability that the player chooses state k is 1/Nᵢₖ, where Nᵢₖ is the number of non-empty subsets of Sᵢ that sum to z.

With the information above we can now formally define the entries of the transition matrix. Each entry in the transition matrix represents the probability of two events occurring:

The sum of the dice two being equal to z (event 1)The player randomly choosing to flip down the specific set of tiles that change the state from state i to state k (event 2)

Thus, the entries of the transition matrix can be defined as follows:

It is important to note that the above definition of the transition matrix does not consider the probability of a player’s turn ending. This can happen when either the player “shuts the box,” or no tiles can be flipped down after observing the dice roll value. In both of these cases, 1/Nᵢₖ is undefined and as such they must be handled independently.

Under the aforementioned binary representation of states, state i = 0 suffices to represent “shutting the box” as all tiles are flipped down. Since this is the “win” state of the game, we can modify the transition matrix so the probability of staying in state 0 is 1 (i.e. T₀₀ = 1).

Finally, we will add a “loss” state (L) to represent a player’s turn finishing due to an “unlucky” dice roll. Specifically, to incorporate this state into the transition matrix we need to know the probability that none of the subsets of Sᵢ sum to X₁ + X₂ (the sum of the two dice). While computing this quantity explicitly may be possible, we can define it relative to other transition matrix values as follows:

because each row in the transition matrix represents a probability distribution over states which must sum to one. Furthermore, since it is a final state of the game the probability of leaving the state is 0.

With the above results for the exact transition probabilities we are now able to utilize some of the great properties of Markov Chains to answer the question, how hard is it to “shut the box?” Specifically, we can answer the question: what is the probability that a player “shuts the box” using a completely random strategy?

Computing the Win Probability of “Shut the Box” using Python

In trying to compute the “win/loss” probability of this game, understanding the utility of the transition matrix is important. For a Markov chain the transition matrix enables us to explore how a probability distribution over the states evolves after a single transition using only one matrix-vector multiplication. Mathematically, we can write this simply as follows:

where T is the transition matrix and πₜ is a row vector representing the probability distribution over all the states after t transitions. Thus, given that we know the probability of being in any state currently we can answer the question, “what is the probability of being in any state after one dice roll given we choose tiles to flip down at random?” Furthermore, using our knowledge of the deterministic initial state of the game (all tiles flipped up) we can easily define π₀ and recursively multiply it by T to determine the distribution over the states after any arbitrary number of transitions (dice roll + tile flips). This recursion can be rewritten as the following closed form expression for πₜ:

The Markov chain that models “Shut the Box” has two final states, win and loss, that once entered cannot be left (a.k.a absorbing states). Thus, we can be sure that the distribution over all states will converge to a distribution over these two states after some finite number of transitions. Intuitively, for “Shut the Box”, this statement highlights the fact that a player’s turn must end with them either “shutting the box” or failing to do so and as such there is a finite limit on the number of moves a player can make in one turn.

To find this upper limit note that the longest turn occurs when the player flips down one tile after every dice roll until the tile labeled “1” is solely upright and thus they cannot “shut the box” on the following dice roll. This sequence of actions constitutes 9 moves in total to reach the final state as there are 9 total tiles. As such, solving for the win/loss probability is as simple as setting t ≥ 9 and computing πₜ. After computing πₜ, the probability of a player “shutting the box” using a random strategy is the first entry in πₜ as it corresponds to the state with all tiles flipped down (S₀). Alternatively, the recursive process of evolving the state distribution can be repeated starting at 0 until the distribution converges. Additionally, there are faster methods for computing πₜ in this case that I will not cover in this post. They leverage the existence of absorbing states and a special definition of the transition matrix. Learn more here: https://en.wikipedia.org/wiki/Absorbing_Markov_chain

To compute the win/loss probability in Python we will rely solely on the scientific computing library Numpy. First, we define the number of tiles, number of dice and the number states in the game as 9, 2 and 513 respectively.

num_tiles = 9
num_dice = 2
num_states = (2**num_tiles)+1 # +1 for the game over/lose state

Using these simple parameters we can generate the set representation, Sᵢ, for every state using its binary string representation as shown below.

# Generate representation of all possible states
tile_nums = [i for i in range(1, num_tiles+1)]
states = []
for i in range(0, 2**num_tiles):
binary_rep = np.binary_repr(int(i), width=num_tiles)
states.append([tile_nums[idx] for idx, a in enumerate(binary_rep) if a == ‘1’])

Here the Numpy binary_repr function is very helpful as it generates the binary representation of the state using the integer i. As such, by finding the positions of the ones in this string we can determine which tiles are upright in state i.

After handling with the state representations, we then generate the transition matrix using the following code:

# Generate transition matrix
transition_matrix = np.zeros((num_states, num_states))
for i in range(num_states):
for j in range(num_states):
transition_matrix[i][j] = compute_transition_probability(i, j)
transition_matrix[i][num_states-1] = 1 – transition_matrix[i][:num_states-1].sum()
assert np.allclose(transition_matrix[i].sum(), 1), “Transition matrix is not stochastic”

Finally, using the transition matrix we answer the core question of this exploration, what is the probability that a player “shuts the box” using a completely random strategy? To do this we must define the initial distribution, π₀, over all states as follows:

# Define the initial state distribution
init_state_dist = np.zeros((1, num_states))
init_state_dist[:, num_states-2] = 1

Since the game always starts with all of the tiles upright, the distribution over states is a row vector of length 513 with zeros at all entries except for the 511th entry (zero-based indexing) where a one is placed. This is because the binary representation of 511 is the string ‘111111111’ which corresponds to the state with all tiles upright.

Once the initial distribution is defined the probability of “shutting the box” can be determined using the formula πₜ = π₀Tᵗ with t = 9 in order to find the converged distribution over states. Again, we can set t = 9 because π₉T = π₉ and as such using a value greater than nine for t leads to unnecessary computation. The code that accomplishes this is as follows:

# Compute and print the win and lose probabilities
t = 9
multiple_transition_matrix = np.linalg.matrix_power(transition_matrix, t)
final_dist = np.matmul(init_state_dist, multiple_transition_matrix)

win_prob = final_dist[0, 0]
lose_prob = final_dist[0, num_states-1]

# Print results to 4 decimal places
print(“Win Probability: {:.4f}”.format(win_prob))
print(“Lose Probability: {:.4f}”.format(lose_prob))

In this code snippet we make use of the Numpy functions matrix_power and matmul (matrix multiplication) to compute T₉ and π₉ respectively. Using these results the probability of “shutting the box” is simply stored as the first element of π₉ which corresponds to the state with no upright tiles (‘000000000’ in binary). Using this insight, we finally know that it is very hard to “shut the box” (~2% chance) when using a completely random strategy! (exact probability values are reported below).

The code and model formulation above with some modifications can be extended to support variants of shut the box with any number of tiles and any number of dice. As such, I have visualized a graph of the win probability as the number of dice and number of tiles change below:

Conclusion

Through this post we have explored Markov Chains and their specific application for answering questions about the game “Shut the Box.” Nonetheless, the techniques I have highlighted, with some critical thought and modifications, can be used to model a multitude of games. As such, while your chances of “shutting the box” may be 2 in 100, I am certain you will find more success using probabilistic modeling to answer your favorite game-related questions.

Unless otherwise noted, all images are by the author

Modeling Games with Markov Chains was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

1 Comment

tlovertonet

20.03.2024

Hey there! Do you know if they make any plugins to safeguard against hackers? I’m kinda paranoid about losing everything I’ve worked hard on. Any suggestions?

Reply

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Unfolding the universe of possibilities..

Modeling Games with Markov Chains

Exploring Probabilistic Modeling using “Shut the Box”

Introduction

What is a Markov Chain and How do they Relate to Games?

“Shut the Box” Game Explained

Modeling “Shut the Box” using a Markov Chain

Computing the Win Probability of “Shut the Box” using Python

Conclusion

1 Comment

tlovertonet

Leave a Comment Cancel reply

Свежие записи

Свежие комментарии

Recent Posts

Deep Dive on Accumulated Local Effect Plots (ALEs) with Python

Turning your relational database into a graph database

Yes, you still need old-school NLP skills in “the age of ChatGPT”

The Two Documents Every Data Scientist Must Write Before Taking Interviews

Categories

Recent Comments

Архивы

Рубрики