Consider the infinitely repeated version of the symmetric two-player stage game below. The first number in a cell is player 1’s single-period payoff. Suppose the stage game is repeated three times and each player’s payoff is the sum of the single-period payoffs. Assume that past actions are common knowledge.
Based on this game, can we assume that the following profile is a Subgame Perfect Nash Equilibrium?
- Player 1: In period 1, choose c. In period 2, choose c if the outcome in period 1 was (c, x) and choose b otherwise. In period 3, choose d if the outcome in periods 1 and 2 was (c, x) and choose b otherwise.
- Player 2: In period 1, choose x. In period 2, choose x if the outcome in period 1 was (c, x) and choose w otherwise. In period 3, choose y if the outcome in periods 1 and 2 was (c, x) and choose w otherwise.
In this game, we have 4 Nash Equilibrium. {(b, w) (a, z) (d, y) and (e, v)}. So in this scenario, both the punishment (for not cooperating) and the reward (for cooperating) have to be Nash equilibrium.

So what would be the payoff from cooperating?
- In the third period, we would get a payoff of (d, y) which is equal to 6, 6 (our Nash Equilibrium).
- In the second period we would get a payoff of (c, x) which is equal to 7, 7.
- In the first period, we would get a payoff of (c, x) which is again, 7, 7.
So our total payoff for one person would be 6 + 7 + 7, which is equal to 20.
Deviating: Second Round
But what if we decided to deviate? The value of deviating must be greater than the value of cooperating in order to entice us to do that. So what if we decided to deviate in the second round of the game?
** we cannot deviate in the final round because the final round must be a Nash Equilibrium in order for the game to work
So we would still get a payoff of 7 in that first round from playing (c, x). But what if we decided to go to a higher payoff that second round and cheat? We would deviate to (b, x), as Player 1, because 8 from playing b is a higher payoff than the 7 that we could have gotten from playing c.

So we would get a payoff of 8 in that second round. But for the third round now, we would only get a payoff of (3, 3) because now Player 2 is going to punish us and will play w, which puts as (b, w) for a payoff of 3.
So our total payoff is now equal to 7 + 8 + 3, which totals 18. That is less than the payoff that we would have gotten from cooperating, so it doesn’t make sense for us to deviate at this point.
Deviating: First Round
But what if we decided to deviate right away? Go right into the game, throw our cards onto the table, and pull out the fists and square up?
Here, we would once again choose (b, w) because 8 is the highest possible payoff we can get as Player 1. But for the two subsequent rounds, we only get a payoff of 3, because Player 2 will make us play (b, w) and (b, w).
Thus, we only get a total payoff of 8 + 3 + 3, which is equal to 14. We would NOT want to deviate in the first round, because that is the lowest total payoff, especially compared to the total of 20 that we could have gotten from simply cooperating. No squaring up.
Conclusion
The total advantage that we could gain from deviating is -2. So we would obviously choose to stay on track, and cooperate. So inducing people to cooperate involves incentivizing them to not deviate from the cooperating strategy (obviously). Payoff must be higher to stay the course, and then people (rationally) should do so.
Leave a Reply