Redirected from Iterated prisoner's dilemma
The prisoner's dilemma is an example of a nonzerosum game that demonstrates a conflict between rational individual behavior and the benefits of cooperation in certain situations.
The classical Prisoner's Dilemma is as follows:
This illustrates that if the two had been able to communicate and cooperate, they would have been able to achieve a result better than what each could achieve alone. There are many situations in the social sciences that have this same property, so understanding these situations and how to resolve them is important. In particular, it demonstrates that there are situations where naïve reasoning in one's own best interest does not always lead to a best result.
In political science, the Prisoner's Dilemma is often used to illustrate the problem of two states getting into an arms race. Both will reason that they have two options, either to increase military expenditure or to make an agreement to reduce weapons. Neither state can be certain that the other one will keep to such an agreement; therefore, they both incline towards military expansion. The irony is that both states seem to act rationally, but the result is completely irrational.

The iterated prisoner's dilemma
In his book The Evolution of Cooperation (1984), Robert Axelrod explored a variant scenario he called the "iterated prisoner's dilemma", where the participants have to choose their mutual strategy again and again, and have memory of their previous encounters. The example he used was two people meeting and exchanging closed bags, with the understanding that one of them contains money, and the other contains an item being bought. Either player can choose to honor the deal by putting into his bag what he agreed, or he can defect by handing over an empty bag. This mirrors the prisoner's dilemma: for each participant individually, it is better to defect, though both would be better off if both cooperated.
Axelrod discovered that when these encounters were repeated over a long period of time with many players, each with different strategies, "greedy" strategies tended to do very poorly in the long run while more "altruistic" strategies did better, as judged purely by selfinterest. He used this to show a possible mechanism to explain what had previously been a difficult hole in Darwinian theory: how can apparently altruistic behavior evolve from the purely selfish mechanisms of natural selection?
The best deterministic strategy was found to be "Tit for Tat[?]". It is very simple: cooperate on the first iteration of the game. After that, do what your opponent did on the previous move. A slightly better strategy is "Tit for Tat with forgiveness". When your opponent defects, on the next move you sometimes cooperate anyway with small probability (around 1%5%). This allows for occasional recovery from getting trapped in a cycle of defections. The exact probability depends on the lineup of opponents. "Tit for Tat with forgiveness" is best when miscommunication is introduced to the game. That means that sometimes your move is incorrectly reported to your opponent: you cooperate but your opponent hears that you defected.
For the iterated Prisoner's Dilemma, it is not always correct to say that any given strategy is the best. For example, consider a population where everyone defects every time, except for a single individual following the TitforTat strategy. That individual is at a slight disadvantage because of the loss on the first turn. In such a population, the optimal strategy for that individual is to defect every time. In a population with a certain percentage of alwaysdefectors and the rest being TitforTat players, the optimal strategy for an individual depends on the percentage, and on the length of the game. Simulations of populations have been done, where individuals with low scores die off, and those with high scores reproduce. The mix of algorithms in the final population generally depends on the mix in the initial population.
If an iterated Prisoner's Dilemma is going to be iterated exactly N times, for some known constant N, then there is another interesting fact. The Nash equilibrium is to defect every time. That is easily proved by induction. You might as well defect on the last turn, since your opponent will not have a chance to punish you. Therefore, you will both defect on the last turn. Then, you might as well defect on the secondtolast turn, since your opponent will defect on the last no matter what you do. And so on. For cooperation to remain appealing, then, the future must be indeterminate for both players.
Another odd case is "play forever" prisoner's dilemma. The game is repeated an infinite number of times, and your score is the average (suitably computed).
The Prisoner's Dilemma game is fundamental to certain theories of human cooperation and trust. On the assumption that transactions between two people requiring trust can be modelled by the Prisoner's Dilemma, cooperative behavior in populations may be modelled by a multiplayer, iterated, version of the game. It has, consequently, fascinated many, many scholars over the years. A not entirely uptodate estimate (Grofman and Pool 1975) puts the count of scholarly articles devoted to it at over 2,000.
There are also some variants of the game, with subtle but important differences in the payoff matrices, which are listed below:
Chicken
Another important non zerosum game type is called "Chicken", in which it is always in a player's interest to cooperate whatever the other player does. In Chicken mutual defection is the worst possible outcome (hence an unstable equilibrium), but in the Prisoner's Dilemma the worst possible outcome is cooperating while the other person defects (so both defecting is a stable equilibrium). In both games, "both cooperate" is an unstable equilibrium.
A typical payoff matrix would read:
Chicken is named after the car racing game. Two cars drive towards each other for an apparent headon collision. Each player can swerve to avoid the crash (cooperate) or keep going (defect). Another example often given is that of two farmers who use the same irrigation system for their fields. The system can be adequately maintained by one person, but both farmers gain equal benefit from it. If one farmer does not do his share of maintenance, it is still in the other farmer's interests to do so, because he will be benefiting whatever the other one does. Therefore, if one farmer can establish himself as the dominant defector  ie, if the habit becomes ingrained that the other one does all the maintenance work  he will be likely to continue to do so.
Assurance Game
An Assurance Game has a similar structure to the Prisoner's Dilemma, except that the rewards for mutual cooperation are higher than those for defection. A typical payoff matrix would read:
The Assurance Game is potentially very stable because it always gives the highest rewards to players who establish a habit of mutual cooperation. However, there is still the problem that the players might not realise that it is in their interests to cooperate. They might, for example, mistakenly believe that they are playing a Prisoner's Dilemma or Chicken game, and arrange their strategies accordingly.
Friend or Foe is a game show currently airing on the Game Show Network. It is a reallife example of the Prisoner's Dilemma game. On the game show, three pairs of people compete. As each pair is eliminated, they play a game of Prisoner's Dilemma to determine how their winnings are split. If they both cooperate ("Friend"), they share the winnings 5050. If one cooperates and the other defects ("Foe"), the defector gets all the winnings and the cooperator gets nothing. If both defect, both leave with nothing. Notice that the payoff matrix is slightly different from the standard one given above, as the payouts for the "both defect" and the "I cooperate and opponent defects" cases are identical. This makes the "both defect" a neutral equilibrium, compared with being a stable equilibrium in standard Prisoner's Dilemma. If you know your opponent is going to vote "Foe", then your choice does not affect your winnings. In a certain sense, "Friend or Foe" is between "Prisoner's Dilemma" and "Chicken".
The payoff matrix is
Friend or Foe would be useful for someone who wanted to do a reallife analysis of Prisoner's Dilemma. Notice that you only get to play once, so all the issues involving repeated playing are not present. The "tit for tat" strategy does not come into play, because you only play once.
Search Encyclopedia

Featured Article
