
(Adobe Stock Images)
The prisoner’s dilemma is a classic concept in game theory that illustrates a common problem faced in various scenarios involving decision-making and cooperation. Many early examination into strategic decision making centered around zero sum games. Basically, a zero sum game is where one person’s gain equals a corresponding loss to the other person. However, real-life decisions are often more complex than just choosing what benefits one person at the cost to another.
In the 1940s “game theory” was founded by the polymath John von Neumann. In 1940, at the beginning of the nuclear race, strategic decisions had nasty consequences if miscalculated. Basically, it is better to bomb than be bombed. Neumann’s work influenced decision with respect to economics, diplomacy, and warfare. In collaboration with Oskar Morgenstern, they explored situations where there were more than one stakeholder and decisions that were not zero-sum games (Dufwenburg, 2011).
The prisoner dilemma game is a game that includes the option of cooperation. Behavioral scientists have explored critical decision making utilizing information from research utilizing the prisoner dilemma game.
Key Definition:
The prisoner’s dilemma is a classic concept in game theory, butting two prisoners against each other, illustrating a basic principle of self interest (defecting) or mutual cooperation (loyalty) in decision making, and the consequential payoffs of each.
The Prisoner’s Dilemma
The backdrop of the prisoner’s dilemma is there are two prisoners charged with the same crime. Held separately, officials interrogate, hoping for a confession. Because of lack of evidence, the courts will only convict them of the larger crime if one of the prisoner confesses, implicating the other. If both prisoners confess, they will be charged with the higher crime. If neither confesses, they both will be charged with the lessor crime. And finally, if only one confesses, he is set free for having turned state’s evidence and is given a reward. The courts convict the other prisoner on the strength of the other’s testimony and is given a more severe sentence than if he had confessed.
Key Description:
Two members of a gang, A and B, are arrested. Prosecutors lack evidence to convict them of a major crime but can get them on a lesser charge, for which they’ll serve a year in prison. A and B can’t communicate with each other. Prosecutors offer each a deal—inform on the other and your sentence is reduced. There are four possible outcomes: Both A and B refuse to inform on each other: each serves one year. Both A and B inform on each other: each serves two years. A informs on B, who remains silent: A walks free and B serves three years. B informs on A, who remains silent: B walks and A serves three years (Sapolski, 2018, Kindle Location: 5,480).
The dilemma is that singularly it in the interest of each to confess no matter what the other does, however, it is in their collective interest to both hold out. There is no satisfactory solution to the paradox of this game. The simplicity of the prisoner’s dilemma is misleading because what seems rational from our own point of view, turns out to be detrimental in the end. Steven Pinker wrote “game theorists have shown that the best decision for each player individually is sometimes the worst decision for both collectively” (2011, Kindle location 7,209).
The Dilemma
In this dilemma, both prisoners have two choices: cooperate with each other by remaining loyal and refusing to confess to the crime or betray one another. The three potential outcomes of their decisions are:
- If both prisoners choose to cooperate, meaning they remain loyal to each other and stay silent, they will both receive a moderate sentence for a lesser offense. This outcome is often considered the most beneficial for both parties involved, as it minimizes their overall punishment.
- If one prisoner remains loyal while the other betrays, the betrayer will receive a significantly reduced sentence or even go free, while the cooperating prisoner faces a harsher penalty. This scenario reflects the advantage a person may gain by exploiting the trust of their counterpart.
- If both prisoners choose to betray each other and confess, they will both receive a severe punishment. This outcome is considered the worst-case scenario since both individuals end up worse off than if they had remained loyal.
Mathematical Analysis
For evaluating data, researchers often assign a mathematical correlate to each potential outcome. For example, consider the punishment for outcome 1 as each prisoner receives a sentence of one year. Combined (1+1) the punishment is two years of incarceration. For outcome 2, one prisoner is set free and the other receives the maximum of four years. Combined (0+4) the punishment is three years of incarceration. For outcome 3, each prisoner receives the maximum of three years. Combined (3+3) the punishment is six years of incarceration.
The possible outcomes, however for a group differ from the possible outcomes of the individual.
For the Group:
- Group Outcome 1= (both remain loyal) two years (-2)
- Group Outcome 2= (one remains loyal) three years (-4)
- Group Outcome 3= (both defect) six years (-6)
For the Individual:
- Individual Outcome 1= (defect and confess to crime but partner doesn’t) Set free (0)
- Individual Outcome 2= (both remain loyal) one year (-1)
- Individual Outcome 3= (both defect and confess) three years (-3)
- Individual Outcome 4= (remain loyal but partner defects and confesses) three years (-4)
The prisoner’s dilemma highlights a conflict between self-interest and cooperation. Each prisoner faces the choice between maximizing their own benefit (individual outcome 1) or risking the greatest potential harm to themselves (individual outcome 4) in order to achieve a mutually beneficial outcome (group outcome 1). This dilemma is often used to analyze various social, economic, and political situations where the actions of individuals impact the collective as a whole.
The Prisoner Dilemma Tournaments
To research decision making, utilizing the prisoner’s dilemma over multiple rounds of play, social scientists organized tournaments where contestants submitted a strategy (always defect, always loyal, respond in kind, etc…). Robert Axelrod wrote about the multiple round experiment, “this changed the game from a one-move Prisoner’s Dilemma in which defection is the dominant choice, to an iterated Prisoner’s Dilemma in which conditional strategies are possible” (2006, Kindle location 1,107). Interestingly, the multiple round game significantly changed the long term benefits of cooperation (remaining loyal).
Robert Trivers wrote, “cooperation has been well modeled as a simple prisoner’s dilemma. Cooperation by both parties benefits each, while defections hurt both, but each is better off if he defects while the other cooperates. Cheating is favored in single encounters, but cooperation may emerge much of the time, if players are permitted to respond to their partner’s previous moves” (2001).
The multiple rounds created a new dynamic. Players remember how you responded on previous rounds. If you repeatedly defected, they cannot trust you to be loyal, therefore they are more likely to defect in future rounds. As Axelrod explains, “this basic problem occurs when the pursuit of self-interest by each leads to a poor outcome for all” (2006, Kindle location: 231).
The Tit-for-Tat Strategy
The best performing strategy in several experiments of the iterated prisoner’s dilemma tournaments was tit-for-tat. In the tit-for-tat strategy, the player begins with cooperation and thereafter responds in kind to the other player’s last move. Since this strategy encourages cooperation, it performs better in tournaments of multiple rounds without being a succor as in the strategy to always cooperate.
What makes the tit-for-tat strategy successful is the possibility of meeting again in later rounds. The choice to defect or cooperate then is a strategy to influence future round meetings. “This possibility means that the choices made today not only determine the outcome of this move, but can also influence the later choices of the players. The future can therefore cast a shadow back upon the present and thereby affect the current strategic situation” (Axelrod, 2006, Kindle location: 296).
Tit for Tat Strategy Can Never Win Individual Matchups
Sapolski explains that the Tit for tat strategy can never win individual match-ups “best case is a draw, if playing against another person using Tit for Tat or someone using an “always cooperate” strategy. Otherwise it loses by a small margin. Every other strategy would always beat Tit for Tat by a small margin. However, other strategies playing against each other can produce catastrophic losses. And when everything is summed, Tit for Tat wins” (2018, Location: 5,518).
However, knowledge of start and end rounds impacts the choice. Since tit-for-tat is based on future rounds, if the last round is known, it is always advantageous to defect (Rapoport and Dale, 1966). Axelrod adds, “such a line of reasoning implies that the game will unravel all the way back to mutual defection on the first move of any sequence of plays that is of known finite length” (Axelrod, 2006).
After extensive research in cooperation, using the prisoners dilemma, Anatol Rapoport concluded, “it is reasonable to suppose that the subjects by and large learned that, in the long run, cooperation pays, whereas defection, although momentarily advantageous, does not pay in the long run” (1995).
Real Life Applications, Considerations, and Complications
By examining the prisoner’s dilemma, researchers and theorists have gained insights into the complexities of strategic decision-making, trust, and cooperation in various settings. It serves as a thought-provoking example that challenges our understanding of human behavior and the dynamics of relationships. However, a game with set rules does not match the complexity of the world in which we live.
An important point to remember, is that the prisoner’s dilemma is not about prisoners. It is about individual choice and relationships. Behavioral science is less concerned about individual variations and more concerned about overall impact. As an individual, applying these lessons to our unique relationships, we must use extreme caution.
Rapoport concluded, “the lesson to be drawn from the prisoner’s dilemma paradox is that the time honored principles of individual rationality may get us in trouble when applied in circumstances which all for collective rationality–which is not necessarily a sum of individual rationalities” (1975).
Tit-for-Tat in Intimate Relationships
Trudy Govier, in her hard to find book on trust, wrote that the tit-for-tat strategy is “for personal relationships…is rarely sensible.” She explains, “Tit for tat is not appropriate for personal relationships for many good reasons, prominent among these being the fact that these relationships lack key features of the Prisoner’s Dilemma. Unlike the prisoners in the dilemma, people in relationships can communicate. And unlike the situation faced by those prisoners, what counts as cooperation, as defection, and as pay-off is unclear” (1998).
A tit for tat approach can destroy a relationship as it delves deeper and deeper in to vindictive retribution. However, to avoid being the succor of an unscrupulous partner is the ultimate response to all their ‘tats’ is to move on, creating space for a better relationship.
Perception
The prisoner’s dilemma is based on clear rules and immediate knowledge of the other persons choice to cooperate or defect. Life isn’t life this. Trivers explains this complication, “If you lie and I believe you, I suffer. If you lie and I disbelieve you, you are likely to suffer. By contrast, in the prisoner’s dilemma, each individual knows after each reciprocal play how the other played (cooperate or defect), and a simple reciprocal rule can operate under the humblest of conditions—cooperate initially, then do what your partner did on the previous move (tit for tat). But with deception, there is no obvious reciprocal logic. If you lie to me, this does not mean my best strategy is to lie back to you—it usually means that my best strategy is to distance myself from you or punish you.”
Trivers remarks that “of course it is better to begin with very simple games and only add complexity as we learn more about the underlying dynamics” (2011, Kindle location: 1,014).
References:
Axelrod, Robert (2006). The Evolution of Cooperation. Basic Books; Revised edition.
Dufwenberg, Martin (2011). Game theory. Wiley Interdisciplinary Reviews: Cognitive Science, 2(2), 167-173. https://doi.org/10.1002/wcs.119.
Govier, Trudy (1998). Dilemmas of Trust. McGill-Queen’s University Press; First Edition.
Pinker, Steven (2003). The Blank Slate: The Modern Denial of Human Nature. Penguin Books; Reprint edition.
Rapoport, Anatol (1995). Prisoner’s Dilemma: Reflections and Recollections. Simulation & Gaming: An Interdisciplinary Journal of Theory, Practice and Research, 26(4), 489-503. https://doi.org/10.1177/1046878195264010.
Rapoport, Anatol (1975). Some comments on “Prisoner’s Dilemma: Metagames and other solutions”. Systems Research & Behavioral Science, 20(3), 206-208. https://doi.org/10.1002/bs.3830200309.
Rapoport, Anatol & Dale, Phillip S. (1966). The “end” and “start” effects in iterated Prisoner’s Dilemma. Journal of Conflict Resolution, 10(3), 363-366. https://doi.org/10.1177/002200276601000308
Sapolski, Robert (2018). Behave: The Biology of Humans at Our Best and Worst. Penguin Books; Illustrated edition.
Trivers, Robert (2011). The Folly of Fools: The Logic of Deceit and Self-Deception in Human Life. Basic Books; 1st edition.
Psychology Fanatic Book References:
Throughout the vast selection of articles found at Psychology Fanatic, you will find a host of book references. I proudly boast that these referenced books are not just quotes I found in other articles but are books that I have actually read. Please visit the Psychology Fanatic data base of books.