The Prisoner's Dilemma: Examples, Payoff Matrix, and the Psychology of Trust

The Prisoner’s Dilemma: Examples, Payoff Matrix, and the Psychology of Trust

The prisoner’s dilemma is a classic concept in game theory that illustrates a common problem faced in various scenarios involving decision-making and cooperation. Many early examination into strategic decision making centered around zero sum games. Basically, a zero sum game is where one person’s gain equals a corresponding loss to the other person. However, real-life decisions are often more complex than just choosing what benefits one person at the cost to another.

In the 1940s ‘game theory’ was founded by the polymath John von Neumann. In 1940, at the beginning of the nuclear race, strategic decisions had nasty consequences if miscalculated. Basically, it is better to bomb than be bombed. Neumann’s work influenced decision with respect to economics, diplomacy, and warfare. In collaboration with Oskar Morgenstern, they explored situations where there were more than one stakeholder and decisions that were not zero-sum games (Dufwenberg, 2011).

See Game Theory for more on this topic

What is the Prisoner’s Dilemma? A Game Theory Definition

The prisoner dilemma game is a game that includes the option of cooperation. Behavioral scientists have explored critical decision making utilizing information from research utilizing the prisoner dilemma game.

The backdrop of the prisoner’s dilemma is there are two prisoners charged with the same crime. Held separately, officials interrogate, hoping for a confession. Because of lack of evidence, the courts will only convict them of the larger crime if one of the prisoner confesses, implicating the other. If both prisoners confess, they will be charged with the higher crime. If neither confesses, they both will be charged with the lessor crime.

And finally, if only one confesses, he is set free for having turned state’s evidence and is given a reward. The courts convict the other prisoner on the strength of the other’s testimony and is given a more severe sentence than if he had confessed.

A Simple Modern Portrayal of the Prisoner’s Dilemma

Imagine you and a coworker are both working on a big project. If you both work hard (Cooperate), you both get a small bonus. If you slack off while the other works (Defect), you get to rest while they do all the work, and you still get the credit. If you both slack off, the project fails and you both get fired. Even though working together is better for the group, the fear of being the only one working hard often leads both people to slack off.”

Key Description:

Two members of a gang, A and B, are arrested. Prosecutors lack evidence to convict them of a major crime but can get them on a lesser charge, for which they’ll serve a year in prison. A and B can’t communicate with each other. Prosecutors offer each a deal—inform on the other and your sentence is reduced. There are four possible outcomes: Both A and B refuse to inform on each other: each serves one year. Both A and B inform on each other: each serves two years. A informs on B, who remains silent: A walks free and B serves three years. B informs on A, who remains silent: B walks and A serves three years (Sapolsky, 2018).

The Dilemma

The dilemma is that singularly it in the interest of each to confess no matter what the other does, however, it is in their collective interest to both hold out. There is no satisfactory solution to the paradox of this game. The simplicity of the prisoner’s dilemma is misleading because what seems rational from our own point of view, turns out to be detrimental in the end. Steven Pinker wrote, “Game theorists have shown that the best decision for each player individually is sometimes the worst decision for both collectively” (Pinker, 2011).

In this dilemma, both prisoners have two choices: cooperate with each other by remaining loyal and refusing to confess to the crime or betray one another. The three potential outcomes of their decisions are:

If both prisoners choose to cooperate, meaning they remain loyal to each other and stay silent, they will both receive a moderate sentence for a lesser offense. This outcome is often considered the most beneficial for both parties involved, as it minimizes their overall punishment.
If one prisoner remains loyal while the other betrays, the betrayer will receive a significantly reduced sentence or even go free, while the cooperating prisoner faces a harsher penalty. This scenario reflects the advantage a person may gain by exploiting the trust of their counterpart.
If both prisoners choose to betray each other and confess, they will both receive a severe punishment. This outcome is considered the worst-case scenario since both individuals end up worse off than if they had remained loyal.

The Payoff Matrix: Understanding the Rewards and Risks

In order to evaluate data, researchers often assign a mathematical correlate to each potential outcome. For example, consider the punishment for outcome 1 as each prisoner receives a sentence of one year. Combined (1+1) the punishment is two years of incarceration. In outcome 2, one prisoner is set free and the other receives the maximum of four years. Combined (0+4) the punishment is three years of incarceration. For outcome 3, each prisoner receives the maximum of three years. Combined (3+3) the punishment is six years of incarceration.

The possible outcomes, however for a group differ from the possible outcomes of the individual.

For the Group:

Group Outcome 1= (both remain loyal) two years (-2)
Group Outcome 2= (one remains loyal) three years (-4)
Group Outcome 3= (both defect) six years (-6)

For the Individual:

Individual Outcome 1= (defect and confess to crime but partner doesn’t) Set free (0)
Individual Outcome 2= (both remain loyal) one year (-1)
Individual Outcome 3= (both defect and confess) three years (-3)
Individual Outcome 4= (remain loyal but partner defects and confesses) three years (-4)

The prisoner’s dilemma highlights a conflict between self-interest and cooperation. Each prisoner faces the choice between maximizing their own benefit (individual outcome 1) or risking the greatest potential harm to themselves (individual outcome 4) in order to achieve a mutually beneficial outcome (group outcome 1). This dilemma is often used to analyze various social, economic, and political situations where the actions of individuals impact the collective as a whole.

The 4 Outcomes of the Prisoner’s Dilemma:

Both Cooperate (Reward): Both get a moderate benefit (e.g., 1 year in prison).
Both Defect (Punishment): Both get a moderate penalty (e.g., 3 years in prison).
You Cooperate, They Defect (The Sucker): You get the maximum penalty; they go free.
You Defect, They Cooperate (The Temptation): You go free; they get the maximum penalty.

Nash Equilibrium

The Nash Equilibrium describes a stable state where neither player has an incentive to change their strategy, given the action of the other player (Nowak & Sigmund, 2005). Although both participants would technically be better off if they agreed to cooperate, standard game theory predicts they will settle on mutual defection because it acts as a “dominant” strategy—meaning it provides a better individual payoff regardless of whether the opponent decides to stay silent or confess (Batson, 2011; Stanovich & West, 1998; Kenrick & Griskevicius, 2013).

Driven by the rational desire to avoid the “sucker’s payoff” (cooperating while the other person defects), both players choose to betray one another, locking themselves into a suboptimal outcome where they maximize individual security at the cost of the greater collective good (Sapolsky, 2018; Kenrick & Griskevicius, 2013).

Researchers often study prisoners’ dilemmas in the laboratory, offering people different amounts of money for cooperating versus defecting. For example, if you and the other person in an experiment both choose to cooperate, you each get $5. But if you both defect on one another, you walk away with only $2. That might make it seem like cooperation is the best strategy, but it isn’t. If you choose to defect but your partner cooperates, you win $8, while the other guy gets $0. Of course, if the reverse happens (you cooperate but your partner defects), you’re the one who ends up with zilch—this is known as the “sucker’s payoff.”

~Douglas T. Kenrick & Vladas Griskevicius

The Prisoner Dilemma Tournaments

To research decision making, utilizing the prisoner’s dilemma over multiple rounds of play, social scientists organized tournaments where contestants submitted a strategy (always defect, always loyal, respond in kind, etc.).

Robert Axelrod wrote this about making the experiment multiple rounds:

“This changed the game from a one-move Prisoner’s Dilemma in which defection is the dominant choice, to an iterated Prisoner’s Dilemma in which conditional strategies are possible” (Axelrod, 2006).

Interestingly, the multiple round game significantly changed the long term benefits of cooperation (remaining loyal).

Robert Trivers wrote:

“Cooperation has been well modeled as a simple prisoner’s dilemma. Cooperation by both parties benefits each, while defections hurt both, but each is better off if he defects while the other cooperates. Cheating is favored in single encounters, but cooperation may emerge much of the time, if players are permitted to respond to their partner’s previous moves” (Trivers, 2011).

The multiple rounds created a new dynamic. Players remember how you responded on previous rounds. If you repeatedly defected, they cannot trust you to be loyal, therefore they are more likely to defect in future rounds. As Axelrod explains, this basic problem occurs “when the pursuit of self-interest by each leads to a poor outcome for all” (Axelrod, 2006).

The Iterated Prisoner’s Dilemma: Why ‘Tit-for-Tat’ Wins

The best performing strategy in several experiments of the iterated prisoner’s dilemma tournaments was tit-for-tat. In the tit-for-tat strategy, the player begins with cooperation and thereafter responds in kind to the other player’s last move. Since this strategy encourages cooperation, it performs better in tournaments of multiple rounds without being a succor as in the strategy to always cooperate.

What makes the tit-for-tat strategy successful is the possibility of meeting again in later rounds. The choice to defect or cooperate then is a strategy to influence future round meetings. “This possibility means that the choices made today not only determine the outcome of this move, but can also influence the later choices of the players. The future can therefore cast a shadow back upon the present and thereby affect the current strategic situation” (Axelrod, 2006).

Tit for Tat Strategy Can Never Win Individual Matchups

Sapolsky explains that the Tit for tat strategy can never win individual matchups “best case is a draw, if playing against another person using Tit for Tat or someone using an ‘always cooperate’ strategy. Otherwise it loses by a small margin. Every other strategy would always beat Tit for Tat by a small margin. However, other strategies playing against each other can produce catastrophic losses. And when everything is summed, Tit for Tat wins” (Sapolsky, 2018).

However, knowledge of start and end rounds impacts the choice. Since tit-for-tat is based on future rounds, if the last round is known, it is always advantageous to defect (Rapoport & Dale, 1966). Axelrod adds, “such a line of reasoning implies that the game will unravel all the way back to mutual defection on the first move of any sequence of plays that is of known finite length” (Axelrod, 2006).

After extensive research in cooperation, using the prisoner’s dilemma, Anatol Rapoport concluded, “It is reasonable to suppose that the subjects by and large learned that, in the long run, cooperation pays, whereas defection, although momentarily advantageous, does not pay in the long run” (Rapoport, 1995).

Reciprocal Altruism and the Prisoners Dilemma

Reciprocal altruism is essentially the evolutionary version of the maxim “you scratch my back, and I’ll scratch yours” (Nowak & Sigmund, 2005). It occurs when an individual helps another, incurring a small immediate cost to provide a relatively large benefit to the recipient, under the assumption that the favor will be repaid in the future

While natural selection typically favors self-interest, this mechanism allows cooperation to evolve even among individuals who are not related to one another (Trivers, 1971). For this exchange to remain stable, the participants must interact repeatedly and possess the cognitive ability to remember past interactions, ensuring they can distinguish between reliable partners and those who take without giving (Pinker, 2003; Stevens & Hauser, 2004).

In the specific context of the Prisoner’s Dilemma, reciprocal altruism provides a solution to the puzzle of why rational agents would ever choose to cooperate rather than defect. While betraying a partner might be the best move in a single, isolated encounter, real-life interactions are often repeated, creating an “iterated” Prisoner’s Dilemma where players can respond to previous behaviors (Trivers, 1971; Stevens & Hauser, 2004). This repetition allows for conditional strategies like “Tit-for-Tat,” where an individual starts by cooperating and subsequently mirrors their partner’s actions (Carter, 2024; Axelrod, 2006).

By making their helpfulness contingent on the other person’s behavior, individuals can enjoy the long-term rewards of mutual cooperation while simultaneously punishing “cheaters” who fail to reciprocate (Trivers, 1971).

See Reciprocal Altruism for more on this concept

Prisoner’s Dilemma Examples: From The Cold War to Business

By examining the prisoner’s dilemma, researchers and theorists have gained insights into the complexities of strategic decision-making, trust, and cooperation in various settings. It serves as a thought-provoking example that challenges our understanding of human behavior and the dynamics of relationships. However, a game with set rules does not match the complexity of the world in which we live.

An important point to remember, is that the prisoner’s dilemma is not about prisoners. It is about individual choice and relationships. Behavioral science is less concerned about individual variations and more concerned about overall impact. As an individual, applying these lessons to our unique relationships, we must use extreme caution.

Rapoport concluded:

“The lesson to be drawn from the prisoner’s dilemma paradox is that the time honored principles of individual rationality may get us in trouble when applied in circumstances which all for collective rationality–which is not necessarily a sum of individual rationalities” (Rapoport, 1975).

The Cold War and the Security Dilemma

The Cold War provides a stark historical illustration of the Prisoner’s Dilemma on a global scale, specifically through the “security dilemma” where nations seek safety by means that inevitably threaten the security of others (Axelrod, 2006). In this scenario, superpowers faced a choice between disarmament (cooperation) and an arms buildup (defection); while mutual disarmament yields the best collective safety and economic outcome, the dominant strategy for each individual side is to arm themselves to avoid the catastrophic “sucker’s payoff” of being unarmed against an armed opponent (Sapolsky, 2018; Axelrod, 2006).

This dynamic is exacerbated by loss aversion, where each side views its own concessions as perilous losses while viewing the opponent’s concessions as minor gains, hindering the ability to reach agreements on missile reduction (Quattrone & Tversky, 2000; Kahneman & Tversky, 2000). The danger of relying on this dominant strategy in an ongoing relationship is that it traps both parties in a high-cost, high-risk equilibrium where resources are wasted on weapons that ultimately leave neither side safer (Axelrod, 2006).

Business and the Trap of Short-Term Profit

In the business world, the Prisoner’s Dilemma appears when companies or individuals prioritize short-term market maximization over long-term stability, such as in price wars or the exploitation of supply chains. While “market pricing” and ruthless competition might be rational when dealing with strangers in one-off transactions, these tactics often backfire in ongoing business relationships where trust is essential (Kenrick & Griskevicius, 2013).

Unless you’re involved in a onetime negotiation with a total stranger about how many pesos to pay for a pound of pomegranates, market economics typically makes for bad business. Cold, hard rational self-interest might make sense in dealing with potentially hostile strangers you’ll never see or hear from again, but it doesn’t work too well when you’re dealing with people with whom you’ll be doing business for any length of time, be they coworkers, clients, or simply repeat customers.

~Douglas T. Kenrick & Vladas Griskevicius (2013, p. 68-69)

If a business adopts the dominant strategy of squeezing every penny from a partner—effectively “defecting” to maximize immediate profit—it risks creating an environment of mutual recrimination and distrust (Axelrod, 2006; Kenrick & Griskevicius, 2013). Because businesses usually operate in an “iterated” version of the game where players meet again, the use of a cutthroat dominant strategy is dangerous; it can destroy the cooperative environment necessary for mutual profit, leading to retaliation and the loss of valuable partners (Axelrod 2006).

Political Polarization and the Zero-Sum Game

These concepts also apply to the intense polarization in modern American politics, where parties often view interactions as zero-sum games rather than opportunities for mutual governance (Sapolsky, 2018). Partisan “silos” and “tribalism” encourage a dominant strategy of uncompromising obstruction and the demonization of the opposition, as cooperation is frequently viewed as a betrayal of one’s own group (Kakutani, 2018).

Nationalism, tribalism, dislocation, fears of social change, and the hatred of outsiders are on the rise again as people, locked in their partisan silos and filter bubbles, are losing a sense of shared reality and the ability to communicate across social and sectarian lines.

~Michiko Kakutani (2018)

When political actors treat their counterparts as enemies rather than colleagues, moral debates tend to escalate hostilities rather than resolve them, and the refusal to compromise—the political equivalent of defection—becomes the norm (Pinker, 2003). The danger of applying this dominant strategy to the ongoing governance of a nation is that it erodes shared reality and democratic institutions; instead of checking one another’s power, factions become locked in a cycle of “truth decay” and retaliation that paralyzes the system and prevents the resolution of shared problems (Kakutani, 2018).

See The Minimal Group Paradigm (MGP) for more information on this topic

Social psychologists have found that with divisive moral issues, especially those on which liberals and conservatives disagree, all combatants are intuitively certain they are correct and that their opponents have ugly ulterior motives. They argue out of respect for the social convention that one should always provide reasons for one’s opinions, but when an argument is refuted, they don’t change their minds but work harder to find a replacement argument.

~Steven Pinker (2003)

Tit-for-Tat in Intimate Relationships

Trudy Govier, in her hard to find book on trust, wrote that the tit-for-tat strategy for personal relationships is “rarely sensible.”

She explains:

“Tit for tat is not appropriate for personal relationships for many good reasons, prominent among these being the fact that these relationships lack key features of the Prisoner’s Dilemma. Unlike the prisoners in the dilemma, people in relationships can communicate. And unlike the situation faced by those prisoners, what counts as cooperation, as defection, and as pay-off is unclear” (Govier, 1998).

A tit for tat approach can destroy a relationship as it delves deeper and deeper in to vindictive retribution. However, to avoid being the succor of an unscrupulous partner is the ultimate response to all their ‘tats’ is to move on, creating space for a better relationship.

Perception

The prisoner’s dilemma is based on clear rules and immediate knowledge of the other persons choice to cooperate or defect. Life isn’t life this. Trivers explains this complication, “If you lie and I believe you, I suffer. If you lie and I disbelieve you, you are likely to suffer.

By contrast, in the prisoner’s dilemma, each individual knows after each reciprocal play how the other played (cooperate or defect), and a simple reciprocal rule can operate under the humblest of conditions—cooperate initially, then do what your partner did on the previous move (tit for tat). But with deception, there is no obvious reciprocal logic. If you lie to me, this does not mean my best strategy is to lie back to you—it usually means that my best strategy is to distance myself from you or punish you.”

Trivers remarks that, “Of course it is better to begin with very simple games and only add complexity as we learn more about the underlying dynamics” (Trivers, 2011).

Associated Concepts

Game Theory: The Prisoner’s Dilemma is a classic example used in game theory, which studies strategic interactions where the outcome for each participant depends on the actions of all.
The Primary Dilemma: This dilemma refers to the primary human conflict between satisfying individual needs with the social rules for connection.
Theory of Reasoned Action: According to this theory, there is a relationship between attitudes and behaviors. This theory posits that an individual’s behavior is determined by their intention to perform the behavior, which is influenced by their attitude toward the behavior and subjective norms.
Prospect Theory: A behavioral economic theory that describes how people choose between probabilistic alternatives that involve risk, where individuals know the probabilities of outcomes. Neuroeconomics often employs prospect theory to interpret neural data related to decision-making under risk.
Neuroeconomics: This field of study combines methods and theories from neuroscience, psychology, and economics to understand how individuals make decisions. By exploring the neural mechanisms underlying economic decision-making processes, neuroeconomics aims to shed light on topics such as risk, reward, and social interactions.
Value Theory: This theory is a branch of philosophy that examines the nature, origin, and evaluation of human values and moral principles. It explores questions about what constitutes intrinsic value, the source of value, and how value influences human behavior and decision-making.
Rational Choice Theory: This theory provides a framework exploring how individuals make decisions by weighing the costs and benefits of different options. It assumes that people are rational actors who seek to maximize their self-interest.

A Few Words by Psychology Fanatic

In conclusion, the Prisoner’s Dilemma is more than just a theoretical conundrum; it is a reflection of the complex interplay between individual interests and collective welfare. It challenges us to consider the implications of our actions, not just in isolation, but as part of a broader network of social interactions. Whether we are conscious of it or not, we encounter versions of this dilemma in various aspects of life—from business negotiations and political decisions to personal relationships and environmental issues.

As we navigate through these decisions, the principles derived from the Prisoner’s Dilemma can guide us towards understanding the value of long-term cooperation over short-term gain. It teaches us that sometimes, the best way to advance our own interests is by considering the well-being of others. Ultimately, the enduring relevance of the Prisoner’s Dilemma lies in its ability to illuminate the intricate dance between competition and cooperation that defines the human experience.

Last Update: February 6, 2026

References:

Axelrod, Robert (2006). The Evolution of Cooperation.‎ Basic Books; Revised edition. ISBN-13: 9781541606845
(Return to Main Text)

Batson, Charles D. (2011). Altruism in humans. Oxford University Press. ISBN: 9780195341065; APA Record: 2011-04533-000
(Return to Main Text)

Carter, G. (2024). Reciprocity versus pseudo‐reciprocity: A false dichotomy. Ethology, 130(4). DOI: 10.1111/eth.13431
(Return to Main Text)

Dufwenberg, Martin (2011). Game theory. Wiley Interdisciplinary Reviews: Cognitive Science, 2(2), 167-173. DOI: 10.1002/wcs.119.
(Return to Main Text)

Govier, Trudy (1998). Dilemmas of Trust. McGill-Queen’s University Press; First Edition. ISBN-10: 0773517979; DOI: 10.1017/S0012217300018643
(Return to Main Text)

Kahneman, Daniel; Tversky, Amos (2000). Conflict Resolution: A cognitive Perspective. In: Kahneman, D., & Tversky, A. (Eds.). Choices, values, and frames. pp. 473-488. Cambridge University Press. ISBN: 9780521627498; APA Record: 1985-05780-001
(Return to Main Text)

Kakutani, Michiko (2018). The Death of Truth: Notes on Falsehood in the Age of Trump. Random House. ISBN: 978-0525574828
(Return to Main Text)

Spotlight Book:

Kenrick, Douglas T.; Griskevicius, Vladas (2013). The rational animal: How evolution made us smarter than we think. Basic Books. ISBN: 9780465032426; APA Record: 2013-31943-000
(Return to Main Text)

Nowak, Martin; Sigmund, Karl (2005). Evolution of Indirect Reciprocity. Nature. DOI: 10.1038/nature04131
(Return to Main Text)

Pinker, Steven (2003). The Blank Slate: The Modern Denial of Human Nature. Penguin Books; Reprint edition. ISBN-10: 0142003344; APA Record: 2002-18647-000
(Return to Main Text)

Quattrone, George A.; Tversky, Amos (2000). Contrasting Rational and Psychological Analyses of Political Choice. In: Kahneman, D., & Tversky, A. (Eds.). Choices, values, and frames. pp. 451-472. Cambridge University Press. ISBN: 9780521627498; APA Record: 1985-05780-001

Rapoport, Anatol & Dale, Phillip S. (1966). The “end” and “start” effects in iterated Prisoner’s Dilemma. Journal of Conflict Resolution, 10(3), 363-366. DOI: 10.1177/002200276601000308
(Return to Main Text)

Rapoport, Anatol (1975). Some comments on “Prisoner’s Dilemma: Metagames and other solutions”. Systems Research & Behavioral Science, 20(3), 206-208. DOI: 10.1002/bs.3830200309.
(Return to Main Text)

Rapoport, Anatol (1995). Prisoner’s Dilemma: Reflections and Recollections. Simulation & Gaming: An Interdisciplinary Journal of Theory, Practice and Research, 26(4), 489-503. DOI: 10.1177/1046878195264010.
(Return to Main Text)

Spotlight Article:

Sapolsky, Robert (2018). Behave: The Biology of Humans at Our Best and Worst. Penguin Books; Illustrated edition. ISBN-10: 1594205078
(Return to Main Text)

Stanovich, Keith E.; West, Richard (1998). Discrepancies between normative and descriptive models of decision making and the understanding/acceptance principle. Cognitive Psychology, 38(3), 349–385. DOI: 10.1006/cogp.1998.0700
(Return to Main Text)

Stevens, Jeffrey R.; Hauser, Marc D. (2004). “Why be nice? Psychological constraints on the evolution of cooperation.” Faculty Publications, Department of Psychology. 542. DOI: 10.1016/j.tics.2003.12.003
(Return to Main Text)

Trivers, Robert (2011). The Folly of Fools: The Logic of Deceit and Self-Deception in Human Life. ‎Basic Books; 1st edition. ISBN-10: 0465085970; APA Record: 2011-24018-000
(Return to Main Text)

Trivers, Robert (1971). The Evolution of Reciprocal Altruism. The Quarterly Review of Biology, 46(1): 35-57. DOI: 10.1086/406755
(Return to Main Text)

Transactional Relationships. Psychology Fanatic article feature image

Psychology Fanatic

Prisoner’s Dilemma