Game Theory · Lesson 08 · ← Lesson 07

How the future rescues cooperation

A one-shot prisoner's dilemma traps you; repeated play can set you free.

~14 min · one sitting Skill: repeated games, tit-for-tat, shadow of the future Builds on: prisoner's dilemma (L01), Nash, credibility

First — 20-second recall from Lesson 07

Without scrolling back: a credible commitment wins by…?

01 · THE TRAP, REVISITEDYou already know the one-shot answer

Back in Lesson 1 you met the prisoner's dilemma in the form of a price war or ad arms race. Both players have a dominant strategy: defect (discount, overbid, undercut). Both end up worse than if they had cooperated. That's the trap — rational individual choices produce a collectively bad outcome.

The standard matrix for two clinics deciding whether to hold their consultation price or slash it:

Payoffs: (Your clinic, Rival clinic) · higher = better
	Rival holds	Rival discounts
You hold	(3, 3)	(0, 5)
You discount	(5, 0)	(1, 1)

Discount is dominant for both. Nash equilibrium: (1, 1). You each earn 1 when you could both earn 3. Classic dilemma.

But here is the thing: most of your real games repeat. The clinic two blocks away isn't disappearing after this month. The referring neurosurgeon you're networking with will see you again at the next conference. The vendor supplying your OR equipment will be on the phone next quarter. The skull-base course you run competes with a rival symposium year after year.

When the same players meet again and again, the game changes — and so does the rational move.

One-shot PD equilibrium and the cooperation problem are covered in SEP — Prisoner's Dilemma and Open Yale ECON 159 (Polak).

02 · THE SHADOW OF THE FUTUREDefect today, pay for it every round ahead

Here is the core insight of repeated games. If you defect this month — slash your consultation price to grab patients — you get a one-time windfall. But you've also just signalled to the rival: this is how we play. They retaliate next month. Now you're both stuck at (1, 1) for many rounds instead of (3, 3).

The question becomes purely arithmetic: is the one-time grab worth more than the long-run loss? When players care enough about future rounds — when the future casts a long shadow — the answer is no. Cooperation becomes individually rational, not out of altruism, but because the math works out.

Economists formalise this with a discount factor (usually written δ, "delta") — a number between 0 and 1 representing how much you value next round's payoff relative to this round's. But you don't need the algebra. Plain-language version:

The shadow of the future

Cooperate when the future you'd burn by defecting is worth more than the one-time cheat. The longer the relationship, the more patient the players, the more credible the punishment — the stronger the shadow.

Levers that strengthen the shadow: ongoing/uncertain horizon (no known final round), patient players (low time preference), observable defection (the other side will know), credible punishment (they'll actually retaliate). Weaken any one of these and cooperation gets harder to sustain.

The discount factor and shadow of the future are standard in SEP — Game Theory and developed extensively in Open Yale ECON 159, lectures 14–17.

03 · TIT-FOR-TATThe rule that won Axelrod's tournament

In the early 1980s, political scientist Robert Axelrod ran a computer tournament. He invited game theorists to submit strategies for a repeated prisoner's dilemma. Hundreds of rounds, dozens of strategies, ranging from sophisticated to ruthless. The winner — by a wide margin — was the simplest entry submitted: tit-for-tat.

The rule: start by cooperating. In every subsequent round, do exactly what the other player did last round. If they cooperated, cooperate. If they defected, defect — once — then return to cooperating if they do.

What made it win? Four properties, each doing real work:

Nice. It never defects first. It enters every relationship in good faith — which is why it scores well against other cooperative strategies and wastes no payoffs on preemptive hostility.
Retaliatory. It punishes defection immediately, in the very next round. Defectors don't get to freeload — they get mirrored back.
Forgiving. After one retaliation, it returns to cooperation if the other player does. It doesn't hold grudges — which prevents spirals of mutual punishment that burn both sides.
Clear. Its pattern is obvious after two rounds. The other player quickly learns what to expect. That predictability is itself a strategic asset: it makes the implicit deal legible.

A related strategy — grim trigger — is harsher: cooperate until the first defection, then defect forever. It enforces cooperation more aggressively (the punishment is severe and permanent), but it's brittle: one mistake triggers endless mutual punishment. Tit-for-tat's forgiveness makes it more robust in noisy, imperfect-information environments — like, say, a competitive market where misreads happen.

Robert Axelrod, The Evolution of Cooperation (1984). The tournament results and tit-for-tat's four properties are described in detail there.

04 · PLAY IT OUTOne defection, many consequences

Frame the decision concretely. You and a rival clinic have been holding prices for months — both earning (3, 3) each round. Your rival, you have learned, plays tit-for-tat: they cooperate by default, but if you discount this month, they'll discount next month. Then they'll forgive if you return. You're tempted: a one-month discount would push you to (5, 0) — your best outcome this round.

But count forward. You earn 5 this round instead of 3 — a gain of 2. Next round they retaliate: you're at (1, 1) instead of (3, 3) — a loss of 2. If you then return to holding (and they forgive), you're back to (3, 3). Net result of the defection: one round of (5), one round of (1), then back to (3). You "won" 2 now and "lost" 2 next round. Even at equal time value, you break even — and in practice you've also burned trust and risked a longer spiral.

Before you see any options — type it: across many future rounds, what happens if you defect (discount) once against a tit-for-tat rival?

Model answer: you gain once — 5 instead of 3 — but tit-for-tat retaliates next round, dropping you to 1 instead of 3. If you return to holding and they forgive, you're back to (3, 3). Net across the two rounds: roughly a wash on payoff, plus burned trust and spiral risk. Holding stays the higher-value path over many rounds.

You're tempted to defect (discount) this month for a quick win. Across many future months, what happens?

This is also why visible pricing and market transparency matter strategically. If the rival can't easily observe that you discounted — if defection is hard to detect — the punishment can't land promptly, which weakens the tit-for-tat threat. In markets where prices are public (insurance schedules, published rates, transparent ads), defection is observable and punishment is swift. That's part of why those markets can sustain implicit cooperation even without explicit collusion.

A repeated game you already play A long-run co-management relationship with a referring colleague: reputation compounds across many rounds, so a one-round "win" that burns their trust is actually a loss. The shadow of the future is what keeps both of you honest.

05 · THE LIMITKnown endings unravel cooperation

Here is the important caveat — the thing that can collapse everything you just built. It's called backward induction, and it only applies when players know exactly when the game ends.

Suppose you and the rival both know — with certainty — that you will compete for exactly 12 more months and then the rival clinic closes. Think about month 12: there is no future after it. The shadow of the future has length zero. With no future payoffs to protect, month 12 is a one-shot game — and the dominant strategy is to defect. Both clinics slash prices in month 12.

But now think about month 11. Both players know month 12 will end in defection regardless of what either does in month 11. So month 11's outcome no longer affects what happens in month 12 — and with no future cooperation to preserve, both defect in month 11 too. The logic cascades backward, round by round, all the way to month 1. The repeated game unravels into a series of one-shot defections.

A known, fixed last round is cooperation's kryptonite — the shadow of the future disappears exactly when you need it most.

If you and the rival KNOW this is the final month you will ever compete, what does each rationally do?

Real-world fixes for the unraveling problem: keep the horizon open or uncertain (no announced exit date — this is partly why acquisitions and retirements are kept quiet), build reputation that extends beyond this relationship, and make the relationship ongoing by adding new dimensions (referrals, co-teaching, shared vendors) that extend the shadow.

Backward induction in finite repeated games is covered in SEP — Game Theory and Open Yale ECON 159, lecture 15. The folk theorem (cooperation can be sustained in infinitely or indefinitely repeated games) is the counterpart result for the case where endings are unknown.

06 · KNOW THE LIMITSWhen cooperation won't hold

Repetition and a long shadow don't guarantee cooperation on their own. Watch for the conditions that break the mechanism before you lean on it:

Contraindications — don't expect tit-for-tat to hold here

A known, finite end unravels cooperation by backward induction — the last round has no future to protect, so it collapses backward.
One-shot or anonymous interactions have no shadow of the future — don't expect tit-for-tat to sustain cooperation.
Tit-for-tat can lock two players into an endless retaliation spiral after a single mistake or misread.

Before you count on "the future will keep them honest," check that a future actually exists, that it's visible to both sides, and that one misstep won't cascade into permanent mutual punishment.

07 · LOCK IT INBring it back to your real decisions

Repetition + a valued future + credible retaliation = sustainable cooperation. Three levers. Weaken any one and the equilibrium softens.
Tit-for-tat operationalises it. Nice (cooperate first), retaliatory (punish immediately), forgiving (return after they do), clear (legible pattern). Simple beats clever.
Known fixed endings break it. Backward induction unravels cooperation from the last round backward. The fix: open or uncertain horizon, reputation, relationship breadth.
Observability matters. Punishment only works if defection is detectable. Transparent markets sustain cooperation that opaque ones can't.

Bring it back to me. Identify an ongoing relationship of yours that is secretly a repeated prisoner's dilemma — a rival on ads or price, a referring partner, a vendor, the skull-base course vs a competing symposium.

① Are you playing one-shot when you should be playing the long game? ② Could a visible tit-for-tat stance — cooperate openly, but retaliate once if crossed, then forgive — stabilize it? ③ Is the horizon open or do you have an announced exit that's eroding the shadow? Name the relationship and I'll help you map the game.

Don't invent one — you already have a live row. Open DECISIONS.md and look at D3: a price/ads truce with the rival clinic on shared keywords — a repeated game where tit-for-tat and the shadow of the future apply directly.

Copy learning-records/REP-TEMPLATE.md and fill Phase 1: the four outcomes, your ranking, what the rival clinic actually optimises, their ranking from that chair, your predicted outcome, the one payoff you're least sure of, and the contraindication check from §06 above.

The gate: this lesson isn't "done" when you finish reading — it's done when one REP-*.md exists with Phase 1 filled. Delivered ≠ learned. One honest rep beats reading the next three lessons.

Primary sources: SEP — Prisoner's Dilemma · SEP — Game Theory · Open Yale ECON 159 (Polak). Robert Axelrod, The Evolution of Cooperation (1984). Avinash Dixit & Barry Nalebuff, The Art of Strategy (Norton, 2008).