Simulating individual selection with evolutionary game theory
“Universal Bits”, uh? I named this blog thinking about articles like this one, applying computer science bits to other fields.
# Post outline
# Introduction
Themes beyond my circle of competence attract my curiosity and fascinate the “explorer” within me. Better, they help me create more effective mental models of the world.
With this in mind, last spring, I finished reading The Selfish Gene, by Richard Dawkins^{1}.
Let’s be frank. I studied computer science. Even after reading a couple books about biology, I know very little about it.
But perhaps biology is not that different from computer science. Pick gene sequence, encoding expressions that mold ourselves and our environment. Now compare them to computer code snippets, encoding instructions for machines, to teach them what to do. Viruses are both biological and computational, invading environments, altering individual behaviour, and replicating. Vaccines and software patches mitigate them. And gene editing feels like reverse engineering! But that is a story for another post.
If they’re not that far, then maybe we can apply the familiar lens of computer science to learn biology. Through programs and simulations, replicating what nature does best.
This post focuses on one field in particular, bridging biology, computer science, and even economics: game theory. It explains equally well how peers contribute to a decentralized system (for instance, in Web3) and how investors behave in a market. Today, we look at it from an evolutionary perspective.
# Hawks and Doves
Chapter 5 of the Selfish Gene introduces evolutionary game theory to demonstrate how different individuals in a population adopt various strategies, resulting in different payoffs, that affect their chances of survival and reproduction.
Let’s break this down.
Imagine observing a full ecosystem, isolated under a glass bell, where there only live two species: hawks and doves^{2}. When looking at a pair of individuals, you can’t tell their species unless they engage in a fight.
Hawks fight aggressively and only retreat after a serious injury. Doves instead threaten the opponent, but cause no harm^{3}. When a hawk encounters a dove, the dove flees and no one gets hurt. When a hawk meets a hawk, they’ll fight fiercely until one of them will be injured (or dead). In the case of two doves, they threaten each other until one tires and backs down. In summary:
vs  Hawk  Dove 

Hawk  Fight fiercely until one is seriously injured  The dove flees when attacked 
Dove  See Hawk vs Dove (top right quadrant)  Both threaten until one backs down 
These hypothetical fights represent fierce competition for limited resources. The winner gains access to the resource and thereby develops a better position to spread its genes. A species’ strategy defines their behaviour in fights – in other words, how they interact with the surrounding environment and react to external stimuli.
Assigning numerical scores to the result of fights allows us to compute payoffs:
 +50 points for a win
 0 points for a loss
 100 points for serious injury
 10 points for time lost
For instance, a hawk defeating a dove earns 50 points for winning, while the dove gains 0 (for losing). In a hawk vs hawk fight, one gets injured and backs down. That’s 100 points, +0 for losing. The other gains 50 for winning. The expected payoff for such fight is 25 ((100 + 50) / 2). When two doves fight, one will eventually back down. The winner will earn +50, but they’ll both get 10 for losing time. Their expected payoff is 15 ((10 + 50  10) / 2).
We can summarize all this through a payoff matrix:
Expected Payoff  Hawk  Dove 

Hawk  25  50 
Dove  0  15 
The top right cell indicates that when a hawk fights a dove, the hawk’s expected payoff is 50, while the bottom right cell shows that when two doves face off, their expected payoff is 15.
# Expected payoffs
Now, how does this connect to computer science? Through code. We will use Python to represent the strategies and then probabilistically compute their expected payoff.
To start, we enumerate strategies and points. Then we create a dataclass
to
hold the results of simulated fights.
import dataclasses
import random
from enum import Enum, IntEnum, auto
class Strategy(Enum):
HAWK = auto()
DOVE = auto()
# make prettier prints
def __repr__(self) > str:
return self._name_
class Points(IntEnum):
WIN = 50
LOSS = 0
SERIOUS_INJURY = 100
WASTE_OF_TIME = 10
# make prettier prints
def __repr__(self) > str:
return f"{self.value}"
@dataclasses.dataclass
class FightResult:
a_payoff: int
b_payoff: int
def reverse(self):
self.a_payoff, self.b_payoff = self.b_payoff, self.a_payoff
return self
def shuffle(self):
if random.random() < 0.5:
self.reverse()
return self
FightResult.shuffle
allows us to randomly select the winner and loser (through
a fair coin toss) when two individuals of the same species fight, like dove vs.
dove.
We can now add a fight
function takes two strategies and returns the result.
It uses a touch of recursion to break the symmetry in outcomes (dove vs hawk has
the FightResult.reverse
result of hawk vs dove).
def fight(a: Strategy, b: Strategy) > FightResult:
match (a, b):
case (Strategy.DOVE, Strategy.DOVE):
return FightResult(
Points.WIN + Points.WASTE_OF_TIME, Points.WASTE_OF_TIME
).shuffle()
case (Strategy.DOVE, Strategy.HAWK):
return FightResult(Points.LOSS, Points.WIN)
case (Strategy.HAWK, Strategy.DOVE):
return fight(b, a).reverse()
case (Strategy.HAWK, Strategy.HAWK):
return FightResult(Points.SERIOUS_INJURY, Points.WIN).shuffle()
case _:
assert False
Lastly, we conclude with a simulate
function to:
 Sample two individuals from the population (without replacement), simulate a fight, and record the result.
 Produce the expected payoff matrix by repeating the sampling and fighting process one thousand times.
from itertools import groupby
def mean(x):
... # [...], does what you'd expect
def simulate(population, n_fights=1000) > dict[Strategy, dict[Strategy, float]]:
payoffs_by_species = collections.defaultdict(list)
for _ in range(n_fights):
a, b = random.sample(population, k=2) # without replacement
result = fight(a, b)
payoffs_by_species[a].append((b, result.a_payoff))
payoffs_by_species[b].append((a, result.b_payoff))
def key(t):
species, _ = t
return species.value
payoffs_matrix = {}
for species, payoffs in payoffs_by_species.items():
sorted_against_species = sorted(payoffs, key=key)
grouped = groupby(
sorted_against_species, key=key
) # needs elements sorted by group key
payoffs_matrix[species] = {
Strategy(s): mean(payoff for (_, payoff) in g) for s, g in grouped
} # the matrix row for `species`
return payoffs_matrix
Let’s test it.
import pandas as pd
population_size = 100
population = [Strategy.DOVE] * (population_size // 2) + [Strategy.HAWK] * (
population_size // 2
)
assert len(population) == population_size
split = collections.Counter(population)
print(split)
# >>> Counter({DOVE: 50, HAWK: 50})
payoffs_matrix = simulate(population)
payoffs_matrix = pd.DataFrame.from_dict(payoffs_matrix, orient="index")
print(payoffs_matrix)
# >>> Strategy.HAWK Strategy.DOVE
# >>> Strategy.DOVE 0.0 15.0
# >>> Strategy.HAWK 25.0 50.0
If you prefer to look at the code, you can find the full source here.
Sweet! We could have arrived to the same results through a mathematical approach (in fact, we did that above 👆), but we are here to simulate things. Plus, we are about to make this more interesting by adding simulations for evolution and individual selection.
# Towards evolution: Weighted Average Payoff
Let’s state our assumptions:
 The payoff of fights represents access to (scarce) resources.
 More abundant resources improve the chances for an individual to reproduce. More resources correspond to a higher number of offspring.
 The organisms within our glass bell reproduce asexually and no mutations occur.
We can now introduce the concept of an individual’s fitness to represent their “reproductive success” or, in other words, their chances of passing their genes to a new generation.
It doesn’t make sense to measure fitness in isolation, as we have done so far, by looking at individual fights. Instead, we need to consider the individual’s surroundings. A single hawk in a population of doves will have an easy life, winning all fights with maximum payoff. Intuitively, this translates to a high fitness.
In a few paragraphs we will see how to compute the fitness starting from the payoffs. But for now, let’s begin by examining the weighted average payoffs of fights, which takes into account the opponent’s frequency in the population.
population_size = 100
n_hawks = 1
population = [Strategy.HAWK] * n_hawks + [Strategy.DOVE] * (population_size  n_hawks)
assert len(population) == population_size
split = collections.Counter(population)
payoffs_matrix = simulate(population, n_fights=10000)
print(f"Population by species: {split}")
# >>> Population by species: Counter({DOVE: 99, HAWK: 1})
weighted_avg_payoffs = {}
for species, averaged_payoffs_by_species in payoffs_matrix.items():
# 👇 For each species:
#  Take the expected payoff per fight (against each other species)
#  Weight it by the adversary's _frequency_ in the population
weighted_avg_payoffs[species] = mean(
avg * split[s] / len(population)
for s, avg in averaged_payoffs_by_species.items()
)
# 👇
# >>> Weighted average payoff: {DOVE: 7.425, HAWK: 49.5}
As expected, a solitary hawk thrives among all doves.
However, as the number of hawks increases, their chances of fighting each other and suffering serious injuries increases too. Their payoff (and their fitness) decline.
# Make this a function to reuse it easily
def weighted_average_payoffs(payoffs_matrix, split) > dict[Strategy, float]:
w_avg_payoffs = {}
# [...]
return w_avg_payoffs
population_size = 100
# 👇
n_hawks = 25
population = [Strategy.HAWK] * n_hawks + [Strategy.DOVE] * (population_size  n_hawks)
assert len(population) == population_size
split = collections.Counter(population)
print(f"Population by species: {split}")
# >>> Population by species: Counter({DOVE: 75, HAWK: 25})
payoffs_matrix = simulate(population, n_fights=10000)
w_avg_payoffs = weighted_average_payoffs(payoffs_matrix, split)
print(f"Weighted average payoff: {weighted_average_payoffs}")
# 👇
# >>> Weighted average payoff: {DOVE: 5.625, HAWK: 15.625}
In this scenario the weighted average payoff for hawks decreases to 15.625 (from 49.5).
How does it change according to the frequency of hawks in the population?
Frequency of hawks and doves in the population against weighted average payoff.
The weighted average payoffs of the two species intersect! And as it often happens, cool things occur at the intersection.
# Evolutionarily Stable Strategies (ESS)
In this case, the intersection represents a point of equilibrium. When the population consists of ~58.3% (more precisely, 7/12) hawks and 5/12 doves^{4}, the average weighted payoff for doves matches that of hawks. What does this mean? It means we have found an “Evolutionarily stable strategy” (ESS).
In informal terms (remember, I am a computer scientist stepping into biology 🧑🔬), an ESS means that once a population adopts this (mix of) strategy(ies), it will resist invasions from other (sets of) strategies.
We have now mentioned the word “evolutionary” a few times, so let’s bring the big guns in. So far, we haven’t considered the role of “evolution”. Instead, we have just enumerated the different hawktodove ratios in the population and found an ESS through brute force. What’s truly fascinating is that individual selection will lead us to the ESS.
Here is how it works. Imagine that the population under our glass bell is 50% doves and 50% hawks. Referring to the plot above, we know that the weighted average payoff for hawks, at 50% split, is higher that doves’. In other words, hawks will have better access to resources and better chances of reproduction. Consequently, the next generation under our bell will count more hawks than doves (for instance, 55% vs 45%). We are descending along the slope of the orange line.
Now consider another scenario, where there are too many hawks. Competition is fierce and fights among hawks end with severe injuries. In contrast, doves back off from fights against hawks and enjoy a better expected payoff. This translates to improved chances of reproduction, leading to the next generation having more doves and fewer hawks (ascending along the blue line).
# Weighted Average Payoff → Fitness
At this point we have almost all we need to do the evolution. The final ingredient involves converting the weighted average payoff into fitness, to express the likelihood of reproducing. Since a payoff can be negative, we can’t use it directly as a probability, nor as the expected number of offspring – what would it mean to have 3 children?
However, we can use a little imagination. Our goal is for species with a positive weighted average payoff to have better reproduction chances than those with a negative value. I approached this as follows (but there might be better solutions):
 Separate positive and negative payoffs.
 Apply minmax to the negative payoffs, normalizing them between 0 and 1.
 Apply the same normalization to positive payoffs, but add +1 to the result, normalizing them between 1 and 2.
Intuitively, this method works because we have centered the results around 1 (the neutral element of multiplication). We will use them as follows. Imagine there were initially 43% hawks. Their average weighted payoff was positive, and so their fitness is greater than 1. By multiplying 43% by their fitness (>1), we obtain a higher frequency of hawks in the new generation. A fitness below 1 would, instead, shrink the number of hawks in the population.
The following snippet implements this logic.
def scale(d, min_payoff, max_payoff):
# negative numbers to 0..1
# positive numbers to 1..2
assert min_payoff < 0
if min_payoff < 0 and abs(min_payoff) > max_payoff:
max_payoff = abs(min_payoff)
# Made very verbose for explainability.
r = {k: (v  min_payoff) / (0  min_payoff) for k, v in d.items() if v < 0}  {
k: (1 + v  0) / (max_payoff  0) for k, v in d.items() if v > 0
}
assert all(0 <= v <= 2 for v in r.values())
return r
# Simulating individual selection
We can finally simulate individual selection. We begin with a population equally divided between hawks and doves. Then, we evolve it gradually by relying on individual fitness (remember, fitness depends on the population split!).
population_size = 10000
n_generations = 125
results = []
population = [Strategy.HAWK] * (population_size // 2) + [Strategy.DOVE] * (
population_size // 2
)
payoffs_matrix = simulate(population)
min_payoff = min(p.value for p in Points)
max_payoff = max(p.value for p in Points)
for i in range(n_generations):
split = collections.Counter(population)
results.append(split)
w_avg_payoffs = weighted_average_payoffs(payoffs_matrix, split)
# scale to 0...1...2
scaled_w_avg_payoffs = scale(w_avg_payoffs, min_payoff, max_payoff)
# multiply by frequency to compute fitness
absolute_fitness = {
k: split[k] * v / population_size for k, v in scaled_w_avg_payoffs.items() }
# evolve to the next generation
population = random.choices(
tuple(absolute_fitness.keys()),
weights=tuple(absolute_fitness.values()),
k=population_size,
)
Here is the plot resulting from a run of the simulation.
Frequency of hawks and doves in each generation, as they evolve.
This particular simulation took less than 10 generations to reach the ESS, highlighted by the red line at 58.3% (the 7/12 ratio we have previously discovered through brute force). Our handcrafted fitness function provides a good indicator for individual selection, leading the proportion of hawks to doves to quickly reach the equilibrium.
Once at the equilibrium, the strategy resists invasion. Evolution continues and slightly changes the percentage of hawks and doves in the population, but their frequency reverts to the ESS. Again, informed by our fitness function.
Our experimental results show the definition of ESS being applied in practice. From above:
ESS means that once a population adopts this (mix of) strategy(ies), it will resist invasions from other (sets of) strategies.
The ESS resists changes to the population split that would result in less favorable payoffs. Notably, one can think of a “population split” as a strategy where the individual privately tosses a biased coin before each fight to decide whether to behave like a hawk or a dove. In these terms, our ESS:
 Resists invasion from strategies bias differently from 7:12, in favor of hawks.
 Performs well against itself. In fact, this is an alternative definition for ESS: a strategy that “does well” against copies of itself.
# Conclusions
We’ve covered a lot of ground, starting from the highlevel definition of fitness to arrive at simulating individual selection.
Armed with these tools there’s so much more we can experiment about! Other interesting strategies, for example. How about someone who retaliates, fighting back only if attacked? Check the references for a sneak peek.
👋 That’s it for today! Thank you for reading so far.
One last thing: I can’t leave you without pointing you to the soundtrack I had
in mind all along.
🎧 Do the Evolution  Pearl Jam
# References
 Smith, J., Price, G. The Logic of Animal Conflict. Nature 246, 15–18 (1973). https://doi.org/10.1038/246015a0
 Gale, J. S., & Eaves, L. J. (1975). Logic of animal conflict. Nature, 254(5499), 463–464. https://doi.org/10.1038/254463b0
The paper by Smith and Price, cited by Dawkins, originally introduced evolutionary game theory.
It formulates a slightly different version of our problem by swapping “dove” with “mouse”. Instead of describing fights only through their outcome, the authors describe the steps leading to the outcome (“provoke”, “attack”, and so on). Lastly, they introduce additional strategies for a “bully”, a “retaliator”, and a “proberretaliator”. “Retaliator” is supposed to be an ESS.
In 1975, Gale and Eaves showed through computer simulation (!) that “retaliator” is in fact not an ESS. A “dove” in a population of “retaliator” behaves like a “retaliator” and, therefore, can invade it.
# Footnotes

40th Anniversary edition, published by Oxford Landmark Science. ↩

Unrelated from Neil Young’s album. Still a great record though. ↩

This dove is different from the animal we know, who is in fact quite an aggressive species. ↩

The numbers 5/12 and 7/12 are not “magic”, nor general. They specifically depend on the scores we attribute to the fight outcomes. ↩