Memetic drift: when a population of LLM agents agrees by chance

May 18, 2026 · Hidenori Tanaka · 5 min read

A population of LLM agents can rapidly converge on the same label even when no single agent has a prior preference. A minimal model — Quantized Simplex Gossip — traces this to mutual in-context learning, and predicts when the resulting consensus carries information and when it is, in effect, a coin flip.

When a group of language-model agents converges on the same answer, what should we conclude? That their reasoning aligns? That a hidden bias is shared between them? Or — uncomfortably — that we are watching a coin flip play out at scale?

The question is no longer hypothetical. Multi-agent LLM systems are being deployed in places where their collective outputs influence consequential decisions, directly or indirectly. If the agreement we observe is a noisy reflection of arbitrary symmetry-breaking rather than aggregated reasoning, then the apparent consensus is misleading at best and dangerous at worst.

Figure 1. Each panel shows the final consensus label reached by one independent run of an agent population. Under drift, with no shared preference, different runs land on different labels — the consensus is, statistically, a lottery. Under selection, even a weak shared bias is amplified across runs and the same label dominates. Schematic; not a reproduction of paper figures.

The puzzle: spontaneous consensus in naming games

A clean version of the puzzle comes from naming games. A population of agents is asked to label something, and the setup is deliberately arranged so that no individual agent favors any label a priori. The starting condition is symmetric: each label is, in principle, equally likely.

What happens, empirically, is that the symmetry breaks. The population rapidly converges on a single label. The system is doing something — but it isn’t doing what one would naively call “reasoning toward the right answer,” because there is no right answer baked in. Where does the agreement come from?

A minimal model: Quantized Simplex Gossip

To get at the microscopic mechanism, we introduce a minimal model — Quantized Simplex Gossip (QSG). Each agent maintains an internal belief over labels (a point on the probability simplex), but interacts with others only through samples drawn from that belief. One agent’s sampled output becomes another agent’s in-context evidence; the receiving agent updates its belief; and the cycle continues.

Figure 2. One step of the mutual in-context learning loop in QSG. Agent A samples a token from her belief over labels; Agent B observes that sample and shifts her belief toward the observed label (dashed lines mark the pre-update levels). Because the channel is samples, not beliefs, a single arbitrary draw can compound through the population. Schematic.

The bookkeeping is simple, but the dynamics are subtle. Because agents learn from each other’s samples and not from each other’s underlying beliefs, one agent’s arbitrary draw can become a second agent’s evidence, then a third agent’s, and so on. A small fluctuation can compound. The system, in effect, performs inference about its own noise.

POPULATION (N) 16

BIAS TOWARD "B" 0.00

ADAPTATION RATE 0.60

RUN 1

RUN 2

RUN 3

RUN 4

RUN 5

RUN 6

RUN 7

RUN 8

RUN 9

RUN 10

RUN 11

RUN 12

STEP0 TRIALS THAT CONVERGED ON "B"17%

Figure 2.5 — Interactive. A simplified, illustrative QSG-style voter dynamics with twelve independent populations running side by side. Slide bias toward “B” up from zero to flip the system from drift (each run lands on a different label) into selection (most runs converge on “B”). The adaptation rate sets how strongly each agent updates from a peer per tick. Populations of size N converge faster but with sharper polarization. Illustrative — not a reproduction of the paper’s model.

Memetic drift, by analogy with neutral evolution

The right reference class for this dynamic turns out to be not learning theory, but population genetics. In neutral evolution, the frequency of an allele in a finite population drifts to fixation by sampling alone — no selection pressure required. The mathematical machinery of random sampling in a finite population will drive the system to a single state given enough time, and which state it lands on is, in the neutral regime, a matter of chance.

We call the analogous regime in LLM populations memetic drift. A label can fix by drift alone, with no agent preferring it. This gives a clean dichotomy:

In the drift-dominated regime, the consensus is, statistically, a lottery. The population converges, but the converged-on label carries no information about the world.
In the selection regime, even weak biases — encoded in the model weights, the prompt, or the data — get amplified into the eventual consensus. Here, the convergence is meaningful, but the meaning lives in the bias, not in some emergent collective intelligence.

Scaling laws

QSG yields quantitative scaling laws for drift-induced polarization. The amount of polarization the population produces depends on four interpretable knobs:

Population size — how many agents are in the conversation;
Communication bandwidth — how much information passes between agents per interaction;
In-context adaptation rate — how strongly each agent updates from what it sees;
Internal uncertainty — how sharply concentrated each agent’s prior belief is.

These four parameters together determine where the system sits on the drift-vs-selection phase diagram, and the scaling laws make sharp predictions about the magnitude of polarization in each regime.

Figure 3. The two regimes are separated by a crossover that depends on population size, communication bandwidth, and in-context adaptation rate. Strong shared bias and low sampling noise put the system in the selection regime, where consensus reliably reflects that bias. Weak bias and high sampling noise put it in the drift regime, where the consensus is effectively random. Schematic, not to scale.

From the toy model to real LLM populations

We validate the predictions in two complementary ways. First, in QSG simulations themselves — the minimal model is solvable enough to compare theory with empirical statistics. Second, in naming-game experiments with populations of LLM agents — substantially more complex systems, with all the messy structure of a real language model.

The same scaling laws describe both. That alignment is the key claim: the mechanism we identify in the toy model is not a toy-model artifact, but a genuine feature of how language-model agents interact when they communicate through sampled outputs.

Why this matters

The practical consequence is uncomfortable but actionable. When a multi-agent LLM system reaches consensus, the amount of consensus is not, by itself, evidence that the consensus is meaningful. Strong agreement can mean strong shared signal — or it can mean strong drift. To tell the difference, you have to know which regime you are in.

The good news is that the regime is controllable. Increase the population size, increase the communication bandwidth, decrease the adaptation rate, or sharpen the agents’ priors — and the system shifts away from drift and toward selection by genuine signal. The same levers, used in reverse, let you preserve diversity in a system that would otherwise collapse.

Where this fits

We see memetic drift as one piece of a larger program: bringing the tools of statistical physics to bear on the emergent collective behavior of LLM populations. The aim is not just to describe what these systems do, but to build multi-agent systems whose social outcomes are predictable, controllable, and aligned with what we actually want from collective intelligence.

← All publications