SERIES 01Twelve essays · Continual Intelligence

Machines that
never stop
learning.

An illustrated field guide to continual learning, reinforcement learning and the science of reasoning — where benchmarks lie, networks lose plasticity, and the world is always bigger than the agent.

12
Illustrated essays
363
Minutes of reading
80k
Words, fully sourced
7
Research threads

The Essays

Read in order — or wander
CLContinual LearningRLReinforcement LearningWMWorld ModelsREReasoningGIGeneral IntelligenceEVEvolutionary & Open-EndedFNFoundations
01 / 12
CLRL

The Benchmark Gap in Continual RL: From Continual World to SPIRAL

Five years of continual-RL benchmarks all measured the same wrong thing. The proof — and a map of what we should measure instead.

24 min readRead
02 / 12
CL

The Plasticity Crisis in Continual Deep Learning

Neural networks quietly lose the ability to learn. Inside the plasticity collapse that benchmarks never run long enough to see.

29 min readRead
03 / 12
CLWM

The Big World Hypothesis: Why Continual Learning Is Inevitable

If the world is bigger than the agent, continual learning is not a feature to add — it is mathematically inevitable.

27 min readRead
04 / 12
CLWMRL

GVFs as Proto-World-Models: The Alberta Plan Vindicated?

General Value Functions predicted the world-model era by a decade. Was the Alberta Plan right all along?

27 min readRead
05 / 12
CLRL

The Forgetting Transformer: When Architecture Solves Plasticity

What if the cure for plasticity loss isn't a regularizer but the architecture itself? The case of learned forgetting.

30 min readRead
06 / 12
RLRE

Does RL Teach LLMs to Reason, or Just Refine Them?

Does reinforcement learning teach language models to reason — or merely sharpen what pre-training already knew?

34 min readRead
07 / 12
RERL

Shape of Thought: Why Reasoning Format Matters More Than Correctness

The format of a chain of thought may matter more than whether it lands on the right answer. The geometry of reasoning.

32 min readRead
08 / 12
RLRE

Stable Deep RL at Scale: Gradients, KL, and the Shape of Learning

Gradients, KL divergence and the shape of learning: what it actually takes to keep deep RL stable at scale.

34 min readRead
09 / 12
RERL

Reasoning at Scale: What DeepSeek-R1, ProRL, and Prolonged RL Reveal

DeepSeek-R1, ProRL and prolonged RL reveal how far reasoning can be pushed when you simply do not stop training.

36 min readRead
10 / 12
GIRL

Darwin-Gödel to ShinkaEvolve: The Case for Open-Ended AI

From the Darwin-Gödel Machine to ShinkaEvolve — the case for open-ended systems that never converge.

31 min readRead
11 / 12
REEVFN

Thinking Without Tokens: CTM and Inference-Time Compute Beyond CoT

Continuous Thought Machines and inference-time compute that happens beyond the token stream. Thought without words.

30 min readRead
12 / 12
RLRE

RL as Educator: Training Teachers, Not Just Students

Stop optimizing the student. Train the teacher. Reinforcement learning reframed as the design of curricula.

29 min readRead

“A field's benchmarks are not neutral measurement tools. They encode what the field believes the problem is.”

This series follows a single thread: intelligence that must keep learning, in a world too large to ever fully model. From the measurement gap in continual RL to thinking without tokens, each essay pairs a careful read of the primary literature with a picture you can actually hold in your head.

Continual Intelligence · in the spirit of the Alberta Plan