SERIES 01Twelve essays · Continual Intelligence

Machines that
never stop
learning.

An illustrated field guide to continual learning, reinforcement learning and the science of reasoning — where benchmarks lie, networks lose plasticity, and the world is always bigger than the agent.

↵ Start the series A About

Illustrated essays

349

Minutes of reading

77k

Words, fully sourced

Research threads

The Essays

Read in order — or wander

CLContinual LearningRLReinforcement LearningWMWorld ModelsREReasoningGIGeneral IntelligenceEVEvolutionary & Open-EndedFNFoundations

01 / 12

CLRL

The Benchmark Gap in Continual RL: From Continual World to SPIRAL

Five years of continual-RL benchmarks all measured the same wrong thing. The proof — and a map of what we should measure instead.

23 min readRead

02 / 12

The Plasticity Crisis in Continual Deep Learning

Neural networks quietly lose the ability to learn. Inside the plasticity collapse that benchmarks never run long enough to see.

28 min readRead

03 / 12

CLWM

The Big World Hypothesis: Why Continual Learning Is Inevitable

If the world is bigger than the agent, continual learning is not a feature to add — it is mathematically inevitable.

26 min readRead

04 / 12

CLWMRL

GVFs as Proto-World-Models: The Alberta Plan Vindicated?

General Value Functions predicted the world-model era by a decade. Was the Alberta Plan right all along?

26 min readRead

05 / 12

CLRL

The Forgetting Transformer: When Architecture Solves Plasticity

What if the cure for plasticity loss isn't a regularizer but the architecture itself? The case of learned forgetting.

29 min readRead

06 / 12

RLRE

Does RL Teach LLMs to Reason, or Just Refine Them?

Does reinforcement learning teach language models to reason — or merely sharpen what pre-training already knew?

32 min readRead

07 / 12

RERL

Shape of Thought: Why Reasoning Format Matters More Than Correctness

The format of a chain of thought may matter more than whether it lands on the right answer. The geometry of reasoning.

31 min readRead

08 / 12

RLRE

Stable Deep RL at Scale: Gradients, KL, and the Shape of Learning

Gradients, KL divergence and the shape of learning: what it actually takes to keep deep RL stable at scale.

32 min readRead

09 / 12

RERL

Reasoning at Scale: What DeepSeek-R1, ProRL, and Prolonged RL Reveal

DeepSeek-R1, ProRL and prolonged RL reveal how far reasoning can be pushed when you simply do not stop training.

35 min readRead

10 / 12

GIRL

Darwin-Gödel to ShinkaEvolve: The Case for Open-Ended AI

From the Darwin-Gödel Machine to ShinkaEvolve — the case for open-ended systems that never converge.

30 min readRead

11 / 12

REEVFN

Thinking Without Tokens: CTM and Inference-Time Compute Beyond CoT

Continuous Thought Machines and inference-time compute that happens beyond the token stream. Thought without words.

29 min readRead

12 / 12

RLRE

RL as Educator: Training Teachers, Not Just Students

Stop optimizing the student. Train the teacher. Reinforcement learning reframed as the design of curricula.

28 min readRead

“A field's benchmarks are not neutral measurement tools. They encode what the field believes the problem is.”

This series follows a single thread: intelligence that must keep learning, in a world too large to ever fully model. From the measurement gap in continual RL to thinking without tokens, each essay pairs a careful read of the primary literature with a picture you can actually hold in your head.

Continual Intelligence · in the spirit of the Alberta Plan

Machines thatnever stoplearning.

The Essays

The Benchmark Gap in Continual RL: From Continual World to SPIRAL

The Plasticity Crisis in Continual Deep Learning

The Big World Hypothesis: Why Continual Learning Is Inevitable

GVFs as Proto-World-Models: The Alberta Plan Vindicated?

The Forgetting Transformer: When Architecture Solves Plasticity

Does RL Teach LLMs to Reason, or Just Refine Them?

Shape of Thought: Why Reasoning Format Matters More Than Correctness

Stable Deep RL at Scale: Gradients, KL, and the Shape of Learning

Reasoning at Scale: What DeepSeek-R1, ProRL, and Prolonged RL Reveal

Darwin-Gödel to ShinkaEvolve: The Case for Open-Ended AI

Thinking Without Tokens: CTM and Inference-Time Compute Beyond CoT

RL as Educator: Training Teachers, Not Just Students

Machines that
never stop
learning.