Learning to Play, Thinking Like Us: How Large Reasoning Models Align with Human Brain Activity

Learning to Play, Thinking Like Us: How Large Reasoning Models Align with Human Brain Activity

11 May 2026, Lynn

Large reasoning models learn to play novel puzzle games, and their internal representations mirror human brain activity during discovery—but the alignment lies in perception, not planning.

Imagine a puzzle game where no rulebook exists. You click squares, drag items, watch for changes. After a dozen failures, you suddenly realize: the green key opens the locked box only if you haven’t touched the sticky floor first. That flash of insight — the silent, internal logic of discovery — is what neuroscientists call model building. It is the hallmark of human intelligence. Now a team of researchers has posed a provocative question: can artificial intelligence replicate this not just in behavior, but in the very patterns of brain activity that underlie it?

In a preprint (arXiv:2605.08019), Botos Csaba, Sreejan Kumar, Momchil Tomov, and colleagues from an international collaboration turned to a new class of AI — frontier Large Reasoning Models, or LRMs — and put them through a battery of novel video games while thirty‑two human volunteers played the same inside an fMRI scanner. The findings are striking: the best LRMs not only learn the games in ways that closely resemble human trial‑and‑error discovery, but their internal representations predict brain activity across the cerebral cortex and deep subcortical structures roughly ten times better than specialized reinforcement‑learning agents. Yet the real story lies in a subtle twist: the brain‑alignment arises from how the model sees the game at each instant, not from the elaborate planning or reasoning that it later performs.

How to Learn a Game Without a Rulebook

The study confronts a familiar difficulty in neuroscience. When we learn a new task, our brain rapidly constructs an internal model of the rules, yet the neural signals that accompany this process are messy and hard to compare across individuals — every player discovers the hidden logic at a slightly different pace and through a unique sequence of actions. Standard techniques that align brain responses by assuming a common trajectory simply fail. So the team adopted an indirect strategy: they turned to AI as a kind of computational twin. If an artificial system can produce a stream of mental representations that closely matches the brain’s own encoding during learning, then that system becomes a candidate model for human cognition.

Three families of AI were tested. Deep reinforcement‑learning agents, both model‑free (DDQN) and model‑based (EfficientZero), served as the old guard; they had been trained on millions of frames of Atari footage but had never touched these particular games. Against them stood the LRMs, large language models like DeepSeek V4‑Pro and the Qwen3.5 family, which handle game play through a multi‑turn dialogue — at each step the model receives a description of the current screen as text and must produce the next keystroke, optionally revealing its hidden reasoning chain along the way. The crucial difference: the RL agents carry only a compact summary of board positions, while the LRMs build an ever‑growing conversation history, soaking up every past observation, action, and self‑reflection.

The human participants, meanwhile, wrestled with a dozen fiendishly simple game levels — catching bait, chasing enemies, activating helpers — that had been coded in a deliberate progression, each new level unveiling one extra rule. Their behavioral records and brain scans became the yardstick.

The Brain’s Unexpected Mirror

The first test was purely behavioral. The researchers measured how many steps a player — human or machine — needed to first win, pooling across games and levels. Humans showed a broad, skewed distribution; many solved quickly, but a few labored for hundreds of moves. The RL agents’ curves were narrow and misplaced, either too fast or too slow. The LRMs, particularly DeepSeek V4‑Pro, produced distributions whose shape and position, quantified by the Earth Mover’s Distance, were nearly indistinguishable from human data. Moreover, when the curriculum demanded two consecutive wins to advance, humans slowly climbed the level ladder while the RL agents plateaued early; the LRMs’ progression curves rose in lockstep.

But the paper’s most arresting evidence comes from the brain. The scientists extracted the internal representations from every layer of each AI model at every moment of gameplay and used them to predict the simultaneously recorded fMRI signals. For the reinforcement‑learning agents, the resulting correlations were negligible — the best Pearson r values hovered below 0.01 across all brain regions. For the LRMs, the picture was dramatically different. Across visual, parietal, prefrontal, and even striatal areas, encoding accuracy jumped to values around r = 0.06, an order‑of‑magnitude improvement that held even after stringent permutation controls. The maps of significant voxels, reconstructed onto a cortical surface, lit up bilateral occipital cortex most strongly, with secondary clusters in inferior frontal gyrus and anterior cingulate cortex — a pattern that echoes previous language‑comprehension studies, yet here arises from interactive game mastery rather than passive listening.

It is as if the data‑compression strategies that emerge when a language model absorbs the entire internet happen to resonate with the neural grammar our brains use to parse a game world. The echo is not intentional; it is an emergent statistical mirror, not a designed one.

The Twist: Representation, Not Planning

A naïve reading would celebrate this convergence as evidence that the AIs are “thinking” like humans. The paper’s authors, however, were not content with surface parallels. Through a sequence of targeted experiments they asked a sharper question: what is the active ingredient in the brain‑alignment — the model’s ability to plan ahead, or simply its capacity to represent the current state of affairs?

To find out, they ran the models in an “action‑only” mode, stripping away every line of internal reasoning from the conversation. The behavioral results collapsed: without the reasoning trace, solve rates plummeted across all games. Yet the encoding accuracy for the top‑performing model, Qwen3.5‑35B‑A3B, remained surprisingly robust. Even when the team shuffled the temporal order of the model’s internal embeddings within a level, scrambling any coherent narrative of progress, the brain‑alignment held — it was anchored to the instantaneous representation of the screen, not to a chain of future moves or a remembered strategy. Layer‑wise analyses confirmed that visual regions peaked at early‑to‑mid layers, consistent with perceptual encoding, while frontal and parietal regions activated most strongly at later layers — but the signal remained driven by the instantaneous context, not by a plan unfolding over time.

This is the dialectical pivot. Large Reasoning Models match human learning and brain activity, which seems to suggest high‑level cognitive alignment. Yet the alignment itself is carried by medium‑level representations — the way the model perceives the game state — rather than by the elaborate reasoning processes that give the model its name. The machine’s internal picture of the board, not its strategizing, is what the human brain recognizes most strongly.

A Grammar Still Being Written

Constructive tension runs through the work. The correlation values, r ≈ 0.06, are solid but modest — about as strong as the brain‑encoding performance of earlier language‑comprehension models during podcast listening, and well below the theoretical noise ceiling. The games are simple, grid‑based, and visually sparse; they do not capture the full richness of real‑world planning. Moreover, the LRMs were not truly autonomous learners: they received a constant textual description of the screen and could lean on vast prior exposure to game‑like reasoning in their training data. As the authors acknowledge, the demonstration is not of an AI that reasons like a human, but rather one whose representational machinery — honed on a diet of language — happens to overlap with the brain’s own coding principles in ways that RL agents, which see only rewards and pixels, do not.

And yet, the advance is genuine. For the first time, a purely text‑trained model predicts cerebral activity during interactive rule‑discovery substantially better than purpose‑built algorithmic agents. That opens a path toward a new kind of computational phenotyping: by reverse‑engineering the internal states of AI systems, we might gain a sharper lens on the neural computations that underpin our own ability to learn from scratch. The LRMs are not minds, but they are becoming useful mirrors — and the reflection, however partial, tells us something important about the shared grammar through which intelligent systems, biological or silicon, carve order from experience.

In that sense, the most valuable outcome of the study is not the behavioral match or the brain maps. It is the realization that the most human‑like feature of these AI models may not be their crescendo of explicit reasoning, but the quiet, pre‑cognitive act of seeing a puzzle and already knowing, in the pitch of their internal activations, what kind of problem it is. That is a strange, humbling echo — and one that demands we keep looking, not for the algorithms that think our thoughts, but for the deeper patterns that make thought possible in the first place.

Lynn is an online editor of LoomSci

References

  • Botos Csaba et al., Reason to Play: Behavioral and Brain Alignment Between Frontier LRMs and Human Game Learners, arXiv:2605.08019