2026-01-16 8 min read

Why AI Can't Change Its Mind (And Why That's a Problem)

Current AI models follow a single reasoning path and can't recover from mistakes. What if the solution is making them more human—by letting them be a little random?

A Simple Question

What's 2 + 2?

When you read that, something interesting happened in your brain. Maybe you thought "4." Maybe you thought "four." Maybe, for a split second, some other number flickered through your mind before you settled on the answer.

Humans don't compute like calculators. Our thinking has inherent variability. Sometimes we make mistakes—we might accidentally say "5"—but we also quickly catch ourselves. That variability isn't a bug. It's what allows us to reconsider, reframe, and self-correct.

Current AI models can't do this. And I think that's why they fail on hard problems.

The Reasoning Collapse Problem

Recent research has revealed something troubling about large language models. When problems get hard enough, these models don't just struggle—they collapse.

A paper from Apple researchers (Shojaee et al., 2025) tested reasoning models on puzzles like the Tower of Hanoi. What they found was striking:

On easy problems (3 disks), models performed well
On medium problems (4-5 disks), reasoning models excelled
On hard problems (7+ disks), all models completely collapsed

But here's the really strange part: on the hardest problems, the models used fewer reasoning steps, not more. They gave up early. They "fixated quickly on an incorrect guess and terminated reasoning."

It's as if the model commits to a wrong path and can't backtrack.

The Single-Path Problem

Here's what's happening under the hood. When you give a prompt to an LLM, the model computes a sequence of internal representations—hidden states that flow from layer to layer. For a fixed input, this sequence is completely deterministic. The same prompt always produces the same internal "reasoning trajectory."

The only randomness comes at the very end, when the model samples tokens to generate text. You might get "four" instead of "4"—surface-level variation—but the underlying reasoning path is identical.

How LLMs work

One deterministic path through the problem. If you commit to a wrong approach early, you're stuck.

How humans work

Multiple candidate approaches held in mind. Wrong paths get abandoned. New framings emerge.

Think about how you solve hard problems. You try one approach, hit a wall, step back, reframe the problem, try again. You hold multiple possibilities in mind. You have internal variability.

LLMs can't do any of this. They're locked into a single trajectory from the first layer to the last.

The Naive Solution Doesn't Work

The obvious fix is to add randomness to the model's internal computations. Just inject some noise into the hidden states, right?

Unfortunately, this fails. Random noise pushes the model's representations into regions of the space it has never seen during training. The result is incoherent outputs—not creative exploration.

The problem is that the model's internal representations aren't uniformly distributed in space. They form a specific, structured geometry. Research has shown that transformer hidden states are anisotropic—they cluster in a narrow cone, not a sphere. Random perturbations almost always move you off the learned manifold.

Analogy: Walking in a City

Imagine you're navigating a city. The "learned manifold" is like the street grid—paths the model knows how to traverse. Random noise is like teleporting to a random point, which might drop you in the middle of a building or off a cliff. What you actually want is to explore different streets—structured alternatives that stay on valid paths.

Structured Stochasticity: A Different Kind of Randomness

What if we could add randomness that respects the model's internal geometry?

This is the core idea I've been exploring: cluster-aware perturbations. Instead of injecting random noise, we inject perturbations that are aligned with the structure of the model's representation space.

The key insights:

The model's representations have structure. Hidden states cluster by semantic meaning, abstraction level, and reasoning mode. We can discover these clusters.
Perturbations should move between nearby clusters. This is like exploring a different framing of the problem while staying in "valid territory."
One perturbation per trajectory. Sample a direction once and hold it fixed. This produces a coherent alternative reasoning path, not accumulated noise.

Mathematically, this means:

Sample a direction from the discovered cluster structure
Orthogonalize it against the current representation (so you're moving to a different framing, not reinforcing the current one)
Scale it to stay within a geometric bound (so you don't leave the manifold)

The result: different internal reasoning trajectories for the same input, each exploring a different approach to the problem.

Why This Might Matter

If this works, it would suggest something important: the reasoning collapse we see in LLMs isn't a fundamental limitation of the architecture. It's an artifact of deterministic single-path inference.

The model might already "know" how to solve hard problems—it just can't explore the space of approaches.

This aligns with existing research. The Coconut paper (Hao et al., 2024) showed that allowing models to reason in a continuous latent space—rather than generating tokens step by step—dramatically improves performance on problems requiring backtracking. The hidden state space supports meaningful exploration; we just need to unlock it.

The Human Connection

This is how human cognition works. We don't follow a single deterministic reasoning chain. We have persistent internal variability, implicit hypothesis branching, and the ability to reframe problems mid-thought. Convergence happens across multiple cognitive trajectories, not along a single path.

Making AI more "random" in a structured way might actually make it more human.

What This Doesn't Solve

To be clear: this is theoretical work. I haven't run the experiments yet. There are open questions:

How do you discover the cluster structure? Probing? Clustering on activations? This is crucial for reproducibility.
Which layers should be perturbed? Early layers affect problem framing; later layers affect solution strategy. The choice likely matters.
What's the right magnitude? Too small and you don't explore; too large and you leave the manifold.
Does it actually improve reasoning on hard tasks? The proof is in the experiments.

But the theoretical foundation is solid. We know single-trajectory inference fails. We know the representation space has exploitable structure. We know Bayesian inference justifies noise injection. The question is whether cluster-aware perturbations are the right synthesis.

The Bigger Picture

I think there's a broader lesson here. We've been scaling language models by making them larger and training them on more data. But maybe the next frontier isn't scale—it's inference-time computation.

The model's weights encode knowledge. But how we use those weights at inference time determines what the model can actually do. Single-path deterministic inference might be leaving enormous capability on the table.

Structured stochasticity is one way to unlock it. There might be others.

The question isn't just "how smart is the model?" It's "how do we let the model think?"

References

Shojaee, P., et al. (2025). "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity." arXiv:2506.06941

Hao, S., et al. (2024). "Training Large Language Models to Reason in a Continuous Latent Space." arXiv:2412.06769

Gal, Y. & Ghahramani, Z. (2016). "Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning." ICML 2016

Park, K., et al. (2023). "The Linear Representation Hypothesis and the Geometry of Large Language Models." arXiv:2311.03658

Ethayarajh, K. (2019). "How Contextual are Contextualized Word Representations?" EMNLP 2019

Zou, A., et al. (2023). "Representation Engineering: A Top-Down Approach to AI Transparency." arXiv:2310.01405

#llm #reasoning #ai #machinelearning #research