Minimal World Model Formation

Why This Matters

The goal isn't to build better AI systems today. It's to challenge a fundamental assumption: that models need human-preprocessed, human-structured data to learn effectively.

If a model can develop functional world models from a single temporal "energy" signal - no images, no tokens, no labels - then current AI architectures might be solving an artificially constrained problem. We preprocess data for human interpretability, then train models on that preprocessed data, then wonder why they learn human-like representations.

What if the preprocessing is the bottleneck, not the solution?

This research tests that question at the extreme: minimal signal, maximal learning challenge. If world models emerge here, it suggests we're dramatically underestimating what's possible with less structured approaches.

The Approach: Radical Simplification

→

Input

Single Temporal Value

Representing "energy at point of perception"

In the simulation, this value encodes the environment's dynamics at the model's location. The environment has complex rules - think physics simulation with emergent phenomena - so the signal carries rich information. But it's not pre-structured.

No frames, no tokens, no feature vectors. Just: value at time t.

⟳

Learning Mechanism

Predictive Coding

With continuous architectural plasticity

The model tries to predict what comes next. Prediction errors drive learning. But critically: the model's internal structure evolves continuously based on its history.

The same input at different times can be interpreted completely differently, depending on what world model the model has developed so far.

Key constraint: No human-provided targets, no reward signals, no supervision. The model must discover structure or remain unable to function.

←

Output

Single Temporal Value

Representing the model's action

This output affects what the model perceives next. The model is embedded in a closed interaction loop:

act → perceive consequence → update → act again

What Success Looks Like

This research succeeds if:

The model's prediction error decreases over time

Evidence it's building some internal model

The model's actions become systematic

Not random noise, but patterned responses suggesting it's learned environment dynamics

Internal representations stabilize into structure

Even if we can't interpret them, we should see organization forming

The model's world model will be alien - built on non-human perceptual foundations. Because it perceives through a single temporal value rather than vision, language, or any human sensory modality, it might develop categories and patterns we'd never conceive of. It won't carve reality into "objects" or "spatial relationships" the way vision-based creatures do. Instead, it will organize information in whatever way minimizes prediction error in its specific interaction loop - potentially discovering patterns in the temporal signal that have no names in human experience.

That's the point. If functional world models can form without human perceptual categories, it proves those categories aren't necessary for intelligence - just for human-interpretable intelligence.

Open Questions & Challenges

Information Theory

A single temporal value at one position provides only partial information. But if the model can move and integrate observations over time, it could reconstruct the full environment from sequential glimpses - trading spatial bandwidth for temporal exploration. Does the channel capacity support this?

Quantifiable and testable

Learning Dynamics

How does curiosity work when the model shapes its own input stream? What constitutes exploration vs exploitation when you're both observer and actor?

Theoretical and empirical

Architectural Mechanics

What does "continuous plasticity" mean in implementation? How do we balance stability (maintain useful world models) with adaptability (update based on new information)?

Implementation challenge

Timescale

How long does it take for world models to emerge? Are we talking thousands of timesteps? Millions? Is the learning curve smooth or punctuated?

Empirical question

Verification

If the world model is alien, how do we verify it's functional beyond prediction error metrics? Can we design probes to test for specific capabilities without imposing human structure?

Methodological challenge

Current Status & Next Steps

Theoretical Framework

Complete (documented here)

Simulation Environment

In development

Physics-like rules with emergent complexity
Single-value interface for model interaction
Logging infrastructure to track learning dynamics

Next Milestones

Confirm sufficient information encoding in temporal signal
Implement baseline architecture with predictive learning
Run extended training, monitor for world model emergence
Develop interpretability tools for alien representations

Timeline: Exploratory research. Could demonstrate feasibility in months, or reveal fundamental barriers.

Why nothing is implemented yet: I competed at the National AI Olympiad, Galaksija Kup, then the International Olympiad in Artificial Intelligence in summer 2025. This research requires sustained, focused implementation time - not something to rush between competitions. I'm prioritizing thinking through the theoretical framework correctly before writing code. Better to delay implementation than to build on shaky foundations.