Why This Matters
The goal isn't to build better AI systems today. It's to challenge a fundamental assumption: that models need human-preprocessed, human-structured data to learn effectively.
If a model can develop functional world models from a single temporal "energy" signal - no images, no tokens, no labels - then current AI architectures might be solving an artificially constrained problem. We preprocess data for human interpretability, then train models on that preprocessed data, then wonder why they learn human-like representations.
This research tests that question at the extreme: minimal signal, maximal learning challenge. If world models emerge here, it suggests we're dramatically underestimating what's possible with less structured approaches.
The Approach: Radical Simplification
Input
Single Temporal Value
Representing "energy at point of perception"
In the simulation, this value encodes the environment's dynamics at the model's location. The environment has complex rules - think physics simulation with emergent phenomena - so the signal carries rich information. But it's not pre-structured.
No frames, no tokens, no feature vectors. Just: value at time t.
Learning Mechanism
Predictive Coding
With continuous architectural plasticity
The model tries to predict what comes next. Prediction errors drive learning. But critically: the model's internal structure evolves continuously based on its history.
The same input at different times can be interpreted completely differently, depending on what world model the model has developed so far.
Key constraint: No human-provided targets, no reward signals, no supervision. The model must discover structure or remain unable to function.
Output
Single Temporal Value
Representing the model's action
This output affects what the model perceives next. The model is embedded in a closed interaction loop:
act → perceive consequence → update → act again
What Success Looks Like
This research succeeds if:
The model's prediction error decreases over time
Evidence it's building some internal model
The model's actions become systematic
Not random noise, but patterned responses suggesting it's learned environment dynamics
Internal representations stabilize into structure
Even if we can't interpret them, we should see organization forming
The model's world model will be alien - built on non-human perceptual foundations. Because it perceives through a single temporal value rather than vision, language, or any human sensory modality, it might develop categories and patterns we'd never conceive of. It won't carve reality into "objects" or "spatial relationships" the way vision-based creatures do. Instead, it will organize information in whatever way minimizes prediction error in its specific interaction loop - potentially discovering patterns in the temporal signal that have no names in human experience.
That's the point. If functional world models can form without human perceptual categories, it proves those categories aren't necessary for intelligence - just for human-interpretable intelligence.
Open Questions & Challenges
Information Theory
A single temporal value at one position provides only partial information. But if the model can move and integrate observations over time, it could reconstruct the full environment from sequential glimpses - trading spatial bandwidth for temporal exploration. Does the channel capacity support this?
Learning Dynamics
How does curiosity work when the model shapes its own input stream? What constitutes exploration vs exploitation when you're both observer and actor?
Architectural Mechanics
What does "continuous plasticity" mean in implementation? How do we balance stability (maintain useful world models) with adaptability (update based on new information)?
Timescale
How long does it take for world models to emerge? Are we talking thousands of timesteps? Millions? Is the learning curve smooth or punctuated?
Verification
If the world model is alien, how do we verify it's functional beyond prediction error metrics? Can we design probes to test for specific capabilities without imposing human structure?
Current Status & Next Steps
Theoretical Framework
Complete (documented here)
Simulation Environment
In development
- Physics-like rules with emergent complexity
- Single-value interface for model interaction
- Logging infrastructure to track learning dynamics
Next Milestones
- Confirm sufficient information encoding in temporal signal
- Implement baseline architecture with predictive learning
- Run extended training, monitor for world model emergence
- Develop interpretability tools for alien representations
Timeline: Exploratory research. Could demonstrate feasibility in months, or reveal fundamental barriers.
Why nothing is implemented yet: I competed at the National AI Olympiad, Galaksija Kup, then the International Olympiad in Artificial Intelligence in summer 2025. This research requires sustained, focused implementation time - not something to rush between competitions. I'm prioritizing thinking through the theoretical framework correctly before writing code. Better to delay implementation than to build on shaky foundations.