Parallax
A Central Workspace as a continuous-state dynamical system that settles into a metastable regime in an attractor landscape, perturbed by a heterogeneous pool of expert probes. Reasoning is settling. K loop is decoupled from the token loop. Candidate replacement for the residual stack itself.
The whole project is a slow drift away from the assumption that cognition is the same thing as token-clocked autoregressive prediction. The CWS does K reasoning steps per context update, not per token. It does not see raw inputs - language is just one of several sensory channels. Trained with a DEQ-style contract (K-1 frozen iterations, one live final step), so reasoning depth costs no extra memory at training time. The bet: that you can get the dynamical-system flavour of cognition - settling, pondering, content-addressable basins - without giving up the throughput of modern transformers. Took three full iterations to figure out what the project actually is; v3 is where it started feeling honest.
- How sharp does the attractor regime need to be? Strange-attractor dynamics emerge from K-budget, not architecture - what is the right K policy?
- Is K-DOF a graded reasoning-depth signal at language scale, or only a binary structural-recognition signal? Synthetic recall says graded; TinyStories at 6k steps says binary.
- Can the 'replaces ResNet' framing survive 100M+ parameters and 1B+ tokens, or is the constant-memory-in-K finding only useful at small scale?
- How do you write modality-specific motor decoders without sneaking cognition into them?