A language model and a company look like completely different things, and in most ways they are. A model isn't alive; a company is. But the one thing that matters here, they share: what makes each of them powerful isn't the machinery — it's the accumulated structure it runs on. I've spent a decade on why. It's the bet under everything in Calyx.
Intelligence — in a mind, an agent, or a company — is fluent navigation of an accumulated structure that was selected, not designed. The machinery is replaceable. The structure is the asset, because it's the one part that can't be shortcut.
That single claim has two faces. Point it at a company and you get the living-systems lens — why Calyx is built the way it is, and where the moat lives. Point it at a language model and you get the reasoning lens — why scaling produces reasoning in the first place. Below, both. Dive into whichever pulls you; they meet at the bottom.
This isn't a product page. It's a thinking artifact — the kind of thing I'd usually send instead of a coding session when someone asks whether I actually understand the technology I'm building on. None of it is decoration: the architecture decisions in Calyx fall directly out of what's below.
The Free Energy Principle, active inference, cybernetics — why agents and companies are built this way, and where the moat compounds.
Lens 2Relational resolution and scaling — why a system trained to predict the next word can solve frontier mathematics.
This is the lens that shapes the product and the moat. It comes out of the science I studied formally — the Free Energy Principle, active inference, and cybernetics — applied to the question of what a company actually is.
And this isn't fringe philosophy. The Free Energy Principle is a serious, peer-reviewed framework — two decades of work by Karl Friston, one of the most-cited neuroscientists alive — and it's an active frontier in bio-inspired AI: dedicated annual conferences, an MIT Press textbook, and real results in robotics, multi-agent systems, and efficient planning, with a growing body of work treating it as a route to adaptive agents beyond today's LLMs. The field is only now turning toward it.
It's also the exact territory I've lived in for years — it's what my honours thesis was on, and what I spent a podcast interviewing researchers about. The record →
The Free Energy Principle (Karl Friston's framework) says that any system which persists — a cell, an animal, an institution — does so by maintaining a model of its environment and acting to minimise surprise: the gap between what it predicts and what it encounters. This is active inference — you don't just update your model to fit the world, you also act on the world to fit your model. Sense, model, act, repeat. It's the same loop cybernetics described decades ago.
A company does exactly this, badly. It senses its market, holds a picture of itself, and acts to close the gap — except the picture lives smeared across a dozen tools and somebody's head, and nobody can keep it current. Calyx is the attempt to give that loop a real substrate.
The physicist David Wolpert, with Artemy Kolchinsky (2018), made this precise: semantic information is the information a system holds that is causally necessary for it to keep existing. A bacterium tracking a chemical gradient holds semantic information about it; a rock holds none. Meaning isn't projected onto the world by minds — it's the subset of information that bears on survival.
That reframes a company's context. The decisions, the history, the model of itself — that is its semantic information, the part it must keep true to persist. Lose it and the organism goes blind.
A system can only act on a world it can model accurately. Drop an agent into an opaque, high-entropy substrate and its predictions degrade; give it a legible, ordered one and it reasons reliably. This isn't a preference — it falls straight out of the principle above.
So Calyx is a folder of plain markdown files: deterministic, inspectable, low-entropy. The file-first decision isn't aesthetic. It's the condition under which agentic reasoning actually works — which, as it happens, is also what makes the workspace pleasant for a human. The same choice serves both.
Put the three together. A company is a cybernetic system that persists by keeping a true model of itself; that model is its semantic information; and it can only run well on a legible substrate. Calyx is built to be that substrate — sensing your connected tools, updating the vault, acting to close the gap between what the organisation believes and what's actually true.
And the more of the company that lives inside it, the more capable its agents become and the more expensive it is to leave. That's a switching cost and a network effect that compound per customer, every day. Competitors can copy an interface in a weekend. They cannot copy a continuously-reconciled, living model of your company.
Agents are a commodity. The living context is the asset — and it's the only part that compounds.
The lens above explains the product. This one explains the technology it runs on — a theory I've been developing about why these models can reason at all. Watch for Kolchinsky & Wolpert returning in the first claim.
Language models get better at reasoning as they scale. This is empirically undeniable and almost completely unexplained.
Scaling laws describe that it happens. Interpretability work describes which circuits appear. But there's no satisfying answer to the question underneath: why does training a system to predict the next word on a pile of internet text produce something that can solve frontier mathematics? Here's the account I find most convincing.
Take Kolchinsky & Wolpert's semantic information again — meaning as the information a system needs to persist — and point it at language. The words and structures that survive aren't random; they're the residue of a long selection process. Every word that exists is a distinction someone needed often enough to name. So the corpus of human language carries, implicitly, a vast graph of concept-to-concept relationships, and that graph isn't arbitrary — it was shaped by reality, because the people who produced it were.
Some links are strong and obvious (cat/animal, hot/cold). Others are weak and rare — the connection between thermodynamic entropy and information entropy, say. Borrowing the sociologist Mark Granovetter's term, these are the weak bridges: sparse links between distant regions of the graph. Rare, but structurally critical — they're what make non-obvious inference possible.
At low scale a model only learns the strong, common links — a blurry map, major landmarks only. At higher scale the weak bridges resolve: minor roads, then footpaths. The model can traverse paths that were always there but invisible at lower resolution. And the links become context-sensitive — A relates to B differently depending on what else is present. That's the thing that separates reasoning from lookup.
This explains why capabilities seem to appear suddenly. Per-token accuracy climbs smoothly, but a real task requires traversing a chain of weak bridges in sequence — and if any single link is still below resolution, the whole inference fails. When scale crosses the threshold where every link in the path resolves, the capability snaps into existence. Smooth underneath, discontinuous on the surface, because the task needs all the edges at once.
Each token a model emits materialises a concept into its working context. Before it's said, the concept is latent in the weights, reachable only via the right path; once it's said, it's a concrete vector the next computation can build off. Each token a model lays down changes which tokens become possible next — the way each stone in a vault changes what the next stone can bear. Words the bricks, grammar the mortar, reasoning the cathedral that only stands once the whole thing is in place.
So "think step by step" doesn't teach a model to reason — it gives it permission to lay down the surface it needs to reason on. A model can't reason about what it hasn't yet said.
Here's the part that should stop you. The model has never touched the world — no eyes, no hands, no experiment — and it converges on truth anyway. How? Because the structure it learned wasn't designed. It was selected — calibrated against reality by billions of people across thousands of years, the same way natural selection calibrates a body to its environment. Language is the fossil record of every distinction that ever helped someone survive. Train on it deeply enough and you inherit the map.
It even predicts the failures: a model confabulates most confidently exactly where the corpus is internally coherent but reality-detached — pseudoscience, folk etymology, superseded theory. The map is dense there, so it moves fluently. The paths just don't track truth, because the selection pressure was social, not empirical.
Read the two threads back to back and the spine is unmistakable. Not that the two are alive in the same way — a trained model is a frozen artifact; a company is a living system. What they share is narrower, and more interesting. A language model's value isn't its architecture — anyone can fork it. It's the accumulated, selected relational structure it learned to navigate. A company is the same shape: its value isn't the tooling, it's the accumulated context — the model of itself it's built up over time. In both, the structure is the asset, because it was selected, not designed, and so it can't be shortcut.
And here's where the two lenses actually touch: the model isn't alive — but the structure it runs on was laid down by billions of people who are. Language is the residue of living systems. The model just inherits the map they drew.
That's why Calyx is built the way it is — plain files, explicit structure, deterministic folders, a context layer that compounds per customer, every day. The theory of mind and the theory of moat are one theory. I didn't build the product and then reach for a justification. The product fell out of the worldview.
An honest note on what this is. The reasoning lens is a synthesis, not a bolt-from-the-blue. It stands on real work — Piantadosi & Hill and Ellie Pavlick on LLMs and meaning, Olsson et al. on induction heads, the existing emergence-as-phase-transition literature. What I think is mine is the way these stitch together, and the fact that it makes falsifiable predictions — a couple of which I'm keeping back for a proper paper rather than spelling out here. I'm not claiming I've cracked it. I'm showing you how I reason about the technology I'm building a company on top of. More on the background →
Models will keep getting cheaper. Features will keep getting copied over a weekend. The structure a company accumulates is the one thing that compounds — and building the place it lives is the whole game.