Decision boundaries, not tokens.

Apr 07, 7 min

The unit of work for a long agent isn't a token. it's a commit point. we explain how we find them and why nothing else we tried did the job.

Why the token isn't the unit

most of the work in the model-serving stack treats the token as the atomic unit. for a chatbot or a short completion, that's fine. for a long agent run, it's wrong. the model is not deciding one token at a time in the way that matters for the run's trajectory. it's deciding which tool to call next, which file to open, which intermediate claim to commit to, and which sub-goal to pursue.

a token-level signal washes out the structure that matters. by the time you've integrated over five hundred tokens, the commit point that broke the run is invisible.

What a decision boundary is

we define a decision boundary as the point in a tool call where the model commits to a path it can't easily back out of. choosing which file to edit. choosing what to assert about an unknown. choosing whether to write or test. these are the moments where the trajectory takes a turn.

drift events cluster at these boundaries. that's where we intervene.

How we find them

we look at the conditional distribution over the model's next action and identify low-entropy commits and high-stakes branch points. we won't say more here. the details are part of what we're keeping inside while we figure out the right release shape.

What we tried that didn't work

we spent a month on token-level interventions before moving up the stack. they're cheap and predictable but they don't act on what matters for long-horizon work. retry-on-fail and per-call routing have the same shape problem: they treat the run as a sequence of independent calls instead of a single object with a trajectory.

ryan

← docs @ryanndngg