engines for long-form agent work.
long agent runs drift. we build small engines that watch for the moments where drift happens and steer the run at the decision boundary, without touching your prompt or your model. zero quality loss across two weeks of evals.
we steer the trajectory, not the prompt.
we do not edit your prompt. we do not fine-tune. we do not route to a different model. we do not run a second model in the background. those interventions all have a reason we rejected them, written up on the docs.
we measure trajectory at every agent decision and intervene only at the boundaries where the run is committing to a path it can't easily back out of. when the drift detector reports high-confidence drift, the trajectory engine narrows the choice set. otherwise the engines watch.
internal evals run on two weeks of long-horizon traces and the standard agent benchmarks. quality matches the unguided baseline within noise. the methodology is being written up in pieces in the docs.