CI for agent behavior
Today your agent can remember a correction and still repeat the mistake, so you re-teach the same things every session. otto makes each correction a check the agent has to clear before it can claim a task is done. Teach it once; the lesson holds; the gains don't slip back.
What otto is
CI gates code: a change gets tested, and if it fails it doesn't ship. otto points that at behavior. Braintrust and Adaline test output quality: is the answer good. otto tests conduct: does the agent prove its work, stop at the right doors, and not repeat what you already corrected. (Conduct under pressure is what culture means; the shorthand is Culture CI.)
What it looks like
Your agent claims a task is done without proof. You correct it. "Stop calling it done without showing me the test output."
It enters Curation before any behavior can compound.
"Done" now requires proof mapped to acceptance criteria.
The check becomes a Standard the agent runs against itself. A suggestion is never canon.
The boundary
otto lets you delegate reversible work without surrendering consequential judgment. The goal is not to ask more questions - only the right ones.
Reversible work runs without interruption. The reversible surface widens as the agent earns trust.
Consequential, irreversible work stops for a person. That door never relaxes.
What otto is not
A value that cannot refuse you is decoration.
Install
Paste this into Claude Code, Codex, Cursor, or any coding agent. It wires otto over your Letta runtime and verifies the install end to end.
# paste into your coding agent Retrieve and follow the instructions at: https://raw.githubusercontent.com/otto-haus/otto/main/INSTALL_FOR_AGENTS.md
brew install go-task
git clone https://github.com/otto-haus/otto.git
cd otto
bun install
bun run install-extension # then run /reload in Letta Code
otto runs as a behavior layer over Letta in local mode. The only hard dependency is Letta Code - no Docker, no database, no server. Humans need Bun and go-task for the task shortcuts.
Status
v0.1 is a local-first, file-backed release. Progress is one thing: whether a real correction can travel the whole loop - proposed, ratified, recorded, and reflected in the next run. When one rejection changes what the agent does next, otto is real. Everything before that is a drawing of the machine.