When the Bug Is in the Instruction
A bug shows up in code an agent wrote for me. The reflex, the one two decades of debugging built, is to open the file and start reading. But that reflex assumes the thing to debug is the code, and half the time now it isn’t. The agent did what it was told. The mistake was in what it was told.
In my setup the instruction doesn’t disappear. By the time an agent touches anything, the GitHub issue already holds the whole picture: the original problem, the acceptance criteria, and an approach to the fix written out as a prompt. The control plane writes that prompt during triage, the coding agent executes against it, and when the PR comes back, the prompt is still sitting on the issue, next to the result it produced. This is also just how you work with a junior dev: during triage you talk through the problem and tell them the solution you want, right there on the ticket. We have only started calling it a “build prompt” because the one reading it now is a machine.
That changes the first question I ask when something is wrong. It is often not “where is the bug in the code,” but “does the build prompt actually describe what I wanted.” Those are different questions with different fixes. If the execution was wrong, you correct the agent. If the instruction was wrong, you correct yourself, and no amount of staring at the code will show you that, because the code is a faithful rendering of a flawed request.
Keeping the instruction on record lets you do something you usually can’t: diff intent against result. The PR is the result. The build prompt is the intent, written down before the work ran. When they disagree, the gap between them is the bug, and you can see which side it sits on.
And it accumulates. The issue ends up holding the whole chain, from the first ask to the PR that closed it, so you can trace not just what got built but why, and where the reasoning bent. This is the same move as keeping a log for attribution: you can only answer the question later if you wrote it down at the time.
The version of this I didn’t expect is how often the answer is “the instruction.” I go in sure the agent fumbled, and a surprising amount of the time the agent rendered exactly what I asked for, and what I asked for was wrong. That is a more useful thing to learn than a fixed line of code.
Keeping the prompt around changes what debugging even is. The instruction is part of the program now, and it gets debugged like the rest of it.
Continue on the trail