Fast Checks for Code Generation

Greg Tatum – June 24, 2026

I’ve been thinking a lot about code generation with AI agents lately, and wanted to start writing a bit about how I’m approaching the problem. I’ve been using AI-assisted workflows since ChatGPT came out, but I care quite deeply about code quality and maintaining a good mental model for the work I’m doing. In the early days I would mostly use these tools for scaffolding, but I’m finding we’re at a bit of an inflection point where they can be used for more production-ready coding tasks. This is a first article exploring it.

This one technique, around the principle “Automate Feedback”, is one that I found to be quite powerful to ensure that the codegen is up to my standards. In my project Indie Web, which powers FloppyDisk.link and BrowserChords.com, I use taskfiles for my script runner. I added a task check that automates all the fast iterative checks I care about.

I found myself frustrated with constantly correcting the agent, and slow feedback cycles. The agent failed to apply automatic code formatting pretty much every time, and my atomic commits became messy with formatting fixes after the fact. On side projects I still prefer a clean commit style. I found myself constantly correcting annoying coding patterns and style lints. Suddenly, my side project needed classic linting checks for untrusted contributors, even though I was technically the only person on the project. We have had this multiple contributor problem solved for quite a while with linting and automated feedback.

For agentic loops, this changes where the feedback gets applied. Here I want a single unambiguous check command that is as fast as possible to keep the agent on task without a human in the loop, until I am ready to review the work. In order to trust plausible-looking agent code, I need some kind of validation that is quick and that I can trust. Here I got the check down to a few seconds of parallelized runs. AI codegen is really good at creating plausible code that is fundamentally flawed. task check was my way to keep things higher quality.

How the command works – human in the loop

task check is the single command all that I need to run the check. It’s fast, and parallelized.

For failures only the failing task is shared.

This way it is quick to diagnose errors and understand failures whenever I want to run a check.

How the command works – non-TTY

If I pipe a command into cat then it will run in a non-interactive mode. This is the same way an agent or standard CI would interact with the command. Here I wanted to output the format in a log-friendly way without interactive check marks updating over time. I wanted to maximize signal for the agent, but retain signal for me as well.

➤ task check | cat
RUN  Lint JS
RUN  Lint CSS
RUN  TypeScript Frontend
RUN  TypeScript Server
RUN  TypeScript Shared
RUN  Test Frontend
RUN  Test Server
FAIL Lint JS             5.5s
FAILURES

FAIL Lint JS exit 1 | run: task lint-js
cmd: task --silent --exit-code lint-js
output:

/Users/greg/dev/indie-web/src/frontend/index.tsx
  84:22  error  Replace `A.setDropboxAccessToken(oauth.accessToken,·0,·oauth.refreshToken)` with `⏎········A.setDropboxAccessToken(oauth.accessToken,·0,·oauth.refreshToken),⏎······`  prettier/prettier

✖ 1 problem (1 error, 0 warnings)
  1 error and 0 warnings potentially fixable with the `--fix` option.


task: Failed to run task "check": exit status 1

For the success case I got:

➤ task check | cat
RUN  Lint JS
RUN  Lint CSS
RUN  TypeScript Frontend
RUN  TypeScript Server
RUN  TypeScript Shared
RUN  Test Frontend
RUN  Test Server
PASS Lint CSS            0.6s
PASS Test Server         1.6s
PASS TypeScript Shared   3.6s
PASS TypeScript Server   3.8s
PASS Test Frontend       3.9s
PASS TypeScript Frontend 5.0s
PASS Lint JS             5.5s

Here it informs the agent that everything is passing as intended, and how long the check took. 5.5s for the longest task means that everything works quickly in an agentic loop.

How this affects the agentic loop

I found with this technique the agent just started getting things right without me havingn to correct it all of the time. Instead it would figure out every time that there was a formatting error, and only run task lint-fix once it was done solving the bigger problems. I found I could get more reliable behavior.

The principles for the task check command are:

Fast enough to run constantly.
Narrow enough that failures are relevant.
Real enough that passing means something.
Quiet enough that an agent can use the output.
Pleasant enough that I will still run it.

In my project to accomplish “fast enough” I parallelized everything. I care a lot about performant code, and my tests were already pretty fast to begin with. For this project, this is actually all of the CI checks running. I could see that in larger projects you would need to make task check smart enough to only run checks that would be relevant. For my use case, my tests were fast enough to just run everything in parallel on my beefy dev machine.

In order to make failure modes even faster for the agent flows, I made it even faster by killing all the remaining tasks that hadn’t failed. Agents can just fix one failure at a time, so you can order checks by importance.

What this all looked like for me

An example Taskfile.yml shape:

check:
  desc: Run all checks in a blazing fast configuration with only failures reported.
  silent: true
  cmds:
    - node bin/check.mts

lint-js:
  desc: Lint the JavaScript files.
  cmds:
    - eslint "bin/**/*.{ts,mts}" "scripts/**/*.js"

format:
  desc: Check Prettier formatting.
  cmds:
    - prettier --check "bin/**/*.{ts,mts}" "scripts/**/*.js"

And then the check orchestrator lists the tasks in order.

const checks = [
  { task: 'lint-js', label: 'Lint JS' },
  { task: 'lint-css', label: 'Lint CSS' },
  { task: 'ts-frontend', label: 'TypeScript Frontend' },
  { task: 'ts-server', label: 'TypeScript Server' },
  { task: 'ts-shared', label: 'TypeScript Shared' },
  { task: 'test-frontend', label: 'Test Frontend' },
  { task: 'test-server', label: 'Test Server' },
];

Finally, I keep my AGENTS.md pretty short, because of the principle of “Patterns Are Stronger Than Prompts” for codegen. The agent will figure it out contextually and interactively.

Read all of README.md. I optimize for human usage over robot.

For all code written, run `task check` for quick agentic-focused testing that limits stdout.

This feedback loop has made me feel pretty productive without feeling sloppy, so that I can read the generated code and ensure it’s high quality for the task at hand.