Mehmet Erturk | TDD is the Only Way to Survive AI Coding Agents

TDD with AI Agents

Everyone is bragging about how fast AI agents write code. "It built the whole feature in 3 minutes!"

Cool. Now who reviews it? Who maintains it? Who makes sure it doesn't break the feature next to it?

If you use AI coding agents without strict Test-Driven Development (TDD), you aren't a 10x engineer. You're just generating technical debt 10x faster.

The AI Review Bottleneck

Here's what happens when you let an agent write code without tests:

You write a prompt: "Add a workflow config panel."
Agent writes 400 lines of React.
You open the browser. It looks okay. You click around. It seems to work.
You merge it.

Two weeks later, you realize the agent handled the state updates inefficiently, missed three edge cases, and broke the parent component's re-render cycle.

The bottleneck isn't writing code. It's verifying correctness.

Human reviewers can't read 400 lines of agent-generated code and spot logical flaws quickly. The AI doesn't have the context to know what it broke. You need an automated, ruthless reviewer. That reviewer is your test suite.

Tests Are the Contract

When working with human engineers, Jira tickets and conversations form the contract. With AI agents, the test is the contract.

In my monorepo, the rules are explicit:

1. Write a failing test that describes the behavior.
2. Tell the agent: "Make this test pass. Do not change the test."
3. Agent writes code.
4. Test passes. Merge.

If the agent writes garbage, the test fails. If the agent hallucinates a library, the test fails. If the agent breaks existing logic, a different test fails.

The AI is the engine. The tests are the steering wheel and the brakes.

The TDD Flow with Agents

Here is how I actually ship features with agents:

1. Define the Types

Before writing tests, I write the data structures. The types are the boundaries.

pub struct WorkflowExecution {
    pub id: ExecutionId,
    pub status: ExecutionStatus,
    pub current_node: Option<NodeId>,
}

2. Write the Failing Tests

I write the exact behavior I expect. This takes 10 minutes.

#[tokio::test]
async fn test_execution_pauses_at_human_gate() {
    let engine = setup_test_engine().await;
    let workflow = create_human_gate_workflow();
    
    let state = engine.execute(workflow).await;
    
    assert_eq!(state.status, ExecutionStatus::Paused);
    assert!(state.pending_approval.is_some());
}

3. Unleash the Agent

I give the agent the context and the test: "Implement the execution engine logic to make test_execution_pauses_at_human_gate pass."

The agent figures out the implementation details. It handles the boilerplate. It writes the database queries. It does the typing.

I don't care how it writes the code, as long as it adheres to the project's style guidelines and makes the test pass.

Why This Changes Everything

Confidence over speed. When an agent hands me 500 lines of code, I don't feel anxiety. I run cargo test. If it's green, I know the behavioral contract is fulfilled.

Agent refactoring. Agents are terrible at refactoring undocumented code. They are incredible at refactoring fully tested code. "Refactor this module to use the new pattern, ensure all tests stay green."

No context loss. When an agent hallucinates, it's usually because it lacked context. A test provides perfect, executable context about what the code is supposed to do.

What to Test

You can't test everything. Here is what I mandate:

Change type	Required tests
API endpoint	Integration test (request → response)
Core logic	Unit test (input → output)
UI Component	Render + interaction test
Bug fix	Regression test reproducing the bug

Notice what's not here: testing the AI's reasoning. You don't test the agent. You test the output the agent produces. (For testing the actual AI agents, see Eval Harnesses for AI Agents.)

The Shift in Engineering

We are moving from writing code to specifying behavior.

Writing tests is writing the specification in a language the computer can verify. The AI translates that specification into implementation.

If you skip the specification phase, you are just letting a junior developer with amnesia guess what you want.

TDD used to be a best practice for human teams. With AI agents, it's an absolute survival requirement. If you don't write the tests, the AI writes the legacy code.