AIDLC · The AI Development Life Cycle

A lifecycle for building software the agentic way.

AIDLC is the method I run on every engagement. Eight phases that take an idea from a framed problem to an operated, eval-guarded system. Agents do the heavy lifting at each step. Senior engineering keeps them inside the boundary.

Run AIDLC on your project See the eight phases

Operating principles

What holds the lifecycle together?

Four principles run underneath all eight phases. They are what keep agentic speed from turning into agentic chaos.

01 / Spec first

Specs are the source of truth

Agents are only as good as what you point them at. Every phase produces an artifact a coding agent can act on without guessing. Ambiguity is resolved before generation, not during review.

02 / Evals as the gate

Evals guard every change

Velocity without a safety net is just faster regressions. Golden datasets and LLM-as-judge suites run on every diff, so the speed agents bring never trades against correctness.

03 / Human in the loop

Senior review on every diff

Agents generate; a senior engineer is accountable. The boundary is defined by people, enforced by the harness, and never left to the model to police on its own.

04 / Production from day one

Observability is not optional

Traces, costs, and the success metric sit on a dashboard from the first slice. If you cannot watch it, it does not ship. Handover later is a config change, not a rewrite.

The shape of the work

What does the AIDLC pipeline look like?

The first six phases run once to take a framed problem into a hardened build. The last two run as a loop, so every change in production passes back through evals before it ships.

First buildruns once

01Frame
02Spec
03Scaffold
04Generate
05Eval
06Harden

Production loopevery change

07Ship
08Operate
Back through Eval before it ships

The eight phases

How does AIDLC move from idea to operated system?

Each phase has one job and one set of outputs. Phases run in sequence on a first build and as a loop in production. No phase ships without the one before it.

Phase 01 of 8

Frame

Pin down the problem before a line of code exists. We agree on the user, the constraint, and the single metric that proves the system is earning its keep. The output is a one-page scope, not a proposal deck.

One-page problem scope
Success metric agreed in writing
Constraints and non-goals

Phase 02 of 8

Spec

Turn the frame into an executable specification. Agent roles, tool boundaries, data contracts, and acceptance criteria written so both a human and a coding agent can act on them without guessing.

Executable spec
Agent role and tool inventory
Acceptance criteria

Phase 03 of 8

Scaffold

Stand up the skeleton. Repo conventions, type contracts, the eval harness, and the smallest runnable surface. Agents work best inside a structure that already enforces the rules, so we build that structure first.

Typed scaffold and conventions
Eval harness wired in
First runnable slice

Phase 04 of 8

Generate

Agents write the bulk of the code against the spec, supervised by senior review on every diff. Two-week vertical slices, demoable each Friday. Velocity comes from the agents; correctness comes from the harness and the reviewer.

Working vertical slices
Reviewed diffs
Friday demos

Phase 05 of 8

Eval

Behaviour is guarded by golden datasets and LLM-as-judge suites that run on every change. We test the agent's decisions, not just its output, so regressions surface before they reach users.

Golden datasets
Regression-gated CI
Trace-driven backlog

Phase 06 of 8

Harden

Close the gaps that only appear under real load. Prompt-injection defenses, PII redaction, rate limits, fallbacks, and audit logs. Private LLM deployment when residency, PDPL, or DIFC compliance demand it.

Guardrails and redaction
Audit logging
Residency-compliant deployment

Phase 07 of 8

Ship

Release behind a flag, instrument the loop, and roll out by cohort. Costs land on a dashboard from day one. Nothing goes to production that the eval suite and the observability surface cannot watch.

Flagged rollout
Cost and latency dashboards
Production observability

Phase 08 of 8

Operate

The system improves in production. Evals run on every change, traces feed the backlog, and the metric stays on a dashboard. Handover to your team with runbooks, or a monthly retainer if you want us to keep operating it.

Runbooks and handover
Continuous eval loop
Operating cadence

Read the method in depth

The phases, the principles, and how AIDLC changes every role on the team.

The thinking behind the phases

Each post goes deep on one part of running AIDLC in production.

How each role changed

AIDLC reshapes every job in the software lifecycle. One post per role.

Common questions

What teams ask about the method

Answers written for operators and engineers evaluating how AIDLC actually runs.

What is AIDLC?

AIDLC is the AI Development Life Cycle, an eight-phase method for building software where agents do the heavy lifting and senior engineering keeps them inside a defined boundary. The phases are Frame, Spec, Scaffold, Generate, Eval, Harden, Ship, and Operate. It runs in sequence on a first build and as a loop in production.

How is AIDLC different from the traditional SDLC?

The traditional SDLC assumes humans write most of the code. AIDLC assumes agents do, so it front-loads the work that makes agents effective. That means a precise spec, a scaffold that enforces conventions, and an eval harness that catches regressions on every diff. The phases that look familiar like ship and operate carry extra weight because an autonomous system needs observability humans would otherwise provide implicitly.

Do agents really write most of the code?

On a well-specified slice, yes, the bulk of generation is agent-driven. But every diff goes through senior review, and nothing merges without passing the eval suite. The speed comes from the agents while the accountability stays with a person. That split is the whole point of the lifecycle.

Where do evals fit, and why so early?

The eval harness is wired in during Scaffold, before any feature code exists, so it can gate every change from the first slice onward. Golden datasets and LLM-as-judge suites test the agent's decisions, not just its output. Catching a behavioural regression in CI is cheap; catching it from a user is not.

Can AIDLC run on a private or on-premise LLM?

Yes. The Harden phase covers private LLM deployment for cases where data residency, PDPL, or DIFC compliance rule out hosted models. The lifecycle is model-agnostic and runs on Claude, GPT, Gemini, or open-weights like Llama and Mistral, so the deployment target is a choice made on constraints, not a rewrite.

How long does a first pass through AIDLC take?

Frame and Spec take roughly a week. The first demoable slice ships inside two weeks of Generate. After that, build runs in two-week vertical slices with a Friday demo. You decide whether to extend after each slice. There is no annual lock-in.

What happens after the build is done?

The Operate phase keeps the system improving in production. Evals run on every change, traces feed the backlog, and the success metric stays on a dashboard. You can take handover with runbooks and eval suites, or keep us on a monthly retainer to operate it. Both use the same observability surface, so switching later is a config change.

Work with the studio

Bring the methodto your nextbuild.

Send a one-paragraph brief on what you want built. You will get a written reply within twenty-four hours, with an honest read on how AIDLC would run on it.

Start a project