A lifecycle for building software the agentic way.
AIDLC is the method I run on every engagement. Eight phases that take an idea from a framed problem to an operated, eval-guarded system. Agents do the heavy lifting at each step. Senior engineering keeps them inside the boundary.
Four principles run underneath all eight phases. They are what keep agentic speed from turning into agentic chaos.
01 / Spec first
Specs are the source of truth
Agents are only as good as what you point them at. Every phase produces an artifact a coding agent can act on without guessing. Ambiguity is resolved before generation, not during review.
02 / Evals as the gate
Evals guard every change
Velocity without a safety net is just faster regressions. Golden datasets and LLM-as-judge suites run on every diff, so the speed agents bring never trades against correctness.
03 / Human in the loop
Senior review on every diff
Agents generate; a senior engineer is accountable. The boundary is defined by people, enforced by the harness, and never left to the model to police on its own.
04 / Production from day one
Observability is not optional
Traces, costs, and the success metric sit on a dashboard from the first slice. If you cannot watch it, it does not ship. Handover later is a config change, not a rewrite.
The shape of the work
What does the AIDLC pipeline look like?
The first six phases run once to take a framed problem into a hardened build. The last two run as a loop, so every change in production passes back through evals before it ships.
AIDLC pipeline. Frame, Spec, Scaffold, Generate, Eval, and Harden run in sequence on a first build. Ship and Operate run as a continuous loop in production, feeding changes back through evals before they ship.
Each phase has one job and one set of outputs. Phases run in sequence on a first build and as a loop in production. No phase ships without the one before it.
Phase 01 of 8
Frame
Pin down the problem before a line of code exists. We agree on the user, the constraint, and the single metric that proves the system is earning its keep. The output is a one-page scope, not a proposal deck.
One-page problem scope
Success metric agreed in writing
Constraints and non-goals
Phase 02 of 8
Spec
Turn the frame into an executable specification. Agent roles, tool boundaries, data contracts, and acceptance criteria written so both a human and a coding agent can act on them without guessing.
Executable spec
Agent role and tool inventory
Acceptance criteria
Phase 03 of 8
Scaffold
Stand up the skeleton. Repo conventions, type contracts, the eval harness, and the smallest runnable surface. Agents work best inside a structure that already enforces the rules, so we build that structure first.
Typed scaffold and conventions
Eval harness wired in
First runnable slice
Phase 04 of 8
Generate
Agents write the bulk of the code against the spec, supervised by senior review on every diff. Two-week vertical slices, demoable each Friday. Velocity comes from the agents; correctness comes from the harness and the reviewer.
Working vertical slices
Reviewed diffs
Friday demos
Phase 05 of 8
Eval
Behaviour is guarded by golden datasets and LLM-as-judge suites that run on every change. We test the agent's decisions, not just its output, so regressions surface before they reach users.
Golden datasets
Regression-gated CI
Trace-driven backlog
Phase 06 of 8
Harden
Close the gaps that only appear under real load. Prompt-injection defenses, PII redaction, rate limits, fallbacks, and audit logs. Private LLM deployment when residency, PDPL, or DIFC compliance demand it.
Guardrails and redaction
Audit logging
Residency-compliant deployment
Phase 07 of 8
Ship
Release behind a flag, instrument the loop, and roll out by cohort. Costs land on a dashboard from day one. Nothing goes to production that the eval suite and the observability surface cannot watch.
Flagged rollout
Cost and latency dashboards
Production observability
Phase 08 of 8
Operate
The system improves in production. Evals run on every change, traces feed the backlog, and the metric stays on a dashboard. Handover to your team with runbooks, or a monthly retainer if you want us to keep operating it.
Runbooks and handover
Continuous eval loop
Operating cadence
Further reading
Read the method in depth
The phases, the principles, and how AIDLC changes every role on the team.
The thinking behind the phases
Each post goes deep on one part of running AIDLC in production.
Answers written for operators and engineers evaluating how AIDLC actually runs.
What is AIDLC?
AIDLC is the AI Development Life Cycle, an eight-phase method for building software where agents do the heavy lifting and senior engineering keeps them inside a defined boundary. The phases are Frame, Spec, Scaffold, Generate, Eval, Harden, Ship, and Operate. It runs in sequence on a first build and as a loop in production.
How is AIDLC different from the traditional SDLC?
The traditional SDLC assumes humans write most of the code. AIDLC assumes agents do, so it front-loads the work that makes agents effective. That means a precise spec, a scaffold that enforces conventions, and an eval harness that catches regressions on every diff. The phases that look familiar like ship and operate carry extra weight because an autonomous system needs observability humans would otherwise provide implicitly.
Do agents really write most of the code?
On a well-specified slice, yes, the bulk of generation is agent-driven. But every diff goes through senior review, and nothing merges without passing the eval suite. The speed comes from the agents while the accountability stays with a person. That split is the whole point of the lifecycle.
Where do evals fit, and why so early?
The eval harness is wired in during Scaffold, before any feature code exists, so it can gate every change from the first slice onward. Golden datasets and LLM-as-judge suites test the agent's decisions, not just its output. Catching a behavioural regression in CI is cheap; catching it from a user is not.
Can AIDLC run on a private or on-premise LLM?
Yes. The Harden phase covers private LLM deployment for cases where data residency, PDPL, or DIFC compliance rule out hosted models. The lifecycle is model-agnostic and runs on Claude, GPT, Gemini, or open-weights like Llama and Mistral, so the deployment target is a choice made on constraints, not a rewrite.
How long does a first pass through AIDLC take?
Frame and Spec take roughly a week. The first demoable slice ships inside two weeks of Generate. After that, build runs in two-week vertical slices with a Friday demo. You decide whether to extend after each slice. There is no annual lock-in.
What happens after the build is done?
The Operate phase keeps the system improving in production. Evals run on every change, traces feed the backlog, and the success metric stays on a dashboard. You can take handover with runbooks and eval suites, or keep us on a monthly retainer to operate it. Both use the same observability surface, so switching later is a config change.
Work with the studio
Bring the methodto your nextbuild.
Send a one-paragraph brief on what you want built. You will get a written reply within twenty-four hours, with an honest read on how AIDLC would run on it.
Weekly field notes on private AI, automation, and high-performance Next.js builds. Each edition is concise, implementation-ready, and tested in production work.