I have been making a quiet assumption in my work for the last 18 months. That AI adoption is a tooling problem. Use better tools, get better results. I was wrong.
The teams I have watched struggle with AI — and there are more of them than the LinkedIn highlight reel suggests — were not struggling because they had bad tools. They were struggling because they adopted AI into processes that were never designed for it.
Three patterns kept appearing:
AI coding agents deployed into requirements workflows that never defined what “done” means for a non-human implementer. A human developer reads an ambiguous requirement and asks a question. An agent reads an ambiguous requirement and makes a decision — silently, confidently, at 10× the speed.
AI testing deployed into codebases where nobody owns what “correct output” means for an LLM response. Hallucination rates that seem acceptable in isolation accumulate invisibly over weeks — until they surface in production, or in a customer demo, or in a compliance review.
AI agents with write permissions deployed into systems where the monitoring stack was built entirely for deterministic failures. An agent can introduce degraded behaviour that passes every check — because the checks were designed to catch crashes, not drift.
The tools worked fine. The systems around them were not ready.
This is the insight that changes how you think about AI adoption. The failure is almost never in the model. It is in the interface between the model and the human processes surrounding it: the requirements it receives, the tests that evaluate it, the monitoring that watches it, the deployment process that ships it. Get those interfaces right and AI performs. Leave them unchanged and AI amplifies exactly the problems that were already there — but faster, and with more confidence.
The SDLC as a Process for Two Kinds of Consumers
The software development lifecycle was designed for human consumers of its outputs. Requirements written for human developers. Tests evaluated by human judgement. Deployments monitored for human-recognisable failure modes.
AI agents are a different kind of consumer. They need different inputs, produce different outputs, and fail in different ways. The SDLC phases below are not broken. They need to be extended — not to accommodate AI as a novelty, but because AI is now a first-class participant in each phase.
Here is where I have seen things go wrong, one phase at a time:
“AI agents built exactly what was specified. It didn’t work. That’s an analysis problem.”
“We gave an agent a vague domain. Three months later we found it in the wrong database.”
“I almost shipped a race condition I didn’t write. How many am I missing?”
“4% hallucination rate. Six weeks. Zero alerts. The number isn’t the problem — not knowing is.”
“App Store review took 6 days. We’d designed for simultaneous release. We learned.”
“Textbook deployment. Zero errors. Users got wrong answers for four days. All dashboards green.”
“I pulled LLM traces out of curiosity. Three months of quiet degradation. No one had noticed.”
“‘Show us what your AI did.’ We couldn’t. We didn’t win the deal.”
“Two sentences added to a prompt on a Friday. Four days of degraded behaviour. No rollback path.”
None of these failures required the AI to malfunction. In every case, the model did exactly what it was asked to do — correctly, quickly, and with confidence. The problem was in what it was asked to do, in the absence of the right guard rails, in the lack of visibility into what it was doing, and in the failure to design for how it changes over time.
Each phase of the SDLC has a specific failure mode when AI is introduced without adapting the process. The next articles in this series go deeper — one phase at a time, with concrete changes that actually move the needle, not frameworks that sound good in a diagram.
◉ // Zoom out · the structural companion These nine phases are where adoption breaks. 8 Forces Reshaping How Software Gets Built is why — the same territory mapped as eight structural forces, requirements through reliability. The reference companion to this series; Articles 05–08 go deep on each force pair.Some of This Is Slightly Unpopular
The positions in this series are specific. Some of them push against prevailing opinion. That is intentional — not for the sake of being contrarian, but because the prevailing opinion is often shaped by people with something to sell.
- Domain-Driven Design is now an economic necessity, not a philosophy. When AI agents build your microservices, the quality of your domain model determines the quality of your implementation — directly, and at speed. Vague domains produce vague code, generated faster than you can review it.
- Most AI monitoring in production is currently hope masquerading as observability. Infrastructure metrics stay green while LLM output quality silently degrades. The monitoring stack most teams inherited was not built for non-deterministic systems — and most have not rebuilt it.
- The mobile App Store review process is the hardest constraint in modern software delivery, and no amount of AI tooling changes it. A five-track coordinated release — web, mobile, backend, data pipeline, AI/ML — can be AI-generated in hours and then wait five days for App Store review. That asymmetry matters enormously for release design.
I might be wrong on some of it. The honest position in a rapidly changing field is to say something specific and be correctable, rather than say something vague and be agreeable. Vague agreement produces no learning for either of us.
- You are an engineer asking where AI actually fits in your real workflow — not the demo workflow
- You are a tech lead deciding which process changes are worth the disruption
- You are an architect designing systems that serve humans and AI agents with equal reliability
- You are curious about the specific failure modes, not the general promise
- You are looking for validation that everything is going great with your AI adoption
- You want tool recommendations without the surrounding process context
- You are selling something AI-adjacent and want a co-signer for the pitch
- You are looking for hype — this is the honest version, and the honest version is more complicated
The teams winning with AI are not the ones with better models. They are the ones that designed their processes for how AI actually works — its context limits, its confidence without certainty, its inability to ask a clarifying question. That design is where the real work is.