The Code Quality Question Nobody Is Asking Properly

I had a conversation recently with a founder running a tech consultancy. He was worried about code quality. Specifically, what happens to standards when your developers are guiding AI rather than writing code themselves.

His concern was legitimate. But I'm not sure it was the right question.

The DORA reframe

I asked him about DORA. If the code works, and when it fails you can recover quickly, does the underlying quality actually matter?

He hadn't thought about it that way.

DORA metrics were built around outcomes, not craft. They don't care whether the code is elegant. They care whether the system is reliable and whether you can respond when it isn't.

If AI-generated code fails and you can detect it, fix it, and deploy in minutes, the argument for traditional quality standards gets harder to make. Not impossible. But harder.

That pause in the conversation told me something. A founder running a technology business, someone who thinks seriously about delivery and standards, hadn't been asked this yet. That's not a criticism. It's an observation about where most teams still are.

The standards worth keeping

Some quality gates exist because the reasons behind them haven't changed.

Journey tests catch failures before users do. Linting removes ambiguity and enforces consistency. Observability tells you something is wrong before a user has to. Security practices protect data that was never yours to lose.

These aren't craft standards. They're outcome standards. They exist to protect the people using the thing you built, and the business depending on it. AI writing the code doesn't change any of that. If anything, it raises the bar. AI-generated code isn't self-documenting in the way a thoughtful developer would make it. Tests become the spec. Observability becomes the memory.

These standards stay.

The standards worth questioning

Others are harder to justify now.

Idiomatic patterns. Elegant abstractions. The instinct to refactor something that works because a human developer would have written it differently. These were proxies. Ways of reading a codebase to infer whether the people who built it understood what they were doing.

Those proxies made sense when reading the code was how you understood the system. That assumption is shifting.

The developer role is moving from writing to guiding. From managing dependencies to managing context. Knowing what to ask, how to constrain the output, when to trust it and when to push back. That is a different skill. It is not a lesser one. But it is not the same as the craft we spent years valuing.

The uncomfortable part

A lot of the resistance to AI-generated code isn't really about quality. It's about identity.

What it means to be good at this. How competence gets recognised. Who gets to set the standard and on what basis.

That is a harder conversation than any metric. And most teams, most consultancies, most engineering cultures haven't had it yet.

The founder I spoke to hadn't been constructively challenged on this before. Once the frame shifted, the conversation changed completely. Not because the concern about quality was wrong, but because it was pointed at the wrong thing.

Keep the gates that protect outcomes. Question the ones that protect familiarity.

Be honest about which is which. That distinction is where the real work is right now, and most teams are not making it deliberately.

The code underneath is changing. The reasons to care about reliability, security, and recoverability are not.

That is not a reason to lower standards. It is a reason to be precise about which standards still earn their place.