Solving the Engineering Productivity Paradox

"Today, more than a quarter of all new code at Google is generated by AI, then reviewed and accepted by engineers. This helps our engineers do more and move faster.”

That’s what Sundar Pichai, CEO of Alphabet, said in their Q3 2024 earnings call. And in their most recent call, Sundar updated that number to “well over 30% now.”

But here's where things get interesting. On the Lex Fridman podcast this month, Sundar clarified those comments, saying:

“Looking at Google, we’ve given various stats around 30% of code now uses AI-generated suggestions or whatever. But the most important metric, and we measure it carefully, is how much has our engineering velocity increased as a company due to AI, right? It’s tough to measure, and we really try to measure it rigorously, and our estimates are that number is now at 10%.”

This has taken a lot of people by surprise. Casual observers were expecting a 30% increase in engineering productivity, so why only 10%?

I think the key point is right there in the original statement: “then reviewed and accepted by engineers.”

Sure, code is being written by AI, and it's being generated more quickly. But just like code written by a developer, that AI-generated code has to be scrutinized, verified, and fixed. We need to make sure it doesn't have any security issues, and crucially, that it's also reliable, maintainable, and understandable.

One of my favorite classes in graduate school was System Dynamics, taught by Professor John Sterman. Many of you are probably familiar with the concepts from the book "Thinking in Systems" by Donella Meadows. Systems thinking has been a foundational part of how I approach things throughout my professional life. My graduate research and first job were trying to improve overall factory productivity using an approach we ended up calling “flow balancing.” Basically, companies spent a lot of time fixing specific stages of the car assembly process, but productivity wasn’t changing. When you optimize one step of a process, you often end up creating side-effects or bottlenecks somewhere else that pretty much cancel out the benefit. Flow balancing optimized the end to end system of the factory, not just stage by stage.

History is repeating itself in software development. There's a huge focus on speeding up code production using tools like GitHub Copilot, Cursor, and others. And the results are honestly stunning, just like Sundar mentioned in his earnings call.

But, and this is a big "but," bottlenecks are popping up elsewhere. Issues are appearing in production, and issues in production are a lot more expensive and time consuming to fix. According to Harness, almost 60% of developers report experiencing problems with deployments at least half the time when using AI coding tools. In companies that let issues slip through the cracks until the code is shipped, I wouldn’t be surprised to see net productivity actually decrease.

Increasingly, the bottleneck is in the code review phase. And that's actually how it should be. AI-generated code absolutely must be reviewed before it's merged into your codebase, and definitely before it's deployed. Google has always had a strong code review culture, tools, and process, which is likely why they haven't seen a spike in issues from all that AI-generated code.

Many companies, however, don't have sufficient culture, tools, and processes in place for code reviews, and those companies are taking a big risk. Company leaders need to create a culture of high-quality code and thorough code review, reinforcing accountability at both the developer and the team level. But companies also need to provide the right tools to make this manageable. The speed of code generation, along with the complexity and sheer volume of AI-generated code, are all increasing rapidly.

That's where platforms like SonarQube come into play. Automated code assessment identifies and prioritizes potential issues, so developers can focus their time on the real problems. Companies that are doing this well are taking all the AI-generated code that gets accepted and analyzing it with SonarQube to give their developers a boost.

Culture and tooling are both critical, but so is process. Companies need to define and enforce standards for their AI-generated code (honestly, this should be done for all code, as a best practice). I’ve written about this before in “The Seven Habits of Highly Effective AI Coding.” SonarQube’s AI Code Assurance capability helps you define and enforce the gates and checkpoints, ensuring all your teams are meeting the established standards, and giving company leaders, corporate boards, and regulators confidence that AI risks are being managed.

AI has massive potential for improving the productivity of the software development lifecycle. Just remember to think about the whole system, measure true end-to-end performance, and avoid creating new, and potentially riskier, bottlenecks. Vibe, then Verify.

Solving the Engineering Productivity Paradox

SHARE

Get the most value out of your AI-generated code