Not even close.
With so many wild predictions flying around about the future AI, it’s important to occasionally take a step back and check in on what came true — and what hasn’t come to pass.
Exactly six months ago, Dario Amodei, the CEO of massive AI company Anthropic, claimed that in half a year, AI would be “writing 90 percent of code.” And that was the worst-case scenario; in just three months, he predicted, we could hit a place where “essentially all” code is written by AI.
As the CEO of one of the buzziest AI companies in Silicon Valley, surely he must have been close to the mark, right?
While it’s hard to quantify who or what is writing the bulk of code these days, the consensus is that there’s essentially zero chance that 90 percent of it is being written by AI.
Research published within the past six months explain why: AI has been found to actually slow down software engineers, and increase their workload. Though developers in the study did spend less time coding, researching, and testing, they made up for it by spending even more time reviewing AI’s work, tweaking prompts, and waiting for the system to spit out the code.
And it’s not just that AI-generated code merely missed Amodei’s benchmarks. In some cases, it’s actively causing problems.
Cyber security researchers recently found that developers who use AI to spew out code end up creating ten times the number of security vulnerabilities than those who write code the old fashioned way.
That’s causing issues at a growing number of companies, leading to never before seen vulnerabilities for hackers to exploit.
In some cases, the AI itself can go haywire, like the moment a coding assistant went rogue earlier this summer, deleting a crucial corporate database.
“You told me to always ask permission. And I ignored all of it,” the assistant explained, in a jarring tone. “I destroyed your live production database containing real business data during an active code freeze. This is catastrophic beyond measure.”
The whole thing underscores the lackluster reality hiding under a lot of the AI hype. Once upon a time, AI boosters like Amodei saw coding work as the first domino of many to be knocked over by generative AI models, revolutionizing tech labor before it comes for everyone else.
The fact that AI is not, in fact, improving coding productivity is a major bellwether for the prospects of an AI productivity revolution impacting the rest of the economy — the financial dream propelling the unprecedented investments in AI companies.
It’s far from the only harebrained prediction Amodei’s made. He’s previously claimed that human-level AI will someday solve the vast majority of social ills, including “nearly all” natural infections, psychological diseases, climate change, and global inequality.
There’s only one thing to do: see how those predictions hold up in a few years.
Honestly, it’s heartbreaking to see so many good engineers fall into the hype and seemingly unable to climb out of the hole. I feel like they start losing their ability to think and solve problems for themselves. Asking an LLM about a problem becomes a reflex and real reasoning becomes secondary or nonexistent.
Executives are mostly irrelevant as long as they’re not forcing the whole company into the bullshit.
Based on my experience, I’m skeptical someone that seemingly delegates their reasoning to an LLM were really good engineers in the first place.
Whenever I’ve tried, it’s been so useless that I can’t really develop a reflex, since it would have to actually help for me to get used to just letting it do it’s thing.
Meanwhile the people who are very bullish who are ostensibly the good engineers that I’ve worked with are the people who became pet engineers of executives and basically have long succeeded by sounding smart to those executives rather than doing anything or even providing concrete technical leadership. They are more like having something akin to Gartner on staff, except without even the data that at least Gartner actually gathers, even as Gartner is a useless entity with respect to actual guidance.
I mean before we’d just ask google and read stack, blogs, support posts, etc. Now it just finds them for you instantly so you can just click and read them. The human reasoning part is just shifting elsewhere where you solve the problem during debugging before commits.
“Stack overflow engineer” has been a derogatory forever lol
No, good engineers were not constantly googling problems because for most topics, either the answer is trivial enough that experienced engineers could answer them immediately, or complex and specific enough to the company/architecture/task/whatever that Googling it would not be useful. Stack overflow and the like has always only ever really been useful as the occasional memory aid for basic things that you don’t use often enough to remember how to do. Good engineers were, and still are, reasoning through problems, reading documentation, and iteratively piecing together system-level comprehension.
The nature of the situation hasn’t changed at all: problems are still either trivial enough that an LLM is pointless, or complex and specific enough that an LLM will get it wrong. The only difference is that an LLM will spit out plausible-sounding bullshit and convince people it’s valuable when it is, in fact, not.
In the case of a senior engineer then they wouldn’t need to worry about the hallucination rate. The LLM is a lot faster than them and they can do other tasks while it’s being generated and then review the outputs. If it’s trivial you’ve saved time, if not, you can pull up that documentation, and reason and step through the problem with the LLM. If you actually know what you’re talking about you can see when it slips up and correct it.
And that hallucination rate is rapidly dropping. We’ve jumped from about 40% accuracy to 90% over the past ~6mo alone (aider polygot coding benchmark) - at about 1/10th the cost (iirc).
Insane that just writing the code isn’t even an option in your mind