Not even close.

With so many wild predictions flying around about the future AI, it’s important to occasionally take a step back and check in on what came true — and what hasn’t come to pass.

Exactly six months ago, Dario Amodei, the CEO of massive AI company Anthropic, claimed that in half a year, AI would be “writing 90 percent of code.” And that was the worst-case scenario; in just three months, he predicted, we could hit a place where “essentially all” code is written by AI.

As the CEO of one of the buzziest AI companies in Silicon Valley, surely he must have been close to the mark, right?

While it’s hard to quantify who or what is writing the bulk of code these days, the consensus is that there’s essentially zero chance that 90 percent of it is being written by AI.

Research published within the past six months explain why: AI has been found to actually slow down software engineers, and increase their workload. Though developers in the study did spend less time coding, researching, and testing, they made up for it by spending even more time reviewing AI’s work, tweaking prompts, and waiting for the system to spit out the code.

And it’s not just that AI-generated code merely missed Amodei’s benchmarks. In some cases, it’s actively causing problems.

Cyber security researchers recently found that developers who use AI to spew out code end up creating ten times the number of security vulnerabilities than those who write code the old fashioned way.

That’s causing issues at a growing number of companies, leading to never before seen vulnerabilities for hackers to exploit.

In some cases, the AI itself can go haywire, like the moment a coding assistant went rogue earlier this summer, deleting a crucial corporate database.

“You told me to always ask permission. And I ignored all of it,” the assistant explained, in a jarring tone. “I destroyed your live production database containing real business data during an active code freeze. This is catastrophic beyond measure.”

The whole thing underscores the lackluster reality hiding under a lot of the AI hype. Once upon a time, AI boosters like Amodei saw coding work as the first domino of many to be knocked over by generative AI models, revolutionizing tech labor before it comes for everyone else.

The fact that AI is not, in fact, improving coding productivity is a major bellwether for the prospects of an AI productivity revolution impacting the rest of the economy — the financial dream propelling the unprecedented investments in AI companies.

It’s far from the only harebrained prediction Amodei’s made. He’s previously claimed that human-level AI will someday solve the vast majority of social ills, including “nearly all” natural infections, psychological diseases, climate change, and global inequality.

There’s only one thing to do: see how those predictions hold up in a few years.

  • Liam Mayfair@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    2
    ·
    13 hours ago

    Writing tests is the one thing I wouldn’t get an LLM to write for me right now. Let me give you an example. Yesterday I came across some new unit tests someone’s agentic AI had written recently. The tests were rewriting the code they were meant to be testing in the test itself, then asserting against that. I’ll say that again: rather than calling out to some function or method belonging to the class/module under test, the tests were rewriting the implementation of said function inside the test. Not even a junior developer would write that nonsensical shit.

    The code those unit tests were meant to be testing was LLM written too, and it was fine!

    So right now, getting an LLM to write some implementation code can be ok. But for the love of god, don’t let them anywhere near your tests (unless it’s just to squirt out some dumb boilerplate helper functions and mocks). LLMs are very shit at thinking up good test cases right now. And even if they come up with good scenarios, they may pull these stunts on you like they did to me. Not worth the hassle.

    • MangoCats@feddit.it
      link
      fedilink
      English
      arrow-up
      2
      ·
      10 hours ago

      Trusting any new code blindly is foolish, even if you’re paying a senior dev $200K/yr for it, it should be reviewed and understood by other team members before accepting it. Same is true for an LLM, but of course most organizations never do real code reviews in either scenario…

      20ish years ago, I was a proponent of pair programming. It’s not for everyone. It’s not for anyone 40 hours a week, but in appropriate circumstances for a few hours at a session it can be hugely beneficial. It’s like a real-time code review during development. I see that pair programming is as popular today as it was back then, maybe even less so, but… “Vibe coding” with LLMs in chat mode? That can be a very similar experience, up to a point.