@auraithx

auraithx@lemmy.dbzer0.com · edit-2 20 hours ago

Scale? It’s a personal ancestry site for my surname with graphs and shit mate. Compares naming patterns, locations, dna, clustering, etc between generations and tries to place loose people. Works pretty well, managed to find a bunch of missing connections through it.

auraithx@lemmy.dbzer0.com · 20 hours ago

In the case of a senior engineer then they wouldn’t need to worry about the hallucination rate. The LLM is a lot faster than them and they can do other tasks while it’s being generated and then review the outputs. If it’s trivial you’ve saved time, if not, you can pull up that documentation, and reason and step through the problem with the LLM. If you actually know what you’re talking about you can see when it slips up and correct it.

And that hallucination rate is rapidly dropping. We’ve jumped from about 40% accuracy to 90% over the past ~6mo alone (aider polygot coding benchmark) - at about 1/10th the cost (iirc).

auraithx@lemmy.dbzer0.com · 21 hours ago

I mean before we’d just ask google and read stack, blogs, support posts, etc. Now it just finds them for you instantly so you can just click and read them. The human reasoning part is just shifting elsewhere where you solve the problem during debugging before commits.

auraithx@lemmy.dbzer0.com · 21 hours ago

Sounds like they need to work on their prompts. I vibe code some hobby projects I wouldn’t have done otherwise and it’s never done that. I have it comment each change and review it all in diff checker so that’s 90% of the time.

auraithx@lemmy.dbzer0.com · 4 months ago

Working for me now