

In the case of a senior engineer then they wouldn’t need to worry about the hallucination rate. The LLM is a lot faster than them and they can do other tasks while it’s being generated and then review the outputs. If it’s trivial you’ve saved time, if not, you can pull up that documentation, and reason and step through the problem with the LLM. If you actually know what you’re talking about you can see when it slips up and correct it.
And that hallucination rate is rapidly dropping. We’ve jumped from about 40% accuracy to 90% over the past ~6mo alone (aider polygot coding benchmark) - at about 1/10th the cost (iirc).
Scale? It’s a personal ancestry site for my surname with graphs and shit mate. Compares naming patterns, locations, dna, clustering, etc between generations and tries to place loose people. Works pretty well, managed to find a bunch of missing connections through it.