Those mistakes would be easily solved by something that doesn’t even need to think. Just add a filter of acceptable orders, or hire a low wage human who does not give a shit about the customers special orders.
That wouldn’t address the bulk of the issue, only the most egregious examples of it.
For every funny output like “I asked for 1 ice cream, it’s giving me 200 burgers”, there’s likely tens, hundreds, thousands of outputs like “I asked for 1 ice cream, it’s giving 1 burger”, that sound sensible but are still the same problem.
It’s simply the wrong tool for the job. Using LLMs here is like hammering screws, or screwdriving nails. LLMs are a decent tool for things that you can supervision (not the case here), or where a large amount of false positives+negatives is not a big deal (not the case here either).
If I were to watch Dragon Ball Z now, I’d probably drop the series. I still remember it fondly, but it’s too slow.
The first two seasons of the Pokémon anime aged well for me. Individual games, too. But the series as a whole felt from an “I know all 386!” to “…it’s a Tentaquil”.
Chrono Trigger went from “it’s okay, it’s fun” to “…I spent my whole life underrating it, didn’t I?” So did Final Fantasy VI.
Same deal with Dostoyevsky. I guess you need some maturity to understand things.
Baudelaire, though? Hard pass.
I still love 1984 and Animal Farm, but I want to drown 90% of the muppets talking about them.
I can’t stand Legião Urbana any more. Pink Floyd on the other hand aged well, so did Nenhum de Nós.
To be honest I was never too much into movies. There’s one or another thing that I like (Modern Times, 8 1/2, The Shining), but it’s mostly unchanged.