When I taught, a grade below 70 was colloquially referred to as “failing”. As in, if someone got a score below 70 for a course, that person didn’t get credit for the course.
As it stands right now, generative AI is failing when it comes to using data from memory and failing hard when it comes to interpreting charts, diagrams, and images. Given that the current generation of generative AI is not that much better than the previous one, if we are plateauing out on generative AI performance, then the vast promises surrounding the technology ring hollow.
If a CEO announced, “I plan to replace my skilled workforce with low-paid people who are, at best, 68.8% accurate, and who will double-down on inaccuracies when challenged”, or, “we’re going all-in on inaccuracy, environmental destruction, and frustrating our customers” we would think that CEO had lost their ever-lovin’ mind. And if someone said they were going to revolutionize the world by putting BS artists into every development pipeline, expert system, and search engine, I would not see that revolutionization ending well.
https://venturebeat.com/ai/the-70-factuality-ceiling-why-googles-new-facts-benchmark-is-a-wake-up-call