New AI benchmark reveals what we know about the current state of AI. The new index measures how well AI can automate economically valuable chores, and it paints a bleak picture for freelance workers. According to researchers at Scale AI and the Center for AI Safety (CAIS), even the most advanced AI agents struggle to perform even simple tasks.
In an experiment, several top-notch AI models were given a range of simulated freelance work, including graphic design, video editing, game development, and administrative chores like data scraping. And what did we find? Even the best AI models could only manage around 3 percent of the tasks, earning a meager $1,810 out of a possible $143,991.
While some might argue that this is an isolated incident, it's essential to consider the bigger picture. AI experts warn us that these models still have significant limitations, such as:
- They lack long-term memory storage and can't continually learn from experiences.
- They struggle with using different tools and performing complex tasks involving multiple steps.
This benchmark offers a counterpoint to an earlier GDPval benchmark, which claimed that frontier AI models like GPT-5 were approaching human abilities on 220 tasks across various office jobs. However, it's essential to note that the Remote Labor Index is not a perfect yardstick for AI's economic impact and may not cover all professions.
We can see why some experts are worried about AI taking over jobs β Amazon recently announced cutting 14,000 jobs partly due to the rapid rise of generative artificial intelligence. CEO Beth Galetti claims that this technology "is enabling companies to innovate much faster than ever before."
However, if we look at the Remote Labor Index results, it seems unlikely that AI will be stepping into those vacated roles anytime soon.
So what can we take away from this? It's clear that while AI has made significant progress in recent years, it still has a long way to go before becoming capable of performing complex tasks like humans.
In an experiment, several top-notch AI models were given a range of simulated freelance work, including graphic design, video editing, game development, and administrative chores like data scraping. And what did we find? Even the best AI models could only manage around 3 percent of the tasks, earning a meager $1,810 out of a possible $143,991.
While some might argue that this is an isolated incident, it's essential to consider the bigger picture. AI experts warn us that these models still have significant limitations, such as:
- They lack long-term memory storage and can't continually learn from experiences.
- They struggle with using different tools and performing complex tasks involving multiple steps.
This benchmark offers a counterpoint to an earlier GDPval benchmark, which claimed that frontier AI models like GPT-5 were approaching human abilities on 220 tasks across various office jobs. However, it's essential to note that the Remote Labor Index is not a perfect yardstick for AI's economic impact and may not cover all professions.
We can see why some experts are worried about AI taking over jobs β Amazon recently announced cutting 14,000 jobs partly due to the rapid rise of generative artificial intelligence. CEO Beth Galetti claims that this technology "is enabling companies to innovate much faster than ever before."
However, if we look at the Remote Labor Index results, it seems unlikely that AI will be stepping into those vacated roles anytime soon.
So what can we take away from this? It's clear that while AI has made significant progress in recent years, it still has a long way to go before becoming capable of performing complex tasks like humans.