Demis Hassabis, the CEO of Google DeepMind, recently expressed his dismay on social media regarding a post by Sébastien Bubeck, a research scientist at OpenAI. Bubeck excitedly announced that two mathematicians had utilized OpenAI’s latest language model, GPT-4, to solve previously unsolved mathematical problems. This proclamation sparked a wave of enthusiasm, with Bubeck exclaiming that the acceleration of scientific discovery through AI had officially commenced. However, this incident serves as a revealing example of the pitfalls associated with AI boosterism and the dangers of hasty claims disseminated through social media platforms.

In mid-October, the backdrop of this excitement involved the Erdős problems, a collection of unsolved mathematical puzzles left behind by the eminent mathematician Paul Erdős. These puzzles have been cataloged by Thomas Bloom, a mathematician at the University of Manchester, who maintains a dedicated website listing these challenges and their known solutions. When Bubeck celebrated GPT-4’s supposed breakthroughs, Bloom promptly countered that such claims were misleading, explaining that just because a problem may not appear as solved on his site does not imply a definitive answer does not exist. Instead, GPT-4 had likely identified existing solutions in the vast sea of mathematical literature, highlighting the model’s ability to sift through extensive data rather than achieving groundbreaking discoveries.

This incident underscores the need for caution when making grandiose claims about AI’s capabilities, especially in the realm of mathematics. While the ability of language models to conduct literature searches is impressive, it pales in comparison to genuine problem-solving. Furthermore, this trend of overstated accomplishments is not limited to OpenAI. A similar episode occurred when mathematicians demonstrated that no existing language model had successfully tackled a specific mathematical problem, only for subsequent claims to suggest that advancements had been made. As AI continues to infiltrate various fields, including medicine and law, researchers have found that while models can assist with certain tasks, they often fall short in delivering reliable or consistent advice. These findings emphasize the importance of tempered expectations regarding AI’s role in complex domains.


Source: How social media encourages the worst of AI boosterism via MIT Technology Review