In the rapidly evolving world of artificial intelligence, the drive for bigger and better models often centers on the sheer volume of data used for training. However, a recent discussion featuring Janusz Marecki, CEO and Founder of Fractal Brain, and an AI Partner at Ahren Innovation Capital, highlighted a critical nuance: the quality and diversity of data are paramount, and simply adding more might not be the solution to AI's persistent challenges.
The Data Dilemma in AI
Merryn Somerset Webb, host of the "Merryn Talks Money" podcast, initiated the conversation by probing Marecki on the current state of AI development. Marecki, an expert with a background in AI research and investment, pointed out a significant hurdle: the phenomenon of 'hallucinations' in AI models. These are instances where models confidently generate incorrect or nonsensical information, a direct consequence of the data they are trained on.
Marecki elaborated on the common approach of throwing more data at the problem. "We keep pouring money into bigger data centers, knowing that we've used all the data already," he stated, emphasizing the potential futility of this strategy. He likened the current situation to a calculator that is 95-99% accurate. While impressive, the remaining error margin can be critical, leading to the generation of incorrect outputs that are difficult to distinguish from correct ones.