DeepSeek: Pioneering the Shift from Big Data to Small AI Models
Credits: SOCIAL MEDIA

DeepSeek: Pioneering the Shift from Big Data to Small AI Models

DeepSeek signals a shift away from the dominance of Big Data and Big AI, while not diminishing Nvidia's role in the tech landscape. Its emphasis on efficiency marks the beginning of a new competition centered around small AI models that require minimal data and computing resources. The introduction of DeepSeek’s affordable, cutting-edge AI model could redirect U.S. Big Tech from its traditional "bigger is better" mindset, fostering a surge of AI startups that prioritize the philosophy of "small is beautiful."

Much of the dialogue surrounding DeepSeek, particularly on Wall Street, has fixated on its assertion that its AI model can compete with leading U.S. models at a significantly lower training cost. However, what sets DeepSeek apart is not just its computational efficiency but its data efficiency. The team behind DeepSeek curated a training set of only 800,000 examples, including 600,000 reasoning-related answers, demonstrating how to adapt large language models into reasoning-based models. Notably, a team from Hong Kong University was able to replicate DeepSeek’s model using just 8,000 examples.

According to Forbes magazine, this innovation initiates a new phase in AI development—the Small Data competition. The Turing Post described DeepSeek as an exciting instance of curiosity-driven research, emphasizing its focus on addressing specific challenges rather than merely striving to meet benchmarks. A key issue tackled in their research is whether reasoning performance can be enhanced by using a small amount of high-quality data to kickstart learning, a concept referred to as the "cold start" problem. This approach involves generating and fine-tuning data while retaining only accurate responses, relying on human expertise rather than automated data-cleaning processes.

Despite the significant implications of using fewer training examples, the media spotlight has largely been on the $6 million training cost. This phenomenon can be attributed to what I term the "Moore’s Law addiction," where the tech narrative has been dominated by the belief that larger models and more data are inherently better. Historical giants like IBM and Intel have perpetuated this ideology, which has persisted even among new digital startups.

Nvidia emerged during a transformative period in data processing, originally focused on graphics processing units (GPUs) but later adapting to the burgeoning field of Big Data. As we navigate this evolving landscape, there are signs of a gradual departure from the "bigger is better" paradigm. Startups are increasingly demonstrating the viability of smaller models, and Nvidia itself is exploring edge computing and desktop applications.

The attention generated by DeepSeek may further catalyze this movement toward valuing efficiency and innovation over sheer scale, heralding a new era where "small is beautiful" becomes the guiding principle in AI development.

* Stories are edited and translated by Info3 *
Non info3 articles reflect solely the opinion of the author or original source and do not necessarily reflect the views of Info3