DeepSeek's surprisingly inexpensive AI model, DeepSeek V3, has shaken the AI market, causing a significant drop in NVIDIA's stock price. While DeepSeek claims a mere $6 million training cost, a closer look reveals a far more substantial investment.
Image: ensigame.com
DeepSeek V3's innovative architecture is key to its performance. It utilizes:
- Multi-token Prediction (MTP): Predicting multiple words simultaneously for increased accuracy and efficiency.
- Mixture of Experts (MoE): Employing 256 neural networks, activating eight for each token, accelerating training and improving performance.
- Multi-head Latent Attention (MLA): Repeatedly extracting key details to minimize information loss and capture crucial nuances.
Image: ensigame.com
However, SemiAnalysis revealed DeepSeek's actual infrastructure: approximately 50,000 Nvidia Hopper GPUs, including 10,000 H800, 10,000 H100, and additional H20 GPUs, spread across multiple data centers. This represents a ~$1.6 billion server investment and ~$944 million in operational costs. The $6 million figure only reflects pre-training GPU usage, excluding research, refinement, data processing, and infrastructure.
DeepSeek, a subsidiary of High-Flyer, a Chinese hedge fund, owns its data centers, providing control and faster innovation. Its self-funded nature enhances agility. High salaries (over $1.3 million annually for some researchers) attract top Chinese talent, though the company doesn't hire foreign specialists.
Image: ensigame.com
DeepSeek's total investment in AI development exceeds $500 million. Its lean structure allows for efficient innovation compared to larger, more bureaucratic companies.
Image: ensigame.com
DeepSeek's success highlights the competitive potential of well-funded independent AI companies. While its "budget-friendly" claim is misleading, its cost remains significantly lower than competitors. For example, DeepSeek's R1 model cost $5 million, compared to ChatGPT4's $100 million. The reality is a combination of substantial investment, technological breakthroughs, and a highly skilled team.