OpenAI suspects that DeepSeek, a Chinese AI model significantly cheaper than Western counterparts, may have been trained using OpenAI's data, sparking controversy and market volatility. The emergence of DeepSeek, boasting its R1 model as a low-cost alternative trained for a mere $6 million, caused a significant drop in the stock prices of major AI-related companies. Nvidia, a key player in GPU technology crucial for AI model operation, experienced its largest-ever single-day loss, shedding nearly $600 billion in market value. Other companies like Microsoft, Meta, Alphabet, and Dell also saw substantial declines.
OpenAI and Microsoft are investigating whether DeepSeek violated OpenAI's terms of service by employing "distillation," a technique where data from larger models is extracted for training. OpenAI confirmed its awareness of such attempts by Chinese and other companies to leverage leading US AI technology. David Sacks, President Trump's AI czar, further supports the claim of data extraction from OpenAI models.
This situation highlights the irony of OpenAI's position, given previous statements acknowledging the reliance on copyrighted material for training ChatGPT and the ongoing legal battles facing the company. OpenAI's January 2024 submission to the UK's House of Lords emphasized the impossibility of training leading AI models without copyrighted material. This stance is further complicated by lawsuits from the New York Times and 17 authors alleging copyright infringement. The legal landscape surrounding AI training data remains complex, particularly in light of a 2018 US Copyright Office ruling that AI-generated art cannot be copyrighted.