A week ago, on Monday 20 January 2025, a Chinese artificial intelligence startup, Deepseek, made a bold that sent shockwaves through the tech world. The company revealed its first-generation reasoning models, Deepseek-R1-Zero and Deepseek-R1--both trained without supervised fine-tuning--and claimed that the R1 model achieved performance on par with OpenAI’s O1 across math, code and reasoning tasks. Whilst the announcement itself didn’t immediately stir major reactions in the financial world, what set it apart was the company’s decision to open-source the models.
DeepSeek the code for DeepSeek-R1-Zero, DeepSeek-R1 and six dense distilled models available under the MIT license on Github, essentially allowing anyone near-unlimited freedoms, including modifying it and creating their own proprietary commercial use.
What was even more astonishing was that these models were capable of running on high-end Mac desktops, not just AI-optimised reference systems. Researchers and enthusiasts quickly began conducting their own benchmarks, further validating some of Deepseek’s claims.
A week later, financial markets reacted dramatically. Nvidia, the leading design and manufacturing company behind the most powerful AI chips, saw its stock price plummet. On Friday 24 January, Nvidia shares closed at $142.62, but by Monday’s close, the price had dropped 17%, to $118.42, wiping out nearly $540bn from its market value--the largest single-day loss ever recorded by any company.
Cost debate
For some experts, Deepseek’s claim that its R1 model was developed using 2,048 Nvidia H800 GPUs and approximately 2.7m GPU training hours over two months seems highly improbable. Assuming the industry standard of $2 per GPU hour rental price, this would imply a development cost of just $5.6m--a fraction of what OpenAI spends on similar models. In contrast, OpenAI invested tens of millions into building GPT-4, using 25,000 of the more powerful Nvidia H100 chips. So, what’s the catch?
Apparently, some researchers and experts believe that Deepseek had access to a much larger pool of Nvidia chips than disclosed. On the other hand, Deepseek did not use Nvidia’s proprietary coding language, Cuda--considered the industry gold standard--and instead claimed higher efficiency. If true, this could have far-reaching implications, extending well beyond the current ‘chip wars’.
What’s next?
While it is difficult to make industry-wide assumptions, the open-sourced models, capable of running on standard computers and being portable, could bring a seismic shift in the economics of AI development. Moreover, if Deepseek’s claims about development costs hold true--even partially--the billions of dollars spent or earmarked for custom-built AI data centres, energy supplies and other infrastructure could be at risk. For billions in these facilities, this would pose a serious challenge to their long-term strategies.