China’s DeepSeek faces questions over claims after shaking up global tech

After causing shockwaves with an AI model with capabilities rivalling the creations of Google and OpenAI, China’s DeepSeek is facing questions about whether its bold claims stand up to scrutiny.

The Hangzhou-based startup’s announcement that it developed R1 at a fraction of the cost of Silicon Valley’s latest models immediately called into question assumptions about the United States’s dominance in AI and the sky-high market valuations of its top tech firms.

Some sceptics, however, have challenged DeepSeek’s account of working on a shoestring budget, suggesting that the firm likely had access to more advanced chips and more funding than it has acknowledged.

“It’s very much an open question whether DeepSeek’s claims can be taken at face value. The AI community will be digging into them and we’ll find out,” Pedro Domingos, professor emeritus of computer science and engineering at the University of Washington, told Al Jazeera.

“It’s plausible to me that they can train a model with $6m,” Domingos added.

“But it’s also quite possible that that’s just the cost of fine-tuning and post-processing models that cost more, that DeepSeek couldn’t have done it without building on more expensive models by others.”

In a research paper released last week, the DeepSeek development team said they had used 2,000 Nvidia H800 GPUs – a less advanced chip originally designed to comply with US export controls – and spent $5.6m to train R1’s foundational model, V3.

OpenAI CEO Sam Altman has stated that it cost more than $100m to train its chatbot GPT-4, while analysts have estimated that the model used as many as 25,000 more advanced H100 GPUs.

The announcement by DeepSeek, founded in late 2023 by serial entrepreneur Liang Wenfeng, upended the widely held belief that companies seeking to be at the forefront of AI need to invest billions of dollars in data centres and large quantities of costly high-end chips.

It also raised questions about the effectiveness of Washington’s efforts to constrain China’s AI sector by banning exports of the most advanced chips.

Shares of California-based Nvidia, which holds a near-monopoly on the supply of GPUs that power generative AI, on Monday plunged 17 percent, wiping nearly $593bn off the chip giant’s market value – a figure comparable with the gross domestic product (GDP) of Sweden.

While there is broad consensus that DeepSeek’s release of R1 at least represents a significant achievement, some prominent observers have cautioned against taking its claims at face value.

Related Articles

Back to top button