City
Epaper

DeepSeek looks fantastic but not a miracle and not built in USD 5m, panic on it seems overblown: Bernstein Report

By ANI | Updated: January 29, 2025 10:05 IST

New Delhi [India], January 29 : As the social media platforms and the stock markets are buzzed with the ...

Open in App

New Delhi [India], January 29 : As the social media platforms and the stock markets are buzzed with the popularity of the new AI company DeepSeek, a report by Bernstein stated that DeepSeek looks fantastic but not a miracle and not built in USD 5 million.

The report addressed the buzz around DeepSeek's models, particularly the idea that the company built something comparable to OpenAI for just USD 5 million. According to the report, this claim is misleading and doesn't reflect the full picture.

It stated that "we believe that DeepSeek DID NOT "build OpenAI for USD 5M"; the models look fantastic but we don't think they are miracles; and the resulting Twitter-verse panic over the weekend seems overblown".

The Bernstein report stated that DeepSeek has developed two main families of AI models: 'DeepSeek-V3' and 'DeepSeek R1'. The V3 model is a large language model that uses a Mixture-of-Experts (MOE) architecture.

This approach combines multiple smaller models to work together, resulting in high performance while using significantly fewer computing resources compared to other large models. The V3 model has 671 billion parameters in total, with 37 billion active at any given time.

It also incorporates innovative techniques like Multi-Head Latent Attention (MHLA), which reduces memory usage, and mixed-precision training using FP8 computation, which improves efficiency.

To train the V3 model, DeepSeek used a cluster of 2,048 NVIDIA H800 GPUs for about two months, totalling approximately 2.7 million GPU hours for pre-training and 2.8 million GPU hours including post-training.

While some have estimated the cost of this training at around USD 5 million based on a USD 2 per GPU hour rental rate, the report points out that this figure doesn't account for the extensive research, experimentation, and other costs involved in developing the model.

The second model, 'DeepSeek R1', builds on the V3 foundation but uses Reinforcement Learning (RL) and other techniques to significantly improve reasoning capabilities. The R1 model has been particularly impressive, performing competitively against OpenAI's models in reasoning tasks.

However, the report noted that the additional resources required to develop R1 were likely substantial, though not quantified in the company's research paper.

Despite the hype, the report emphasized that DeepSeek's models are indeed impressive. The V3 model, for instance, performs as well as or better than other large models on language, coding, and math benchmarks while using only a fraction of the computing resources.

For example, pre-training V3 required about 2.7 million GPU hours, which is just 9 per cent of the compute resources needed to train some other leading models.

In conclusion, the report outlined that while DeepSeek's achievements are remarkable, the panic and exaggerated claims about building an OpenAI competitor for USD 5 million are overblown.

Disclaimer: This post has been auto-published from an agency feed without any modifications to the text and has not been reviewed by an editor

Open in App

Related Stories

Other SportsNeeraj Chopra thanks Karnataka govt for support to NC Classic 2025

InternationalPM Modi presents Ram Mandir replica, holy water from Saryu river to Trinidad & Tobago PM

Other Sports‘The bowlers have done their homework’: WI coach Darren Sammy after another strong display against Australia

InternationalTrinidad & Tobago Prime Minister recites PM Modi’s poem during welcome address

TechnologyC-DOT should enter global tech giant league by 2047: Minister

Business Realted Stories

BusinessC-DOT should enter global tech giant league by 2047: Minister

BusinessSEBI imposes highest ever penalty of Rs 4843.57 crore on Jane Street Group for index manipulation

BusinessIndian stock market opens marginally up, Nifty above 25,400

BusinessNifty, Sensex open with marginal gains, SEBI's order on JS Group may impact derivative volumes: Expert

BusinessOpinion Trading platforms take off globally amid investor and user surge