Cheaper and faster: China’s DeepSeek introduces new experimental AI model


Chinese AI developer DeepSeek has introduced its new "experimental" model, which it claims to be cheaper to train and better at processing long sequences of text.

The news of the model's release was announced on the developer forum Hugging Face. The company has introduced the new model DeepSeek-V3.2-Exp as an “intermediate step toward our next-generation architecture.”

In the forum post, DeepSeek explains that the new model represents an “ongoing research into more efficient transformer architectures, particularly focusing on improving computational efficiency when processing extended text sequences” compared to previous large language model attempts.

ADVERTISEMENT

Reuters reports that the system is expected to be DeepSeek’s biggest release since its earlier V3 and R1 models, which, upon release, surprised both Silicon Valley and global tech investors.

The DeepSeek-V3.2-Exp model is built on the previous version (V3.1-Terminus) by adding Sparse Attention – a technique that allows the LLM to focus only on parts of texts that matter, thus making training and running the model faster and cheaper.

According to DeepSeek, this method is more efficient as the model uses less computing power by not reading each word as if it carried equally important information, and does not compromise quality.

two charts blue and yellow lines
Source: Hugging Face

The company claims it will continue to experiment with new efficiency mechanisms that could result in DeepSeek producing more powerful and cheaper AI models. So far, DeepSeek has stated that DeepSeek-V3.2-Exp is cutting API prices by "50%+."

As Cybernews has previously reported, DeepSeek has long been known for its quest for efficiency. One of its latest examples is the reasoning-focused R1 model, which reportedly costs $294,000 to train. This is very cheap compared to the hundreds of millions often cited by US developers.


ADVERTISEMENT

Unlock more exclusive Cybernews content on YouTube.