DeepSeek is testing a 1 million token context model ahead of a potential Lunar New Year launch, while its new 'conditional memory' research could redefine how large language models handle long sequences.
DeepSeek has begun testing a new long-context model supporting 1 million tokens in its web and app versions, while its API service remains at V3.2 with 128K context. Industry observers speculate that DeepSeek may unveil a major new release during the upcoming Lunar New Year, potentially replicating the breakout momentum it achieved last year.
On January 12, DeepSeek published a new research paper titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models." Liang Wenfeng appears among the authors. The paper introduces "conditional memory," separating static pattern storage from dynamic computation via an Engram module. Under identical parameter counts and FLOPs constraints, the approach significantly outperforms pure MoE baseline models.
On December 1 last year, DeepSeek released two official models: DeepSeek-V3.2 and DeepSeek-V3.2-Speciale. V3.2 reportedly reached GPT-5-level performance on public reasoning benchmarks, while V3.2-Speciale won gold medals at IMO 2025, CMO 2025, ICPC World Finals 2025, and IOI 2025.
NetEase Youdao Dictionary named "deepseek" its 2025 Word of the Year, citing 8,672,940 annual searches. According to the company, search interest surged sharply throughout the year, initially driven by DeepSeek's "low-cost" breakthrough in compute efficiency and reinforced by each major product update.
Source: The Paper Tags: #DeepSeek #OpenSourceAI

Comments
Please log in or register to join the discussion