Suanmiao Technology, a Beijing-based 3D AI inference chip company, has secured nearly RMB 1 billion (approximately $140 million) across two funding rounds to mass-produce 100% domestically produced 3D AI inference chips, leveraging its Ethereum mining chip legacy to break the 'memory wall' bottleneck in large model inference.
Suanmiao Technology (SUNMMIO), a Beijing-based startup developing 3D AI inference chips, has raised nearly RMB 1 billion (approximately $140 million) across two consecutive funding rounds, positioning itself as a domestic challenger to Nvidia's dominance in AI inference hardware.
Funding Details and Strategic Investors
The Pre-A round was co-led by Source Code Capital and Shi Xi Capital, with participation from Lenovo Capital and other semiconductor industry funds. The subsequent Pre-A1 round was led by Xianghe Capital, joined by state-backed investors including CDB Capital and Beijing Shunxi. These investments will fund the research and development and mass production of 100% domestically produced 3D AI inference chips.
Breaking the Memory Wall with 3D Stacking
Suanmiao's core innovation targets what the industry calls the "memory wall" - a critical bottleneck where compute units sit idle waiting for data to be fetched from memory. The company's solution combines architectural innovation with a domestic 3D IC (integrated circuit) supply chain that has been cultivated over years.

From Ethereum Mining to AI Inference
The company's origins trace back to 2009 when founder Wang Fuquan, a former researcher at the Chinese Academy of Sciences and key participant in the Loongson (Godson) CPU project, launched Shengsheng Technology. There, he led development of the JASMINER X4, an Ethereum mining chip that used 40nm process technology to achieve 20x the energy efficiency of Nvidia's 7nm flagship GPUs, generating RMB 800 million in revenue from a single product line.
This mining chip legacy provides Suanmiao with several advantages: proven expertise in custom chip design, established relationships with semiconductor foundries, and experience optimizing for specific computational workloads. The transition from cryptocurrency mining to AI inference represents a strategic pivot that leverages existing technical capabilities while addressing a much larger market opportunity.
Technical Architecture and Performance Claims
Suanmiao's in-development chip, codenamed A4, adopts a 3D TokenPU architecture. Pre-silicon simulation data shows inference throughput 1.26x to 2.19x that of Nvidia H200 on Llama and Mixtral models.
The 3D TokenPU architecture appears to be specifically optimized for transformer-based models, which dominate the current AI landscape. By stacking multiple layers of processing units vertically, Suanmiao aims to reduce data movement and increase computational density - key factors in improving inference performance and energy efficiency.
Team Expertise and Industry Connections
The current team of nearly 150 includes Liu Ming (CTO, ex-Loongson, 6+ years 3D IC experience) and Lou Jianguang (chief scientist, former Microsoft Research principal who collaborated with OpenAI on Excel NLP features, joined in September 2025). This combination of domestic chip design experience and international AI research expertise positions Suanmiao to bridge the gap between cutting-edge AI algorithms and practical hardware implementation.
Market Context and Competitive Landscape
Suanmiao's emergence reflects China's broader push for semiconductor self-sufficiency amid ongoing technology restrictions and export controls. The focus on 100% domestic production addresses both strategic autonomy concerns and potential supply chain vulnerabilities.
However, competing with established players like Nvidia presents significant challenges. While pre-silicon simulations show promising performance metrics, the real test will come when the A4 chip moves into production and faces real-world workloads. The company must also build out software ecosystems, developer tools, and customer relationships to challenge Nvidia's entrenched position in the AI hardware market.
Implications for the AI Hardware Market
If successful, Suanmiao's approach could demonstrate that specialized 3D-stacked architectures can outperform general-purpose GPUs for specific AI workloads. This could accelerate a trend toward domain-specific AI accelerators and reduce dependence on a single vendor for critical AI infrastructure.
The company's progress will be closely watched by both domestic Chinese tech companies seeking alternatives to foreign hardware and international observers tracking China's semiconductor capabilities. The transition from a successful Ethereum mining chip to AI inference hardware represents an interesting case study in technology repurposing and market adaptation.
The next 12-18 months will be critical as Suanmiao moves from simulation to silicon, with the success of its A4 chip potentially reshaping competitive dynamics in the AI hardware space.

Comments
Please log in or register to join the discussion