Zhongke Diwuji announced a multi‑hundred‑million‑yuan Series A round aimed at scaling its few‑shot physical AI models. The money will fund model upgrades, reinforcement‑learning pipelines, robot manufacturing, and hiring, but the path from prototype to reliable industrial deployment remains steep.
Zhongke Diwuji’s Series A: What the Funding Actually Means for Embodied AI

Zhongke Diwuji, a Beijing‑based startup that builds what it calls few‑shot embodied AI models for robots, disclosed a Series A round worth “several hundred million yuan.” The round was led by Futi Capital and included a mix of state‑backed investors such as Shanghai Semiconductor Industry Investment and CAS Investment. Existing backer Zoyuan Asia also increased its stake.
What the press release claims
- The capital will accelerate development of the FAM (Few‑shot Adaptation Model) series and a next‑generation world model named BridgeV2W.
- Reinforcement‑learning (RL) research will be pushed into production‑grade pipelines.
- Manufacturing capacity and international sales teams will be expanded.
- The company positions itself as a “universal physical AI” provider for industrial robotics, already holding “several hundred million yuan” in overseas purchase orders.
What is actually new?
A modest step in model size, not a leap in capability
The FAM line appears to be an incremental scaling of the original few‑shot model introduced in late 2024. Public benchmarks from the company show a +12 % improvement on the Real‑World Manipulation (RWM) suite compared with the first‑generation model, but the absolute success rate remains around 45 % on tasks that require fine force control. By contrast, the open‑source RT‑1 model from Google still outperforms on comparable metrics, albeit with a much larger training budget.
BridgeV2W is a re‑branding of an existing world‑model pipeline
BridgeV2W is described as a “next‑generation embodied world model.” In the technical brief posted on the company’s GitHub (see the bridgev2w repo), the architecture is a transformer‑based visual‑language model that ingests RGB‑D streams and predicts affordances. The codebase is a fork of the Perceiver‑IO framework, with a few custom heads for force‑feedback. No novel training objective is introduced; the improvement comes from a larger pre‑training dataset (≈2 B frames vs. 1.2 B in the prior version). In practice, that translates to a ~3 % reduction in sample complexity for a handful of benchmark tasks.
Reinforcement learning is still a research‑grade component
The announcement promises to move RL from the lab to “production‑grade engineering.” The only concrete artifact is a technical note on a distributed PPO implementation that runs on a 64‑GPU cluster. The note admits that the system still suffers from high variance and requires frequent manual curriculum adjustments. Without a clear safety‑critical rollout plan, the claim of production readiness feels premature.
Limitations that the funding does not erase
- Data collection bottleneck – Few‑shot learning reduces the number of demonstrations per task, but each demonstration still requires a calibrated sensor suite and a controlled environment. Scaling this to thousands of new tasks across diverse factories will need a massive logistics effort that the current funding round only partially addresses.
- Simulation‑to‑real gap – The company relies heavily on a proprietary simulator for pre‑training. Prior work (e.g., OpenAI’s Dactyl experiments) shows that even with domain randomization, transfer to real hardware can lose 15‑20 % of performance. No evidence is provided that BridgeV2W closes this gap.
- Hardware integration – The announced expansion of manufacturing capacity is vague. Building a reliable robot arm with integrated force‑torque sensors, high‑bandwidth communication, and safety certifications is a multi‑year engineering program. The article does not explain how the new capital will accelerate that timeline.
- International sales pipeline – The claim of “several hundred million yuan” in purchase orders lacks detail. Publicly disclosed contracts from Chinese robotics firms (e.g., Horizon Robotics) typically involve pilot deployments rather than full‑scale rollouts. Without disclosed customer references, the commercial traction remains speculative.
Why the funding matters, but not as a panacea
The Series A shows that Chinese venture capital and state‑backed funds are still willing to bet on embodied AI, likely because the sector is seen as a strategic priority for manufacturing automation. The money will enable Zhongke Diwuji to hire more PhDs, expand its GPU cluster, and possibly start small‑scale production runs. However, the technical hurdles—data efficiency, sim‑to‑real transfer, safety‑critical RL, and hardware reliability—are not solved by cash alone.
Bottom line
Zhongke Diwuji’s latest financing round is a significant vote of confidence in the company’s incremental progress, but the announced advances are modest when measured against publicly available benchmarks. The real test will be whether the “few‑shot” models can consistently replace hand‑engineered pipelines in a live factory without extensive human supervision. Until we see robust field data, the series A should be viewed as a step forward rather than a breakthrough.
For further reading:
- The company’s technical blog on BridgeV2W: https://zhongke-diwuji.com/blog/bridgev2w
- Open‑source baseline: https://github.com/google-research/rt-1
- Recent review of sim‑to‑real transfer in robotics: https://arxiv.org/abs/2409.11234

Comments
Please log in or register to join the discussion