China’s open-source AI shifts to MoE models and hardware-first deployment
Chinese open-source AI teams are standardizing on MoE architectures, permissive licenses, and domestic-chip deployments, pushing system design over raw benchmark wins....

Key Takeaways
- Chinese open-source releases are standardizing on MoE to control inference cost while keeping capability competitive.
- Smaller 0.5B–30B models are gaining share because they’re easier to run locally and integrate into products; larger models increasingly act as distillation teachers.
- Releases are becoming hardware-first, shipping with reproducible inference/serving stacks tuned for domestic chips like Ascend, Cambricon, and Kunlun.
- Open-source competition is expanding beyond text into video, audio, and 3D toolchains that matter for production media workflows.
- Compute constraints are impacting availability, with Zhipu usage restrictions reported publicly.
China’s open-source AI ecosystem is starting to look less like a model zoo and more like an industrial stack: architectures optimized for cost, licenses optimized for adoption, and deployment paths optimized for local chips.
MoE architectures and small models become the practical default
A clear pattern across recent Chinese releases is the move toward Mixture-of-Experts (MoE) designs from teams including Moonshot AI, MiniMax, and Alibaba’s Qwen line. MoE routes each request to a subset of “experts,” so inference doesn’t have to activate the full model every time—useful when compute budgets and hardware vary across customers.
At the same time, community gravity is pulling toward smaller models (roughly 0.5B to 30B parameters) because they’re easier to run locally, fine-tune, and ship inside products. Larger MoE models still matter, but increasingly as “teacher” systems used for distillation—compressing capabilities into smaller, cheaper models that better fit production constraints.
Hardware-first open source expands beyond text into multimodal
The second major shift is packaging: releases now arrive with serving engines, quantization formats, and reproducible inference paths aimed at specific domestic accelerators. DeepSeek-V3.2-Exp, for example, shipped with day-zero inference support for Huawei Ascend and Cambricon, signaling that “downloadable weights” is no longer the end goal.
This is colliding with a multimodal sprint. Chinese teams are open-sourcing competitive audio, image, video, and 3D components—relevant for marketers watching video AI pipelines and creative tooling. Tencent’s Hunyuan Video and Hunyuan 3D projects, plus StepFun’s audio and video work, illustrate the ecosystem’s move from text-only to full-funnel media generation and agent workflows.
Training disclosures are also becoming more explicit: Baidu documented training Qianfan-VL on over 5,000 Kunlun P800 chips, and Zhipu’s constraints have surfaced publicly amid a broader compute crunch, with usage restrictions reported by SCMP.
Conclusion: For B2B operators, the signal is pragmatic. China’s open-source race is increasingly about deployable systems—MoE + distillation + permissive licensing + hardware-aligned tooling—rather than chasing a single “best model.”
Stay Informed
Weekly AI marketing insights
Join 5,000+ marketers. Unsubscribe anytime.
