China AI Native Industry Insights – 20260624 – ByteDance | Alibaba | Baichuan AI | more

Explore Volcano Engine’s Doubao models, Qwen-AgentWorld, Baichuan M4, HappyHorse 1.1. Discover more in Today’s China AI Native Industry Insights.
1. Volcano Engine FORCE Conference: ByteDance Launches Five Doubao Models, Led by Seed 2.1 and Seedance 2.5
At the Volcano Engine FORCE Conference, ByteDance unveiled five major Doubao AI models: Doubao Seed 2.1 Pro, video generation model Seedance 2.5, Seedance 2.0 4K Edition, image model Seedream 5.0 Pro, and Audio Generation Model 1.0. The Doubao model family now processes over 180 trillion tokens daily and holds a 49.5% share in China’s public cloud MaaS market. Seed 2.1 Pro delivers major upgrades in Coding, Agent, and multimodal capabilities at roughly 80% lower cost than Claude Opus 4.6. Seedance 2.5 supports native 30-second single-shot video generation and up to 50 multi-modal reference inputs. Seedream 5.0 Pro introduces interactive editing and multi-layer separation. The new Audio Model 1.0 can generate film-grade audio from a single prompt. Volcano Engine also announced a full upgrade to its Agent infrastructure, launching HiAgent 3.0 and the AI Trust security framework to integrate Agents into enterprise core workflows.
Read more: https://mp.weixin.qq.com/s/Vnv68cHAWfcX2CnszWR6Qg
Video Credit: The original article
2. Qwen-AgentWorld Open-Sourced: The First Native Language World Model That Teaches Agents to Predict Before Acting
Alibaba’s Qwen team has open-sourced Qwen-AgentWorld, the first language world model natively designed for agent environment modeling. Rather than learning through costly trial-and-error in real environments, the model trains AI to predict how an environment will respond to actions before executing them. Built on over 10 million real interaction trajectories through a three-stage CPT → SFT → RL pipeline, a single model covers seven domains: MCP, Search, Terminal, SWE, Web, OS, and Android. On AgentWorldBench, the 397B version scores 58.71, surpassing GPT-5.4 (58.25), while the 35B version outperforms Claude Sonnet 4.6 (56.04 vs 56.39). The team validates two complementary paradigms: first, as a decoupled environment simulator for RL training, where controllable Sim RL outperforms Real RL on WideSearch (F1: 50.3% vs 45.6%); second, as a unified agent foundation model, where LWM warm-up training transfers to 7 benchmarks across 5 domains — including 3 completely unseen during training — without any additional fine-tuning. The model and AgentWorldBench benchmark are fully open-sourced on HuggingFace, ModelScope, and GitHub.
Read more: https://mp.weixin.qq.com/s/NV9WGpGsfFz35jww5agM9g
Video Credit: The original article
3. Baichuan Launches M4 Medical AI Model: Triple HealthBench World #1, 3.3% Hallucination Rate, and Proactive Clinical Questioning
Baichuan Intelligence, in collaboration with Tsinghua University, has released Baichuan-M4, a next-generation medical AI model that ranks first globally across all three HealthBench leaderboards — overall, Hard, and Professional — with a composite score of 68.6, more than 10 points ahead of second-place GPT-5.5. Its factual hallucination rate of 3.3% is the lowest in the industry, compared to 3.8% for GPT-5.5, 6.9% for Claude Opus 4.7, and 9.8% for DeepSeek-V4-Pro. M4’s core breakthroughs span four clinically grounded capabilities: proactive inquiry, where the model asks follow-up questions like a real physician rather than waiting for complete information; full-episode memory, which connects historical records, multi-turn consultations, and medication responses across years, scoring 86.9 on long-context clinical memory benchmarks; evidence anchoring, where every medical conclusion maps to a specific paragraph in the original paper or guideline, achieving a citation precision of 90.0 versus GPT-5.5’s 54.7; and an Agent architecture via the proprietary Baichuan-Harness that autonomously orchestrates inquiry, memory, and evidence retrieval without manual instruction. M4 represents Baichuan’s sustained effort to close the gap between “answering medical questions” and “actually practicing medicine” — bringing quality clinical care within reach of every ordinary person.
Read more: https://mp.weixin.qq.com/s/WBWQFRH5d8z1MBvCMHSWDQ
Video Credit: Codex + Hyperframes
4. HappyHorse 1.1 Launches on Qianwen Cloud and Alibaba Cloud Bailian: Major Upgrades in Motion, Consistency, and Audio
HappyHorse 1.1, the latest version of the video generation model, is now available on the HappyHorse official website, Qianwen Cloud, and Alibaba Cloud Bailian, with a limited-time 40% discount. Compared to version 1.0, the update delivers improvements across five dimensions: enhanced motion dynamics with smoother, more powerful movement in complex action scenes; multi-reference image-to-video (R2V) now supports up to 9 character reference images simultaneously, improving consistency for e-commerce, multi-character dramas, and live-streaming content; stronger instruction-following for both concise and complex prompts; improved visual quality that reduces the “glossy” and “over-sharpened” look while preserving natural skin detail; and upgraded audio capabilities with dynamic speech pacing, emotion-matched tone, and improved audio-visual sync. Pricing after discount is 0.54 RMB/second for 720p and 0.72 RMB/second for 1080p.
Read more: https://mp.weixin.qq.com/s/0h2ChdG59DQcW9D5p4KUmQ
Video Credit: The original article
That’s all for today’s China AI Native Industry Insights. Join us at AI Native Foundation Membership Dashboard for the latest insights on AI Native, or follow our linkedin account at AI Native Foundation and our twitter account at AINativeF.