China AI Native Industry Insights – 20251217 – Tencent | ByteDance | Alibaba | more

Discover Tencent’s groundbreaking HY WorldPlay 1.5 for real-time interactivity, ByteDance’s Seedance 1.5 Pro with advanced narrative control, and Alibaba’s WanXiang 2.6 as China’s first role-playing video model. Discover more in Today’s China AI Native Industry Insights.

1. Tencent Releases HY WorldPlay 1.5: Real-Time Interactive World Model

🔑 Key Details:
– Real-Time Exploration: Users can navigate AI-generated 3D scenes with keyboard, mouse, or controller, enabling game-like interaction.
– Prompt-to-World Creation: Worlds can be built from text or images, supporting rich scene prompts and text-triggered events.
– Breakthrough Memory: The model remembers and reconstructs previously visited 3D regions with geometric consistency.
– Multi-Person View and Game-Style Use: Supports first/third-person views, stylised environments, and event-triggered transitions.

💡 How It Helps:
– Developers: Offers an interactive simulation platform for embodied AI training and 3D space reasoning.
– Game Designers & Creators: Enables dynamic scene generation from natural language, ideal for game prototyping and immersive content.

🌟 Why It Matters:
Tencent HY WorldPlay 1.5 pushes the boundary of generative AI from static media to dynamic, explorable 3D worlds. This marks a critical step toward interactive, persistent AI-generated environments for gaming, VR, and simulation.

Original Chinese article: https://mp.weixin.qq.com/s/HcB_FhZSpcp3XD8XDlcY4Q

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%2FHcB_FhZSpcp3XD8XDlcY4Q

Video Credit: Tencent HY (@TencentHunyuan in X)

2. ByteDance Launches Seedance 1.5 pro: Multimodal Audio-Visual Model with Narrative Control

🔑 Key Details:
– Joint Audio-Video Generation: Enables synchronized T2V and I2V creation with speech, motion, emotion, and ambient sound in one pass.
– Narrative-Centric Design: Improved semantic understanding supports coherent storytelling, dynamic camera motion, and character-driven sequences.
– Language and Dialect Support: Supports Mandarin, English, Japanese, Cantonese, Sichuan dialects, and more with accurate tone and expression.
– Cinematic Motion Mastery: Capable of long takes, Hitchcock zooms, and real-time tracking for professional-grade dynamic tension.
– Leading Benchmark Scores: High performance in instruction following, AV alignment, voice naturalness, and expressive fidelity.

💡 How It Helps:
– Filmmakers & Scripted Video Creators: Accelerates shot planning, emotional delivery, and scene continuity with expressive sound and motion.
– Animation & Ad Producers: Supports stylistic video generation with coordinated voiceover and cinematic structure.
– Game & Entertainment Developers: Creates immersive content with synced motion, ambient sound, and dialogue in multiple languages.

🌟 Why It Matters:
Seedance 1.5 pro pushes beyond isolated clip generation into structured, narrative video creation. It marks a shift toward truly expressive, multimodal storytelling in AI, especially for language-rich and culturally nuanced content in the Chinese-speaking world.

Original Chinese article: https://mp.weixin.qq.com/s/C6YH3ifq7vJPjY_zSpWbfw

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%2FC6YH3ifq7vJPjY_zSpWbfw

Video Credit: The original article

3. Alibaba Unveils WanXiang 2.6: China’s First Video Model with Role-Playing

🔑 Key Details:
– Role-Playing Support: First in China to enable AI video generation with consistent voice and appearance from input references.
– Multi-Shot Video Generation: Converts simple prompts into coherent multi-shot scripts with stable subjects and scenes.
– Audio-Driven Generation: Combines user-provided text and voice to drive narrative video creation.
– Natural Audio-Visual Sync: Enhances realism in speech-driven video with improved voice texture and music synthesis.
– Longer Videos: Now supports up to 15-second videos for more complete storytelling.
– Advanced Image Generation: Integrates aesthetic style fusion, detailed realism, layout control, and logic-driven visual storytelling.

💡 How It Helps:
– Video Creators: Ideal for short film prototyping, lip-sync acting, and stylized multi-character scenes.
– Designers & Artists: Blend multiple styles, reference images, or scene elements into seamless new visuals with commercial-level consistency.
– Educators & Illustrators: Automatically generate full-length picture books and infographics with aligned text and images.

🌟 Why It Matters:
WanXiang 2.6 is now the most feature-complete video generation model globally. Its ability to merge video, audio, text, and visual design workflows makes it a powerful tool for AI-native content creators in entertainment, education, and marketing.

Original Chinese article: https://mp.weixin.qq.com/s/HU19meKxI2PDVYgXBNx5Qw

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%2FHU19meKxI2PDVYgXBNx5Qw

Video Credit: The original article

4. Jimeng Web Update: One-Stop AI Studio for Multimodal Creativity

🔑 Key Details:
– New Model: Video 3.5 Pro generates both video and audio, with improved realism in lip sync, actions, and lighting.
– Long-Take Workflow: Smart Multi-Frame 2.0 enables flexible scene stitching, frame-level editing, and prompt-based openings.
– Enhanced Image Generation: Image 4.1 strengthens design tasks; 4.5 boosts portrait quality, aesthetics, and editability.
– UI & Interaction Upgrade: Canvas offers visual asset management and modal switching; Agent enables inspiration search and goal-driven content generation.

💡 How It Helps:
– Creators: Produce lifelike AI short films or stylised animations with advanced video and image models.
– Designers & Brands: Generate eye-catching posters, illustrations, or mockups with more visual control and creative fidelity.
– Product Teams: Use canvas + agents to streamline multimodal workflows and enable efficient, AI-powered collaboration.

🌟 Why It Matters:
The Jimeng Web upgrade transforms the tool into a full-stack AI studio—merging generation, design, and inspiration into one seamless platform. It empowers creators across film, marketing, and design to turn bold ideas into

Original Chinese article: https://mp.weixin.qq.com/s/Szz86cppqABlYp45cezO4g

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%2FSzz86cppqABlYp45cezO4g

Video Credit: The original article

5. SenseTime Releases Xiaohuanxiong 3.0: AI Agent for High-Quality PPT Creation

🔑 Key Details
– Outcome-Oriented AI: Transforms vague ideas into finished PPTs with visual consistency and editable charts, text, and images—on both cloud and local devices.
– Deep Understanding: Supports long-chain reasoning and multimodal analysis with million-row data handling.
– Workflow Integration: Moves from single-point tools to cross-platform workflows; mobile app now launched with >95% accuracy in enterprise settings.

💡 How It Helps
– Professionals: Draft-free reports and presentations, usable directly in meetings.
– Enterprises: Accelerates internal analysis tasks by 90% with secure access to private and public data.
– Students & Educators: Enables fast knowledge structuring and visual storytelling with educational edition donation to Zhejiang University Library.

🌟 Why It Matters
Xiaohuanxiong 3.0 marks a shift from file-driven to task-driven productivity. By rethinking the way we create, analyse, and collaborate, SenseTime positions AI not just as a tool—but as a native partner in every workflow.

Original Chinese article: https://mp.weixin.qq.com/s/hSl1ElDlQ_GCp5k0ZtNJkw

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%2FhSl1ElDlQ_GCp5k0ZtNJkw

Video Credit: The original article

6. Xiaomi MiMo-V2-Flash Open-Sourced: Maximising Speed, Minimising Cost for Agent & Code LLMs

🔑 Key Details
– 309B MoE model with 15B active parameters, designed for ultra-efficient inference.
– Achieves top 2 ranking in global open-source agent benchmarks; code ability rivals Claude 4.5 Sonnet.
– Introduces Hybrid Attention + Multi-layer MTP for high-speed generation (2× faster than Claude 4.5 at just 2.5% cost).
– Pioneers MOPD: Multi-Teacher On-Policy Distillation for low-resource reward training.
– Fully open-sourced on HuggingFace and SGLang, with >15K tok/s decoding speed at 16K context.

💡 How It Helps
– Developers: Build high-performance agents and code tools with extreme throughput and low latency.
– Enterprises: Leverage MiMo’s cost-effective API (input ¥0.7 /M tokens, output ¥2.1 /M) to scale RL and multimodal workflows.
– Researchers: Explore new RLHF paradigms with token-level rewards and efficient on-policy training.

🌟 Why It Matters
MiMo-V2-Flash represents a new peak of open-source LLM efficiency. Xiaomi’s architectural innovations push the boundaries of inference speed, cost, and agent performance—setting a new standard for practical deployment of powerful models.

Original Chinese article: https://mp.weixin.qq.com/s/66xbk8hH71YAdi9u7gIMNw

English translation via free online service: https://mp-weixin-qq-com.translate.goog/s/66xbk8hH71YAdi9u7gIMNw?_x_tr_sl=auto&_x_tr_tl=en&_x_tr_hl=en&_x_tr_pto=wapp

Video Credit: The original article

That’s all for today’s China AI Native Industry Insights. Join us at AI Native Foundation Membership Dashboard for the latest insights on AI Native, or follow our linkedin account at AI Native Foundation and our twitter account at AINativeF.

Blank Form (#4)
[email protected]

About

Ecosystem

Copyright 2025 AI Native Foundation© . All rights reserved.​