AI Native Foundation

Explore Kling AI’s latest lip sync enhancements, Alibaba’s new Qwen VLo model, and Tencent’s Hunyuan-A13B lightweight MoE model. Discover more in Today’s China AI Native Industry Insights.

1. Kling AI Upgrades Lip Sync Tool with Character Control and 60s Video Support

🔑 Key Details:
– Feature Update: Kling AI’s Lip Sync tool now supports character selection, timeline adjustments, and original audio retention.
– Extended Duration: Users can now create synced videos up to 60 seconds long, doubling previous limits.
– Flexible Editing: Fine-tune timing and choose characters to better match audio and visual intent.
– Audio Support: Keep original voice or music intact while enhancing visual sync.

💡 How It Helps:
– Creators: Produce more expressive, tailored lip-sync videos with greater control over timing and characters.
– Musicians: Sync vocals with animated characters for music videos or performance previews.
– Educators: Generate engaging language-learning content with accurate lip movements and audio.
– Marketers: Craft high-impact promos or explainers using voice-over and character sync.

🌟 Why It Matters:
With improved flexibility and longer durations, Kling AI’s enhanced Lip Sync tool empowers users to create higher-quality, more personalized content. By allowing original audio and character customization, it bridges the gap between expressive storytelling and efficient content creation—ideal for global creators aiming to scale multimedia output.

Original article: https://x.com/Kling_ai/status/1939528773436838102

Video Credit: Kling AI (@Kling_ai on X)

2. Alibaba Launches Qwen VLo: A Unified Multimodal Understanding and Generation Model

🔑 Key Details:
– New Model Release: Qwen VLo builds upon previous Qwen-VL models, enabling both image understanding and high-quality image generation/editing.
– Progressive Generation: Renders images gradually from left to right, top to bottom, allowing for real-time adjustments and optimizations during creation.
– Advanced Capabilities: Supports complex instructions, multi-image input/output, dynamic aspect ratios, and tasks like style transfer, object replacement, and visual perception.
– Multilingual Support: Handles instructions in multiple languages including Chinese and English.

💡 How It Helps:
– Content Creators: Offers precise image editing with capabilities to maintain semantic consistency when modifying colors, styles, or elements.
– Designers: Supports generation of custom-sized visuals (up to 4:1 and 1:3 aspect ratios) for various media formats.
– Marketers: Enables creation of complex visuals through simple language instructions, including poster generation and multi-element compositing.
– Developers: Provides analytical capabilities to interpret generated images, making it useful for both creation and understanding tasks.

🌟 Why It Matters:
Qwen VLo signals a leap in multimodal AI by integrating creation and perception in a single model. Its progressive generation method offers creators fine-grained control, while its understanding features enable contextual consistency. This dual-functionality opens new possibilities for generative design, education, and intelligent visual communication, reinforcing Alibaba’s position in the frontier of creative AI.

Original Chinese article: https://mp.weixin.qq.com/s/E655CRmdgd5bySyuHPDoEw

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%2FE655CRmdgd5bySyuHPDoEw

Video Credit: Qwen (@Alibaba_Qwen on X)

3. Tencent Unveils Hunyuan-A13B: Lightweight Open-Source MoE Model with 80B Parameters

🔑 Key Details:
– New MoE Architecture: Hunyuan-A13B features 80B total parameters but only activates 13B during inference, balancing performance with efficiency.
– Minimal Hardware Requirements: Extreme deployments need just one mid-to-low-end GPU, significantly lowering entry barriers.
– Dual Thinking Modes: Model offers both fast and slow thinking capabilities to optimize resource allocation based on task complexity.
– Strong Performance: Achieves leading results in mathematics, science, and logical reasoning benchmarks compared to similar models.

💡 How It Helps:
– Individual Developers: Provides access to advanced AI capabilities without requiring enterprise-level computational resources.
– Small Businesses: Enables AI implementation with substantially lower infrastructure costs while maintaining competitive performance.
– AI Researchers: Two new open-source datasets (ArtifactsBench and C3-Bench) fill evaluation gaps for code and agent scenarios.
– Application Builders: Agent capabilities support complex tools like travel planning and data analysis for practical implementations.

🌟 Why It Matters:
Hunyuan-A13B addresses a critical industry challenge by democratizing access to advanced AI. Through its innovative MoE architecture, Tencent has created a path for smaller players to leverage high-performance AI without prohibitive infrastructure investments. This represents a significant step toward making cutting-edge AI technology more inclusive and accessible beyond large corporations with extensive resources.

Original Chinese article: https://mp.weixin.qq.com/s/mWfUrWz7bc7f9RhnOltQOA

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%2FmWfUrWz7bc7f9RhnOltQOA

Video Credit: The original article

4. Baidu Open-Sources ERNIE 4.5 Model Series with Multimodal Capabilities

🔑 Key Details:
– Model Release: Baidu has open-sourced 10 ERNIE 4.5 models with different parameter sizes, including 47B and 3B MoE models (up to 424B total parameters) and 0.3B dense models.
– Technical Innovation: Features a multimodal heterogeneous architecture with cross-modal parameter sharing that enhances both text and visual capabilities simultaneously.
– Development Framework: All models are built on PaddlePaddle deep learning framework, achieving 47% FLOPs utilization (MFU) during pre-training.
– Accessibility: Models available on Hugging Face, GitHub, and AiStudio under Apache 2.0 license, with comprehensive development tools.

💡 How It Helps:
– AI Researchers: Access to state-of-the-art models across multiple parameter sizes for academic exploration and benchmarking.
– Enterprise Developers: Complete toolkit with ERNIEKit for fine-tuning and FastDeploy for efficient multi-hardware deployment.
– Application Engineers: Ready-made integration with vLLM and OpenAI protocols for simplified implementation.
– Educators: Access to 20+ open courses and educational materials through PaddlePaddle’s ecosystem.

🌟 Why It Matters:
Baidu’s ERNIE 4.5 release advances open-source AI with strong multimodal capabilities and efficient MoE architecture. Cross-modal sharing improves performance across tasks, while accessible tooling and educational resources promote broader industry adoption. This move strengthens China’s role in global open AI development and lowers the barrier for real-world applications.

Original Chinese article: https://mp.weixin.qq.com/s/MflKTGJKvS2SZd8_MMIFxQ

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%2FMflKTGJKvS2SZd8_MMIFxQ

Video Credit: The original article

That’s all for today’s China AI Native Industry Insights. Join us at AI Native Foundation Membership Dashboard for the latest insights on AI Native, or follow our linkedin account at AI Native Foundation and our twitter account at AINativeF.

China AI Native Industry Insights – 20250630 – Kling AI | Alibaba | Tencent | more

1. Kling AI Upgrades Lip Sync Tool with Character Control and 60s Video Support

2. Alibaba Launches Qwen VLo: A Unified Multimodal Understanding and Generation Model

3. Tencent Unveils Hunyuan-A13B: Lightweight Open-Source MoE Model with 80B Parameters

4. Baidu Open-Sources ERNIE 4.5 Model Series with Multimodal Capabilities

About

Ecosystem

Insights

Legal