20241018 – Sana | IterComp | Hallo2 | RDT-1B | more

1. The 4090 laptop produces large images in just 0.37 seconds! NVIDIA collaborates with MIT and Tsinghua to unveil the Sana architecture, which outpaces FLUX in speed.

NVIDIA, in collaboration with MIT and Tsinghua University, has introduced the innovative Sana architecture, which allows the powerful 4090 notebook to generate high-quality 1024×1024 images in just 0.37 seconds. This groundbreaking technology significantly surpasses existing models like FLUX, achieving up to 100 times faster processing speed while maintaining 4K resolution capabilities. The advancements in Sana’s core design focus on deep compression, efficient linear attention mechanisms, and enhanced text-image alignment, positioning it as a game-changer in cost-effective content creation.

Original Chinese article: https://mp.weixin.qq.com/s/ZVQBSigc2VoW7qG30SekNQ

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%2FZVQBSigc2VoW7qG30SekNQ

2. A more powerful text-to-image model than Flux is here! The secret lies in “drawing from a hundred schools of thought.”

A new text-to-image framework, IterComp, has emerged that integrates the strengths of various existing models such as Flux, stable diffusion, and Omost, offering substantial improvements in compositional generation. Developed by researchers from prestigious institutions, IterComp effectively aligns different model capabilities without increasing computational complexity. Early results show it significantly outperforms peers in image quality and aesthetics, demonstrating its potential as a powerful backbone for enhancing other generation models.

Original Chinese article: https://mp.weixin.qq.com/s/K02rTzdXTkvCsyM9U5o-rA

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%2FK02rTzdXTkvCsyM9U5o-rA

3. Fudan University and Baidu join forces to create a new AI model, Hallo2, capable of generating 4K ultra-high-definition videos up to 1 hour long!

Researchers from Fudan University and Baidu have unveiled Hallo2, an advanced AI model capable of generating up to hours of 4K character animations, controllable through voice and text prompts. This innovative technology revolutionizes the creation of high-quality animations, significantly reducing the time and resources traditionally required, and has shown remarkable performance in multiple datasets. The launch of Hallo2 sets a new benchmark in AI-driven animation technology, paving the way for future advancements in various fields such as film, gaming, and virtual assistance.

Original Chinese article: https://www.aibase.com/zh/news/12533

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Fwww.aibase.com%2Fzh%2Fnews%2F12533

4. Tsinghua has released the world’s largest dual-arm robot diffusion model RDT, capable of cocktail mixing and dog walking, topping the HF embodied hot list.

Tsinghua University has unveiled the RDT, the largest bimanual robotics diffusion model globally, capable of performing complex tasks autonomously, including bartending and dog-walking. With its impressive ability to execute seven challenging tasks with a remarkable 56% higher success rate than current models, this innovative AI signifies a leap toward sophisticated robotic intelligence. The RDT is open-sourced to accelerate advancements in robotics research and industry.

Original Chinese article: https://mp.weixin.qq.com/s/LtuK9bN45Bkm2uDCi3cCvA

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%2FLtuK9bN45Bkm2uDCi3cCvA

5. Performance Comparable to SOTA, Computational Load Only Half of DiT! A New Paradigm for T2X Tasks is Here | Sun Yat-sen University & 360 AI Research.

Sun Yat-sen University and 360 AI Research have unveiled PT-DiT, a cutting-edge model that demonstrates performance on par with state-of-the-art models while reducing computational costs to only 51.4% of DiT and 17.5% of Lumina-Next. Designed for various tasks including text-to-image and text-to-video, PT-DiT utilizes a proxy token mechanism to enhance efficiency in visual generation while maintaining high fidelity in outputs. The research is set for open-source release, promising significant advancements in AI-driven visual content creation.

Original Chinese article: https://mp.weixin.qq.com/s/UUqtHn7f8zdeINA9eUNlFg

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%2FUUqtHn7f8zdeINA9eUNlFg

6. Egistec and Arm collaborate to create an efficient small chip solution for the next generation of AI HPC servers.

Egistec has partnered with Arm to drive innovation in AI HPC chip technology, marking a significant step toward advanced computing solutions. This collaboration aims to enhance chiplet technology within high-performance computing servers, leveraging Arm’s design capabilities and Egistec’s expertise in connectivity IPs. Together, they are positioned to lead the development of competitive chip solutions, meeting the rapidly growing demand for efficient computing in AI applications.

Original Chinese article: https://www.prnasia.com/story/464419-1.shtml

English translation via free online service: https://www-prnasia-com.translate.goog/story/464419-1.shtml?_x_tr_sl=zh-CN&_x_tr_tl=en&_x_tr_hl=zh-CN&_x_tr_pto=wapp

7. Li Yanhong: In the next 5 to 10 years, generative AI will enable everyone to have programming capabilities.

During a recent dialogue, Baidu’s founder Robin Li discussed the transformative potential of generative AI, likening its impact to that of the Industrial Revolution. He emphasized that while AI will replace the most arduous jobs, it will simultaneously create new, more comfortable roles. Looking ahead, Li predicted that within the next decade, everyone will possess coding capabilities, fundamentally altering productivity and the workforce landscape.

Original Chinese article: https://finance.sina.com.cn/tech/discovery/2024-10-17/doc-incsvmwm3021232.shtml

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Ffinance.sina.com.cn%2Ftech%2Fdiscovery%2F2024-10-17%2Fdoc-incsvmwm3021232.shtml

8. Alibaba leads the investment, Tsinghua-backed humanoid robot company raises 300 million yuan.

Tsinghua University-backed humanoid robotics company “Robot Era” has successfully raised nearly 300 million RMB in its Pre-A financing round, led by Alibaba, with support from several notable investors. This funding will accelerate breakthroughs in embodied intelligence technology and enhance the commercialization of universal humanoid robots, highlighting the company’s rapid development since its establishment just one year ago. Robot Era not only achieved technological breakthroughs in the laboratory but also demonstrated powerful performance of its robots in complex outdoor environments.

Original Chinese article: https://mp.weixin.qq.com/s/YPw2JRFgO-GiSvRQLj9XYA

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%2FYPw2JRFgO-GiSvRQLj9XYA

9. Qboson has raised hundreds of millions in Series A financing. How far are we from quantum computing?

Beijing Qboson Technology has secured hundreds of millions in Series A funding, marking its fifth round of financing since its inception in 2020. This capital will support the ongoing development of coherent photonic quantum computers, build a “quantum computing +” ecosystem, and advance general photonic quantum computing technologies. As quantum computing rapidly progresses, Qboson aims to lead the commercialization of specialized quantum applications across various industries.

Original Chinese article: https://www.jazzyear.com/article_info.html?id=1402

English translation via free online service: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&u=https%3A%2F%2Fwww.jazzyear.com%2Farticle_info.html%3Fid%3D1402

10. TSMC’s Q3 Net Profit Surges 54% to $10.1 Billion Amid AI-Driven Demand.

TSMC reported a Q3 2024 net profit of 352.3 billion New Taiwan dollars ($10.1 billion), up 54% year-over-year, driven by strong demand for high-performance chips. The surge in AI applications has significantly boosted the company’s growth, exceeding market expectations. TSMC continues to expand production capacity to meet increasing global chip demand.


That’s all for today’s China AI Native Industry Insights. Join us at AI Native Foundation Membership Dashboard for the latest insights on AI Native, or follow our linkedin account at AI Native Foundation and our twitter account at AINativeF.

🤞 Don’t miss these tips!

We don’t spam! Read our privacy policy for more info.

[email protected]

About

Ecosystem

Copyright 2024 AI Native Foundation© . All rights reserved.​