AI Native Daily Paper Digest – 20250509
1. Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models 🔑 Keywords: Large Multimodal Reasoning Models, Multimodal Reasoning, Cross-modal […]
AI Native Daily Paper Digest – 20250508
1. Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities 🔑 Keywords: multimodal understanding, image generation, autoregressive-based architectures, diffusion-based models, GPT-4o […]
AI Native Daily Paper Digest – 20250507
1. Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning 🔑 Keywords: multimodal Reward Models, CoT reasoning, UnifiedReward-Think, reinforcement fine-tuning 💡 Category: Reinforcement […]
AI Native Daily Paper Digest – 20250506
1. Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play 🔑 Keywords: Voice AI, AI Native, Low-latency conversations, Multilingual speech […]
AI Native Daily Paper Digest – 20250505
1. PixelHacker: Image Inpainting with Structural and Semantic Consistency 🔑 Keywords: Image Inpainting, Latent Categories Guidance, Diffusion-Based Model, PixelHacker, Linear Attention 💡 […]
AI Native Daily Paper Digest – 20250502
1. A Survey of Interactive Generative Video 🔑 Keywords: Interactive Generative Video, generative capabilities, interactive features, control signals, responsive feedback 💡 Category: […]
AI Native Daily Paper Digest – 20250501
1. Sadeed: Advancing Arabic Diacritization Through Small Language Model 🔑 Keywords: Arabic text diacritization, morphological richness, fine-tuned, benchmarking, SadeedDiac-25 💡 Category: Natural […]
AI Native Daily Paper Digest – 20250430
1. Reinforcement Learning for Reasoning in Large Language Models with One Training Example 🔑 Keywords: Reinforcement Learning, Large Language Models, Mathematical Reasoning, […]
AI Native Daily Paper Digest – 20250429
1. CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges 🔑 Keywords: Large Language Models, Cryptographic Reasoning, AI Native 💡 […]
AI Native Daily Paper Digest – 20250428
1. Towards Understanding Camera Motions in Any Video 🔑 Keywords: CameraBench, Structure-from-Motion, Video-Language Models, motion-augmented captioning 💡 Category: Computer Vision 🌟 Research […]
AI Native Daily Paper Digest – 20250425
1. Step1X-Edit: A Practical Framework for General Image Editing 🔑 Keywords: Image Editing, Multimodal Models, Step1X-Edit, GPT-4o, Gemini2 Flash 💡 Category: Multi-Modal […]
AI Native Daily Paper Digest – 20250424
1. VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models 🔑 Keywords: Visual reasoning, MLLMs, VisuLogic, human-verified problems, reinforcement-learning […]