AI Native Daily Paper Digest – 20250428
1. Towards Understanding Camera Motions in Any Video 🔑 Keywords: CameraBench, Structure-from-Motion, Video-Language Models, motion-augmented captioning 💡 Category: Computer Vision 🌟 Research […]
AI Native Daily Paper Digest – 20250425
1. Step1X-Edit: A Practical Framework for General Image Editing 🔑 Keywords: Image Editing, Multimodal Models, Step1X-Edit, GPT-4o, Gemini2 Flash 💡 Category: Multi-Modal […]
AI Native Daily Paper Digest – 20250424
1. VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models 🔑 Keywords: Visual reasoning, MLLMs, VisuLogic, human-verified problems, reinforcement-learning […]
AI Native Daily Paper Digest – 20250423
1. Kuwain 1.5B: An Arabic SLM via Language Injection 🔑 Keywords: Language Model Expansion, Large Language Model (LLM), Arabic Language, Benchmarks 💡 […]
AI Native Daily Paper Digest – 20250422
1. Learning to Reason under Off-Policy Guidance 🔑 Keywords: Large Reasoning Models (LRMs), Reinforcement Learning (RL), Zero-RL, Off-Policy, LUFFY 💡 Category: Reinforcement […]
AI Native Daily Paper Digest – 20250421
1. Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? 🔑 Keywords: Reinforcement Learning, Verifiable Rewards, reasoning capabilities, […]
AI Native Daily Paper Digest – 20250418
1. CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training 🔑 Keywords: CLIMB, semantic space, proxy model, ClimbLab, ClimbMix 💡 Category: […]
AI Native Daily Paper Digest – 20250417
1. ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness 🔑 Keywords: Vision-Language […]
AI Native Daily Paper Digest – 20250416
1. xVerify: Efficient Answer Verifier for Reasoning Model Evaluations 🔑 Keywords: reasoning models, complex reasoning, xVerify, equivalence judgment, VAR dataset 💡 Category: […]
AI Native Daily Paper Digest – 20250415
1. InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models 🔑 Keywords: InternVL3, multimodal pre-training paradigm, MLLM, V2PE, open-source MLLMs […]
AI Native Daily Paper Digest – 20250411
1. Kimi-VL Technical Report 🔑 Keywords: Mixture-of-Experts (MoE), Vision-Language Model (VLM), Multimodal Reasoning, Long Context Understanding, Reinforcement Learning (RL) 💡 Category: Multi-Modal […]
AI Native Daily Paper Digest – 20250408
1. SmolVLM: Redefining small and efficient multimodal models 🔑 Keywords: Vision-Language Models, Resource-Efficient, On-Device Applications, Tokenization, Multimodal Performance 💡 Category: Multi-Modal Learning […]