AI Native Daily Paper Digest – 20250717
1. Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs 🔑 Keywords: Reasoning-Enhanced RAG, RAG-Enhanced Reasoning, Synergized RAG-Reasoning, […]
AI Native Daily Paper Digest – 20250716
1. Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models 🔑 Keywords: Vision-Language Models, VLV auto-encoder, semantic understanding, fine-tuning, cost-efficiency 💡 Category: Multi-Modal […]
AI Native Daily Paper Digest – 20250715
1. SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation 🔑 Keywords: SpeakerVid-5M, audio-visual, virtual human, large-scale dataset, dyadic interaction […]
AI Native Daily Paper Digest – 20250714
1. Test-Time Scaling with Reflective Generative Model 🔑 Keywords: MetaStone-S1, Self-supervised Process Reward Model, Reflective Generative Model, Test Time Scaling, Scaling Law […]
AI Native Daily Paper Digest – 20250711
1. Scaling RL to Long Videos 🔑 Keywords: Vision-Language Models, Reinforcement Learning, Long Video QA, Multi-modal Reinforcement Sequence Parallelism, LongVideo-Reason 💡 Category: […]
AI Native Daily Paper Digest – 20250710
1. 4KAgent: Agentic Any Image to 4K Super-Resolution 🔑 Keywords: agentic super-resolution, Profiling, Perception Agent, Restoration Agent, low-level vision tasks 💡 Category: […]
AI Native Daily Paper Digest – 20250709
1. SingLoRA: Low Rank Adaptation Using a Single Matrix 🔑 Keywords: SingLoRA, Low-Rank Adaptation, parameter-efficient, fine-tuning, common sense reasoning 💡 Category: Foundations […]
AI Native Daily Paper Digest – 20250707
1. How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks 🔑 Keywords: Multimodal Foundation Models, prompt […]
AI Native Daily Paper Digest – 20250704
1. WebSailor: Navigating Super-human Reasoning for Web Agent 🔑 Keywords: WebSailor, LLM, proprietary agents, reasoning capabilities, complex information-seeking tasks 💡 Category: Reinforcement […]
AI Native Daily Paper Digest – 20250703
1. Kwai Keye-VL Technical Report 🔑 Keywords: Multimodal Large Language Models, short-video understanding, vision-language alignment 💡 Category: Multi-Modal Learning 🌟 Research Objective: […]
AI Native Daily Paper Digest – 20250702
1. GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning 🔑 Keywords: Vision-Language Model, Reinforcement Learning, Multimodal Reasoning, Curriculum Sampling, General-Purpose 💡 […]
AI Native Daily Paper Digest – 20250701
1. Ovis-U1 Technical Report 🔑 Keywords: Ovis-U1, multimodal understanding, text-to-image generation, image editing, diffusion-based visual decoder 💡 Category: Generative Models 🌟 Research […]