GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages 2024-11-01 Minimum Entropy Coupling with Bottleneck 2024-11-01 CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation 2024-10-31 Decoding Reading Goals from Eye Movements 2024-10-31 ReferEverything: Towards Segmenting Everything We Can Speak of in Videos 2024-10-31 A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks 2024-10-31 TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters 2024-10-31 Stealing User Prompts from Mixture of Experts 2024-10-31 AutoMIR: Effective Zero-Shot Medical Information Retrieval without Relevance Labels 2024-10-31 SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation 2024-10-31 On Memorization of Large Language Models in Logical Reasoning 2024-10-31 Toxicity of the Commons: Curating Open-Source Pre-Training Data 2024-10-31 CLEAR: Character Unlearning in Textual and Visual Modalities 2024-10-30 AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions 2024-10-30 OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization 2024-10-30 SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization 2024-10-30 Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning 2024-10-30 Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Dataset 2024-10-30 ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference 2024-10-30 Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning 2024-10-30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49