FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally 2024-09-13 Can OOD Object Detectors Learn from Foundation Models? 2024-09-13 PiTe: Pixel-Temporal Alignment for Large Video-Language Model 2024-09-13 PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation 2024-09-12 MEDIC: Towards a Comprehensive Framework for Evaluating LLMs in Clinical Applications 2024-09-12 Agent Workflow Memory 2024-09-12 Gated Slot Attention for Efficient Linear-Time Sequence Modeling 2024-09-12 VMAS: Video-to-Music Generation via Semantic Alignment in Web Music Videos 2024-09-12 Self-Harmonized Chain of Thought 2024-09-12 Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models 2024-09-12 MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis 2024-09-12 gsplat: An Open-Source Library for Gaussian Splatting 2024-09-12 Can Large Language Models Unlock Novel Scientific Research Ideas? 2024-09-12 ProteinBench: A Holistic Evaluation of Protein Foundation Models 2024-09-12 Generative Hierarchical Materials Search 2024-09-12 Instant Facial Gaussians Translator for Relightable and Interactable Facial Rendering 2024-09-12 SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories 2024-09-12 GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering 2024-09-11 LLaMA-Omni: Seamless Speech Interaction with Large Language Models 2024-09-11 INTRA: Interaction Relationship-aware Weakly Supervised Affordance Grounding 2024-09-11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49