LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation 2024-11-11 Balancing Pipeline Parallelism with Vocabulary Parallelism 2024-11-11 StdGEN: Semantic-Decomposed 3D Character Generation from Single Images 2024-11-11 Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding 2024-11-11 DELIFT: Data Efficient Language model Instruction Fine Tuning 2024-11-11 Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study 2024-11-11 RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models 2024-11-11 The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities 2024-11-11 Improving the detection of technical debt in Java source code with an enriched dataset 2024-11-11 CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM 2024-11-11 OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models 2024-11-08 ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning 2024-11-08 BitNet a4.8: 4-bit Activations for 1-bit LLMs 2024-11-08 DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion 2024-11-08 Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models 2024-11-08 TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation 2024-11-08 Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model 2024-11-08 VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos 2024-11-08 Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks? 2024-11-08 RetrieveGPT: Merging Prompts and Mathematical Models for Enhanced Code-Mixed Information Retrieval 2024-11-08 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49