Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation 2024-10-21 NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples 2024-10-21 MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models 2024-10-21 SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs 2024-10-21 FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model 2024-10-21 Mini-Omni2: Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities 2024-10-21 Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion 2024-10-21 DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation 2024-10-21 Are AI Detectors Good Enough? A Survey on Quality of Datasets With Machine-Generated Texts 2024-10-21 DPLM-2: A Multimodal Diffusion Protein Language Model 2024-10-21 HART: Efficient Visual Generation with Hybrid Autoregressive Transformer 2024-10-21 How Do Training Methods Influence the Utilization of Vision Models? 2024-10-21 Looking Inward: Language Models Can Learn About Themselves by Introspection 2024-10-21 A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement 2024-10-21 Context is Key(NMF): Modelling Topical Information Dynamics in Chinese Diaspora Media 2024-10-21 SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments 2024-10-21 BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities 2024-10-21 Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning 2024-10-21 Teaching Models to Balance Resisting and Accepting Persuasion 2024-10-21 MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures 2024-10-18 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49