Vista3D: Unravel the 3D Darkside of a Single Image 2024-09-19 Towards Diverse and Efficient Audio Captioning via Diffusion Models 2024-09-19 SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer 2024-09-19 RoMath: A Mathematical Reasoning Benchmark in Romanian 2024-09-19 CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark 2024-09-19 fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction 2024-09-19 Putting Data at the Centre of Offline Multi-Agent Reinforcement Learning 2024-09-19 BERT-VBD: Vietnamese Multi-Document Summarization Framework 2024-09-19 Measuring Human and AI Values based on Generative Psychometrics with Large Language Models 2024-09-19 OmniGen: Unified Image Generation 2024-09-18 NVLM: Open Frontier-Class Multimodal LLMs 2024-09-18 Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think 2024-09-18 Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion 2024-09-18 Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models 2024-09-18 OSV: One Step is Enough for High-Quality Image to Video Generation 2024-09-18 A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B 2024-09-18 Agile Continuous Jumping in Discontinuous Terrains 2024-09-18 EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer 2024-09-18 On the limits of agency in agent-based models 2024-09-18 SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction 2024-09-18 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28