HF Papers

MAOAM: Unified Object and Material Selection with Vision-Language Models

MAOAM: Unified Object and Material Selection with Vision-Language Models

2026-06-05

The Shadow Price of Reasoning: Economic Perspective on Optimal Budget Allocation for LLMs

The Shadow Price of Reasoning: Economic Perspective on Optimal Budget Allocation for LLMs

2026-06-05

World-Language-Action Model for Unified World Modeling, Language Reasoning, and Action Synthesis

World-Language-Action Model for Unified World Modeling, Language Reasoning, and Action Synthesis

2026-06-05

Imagine Before You Predict: Interleaved Latent Visual Reasoning for Video Event Prediction

Imagine Before You Predict: Interleaved Latent Visual Reasoning for Video Event Prediction

2026-06-05

Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents

Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents

2026-06-05

Towards One-to-Many Temporal Grounding

Towards One-to-Many Temporal Grounding

2026-06-05

MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery

MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery

2026-06-05

Learning Geometric Representations from Videos for Spatial Intelligent Multimodal Large Language Models

Learning Geometric Representations from Videos for Spatial Intelligent Multimodal Large Language Models

2026-06-05

AffordanceVLA: A Vision-Language-Action Model Empowering Action Generation through Affordance-Aware Understanding

AffordanceVLA: A Vision-Language-Action Model Empowering Action Generation through Affordance-Aware Understanding

2026-06-05

Towards Truly Multilingual ASR: Generalizing Code-Switching ASR to Unseen Language Pairs

Towards Truly Multilingual ASR: Generalizing Code-Switching ASR to Unseen Language Pairs

2026-06-05

SePO: Self-Evolving Prompt Agent for System Prompt Optimization

SePO: Self-Evolving Prompt Agent for System Prompt Optimization

2026-06-05

Flash-WAM: Modality-Aware Distillation for World Action Models

Flash-WAM: Modality-Aware Distillation for World Action Models

2026-06-05

SEAOTTER: Sensor Embedded Autoencoding with One-Time Transcode for Efficient Reconstruction

SEAOTTER: Sensor Embedded Autoencoding with One-Time Transcode for Efficient Reconstruction

2026-06-05

AdaCodec: A Predictive Visual Code for Video MLLMs

AdaCodec: A Predictive Visual Code for Video MLLMs

2026-06-05

MechVQA: Benchmarking and Enhancing Multimodal LLMs on Comprehensive Mechanical Drawing Understanding

MechVQA: Benchmarking and Enhancing Multimodal LLMs on Comprehensive Mechanical Drawing Understanding

2026-06-05

Revising Context, Shifting Simulated Stance: Auditing LLM-Based Stance Simulation in Online Discussions

Revising Context, Shifting Simulated Stance: Auditing LLM-Based Stance Simulation in Online Discussions

2026-06-05

Latent Reasoning with Normalizing Flows

Latent Reasoning with Normalizing Flows

2026-06-05

EvoDS: Self-Evolving Autonomous Data Science Agent with Skill Learning and Context Management

EvoDS: Self-Evolving Autonomous Data Science Agent with Skill Learning and Context Management

2026-06-05

Regret Minimization with Adaptive Opponents in Repeated Games

Regret Minimization with Adaptive Opponents in Repeated Games

2026-06-05

AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents

AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents

2026-06-05