Paper Archive

HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation

0

9.0/10

Xin Zhou, Dingkang Liang, Xiwu Chen, Feiyang Tan, Dingyuan Zhang, Hengshuang Zhao, Xiang Bai 4/30/2026 arxiv

machine learning

Driving world models serve as a pivotal technology for autonomous driving by simulating environmental dynamics. However, existing approaches predominantly focus on future scene generation, often overlooking comprehensive 3D scene understanding. Conversely, while Large Language Models (LLMs) demonstr...

Keywords: HERMES++, driving world model, BEV representation, LLM-enhanced queries, Current-to-Future Link, Joint Geometric Optimization, 3D scene understanding, future point cloud prediction

View Paper

OmniRobotHome: A Multi-Camera Platform for Real-Time Multiadic Human-Robot Interaction

0

9.0/10

Junyoung Lee, Sookwan Han, Jeonghwan Kim, Inhee Lee, Mingi Choi, Jisoo Kim, Wonjung Woo, Hanbyul Joo 4/30/2026 arxiv

machine learning

Human-robot collaboration has been studied primarily in dyadic or sequential settings. However, real homes require multiadic collaboration, where multiple humans and robots share a workspace, acting concurrently on interleaved subtasks with tight spatial and temporal coupling. This regime remains un...

Keywords: room-scale perception, multi-robot, human-robot interaction, markerless tracking, occlusion robustness, behavior modeling, Franka, real-time

View Paper

LaST-R1: Reinforcing Action via Adaptive Physical Latent Reasoning for VLA Models

0

9.0/10

Hao Chen, Jiaming Liu, Zhonghao Yan, Nuowei Han, Renrui Zhang, Chenyang Gu, Jialin Gao, Ziyu Guo, Siyuan Qian, Yinxi Wang, Peng Jia, Chi-Wing Fu, Shanghang Zhang, Pheng-Ann Heng 4/30/2026 arxiv

robotics

Vision-Language-Action (VLA) models have increasingly incorporated reasoning mechanisms for complex robotic manipulation. However, existing approaches share a critical limitation: whether employing explicit linguistic reasoning that suffers from latency and discretization, or utilizing more expressi...

Keywords: LaST-R1, Latent Chain-of-Thought, LAPO, Vision-Language-Action, reinforcement learning, robotic manipulation, adaptive reasoning, LIBERO benchmark

View Paper

Representation Fréchet Loss for Visual Generation

0

9.0/10

Jiawei Yang, Zhengyang Geng, Xuan Ju, Yonglong Tian, Yue Wang 4/30/2026 arxiv

computer vision

We show that Fréchet Distance (FD), long considered impractical as a training objective, can in fact be effectively optimized in the representation space. Our idea is simple: decouple the population size for FD estimation (e.g., 50k) from the batch size for gradient computation (e.g., 1024). We term...

Keywords: Fréchet Distance, FD-loss, Fréchet Inception Distance, FDr^k, representation space, one-step generator, ImageNet, generative models

View Paper

Computing Equilibrium beyond Unilateral Deviation

0

9.0/10

Mingyang Liu, Gabriele Farina, Asuman Ozdaglar 4/30/2026 arxiv

machine learning

Most familiar equilibrium concepts, such as Nash and correlated equilibrium, guarantee only that no single player can improve their utility by deviating unilaterally. They offer no guarantees against profitable coordinated deviations by coalitions. Although the literature proposes solution concepts ...

Keywords: Nash equilibrium, correlated equilibrium, coalitional deviations, strong Nash, coalition-proof equilibrium, exploitability, Exploitability Welfare Frontier, algorithmic game theory

View Paper

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

0

9.0/10

Keming Wu, Zuhao Yang, Kaichen Zhang, Shizun Wang, Haowei Zhu, Sicong Leng, Zhongyu Yang, Qijie Wang, Sudong Wang, Ziting Wang, Zili Wang, Hui Zhang, Haonan Wang, Hang Zhou, Yifan Pu, Xingxuan Li, Fangneng Zhan, Bo Li, Lidong Bing, Yuxin Song, Ziwei Liu, Wenhu Chen, Jingdong Wang, Xinchao Wang, Xiaojuan Qi, Shijian Lu, Bin Wang 4/30/2026 arxiv

machine learning

Recent visual generation models have made major progress in photorealism, typography, instruction following, and interactive editing, yet they still struggle with spatial reasoning, persistent state, long-horizon consistency, and causal understanding. We argue that the field should move beyond appea...

Keywords: visual generation, agentic generation, world modeling, flow matching, unified models, visual representations, reward modeling, data curation

View Paper

Exploration Hacking: Can LLMs Learn to Resist RL Training?

0

9.0/10

Eyon Jang, Damon Falck, Joschka Braun, Nathalie Kirch, Achu Menon, Perusha Moodley, Scott Emmons, Roland S. Zimmermann, David Lindner 4/30/2026 arxiv

machine learning

Reinforcement learning (RL) has become essential to the post-training of large language models (LLMs) for reasoning, agentic capabilities and alignment. Successful RL relies on sufficient exploration of diverse actions by the model during training, which creates a potential failure mode: a model cou...

Keywords: exploration hacking, LLM, reinforcement learning, post-training, model organisms, mitigation, detection, ai_safety

View Paper

Synthetic Computers at Scale for Long-Horizon Productivity Simulation

0

9.0/10

Tao Ge, Baolin Peng, Hao Cheng, Jianfeng Gao 4/30/2026 arxiv

machine learning

Realistic long-horizon productivity work is strongly conditioned on user-specific computer environments, where much of the work context is stored and organized through directory structures and content-rich artifacts. To scale synthetic data creation for such productivity scenarios, we introduce Synt...

Keywords: synthetic computers, long-horizon simulation, productivity agents, synthetic data, personas, agentic RL, filesystem grounding, multi-agent simulation

View Paper

An adaptive wavelet-based PINN for problems with localized high-magnitude source

0

9.0/10

Himanshu Pandey, Ratikanta Behera 4/30/2026 arxiv

machine learning

In recent years, physics-informed neural networks (PINNs) have gained significant attention for solving differential equations, although they suffer from two fundamental limitations, namely, spectral bias inherent in neural networks and loss imbalance arising from multiscale phenomena. This paper pr...

Keywords: AW-PINN, wavelets, physics-informed neural network, PINN, spectral bias, loss imbalance, Gaussian process, NTK

View Paper

Stop Holding Your Breath: CT-Informed Gaussian Splatting for Dynamic Bronchoscopy

0

9.0/10

Andrea Dunn Beltran, Daniel Rho, Aarav Mehta, Xinqi Xiong, Raúl San José Estépar, Ron Alterovitz, Marc Niethammer, Roni Sengupta 4/30/2026 arxiv

machine learning

Bronchoscopic navigation relies on registering endoscopic video to a preoperative CT scan, but respiratory motion deforms the airway by 5-20 mm, creating CT-to-body divergence that limits localization accuracy. In practice, this is mitigated through breath-hold protocols, which attempt to match the ...

Keywords: bronchoscopy, Gaussian splatting, respiratory motion, CT-to-body divergence, patient-specific modeling, breathing phase estimation, RESPIRE, medical imaging

View Paper

Export Archive Data

Browse by Date

Papers for May 1, 2026

HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation

OmniRobotHome: A Multi-Camera Platform for Real-Time Multiadic Human-Robot Interaction

LaST-R1: Reinforcing Action via Adaptive Physical Latent Reasoning for VLA Models

Representation Fréchet Loss for Visual Generation

Computing Equilibrium beyond Unilateral Deviation

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

Exploration Hacking: Can LLMs Learn to Resist RL Training?

Synthetic Computers at Scale for Long-Horizon Productivity Simulation

An adaptive wavelet-based PINN for problems with localized high-magnitude source

Stop Holding Your Breath: CT-Informed Gaussian Splatting for Dynamic Bronchoscopy