Paper Archive

The Value Axis: Language Models Encode Whether They're on the Right Track

0

5.0/10

Nick Jiang, Isaac Kauvar, Jack Lindsey 6/15/2026 arxiv

natural language processing

We investigate whether language models internally track the value of their current trajectory, defined as the likelihood that their ongoing strategy will achieve their goals. Using synthetic, in-context reinforcement learning data, we construct a "value" axis for Qwen3-8B. We find that activations a...

Keywords: reinforcement learning, fine-tuning

View Paper

T-Rex: Tactile-Reactive Dexterous Manipulation

0

5.0/10

Dantong Niu, Zhuoyang Liu, Zekai Wang, Boning Shao, Zhao-Heng Yin, Anirudh Pai, Yuvan Sharma, Stefano Saravalle, Ruijie Zheng, Jing Wang, Ryan Punamiya, Mengda Xu, Yuqi Xie, Yunfan Jiang, Letian Fu, Konstantinos Kallidromitis, Matteo Gioia, Junyi Zhang, Jiaxin Ge, Haiwen Feng, Fabio Galasso, Wei Zhan, David M. Chan, Yutong Bai, Roei Herzig, Jiahui Lei, Fei-Fei Li, Ken Goldberg, Jitendra Malik, Pieter Abbeel, Yuke Zhu, Danfei Xu, Jim, Fan, Trevor Darrell 6/15/2026 arxiv

computer vision

The ability to react dynamically to tactile signals has long been considered crucial to agile human-level dexterity. Yet contemporary learning-based Vision-Language-Action (VLA) models for robotic manipulation generally either overlook the tactile modality or are limited to encoders with static cues...

Keywords: transformer

View Paper

Human Universal Grasping

0

5.0/10

Kevin Yuanbo Wu, Tianxing Zhou, Isaac Tu, Billy Yan, Irmak Guzey, David Fouhey, Dandan Shan, Lerrel Pinto 6/15/2026 arxiv

computer vision

Humans can grasp objects effortlessly, whereas multi-fingered robots are far from this level of generality. We argue that the most natural source of robot grasping data is from humans, who pick up thousands of objects every day. We present HUG, a flow-matching model that generates diverse human gras...

View Paper

Context-Aware RL for Agentic and Multimodal LLMs

0

5.0/10

Peiyang Xu, Bangzheng Li, Sijia Liu, Karthik R. Narasimhan, Pramod Viswanath, Prateek Mittal, Xingyu Fu 6/15/2026 arxiv

computer vision

Large language models (LLMs) often fail when answering requires identifying a small but decisive piece of evidence within a long or complex context, such as a single line in a tool trace or a subtle detail in an image. We propose ContextRL, a context-aware reinforcement learning (RL) method that imp...

Keywords: reinforcement learning

View Paper

BRDFusion: Physics Meets Generation for Urban Scene Inverse Rendering

0

5.0/10

Yi-Ruei Liu, Jie-Ying Lee, Zheng-Hui Huang, Yu-Lun Liu, Chih-Hao Lin 6/15/2026 arxiv

machine learning

Inverse rendering of urban scenes from captured videos enables numerous applications, including content creation and autonomous driving simulation. Physically-based rendering methods follow and control lighting physics, but suffer from reconstruction and rendering artifacts. While generative models ...

View Paper

Exact Posterior Score Estimation for Solving Linear Inverse Problems

0

5.0/10

Abbas Mammadov, Ozgur Kara, Kaan Oktay, Iskander Azangulov, Adil Kaan Akan, Hyungjin Chung, James Matthew Rehg, Yee Whye Teh 6/15/2026 arxiv

computer vision

Diffusion and flow-based models learn powerful data priors by training a denoiser to reverse Gaussian corruption. To use this prior to solve a linear inverse problem, one needs to sample from the posterior, but the score that the prior provides is the unconditional score, not the posterior score. Ex...

Keywords: pretraining

View Paper

Geometric Action Model for Robot Policy Learning

0

5.0/10

Jisang Han, Seonghu Jeon, Jaewoo Jung, René Zurbrügg, Honggyu An, Tifanny Portela, Marco Hutter, Marc Pollefeys, Seungryong Kim, Sunghwan Hong 6/15/2026 arxiv

computer vision

Generalist robot policies must follow user instructions while reasoning about how objects, cameras, and robot actions interact in the 3D physical world. Recent vision-language-action models (VLAs) and video world-action models (WAMs) inherit strong semantic or temporal priors from large-scale founda...

View Paper

Hierarchical Advantage Weighting for Online RL Fine-Tuning of VLAs from Sparse Episode Outcomes

0

5.0/10

Tongyan Fang, Siyuan Huang, Naiyu Fang, Ganlong Zhao, Zhongjin Luo, Jianbo Liu, Xiaogang Wang, Ying Dong, Hongsheng Li 6/15/2026 arxiv

computer vision

When pretrained VLA policies are fine-tuned through online RL, each rollout episode produces only a single binary outcome (success or failure), yet the actor update requires per-transition supervision. Existing approaches commonly reduce this sparse outcome to a single scalar reward or advantage sig...

Keywords: fine-tuning

View Paper

Benchmarking LLM Agents on Meta-Analysis Articles from Nature Portfolio

0

5.0/10

Anzhe Xie, Weihang Su, Yujia Zhou, Yiqun Liu, Qingyao Ai 6/15/2026 arxiv

reinforcement learning

Meta-analysis is a demanding form of evidence synthesis that combines literature retrieval, PI/ECO-guided study selection, and statistical aggregation. Its structured, verifiable workflow makes it an ideal substrate for evaluating systematic scientific reasoning, yet existing benchmarks lack ground ...

View Paper

R2RDreamer: 3D-aware Data Augmentation for Spatially-generalized 2D Manipulation Policies

0

5.0/10

Xiuwei Xu, Haowen Sun, Angyuan Ma, Yiwei Zhang, Zhenyu Wu, Xiaofeng Wang, Bingyao Yu, Zheng Zhu, Jie Zhou, Jiwen Lu 6/15/2026 arxiv

computer vision

Spatial generalization is critical for imitation-learned manipulation policies, but achieving it typically requires scaling demonstrations across diverse object poses, robot configurations, and camera viewpoints. Data augmentation from a few source demonstrations offers a practical alternative to co...

View Paper

Export Archive Data

Browse by Date

Papers for June 16, 2026

The Value Axis: Language Models Encode Whether They're on the Right Track

T-Rex: Tactile-Reactive Dexterous Manipulation

Human Universal Grasping

Context-Aware RL for Agentic and Multimodal LLMs

BRDFusion: Physics Meets Generation for Urban Scene Inverse Rendering

Exact Posterior Score Estimation for Solving Linear Inverse Problems

Geometric Action Model for Robot Policy Learning

Hierarchical Advantage Weighting for Online RL Fine-Tuning of VLAs from Sparse Episode Outcomes

Benchmarking LLM Agents on Meta-Analysis Articles from Nature Portfolio

R2RDreamer: 3D-aware Data Augmentation for Spatially-generalized 2D Manipulation Policies