Paper Archive

OmniRoam: World Wandering via Long-Horizon Panoramic Video Generation

0

9.0/10

Yuheng Liu, Xin Lin, Xinke Li, Baihan Yang, Chen Wang, Kalyan Sunkavalli, Yannick Hold-Geoffroy, Hao Tan, Kai Zhang, Xiaohui Xie, Zifan Shi, Yiwei Hu 3/31/2026 arxiv

computer vision

Modeling scenes using video generation models has garnered growing research interest in recent years. However, most existing approaches rely on perspective video models that synthesize only limited observations of a scene, leading to issues of completeness and global consistency. We propose OmniRoam...

Keywords: panoramic video, long-horizon generation, preview-refine, controllable video, 360° representation, high-resolution upsampling, video datasets, 3D reconstruction

View Paper

Video Models Reason Early: Exploiting Plan Commitment for Maze Solving

0

9.0/10

Kaleb Newman, Tyler Zhu, Olga Russakovsky 3/31/2026 arxiv

computer vision

Video diffusion models exhibit emergent reasoning capabilities like solving mazes and puzzles, yet little is understood about how they reason during generation. We take a first step towards understanding this and study the internal planning dynamics of video models using 2D maze solving as a control...

Keywords: video diffusion models, early plan commitment, maze solving, ChEaP, chaining, denoising steps, path length, Frozen Lake

View Paper

Automatic Identification of Parallelizable Loops Using Transformer-Based Source Code Representations

0

9.0/10

Izavan dos S. Correia, Henrique C. T. Santos, Tiago A. E. Ferreira 3/31/2026 arxiv

machine learning

Automatic parallelization remains a challenging problem in software engineering, particularly in identifying code regions where loops can be safely executed in parallel on modern multi-core architectures. Traditional static analysis techniques, such as dependence analysis and polyhedral models, ofte...

Keywords: DistilBERT, Transformer, source code representation, loop parallelization, automatic parallelization, subword tokenization, 10-fold cross-validation

View Paper

Benchmarking PhD-Level Coding in 3D Geometric Computer Vision

0

9.0/10

Wenyi Li, Renkai Luo, Yue Yu, Huan-ang Gao, Mingju Gao, Li Yuan, Chaoyou Fu, Hao Zhao 3/31/2026 arxiv

computer vision

AI-assisted coding has rapidly reshaped software practice and research workflows, yet today's models still struggle to produce correct code for complex 3D geometric vision. If models could reliably write such code, the research of our community would change substantially. To measure progress toward ...

Keywords: GeoCodeBench, 3D geometric vision, benchmark, unit tests, reproducible evaluation, GPT-5, geometric transformations, mechanics

View Paper

Aligned, Orthogonal or In-conflict: When can we safely optimize Chain-of-Thought?

0

9.0/10

Max Kaufmann, David Lindner, Roland S. Zimmermann, and Rohin Shah 3/31/2026 arxiv

machine learning

Chain-of-Thought (CoT) monitoring, in which automated systems monitor the CoT of an LLM, is a promising approach for effectively overseeing AI systems. However, the extent to which a model's CoT helps us oversee the model - the monitorability of the CoT - can be affected by training, for instance by...

Keywords: chain-of-thought, monitorability, reward_decomposition, reinforcement_learning, LLM, alignment, interpretability, AI_oversight

View Paper

Tucker Attention: A generalization of approximate attention mechanisms

0

9.0/10

Timon Klein, Jonas Kusch, Sebastian Sager, Stefan Schnake, Steffen Schotthöfer 3/31/2026 arxiv

machine learning

The pursuit of reducing the memory footprint of the self-attention mechanism in multi-headed self attention (MHA) spawned a rich portfolio of methods, e.g., group-query attention (GQA) and multi-head latent attention (MLA). The methods leverage specialized low-rank factorizations across embedding di...

Keywords: Tucker Attention, low-rank, factorization, self-attention, GQA, MLA, MHA, flash-attention

View Paper

Covertly improving intelligibility with data-driven adaptations of speech timing

0

9.0/10

Paige Tuttösí, Angelica Lim, H. Henny Yeung, Yue Wang, Jean-Julien Aucouturier 3/31/2026 arxiv

machine learning

Human talkers often address listeners with language-comprehension challenges, such as hard-of-hearing or non-native adults, by globally slowing down their speech. However, it remains unclear whether this strategy actually makes speech more intelligible. Here, we take advantage of recent advancements...

Keywords: speech rate, intelligibility, reverse-correlation, text-to-speech, L2 comprehension, temporal structure, vowel contrast, targeted slowing

View Paper

Hybrid Framework for Robotic Manipulation: Integrating Reinforcement Learning and Large Language Models

0

9.0/10

Md Saad, Sajjad Hussain, Mohd Suhaib 3/31/2026 arxiv

robotics

This paper introduces a new hybrid framework that combines Reinforcement Learning (RL) and Large Language Models (LLMs) to improve robotic manipulation tasks. By utilizing RL for accurate low-level control and LLMs for high level task planning and understanding of natural language, the proposed fram...

Keywords: reinforcement learning, large language models, robotic manipulation, PyBullet, Franka Emika Panda, sim-to-real, human-robot interaction

View Paper

Scalable AI-assisted Workflow Management for Detector Design Optimization Using Distributed Computing

0

9.0/10

Derek Anderson, Amit Bashyal, Markus Diefenthaler, Cristiano Fanelli, Wen Guan, Tanja Horn, Alex Jentsch Meifeng Lin, Tadashi Maeno, Kei Nagai, Hemalata Nayak, Connor Pecar, Karthik Suresh, Fang-Ying Tsai, Anselm Vossen, Tianle Wang, Torre Wenaus 3/31/2026 arxiv

machine learning

The Production and Distributed Analysis (PanDA) system, originally developed for the ATLAS experiment at the CERN Large Hadron Collider (LHC), has evolved into a robust platform for orchestrating large-scale workflows across distributed computing resources. Coupled with its intelligent Distributed D...

Keywords: PanDA, iDDS, Bayesian optimization, detector design, distributed computing, workflow orchestration, multi-objective optimization, EIC

View Paper

ATP-Bench: Towards Agentic Tool Planning for MLLM Interleaved Generation

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 3/31/2026 huggingface

machine learning

Interleaved text-and-image generation represents a significant frontier for Multimodal Large Language Models (MLLMs), offering a more intuitive way to convey complex information. Current paradigms rely on either image generation or retrieval augmentation, yet they typically treat the two as mutually...

Keywords: Agentic Tool Planning, ATP-Bench, MAM, MLLM, interleaved generation, VQA, tool-call precision, multimodal benchmark

View Paper

Export Archive Data

Browse by Date

Papers for April 1, 2026

OmniRoam: World Wandering via Long-Horizon Panoramic Video Generation

Video Models Reason Early: Exploiting Plan Commitment for Maze Solving

Automatic Identification of Parallelizable Loops Using Transformer-Based Source Code Representations

Benchmarking PhD-Level Coding in 3D Geometric Computer Vision

Aligned, Orthogonal or In-conflict: When can we safely optimize Chain-of-Thought?

Tucker Attention: A generalization of approximate attention mechanisms

Covertly improving intelligibility with data-driven adaptations of speech timing

Hybrid Framework for Robotic Manipulation: Integrating Reinforcement Learning and Large Language Models

Scalable AI-assisted Workflow Management for Detector Design Optimization Using Distributed Computing

ATP-Bench: Towards Agentic Tool Planning for MLLM Interleaved Generation