Paper Archive

Transformer-Based Inpainting for Real-Time 3D Streaming in Sparse Multi-Camera Setups

0

9.0/10

Leif Van Holland, Domenic Zingsheim, Mana Takhsha, Hannah Dröge, Patrick Stotko, Markus Plack, Reinhard Klein 3/5/2026 arxiv

computer vision

High-quality 3D streaming from multiple cameras is crucial for immersive experiences in many AR/VR applications. The limited number of views - often due to real-time constraints - leads to missing information and incomplete surfaces in the rendered images. Existing approaches typically rely on simpl...

Keywords: transformer, inpainting, 3D streaming, multi-camera, spatio-temporal embeddings, real-time, adaptive patch selection, resolution-independent

View Paper

FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning

0

9.0/10

Weijie Lyu, Ming-Hsuan Yang, Zhixin Shu 3/5/2026 arxiv

computer vision

We introduce FaceCam, a system that generates video under customizable camera trajectories for monocular human portrait video input. Recent camera control approaches based on large video-generation models have shown promising progress but often exhibit geometric distortions and visual artifacts on p...

Keywords: FaceCam, scale-aware representation, portrait video, camera control, synthetic camera motion, multi-shot stitching, Ava-256, video generation

View Paper

RoboPocket: Improve Robot Policies Instantly with Your Phone

0

9.0/10

Junjie Fang, Wendi Chen, Han Xue, Fangyuan Zhou, Tian Le, Yi Wang, Yuting Zhang, Jun Lv, Chuan Wen, Cewu Lu 3/5/2026 arxiv

robotics

Scaling imitation learning is fundamentally constrained by the efficiency of data collection. While handheld interfaces have emerged as a scalable solution for in-the-wild data acquisition, they predominantly operate in an open-loop manner: operators blindly collect demonstrations without knowing th...

Keywords: RoboPocket, remote inference, AR visual foresight, imitation learning, online finetuning, data efficiency, covariate shift, DAgger

View Paper

Accelerating Text-to-Video Generation with Calibrated Sparse Attention

0

9.0/10

Shai Yehezkel, Shahar Yadin, Noam Elata, Yaron Ostrovsky-Berman, Bahjat Kawar 3/5/2026 arxiv

computer vision

Recent diffusion models enable high-quality video generation, but suffer from slow runtimes. The large transformer-based backbones used in these models are bottlenecked by spatiotemporal attention. In this paper, we identify that a significant fraction of token-to-token connections consistently yiel...

Keywords: CalibAtt, calibrated_sparse_attention, diffusion_models, text-to-video, spatiotemporal_attention, Wan 2.1 14B, Mochi 1, training-free

View Paper

POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation

0

9.0/10

Zeju Qiu, Lixin Liu, Adrian Weller, Han Shi, Weiyang Liu 3/5/2026 arxiv

machine learning

Efficient and stable training of large language models (LLMs) remains a core challenge in modern machine learning systems. To address this challenge, Reparameterized Orthogonal Equivalence Training (POET), a spectrum-preserving framework that optimizes each weight matrix through orthogonal equivalen...

Keywords: POET-X, POET, orthogonal equivalence, spectrum-preserving, LLM training, memory-efficient, NVIDIA H100, AdamW

View Paper

The Spike, the Sparse and the Sink: Anatomy of Massive Activations and Attention Sinks

0

9.0/10

Shangwen Sun, Alfredo Canziani, Yann LeCun, Jiachen Zhu 3/5/2026 arxiv

natural language processing

We study two recurring phenomena in Transformer language models: massive activations, in which a small number of tokens exhibit extreme outliers in a few channels, and attention sinks, in which certain tokens attract disproportionate attention mass regardless of semantic relevance. Prior work observ...

Keywords: Transformers, massive activations, attention sinks, pre-norm, interpretability, ablation study, architecture

View Paper

Safe-SAGE: Social-Semantic Adaptive Guidance for Safe Engagement through Laplace-Modulated Poisson Safety Functions

0

9.0/10

Lizhi Yang, Ryan M. Bena, Meg Wilkinson, Gilbert Bahati, Andy Navarro Brenes, Ryan K. Cosner, Aaron D. Ames 3/5/2026 arxiv

robotics

Traditional safety-critical control methods, such as control barrier functions, suffer from semantic blindness, exhibiting the same behavior around obstacles regardless of contextual significance. This limitation leads to the uniform treatment of all obstacles, despite their differing semantic meani...

Keywords: Poisson safety function, Laplace guidance field, control barrier function, model predictive control, semantic segmentation, multi-sensor fusion, persistent tracking, legged robots

View Paper

Cheap Thrills: Effective Amortized Optimization Using Inexpensive Labels

0

9.0/10

Khai Nguyen, Petros Ellinas, Anvita Bhagavathula, Priya Donti 3/5/2026 arxiv

machine learning

To scale the solution of optimization and simulation problems, prior work has explored machine-learning surrogates that inexpensively map problem parameters to corresponding solutions. Commonly used approaches, including supervised and self-supervised learning with either soft or hard feasibility en...

Keywords: amortized optimization, cheap labels, self-supervised learning, supervised pretraining, merit-based criterion, nonconvex optimization, power systems, dynamical systems

View Paper

Censored LLMs as a Natural Testbed for Secret Knowledge Elicitation

0

9.0/10

Helena Casademunt, Bartosz Cywiński, Khoi Tran, Arya Jakkli, Samuel Marks, Neel Nanda 3/5/2026 arxiv

machine learning

Large language models sometimes produce false or misleading responses. Two approaches to this problem are honesty elicitation -- modifying prompts or weights so that the model answers truthfully -- and lie detection -- classifying whether a given response is false. Prior work evaluates such methods ...

Keywords: honesty elicitation, lie detection, censored LLMs, Qwen3, few-shot prompting, fine-tuning, self-classification, linear probes

View Paper

Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought

0

9.0/10

Siddharth Boppana, Annabel Ma, Max Loeffler, Raphael Sarfati, Eric Bigelow, Atticus Geiger, Owen Lewis, Jack Merullo 3/5/2026 arxiv

machine learning

We provide evidence of performative chain-of-thought (CoT) in reasoning models, where a model becomes strongly confident in its final answer, but continues generating tokens without revealing its internal belief. Our analysis compares activation probing, early forced answering, and a CoT monitor acr...

Keywords: chain-of-thought, activation probing, attention probing, early exit, MMLU, GPQA-Diamond, model belief, interpretability

View Paper

Export Archive Data

Browse by Date

Papers for March 7, 2026

Transformer-Based Inpainting for Real-Time 3D Streaming in Sparse Multi-Camera Setups

FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning

RoboPocket: Improve Robot Policies Instantly with Your Phone

Accelerating Text-to-Video Generation with Calibrated Sparse Attention

POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation

The Spike, the Sparse and the Sink: Anatomy of Massive Activations and Attention Sinks

Safe-SAGE: Social-Semantic Adaptive Guidance for Safe Engagement through Laplace-Modulated Poisson Safety Functions

Cheap Thrills: Effective Amortized Optimization Using Inexpensive Labels

Censored LLMs as a Natural Testbed for Secret Knowledge Elicitation

Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought