Paper Archive

Browse and export your curated research paper collection

176
Archived Days
1748
Total Papers
7.8
Avg Score
9
Categories

Export Archive Data

Download your archived papers in various formats

JSON: Complete data with analysis • CSV: Tabular data for analysis • Markdown: Human-readable reports • BibTeX: Academic citations
Browse by Date

Papers for March 6, 2026

10 papers found

Weijie Lyu, Ming-Hsuan Yang, Zhixin Shu 3/5/2026 arxiv

computer vision

We introduce FaceCam, a system that generates video under customizable camera trajectories for monocular human portrait video input. Recent camera control approaches based on large video-generation models have shown promising progress but often exhibit geometric distortions and visual artifacts on p...

Keywords: FaceCam, scale-aware representation, camera control, portrait video, video generation, conditioning, multi-view, in-the-wild

Junjie Fang, Wendi Chen, Han Xue, Fangyuan Zhou, Tian Le, Yi Wang, Yuting Zhang, Jun Lv, Chuan Wen, Cewu Lu 3/5/2026 arxiv

robotics

Scaling imitation learning is fundamentally constrained by the efficiency of data collection. While handheld interfaces have emerged as a scalable solution for in-the-wild data acquisition, they predominantly operate in an open-loop manner: operators blindly collect demonstrations without knowing th...

Keywords: RoboPocket, Remote Inference, AR Visual Foresight, imitation learning, online finetuning, data efficiency, robot-free, mobile data collection

Shai Yehezkel, Shahar Yadin, Noam Elata, Yaron Ostrovsky-Berman, Bahjat Kawar 3/5/2026 arxiv

machine learning

Recent diffusion models enable high-quality video generation, but suffer from slow runtimes. The large transformer-based backbones used in these models are bottlenecked by spatiotemporal attention. In this paper, we identify that a significant fraction of token-to-token connections consistently yiel...

Keywords: CalibAtt, sparse attention, text-to-video, diffusion models, transformer, spatiotemporal attention, attention calibration

Khai Nguyen, Petros Ellinas, Anvita Bhagavathula, Priya Donti 3/5/2026 arxiv

optimization

To scale the solution of optimization and simulation problems, prior work has explored machine-learning surrogates that inexpensively map problem parameters to corresponding solutions. Commonly used approaches, including supervised and self-supervised learning with either soft or hard feasibility en...

Keywords: amortized optimization, cheap labels, self-supervised learning, supervised pretraining, inexact labels, surrogate models, nonconvex constrained optimization, power-grid operation

Balakumar Sundaralingam, Adithyavairavan Murali, Stan Birchfield 3/5/2026 arxiv

robotics

Effective robot autonomy requires motion generation that is safe, feasible, and reactive. Current methods are fragmented: fast planners output physically unexecutable trajectories, reactive controllers struggle with high-fidelity perception, and existing solvers fail on high-DoF systems. We present ...

Keywords: cuRoboV2, TSDF, ESDF, depth-fused distance fields, B-spline trajectory optimization, GPU-native, high-DoF, differentiable inverse dynamics

Siddharth Boppana, Annabel Ma, Max Loeffler, Raphael Sarfati, Eric Bigelow, Atticus Geiger, Owen Lewis, Jack Merullo 3/5/2026 arxiv

machine learning

We provide evidence of performative chain-of-thought (CoT) in reasoning models, where a model becomes strongly confident in its final answer, but continues generating tokens without revealing its internal belief. Our analysis compares activation probing, early forced answering, and a CoT monitor acr...

Keywords: chain-of-thought, performative reasoning, activation probing, early exit, adaptive computation, interpretability, LLM, MMLU

Hugo Buurmeijer, Carmen Amo Alonso, Aiden Swann, Marco Pavone 3/5/2026 arxiv

machine learning

Vision-Language-Action Models (VLAs) have shown remarkable progress towards embodied intelligence. While their architecture partially resembles that of Large Language Models (LLMs), VLAs exhibit higher complexity due to their multi-modal inputs/outputs and often hybrid nature of transformer and diff...

Keywords: vision-language-action, interpretability, feature-observability, feature-controllability, linear intervention, optimal control, closed-loop, π_{0.5}

Benjamin Feuer, Lucas Rosenblatt, Oussama Elachqar 3/5/2026 arxiv

machine learning

As AI models progress beyond simple chatbots into more complex workflows, we draw ever closer to the event horizon beyond which AI systems will be utilized in autonomous, self-maintaining feedback loops. Any autonomous AI system will depend on automated, verifiable rewards and feedback; in settings ...

Keywords: average bias-boundedness, A-BB, LLM judge, bias guarantees, provable evaluation, Arena-Hard-Auto, automated feedback, adversarial bias

Guo Chen, Lidong Lu, Yicheng Liu, Liangrui Dong, Lidong Zou, Jixin Lv, Zhenquan Li, Xinyi Mao, Baoqi Pei, Shihao Wang, Zhiqi Li, Karan Sapra, Fuxiao Liu, Yin-Dong Zheng, Yifei Huang, Limin Wang, Zhiding Yu, Andrew Tao, Guilin Liu, Tong Lu 3/5/2026 arxiv

machine learning

While datasets for video understanding have scaled to hour-long durations, they typically consist of densely concatenated clips that differ from natural, unscripted daily life. To bridge this gap, we introduce MM-Lifelong, a dataset designed for Multimodal Lifelong Understanding. Comprising 181.1 ho...

Keywords: MM-Lifelong, ReMA, multimodal, lifelong learning, working memory bottleneck, global localization collapse, dynamic memory, recursive belief state

Leif Van Holland, Domenic Zingsheim, Mana Takhsha, Hannah Dröge, Patrick Stotko, Markus Plack, Reinhard Klein 3/5/2026 arxiv

computer vision

High-quality 3D streaming from multiple cameras is crucial for immersive experiences in many AR/VR applications. The limited number of views - often due to real-time constraints - leads to missing information and incomplete surfaces in the rendered images. Existing approaches typically rely on simpl...

Keywords: transformer, inpainting, real-time 3D streaming, multi-camera, spatio-temporal embeddings, adaptive patch selection, resolution-independent, AR/VR
Loading...

Preparing your export...