Paper Archive

Browse and export your curated research paper collection

221
Archived Days
2198
Total Papers
8.0
Avg Score
9
Categories

Export Archive Data

Download your archived papers in various formats

JSON: Complete data with analysis • CSV: Tabular data for analysis • Markdown: Human-readable reports • BibTeX: Academic citations
Browse by Date

Papers for April 25, 2026

10 papers found

Yen-Siang Wu, Rundong Luo, Jingsen Zhu, Tao Tu, Ali Farhadi, Matthew Wallingford, Yu-Chiang Frank Wang, Steve Marschner, Wei-Chiu Ma 4/23/2026 arxiv

computer vision

How can we tell whether a video has been sped up or slowed down? How can we generate videos at different speeds? Although videos have been central to modern computer vision research, little attention has been paid to perceiving and controlling the passage of time. In this paper, we study time as a l...

Keywords: time perception, self-supervised learning, playback speed estimation, speed-conditioned generation, temporal super-resolution, slow-motion dataset, video forensics, temporal control

Nicolae Filat, Ahmed Hussain, Konstantinos Kalogiannis, Elena Burceanu 4/23/2026 arxiv

machine learning

Streaming Continual Learning (CL) typically converts a continuous stream into a sequence of discrete tasks through temporal partitioning. We argue that this temporal taskification step is not a neutral preprocessing choice, but a structural component of evaluation: different valid splits of the same...

Keywords: streaming continual learning, temporal taskification, plasticity, stability, Boundary-Profile Sensitivity, benchmarking, CESNET-Timeseries24, experience replay

Thibault Bañeras-Roux, Shashi Kumar, Driss Khalil, Sergio Burdisso, Petr Motlicek, Shiran Liu, Mickael Rouvier, Jane Wottawa, Richard Dufour 4/23/2026 arxiv

machine learning

Automatic Speech Recognition (ASR) is traditionally evaluated using Word Error Rate (WER), a metric that is insensitive to meaning. Embedding-based semantic metrics are better correlated with human perception, but decoder-based Large Language Models (LLMs) remain underexplored for this task. This pa...

Keywords: ASR, WER, large language models, generative embeddings, semantic evaluation, HATS dataset

Paul-Tiberiu Iordache, Elena Burceanu 4/23/2026 arxiv

machine learning

Continual learning (CL) studies how models acquire tasks sequentially while retaining previously learned knowledge. Despite substantial progress in benchmarking CL methods, comparative evaluations typically keep the fine-tuning regime fixed. In this paper, we argue that the fine-tuning regime, defin...

Keywords: continual learning, fine-tuning regime, trainable depth, projected optimization, forgetting, online EWC, LwF, SI

Isabella Liu, An-Chieh Cheng, Rui Yan, Geng Chen, Ri-Zhao Qiu, Xueyan Zou, Sha Yi, Hongxu Yin, Xiaolong Wang, Sifei Liu 4/23/2026 arxiv

robotics

Long-horizon manipulation remains challenging for vision-language-action (VLA) policies: real tasks are multi-step, progress-dependent, and brittle to compounding execution errors. We present LoHo-Manip, a modular framework that scales short-horizon VLA execution to long-horizon instruction followin...

Keywords: long-horizon manipulation, vision-language-action (VLA), vision-language model (VLM), trace-conditioned planning, receding-horizon, visual trace, keypoint trajectory, closed-loop replanning

Natalie Collina, Jiuyao Lu, Georgy Noarov, Aaron Roth 4/23/2026 arxiv

machine learning

We study the minimax sample complexity of multicalibration in the batch setting. A learner observes $n$ i.i.d. samples from an unknown distribution and must output a (possibly randomized) predictor whose population multicalibration error, measured by Expected Calibration Error (ECE), is at most $\va...

Keywords: multicalibration, expected calibration error, sample complexity, minimax, online-to-batch, L_p, elicitable properties, calibration

Ceyuan Yang, Zhijie Lin, Yang Zhao, Fei Xiao, Hao He, Qi Zhao, Chaorui Deng, Kunchang Li, Zihan Ding, Yuwei Guo, Fuyun Wang, Fangqi Zhu, Xiaonan Nie, Shenhan Zhu, Shanchuan Lin, Hongsheng Li, Weilin Huang, Guang Shi, Haoqi Fan 4/23/2026 arxiv

machine learning

We present Omni, a unified multimodal model natively trained on diverse modalities, including text, images, videos, 3D geometry, and hidden representations. We find that such training enables Context Unrolling, where the model explicitly reasons across multiple modal representations before producing...

Keywords: Omni, Context Unrolling, multimodal, multimodal reasoning, in-context generation, 3D geometry, video, hidden representations

Zhiqiu Xu, Shibo Jin, Shreya Arya, Mayur Naik 4/23/2026 arxiv

machine learning

As frontier language models attain near-ceiling performance on static mathematical benchmarks, existing evaluations are increasingly unable to differentiate model capabilities, largely because they cast models solely as solvers of fixed problem sets. We introduce MathDuels, a self-play benchmark in ...

Keywords: MathDuels, self-play, LLMs, benchmarking, Rasch model, meta-prompting, difficulty amplification, adversarial evaluation

Kuan Heng Lin, Zhizheng Liu, Pablo Salamanca, Yash Kant, Ryan Burgert, Yuancheng Xu, Koichi Namekata, Yiwei Zhao, Bolei Zhou, Micah Goldblum, Paul Debevec, Ning Yu 4/23/2026 arxiv

computer vision

We present Vista4D, a robust and flexible video reshooting framework that grounds the input video and target cameras in a 4D point cloud. Specifically, given an input video, our method re-synthesizes the scene with the same dynamics from a different camera trajectory and viewpoint. Existing video re...

Keywords: 4D point cloud, video reshooting, view synthesis, dynamic scenes, multiview training, camera control, static-pixel segmentation, reconstruction

Songen Gu, Yuhang Zheng, Weize Li, Yupeng Zheng, Yating Feng, Xiang Li, Yilun Chen, Pengfei Li, Wenchao Ding 4/23/2026 arxiv

robotics

Recently, end-to-end robotic manipulation models have gained significant attention for their generalizability and scalability. However, they often suffer from limited robustness to camera viewpoint changes when training with a fixed camera. In this paper, we propose VistaBot, a novel framework that ...

Keywords: view synthesis, video diffusion, robot manipulation, 4D geometry, latent action learning, view generalization score, closed-loop control, action-chunking
Loading...

Preparing your export...