Paper Archive

Browse and export your curated research paper collection

247
Archived Days
2458
Total Papers
7.8
Avg Score
9
Categories

Export Archive Data

Download your archived papers in various formats

JSON: Complete data with analysis • CSV: Tabular data for analysis • Markdown: Human-readable reports • BibTeX: Academic citations
Browse by Date

Papers for May 22, 2026

10 papers found

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/21/2026 huggingface

reinforcement learning

Camera pose matters. The position and orientation of each viewpoint define a shared spatial coordinate frame that relates observations across video frames. Yet this signal is largely absent from multimodal LLMs (MLLMs) for video understanding, which process frames as isolated 2D snapshots, instead o...

Keywords: regression

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/21/2026 huggingface

computer vision

Current motion-controlled image-to-video generation models rigidly follow user-provided trajectories that are often sparse, imprecise, and causally incomplete. Such reliance often yields unnatural or implausible outcomes, especially by missing secondary causal consequences. To address this, we intro...

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/21/2026 huggingface

computer vision

Robust training and validation of Autonomous Driving Systems (ADS) require massive, diverse datasets. Proprietary data collected by Autonomous Vehicle (AV) fleets, while high-fidelity, are limited in scale, diversity of sensor configurations, as well as geographic and long-tail-behavioral coverage. ...

Keywords: multi-modal

[object Object], [object Object], [object Object] 5/21/2026 huggingface

natural language processing

Linear attention replaces the unbounded cache of softmax attention with a fixed-size recurrent state, reducing sequence mixing to linear time and decoding to constant memory. The hard part is not just what to forget, but how to edit this compressed memory without scrambling existing associations. De...

Keywords: attention

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/21/2026 huggingface

computer vision

Representation Autoencoders (RAEs) leverage frozen vision foundation models (VFMs) as tokenizer encoders, providing robust high-level representations that facilitate fast convergence and high-quality generation in latent diffusion models. However, freezing the VFM inherently constrains its spatial r...

Keywords: diffusion model, fine-tuning

[object Object], [object Object] 5/21/2026 huggingface

natural language processing

Model cards describe model behavior through a mixture of textual descriptions and structured artifacts, including performance, configuration, and dataset tables. Existing model search systems rely predominantly on semantic similarity over text, which can produce homogeneous result sets and limit exp...

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/21/2026 huggingface

computer vision

Autoregressive video diffusion models have enabled real-time, action-conditioned world generation. However, sustaining a persistent world, where revisiting a previously seen viewpoint yields consistent content, remains an open problem. Full KV-cache attention preserves this consistency but breaks re...

Keywords: attention, diffusion model, fine-tuning

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/21/2026 huggingface

machine learning

Artificial intelligence (AI) is increasingly embedded in scientific discovery, yet whether it can anticipate scientific progress remains unclear. To study this question, we introduce a temporally grounded evaluation framework for forecasting scientific progress under controlled knowledge constraints...

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/21/2026 huggingface

computer vision

While most frames in long-form video are redundant, the critical information resides in temporal surprises: moments where the actual visual features deviate from their predicted evolution. Inspired by the human brain's predictive coding, we introduce Swift Sampling, an elegant, training-free frame s...

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/21/2026 huggingface

natural language processing

Spreadsheet systems (e.g., Microsoft Excel, Google Sheets) play a central role in modern data-centric workflows. As AI agents grow increasingly capable of automating complex tasks, such as controlling computers and generating presentations, building an AI-driven spreadsheet agent has emerged as a pr...

Keywords: reinforcement learning, fine-tuning
Loading...

Preparing your export...