Paper Archive

Browse and export your curated research paper collection

274
Archived Days
2723
Total Papers
7.6
Avg Score
9
Categories

Export Archive Data

Download your archived papers in various formats

JSON: Complete data with analysis | CSV: Tabular data for analysis | Markdown: Human-readable reports | BibTeX: Academic citations
Browse by Date

Papers for June 19, 2026

10 papers found

Siang-Ling Zhang, Huai-Hsun Cheng, Tsung-Ju Yang, Yu-Lun Liu 6/18/2026 arxiv

computer vision

Creating 3D visual illusions, a single 3D mesh that reveals entirely different semantics from various viewing angles, is a fascinating but tough challenge. Existing optimization-based methods are slow and can produce oversaturated colors. In contrast, naive stitching approaches fail to produce geome...

Keywords: 3D visual illusions, cross-space denoising, view-conditioned texture synthesis, CLIP-guided orientation alignment, Signed Distance Field blending

Sizhe Yang, Juncheng Mu, Tianming Wei, Chenhao Lu, Xiaofan Li, Linning Xu, Zhengrong Xue, Zhecheng Yuan, Dahua Lin, Jiangmiao Pang, Huazhe Xu 6/18/2026 arxiv

computer vision

Robust robotic manipulation in the real world requires not only an understanding of the current observation, but also memory and dynamics modeling. World action models (WAMs) possess these capabilities by jointly modeling visual foresight and actions conditioned on both current and historical observ...

Keywords: MemoryWAM, world action modeling, efficient memory, robotic manipulation, attention mechanism

Arkaprava Sinha, Dominick Reilly, Siddharth Krishnan, Hieu Le, Srijan Das 6/18/2026 arxiv

computer vision

Long Video Question Answering (LVQA) requires identifying sparse, query-relevant evidence within hours-long untrimmed videos. Existing approaches either process videos densely with large vision-language models (VLMs), incurring prohibitive computational cost, or rely on sparse caption-based reasonin...

Keywords: Long Video Question Answering, Temporal Reasoning, Action-based Candidate Evidence, OpenTSUBench, Activities of Daily Living

Joshua Engels, Callum McDougall, Bilal Chughtai, Janos Kramar, Senthoran Rajamanoharan, Cindy Wu, Arthur Conmy, Asic Q Chen, Jean Tarbouriech, Min Ma, Brendan O'Donoghue, João Gabriel Lopes de Oliveira, Rohin Shah, Neel Nanda 6/18/2026 arxiv

machine learning

LLM reasoning transparency is a critical affordance for understanding model decisions, mitigating misuse and misalignment, and debugging surprising model behaviors. However, DiffusionGemma performs a larger fraction of its computation in a continuous latent space; does this make its reasoning less t...

Keywords: LLM reasoning, DiffusionGemma, transparency, interpretability, algorithmic transparency, variable transparency, monitorability

Wenhao Chi, Arkaprava Sinha, Dominick Reilly, Hieu Le, Srijan Das 6/18/2026 arxiv

computer vision

Egocentric video understanding is inherently limited by the narrow perspective of wearable cameras: a single viewpoint, a single modality, a single model cannot capture the full richness of human action. We argue that a truly expressive egocentric representation must subsume complementary knowledge ...

Keywords: egocentric video, multi-teacher distillation, proxy models, action recognition, video retrieval

Georgy Noarov, Aaron Roth 6/18/2026 arxiv

machine learning

A model is multicalibrated on a collection of group weights $G$ if it is calibrated -- i.e. unbiased even conditional on its prediction -- not just overall, but also after reweighting contexts by each $g \in G$. It is a useful property for many downstream applications and is a basic desideratum of t...

Keywords: multicalibration, omniprediction, deterministic predictors, optimal sample complexity, unbiased AI

Pradhaan S Bhat, Naveen Chandra R, Rishubh Parihar, Vaibhav Vavilala, R. Venkatesh Babu, D. A. Forsyth, Anand Bhattad 6/18/2026 arxiv

computer vision

Text and 2D-conditioning interfaces provide weak, ambiguous control over spatial transformations in image editing -- particularly under large object motions and camera changes. Prior work has used 3D primitives such as boxes, but only as loose conditioning signals indicating approximate object locat...

Keywords: 3D editing, real images, spatial transformations, depth-aligned planar floor, generalization

Ruizhong Qiu, Yinglong Xia, Dongqi Fu, Hanqing Zeng, Ren Chen, Xiangjun Fan, Hong Li, Hong Yan, Hanghang Tong 6/18/2026 arxiv

machine learning

Generative recommendation is an emerging paradigm that has shown promise in industrial recommendation systems, aiming to predict users' next interactions from their historical behaviors. At the core of generative recommendation lies item tokenization, which bridges item semantics and recommendation ...

Keywords: Generative Recommendation, User Interest Context, Graph Neural Networks, Semantic Tokenization, Scalability

Sha Yi, Nicklas Hansen, Xueqian Bai, Carmelo Sferrazza, Michael T. Tolley, Xiaolong Wang 6/18/2026 arxiv

robotics

Robot learning has advanced rapidly in learning control, but learning the physical body of a robot remains much more difficult because jointly searching over design and control creates a very large combinatorial problem. Here, we present a data-driven framework for generating robot hands from human ...

Keywords: robot hand design, human demonstration learning, inverse kinematics, reinforcement learning, robot control

Przemyslaw Musialski 6/18/2026 arxiv

computer vision

We place the attention token on the group: a token is an element $g_i$ of a matrix Lie group $G$ -- a bare transformation, with no feature payload and no external action $ρ(g)$ carrying it. To our knowledge this is the first attention construction whose tokens are bare matrix Lie group elements: the...

Keywords: Lie-Algebra Attention, matrix Lie groups, computer vision, attention mechanisms, SE(2), SO(3), Aff(2)
Loading...

Preparing your export...