Paper Archive

Browse and export your curated research paper collection

197
Archived Days
1958
Total Papers
7.9
Avg Score
9
Categories

Export Archive Data

Download your archived papers in various formats

JSON: Complete data with analysis • CSV: Tabular data for analysis • Markdown: Human-readable reports • BibTeX: Academic citations
Browse by Date

Papers for March 31, 2026

10 papers found

Kaituo Feng, Manyuan Zhang, Shuang Chen, Yunlong Lin, Kaixuan Fan, Yilei Jiang, Hongyu Li, Dian Zheng, Chenyang Wang, Xiangyu Yue 3/30/2026 arxiv

computer vision

Recent image generation models have shown strong capabilities in generating high-fidelity and photorealistic images. However, they are fundamentally constrained by frozen internal knowledge, thus often failing on real-world scenarios that are knowledge-intensive or require up-to-date information. In...

Keywords: search-augmented, image generation, multi-hop reasoning, reinforcement learning, SFT, GRPO, KnowGen, datasets

Zimu Zhang, Yucheng Zhang, Xiyan Xu, Ziyin Wang, Sirui Xu, Kai Zhou, Bing Zhou, Chuan Guo, Jian Wang, Yu-Xiong Wang, Liang-Yan Gui 3/30/2026 arxiv

computer vision

Synthesizing human motion has advanced rapidly, yet realistic hand motion and bimanual interaction remain underexplored. Whole-body models often miss the fine-grained cues that drive dexterous behavior, finger articulation, contact timing, and inter-hand coordination, and existing resources lack hig...

Keywords: bimanual, hand motion, motion capture, LLM annotation, diffusion models, autoregressive models, dataset, hand-focused metrics

Jack Cook, Hyemin S. Lee, Kathryn Le, Junxian Guo, Giovanni Traverso, Anantha P. Chandrakasan, Song Han 3/30/2026 arxiv

machine learning

NVFP4 has grown increasingly popular as a 4-bit format for quantizing large language models due to its hardware support and its ability to retain useful information with relatively few bits per parameter. However, the format is not without limitations: recent work has shown that NVFP4 suffers from i...

Keywords: IF4, NVFP4, 4-bit quantization, block-scaled, E4M3, IF3, IF6, quantized training

N Alex Cayco Gajic, Arthur Pellegrino 3/30/2026 arxiv

machine learning

Similarity measures are widely used to interpret the representational geometries used by neural networks to solve tasks. Yet, because existing methods compare the extrinsic geometry of representations in state space, rather than their intrinsic geometry, they may fail to capture subtle yet crucial d...

Keywords: Metric Similarity Analysis, MSA, Riemannian geometry, statistical manifolds, intrinsic geometry, neural representations, manifold hypothesis, diffusion models

Lorenza Prospero, Orest Kupyn, Ostap Viniavskyi, João F. Henriques, Christian Rupprecht 3/30/2026 arxiv

computer vision

Acquiring labeled datasets for 3D human mesh estimation is challenging due to depth ambiguities and the inherent difficulty of annotating 3D geometry from monocular images. Existing datasets are either real, with manually annotated 3D geometry and limited scale, or synthetic, rendered from 3D engine...

Keywords: diffusion models, 3D human mesh, synthetic data, Direct Preference Optimization, data generation, curriculum learning, quality filtering

Omer Dahary, Benaya Koren, Daniel Garibi, Daniel Cohen-Or 3/30/2026 arxiv

computer vision

Modern Text-to-Image (T2I) diffusion models have achieved remarkable semantic alignment, yet they often suffer from a significant lack of variety, converging on a narrow set of visual solutions for any given prompt. This typicality bias presents a challenge for creative applications that require a w...

Keywords: Diffusion Transformers, Contextual Space, on-the-fly repulsion, multimodal attention, text-to-image, diversity, Turbo models, distilled models

Patrick Rim, Kevin Harris, Braden Copple, Shangchen Han, Xu Xie, Ivan Shugurov, Sizhe An, He Wen, Alex Wong, Tomas Hodan, Kun He 3/30/2026 arxiv

computer vision

Accurate 3D understanding of human hands and objects during manipulation remains a significant challenge for egocentric computer vision. Existing hand-object interaction datasets are predominantly captured in controlled studio settings, which limits both environmental diversity and the ability of mo...

Keywords: egocentric vision, hand-object interaction, 3D annotations, multi-camera rig, dataset, marker-less capture, ego-exo tracking, SHOW3D

Sadra Safadoust, Fabio Tosi, Matteo Poggi, Fatma Güney 3/30/2026 arxiv

computer vision

We present FlowIt, a novel architecture for optical flow estimation designed to robustly handle large pixel displacements. At its core, FlowIt leverages a hierarchical transformer architecture that captures extensive global context, enabling the model to effectively model long-range correspondences....

Keywords: optical_flow, transformer, hierarchical_transformer, optimal_transport, confidence_map, occlusion_handling, guided_refinement, cross-dataset_generalization

Derong Jin, Xiyi Chen, Ming C. Lin, Ruohan Gao 3/30/2026 arxiv

machine learning

Tremendous progress in visual scene generation now turns a single image into an explorable 3D world, yet immersion remains incomplete without sound. We introduce Image2AVScene, the task of generating a 3D audio-visual scene from a single image, and present SonoWorld, the first framework to tackle th...

Keywords: image-to-audio-visual, ambisonics, panorama outpainting, 3D reconstruction, spatial audio, sound anchors, audio-visual learning, one-shot acoustic learning

Aur Shalev Merin 3/30/2026 arxiv

machine learning

Recurrent networks do not need Jacobian propagation to adapt online. The hidden state already carries temporal credit through the forward pass; immediate derivatives suffice if you stop corrupting them with stale trace memory and normalize gradient scales across parameter groups. An architectural ru...

Keywords: RTRL, RMSprop, online learning, recurrent networks, temporal credit assignment, normalization, hidden state, scalability
Loading...

Preparing your export...