Paper Archive

Browse and export your curated research paper collection

221
Archived Days
2198
Total Papers
8.0
Avg Score
9
Categories

Export Archive Data

Download your archived papers in various formats

JSON: Complete data with analysis • CSV: Tabular data for analysis • Markdown: Human-readable reports • BibTeX: Academic citations
Browse by Date

Papers for April 19, 2026

10 papers found

[object Object], [object Object], [object Object], [object Object], [object Object] 4/16/2026 huggingface

computer vision

This paper focuses on the alignment of flow matching models with human preferences. A promising way is fine-tuning by directly backpropagating reward gradients through the differentiable generation process of flow matching. However, backpropagating through long trajectories results in prohibitive me...

Keywords: flow matching, LeapAlign, post-training fine-tuning, direct-gradient, ODE sampling, trajectory shortening, gradient stabilization, image-text alignment

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/16/2026 huggingface

machine learning

High-level autonomous driving requires motion planners capable of modeling multimodal future uncertainties while remaining robust in closed-loop interactions. Although diffusion-based planners are effective at modeling complex trajectory distributions, they often suffer from stochastic instabilities...

Keywords: RAD-2, diffusion models, generator-discriminator, reinforcement learning, BEV-Warp, trajectory planning, autonomous driving, closed-loop evaluation

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/16/2026 huggingface

machine learning

The efficient spatial allocation of primitives serves as the foundation of 3D Gaussian Splatting, as it directly dictates the synergy between representation compactness, reconstruction speed, and rendering fidelity. Previous solutions, whether based on iterative optimization or feed-forward inferenc...

Keywords: 3D Gaussian Splatting, global scene tokens, novel-view synthesis, RealEstate10K, ACID, coarse-to-fine training, compact 3D representation, real-time rendering

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/16/2026 huggingface

robotics

3D policy learning promises superior generalization and cross-embodiment transfer, but progress has been hindered by training instabilities and severe overfitting, precluding the adoption of powerful 3D perception models. In this work, we systematically diagnose these failures, identifying the omiss...

Keywords: 3D policy learning, transformer, diffusion decoder, imitation learning, 3D data augmentation, batch normalization, robotics, manipulation

[object Object], [object Object], [object Object], [object Object], [object Object] 4/16/2026 huggingface

machine learning

Reliable uncertainty estimation is critical for medical image segmentation, where automated contours feed downstream quantification and clinical decision support. Many strong uncertainty methods require repeated inference, while efficient single-forward-pass alternatives often provide weaker failure...

Keywords: medical image segmentation, uncertainty estimation, perturbation energy, single-forward-pass, rank-1 posterior probes, calibration, error ranking, AUROC

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/16/2026 huggingface

machine learning

Vision-language models (VLM) have markedly advanced AI-driven interpretation and reporting of complex medical imaging, such as computed tomography (CT). Yet, existing methods largely relegate clinicians to passive observers of final outputs, offering no interpretable reasoning trace for them to insp...

Keywords: RadAgent, vision-language model, tool-using agent, chest CT, interpretability, faithfulness, robustness, medical AI

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/16/2026 huggingface

robotics

Mobile agents powered by vision-language models have demonstrated impressive capabilities in automating mobile tasks, with recent leading models achieving a marked performance leap, e.g., nearly 70% success on AndroidWorld. However, these systems keep their training data closed and remain opaque abo...

Keywords: OpenMobile, task synthesis, trajectory synthesis, policy-switching, vision-language models, AndroidWorld, Qwen3-VL, open-source dataset

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/16/2026 huggingface

machine learning

Recent advances in video-to-audio (V2A) generation enable high-quality audio synthesis from visual content, yet achieving robust and fine-grained controllability remains challenging. Existing methods suffer from weak textual controllability under visual-text conflict and imprecise stylistic control ...

Keywords: video-to-audio, multimodal, controllability, CLIP, temporal-timbre decoupling, REPA, VGGSound-TVC, audio generation

[object Object] 4/16/2026 huggingface

machine learning

Recent advances in LLM based agent systems have shown promise in tackling complex, long horizon tasks. However, existing agent protocols (e.g., A2A and MCP) under specify cross entity lifecycle and context management, version tracking, and evolution safe update interfaces, which encourages monolithi...

Keywords: Autogenesis, AGP, RSPL, SEPL, AGS, self-evolving agents, multi-agent system, resource protocol

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/16/2026 huggingface

machine learning

Retrieval-Augmented Generation (RAG) extends Large Vision-Language Models (LVLMs) with external visual knowledge. However, existing visual RAG systems typically rely on generic retrieval signals that overlook the fine-grained visual semantics essential for complex reasoning. To address this limitati...

Keywords: UniDoc-RL, Retrieval-Augmented Generation, LVLM, reinforcement learning, hierarchical actions, dense rewards, Group Relative Policy Optimization, active visual perception
Loading...

Preparing your export...