Paper Archive

Browse and export your curated research paper collection

221
Archived Days
2198
Total Papers
8.0
Avg Score
9
Categories

Export Archive Data

Download your archived papers in various formats

JSON: Complete data with analysis • CSV: Tabular data for analysis • Markdown: Human-readable reports • BibTeX: Academic citations
Browse by Date

Papers for April 17, 2026

10 papers found

[object Object], [object Object], [object Object], [object Object], [object Object] 4/16/2026 huggingface

computer vision

This paper focuses on the alignment of flow matching models with human preferences. A promising way is fine-tuning by directly backpropagating reward gradients through the differentiable generation process of flow matching. However, backpropagating through long trajectories results in prohibitive me...

Keywords: flow matching, LeapAlign, direct-gradient, GRPO, fine-tuning, trajectory shortening, ODE sampling, image-text alignment

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/16/2026 huggingface

machine learning

High-level autonomous driving requires motion planners capable of modeling multimodal future uncertainties while remaining robust in closed-loop interactions. Although diffusion-based planners are effective at modeling complex trajectory distributions, they often suffer from stochastic instabilities...

Keywords: diffusion-based planning, generator-discriminator, reinforcement learning, Temporally Consistent Group Relative Policy Optimization, On-policy Generator Optimization, BEV-Warp, closed-loop planning, motion planning

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/16/2026 huggingface

machine learning

Recent advances in video-to-audio (V2A) generation enable high-quality audio synthesis from visual content, yet achieving robust and fine-grained controllability remains challenging. Existing methods suffer from weak textual controllability under visual-text conflict and imprecise stylistic control ...

Keywords: video-to-audio, V2A, multimodal, CLIP, temporal-timbre decoupling, REPA, modality dropout, VGGSound-TVC

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/16/2026 huggingface

machine learning

Retrieval-Augmented Generation (RAG) extends Large Vision-Language Models (LVLMs) with external visual knowledge. However, existing visual RAG systems typically rely on generic retrieval signals that overlook the fine-grained visual semantics essential for complex reasoning. To address this limitati...

Keywords: UniDoc-RL, RAG, LVLM, hierarchical actions, dense rewards, GRPO, visual retrieval, active perception

[object Object], [object Object], [object Object], [object Object], [object Object] 4/16/2026 huggingface

machine learning

Concatenating quantum error correction codes scales error correction capability by driving logical error rates down double-exponentially across levels. However, the noise structure shifts under concatenation, making it hard to choose an optimal code sequence. We automate this choice by estimating th...

Keywords: concatenated quantum codes, quantum error correction, logical error rate, noise estimation, learning-based encoders, non-additive encoders, stabilizer codes, resource reduction

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/16/2026 huggingface

machine learning

Scientific discovery in digital health requires converting continuous physiological signals from wearable devices into clinically actionable biomarkers. We introduce CoDaS (AI Co-Data-Scientist), a multi-agent system that structures biomarker discovery as an iterative process combining hypothesis ge...

Keywords: CoDaS, digital biomarkers, wearable sensors, circadian instability, depression, insulin resistance, adversarial validation, multi-agent system

[object Object] 4/16/2026 huggingface

machine learning

Majority voting over multiple LLM attempts improves mathematical reasoning, but correlated errors limit the effective sample size. A natural fix is to assign different reasoning strategies to different voters. The approach, Diverse Prompt Mixer, is tested on the AIMO 3 competition: 3 models, 23+ exp...

Keywords: LLMs, majority voting, prompt engineering, Diverse Prompt Mixer, AIMO 3, IMO-level problems, selection loss, high-temperature sampling

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/15/2026 huggingface

machine learning

Synthetic data is a standard component in training large language models, yet systematic comparisons across design dimensions, including rephrasing strategy, generator model, and source data, remain absent. We conduct extensive controlled experiments, generating over one trillion tokens, to identify...

Keywords: synthetic data, pretraining, prompt design, rephrasing, FinePhrase, structured outputs, generator size, dataset release

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/15/2026 huggingface

computer vision

We introduce HY-World 2.0, a multi-modal world model framework that advances our prior project HY-World 1.0. HY-World 2.0 accommodates diverse input modalities, including text prompts, single-view images, multi-view images, and videos, and produces 3D world representations. With text or single-view ...

Keywords: HY-World 2.0, 3D Gaussian Splatting, HY-Pano 2.0, WorldNav, WorldStereo 2.0, WorldMirror 2.0, WorldLens, multi-modal

[object Object], [object Object], [object Object], [object Object] 4/15/2026 huggingface

computer vision

Existing segmentation models based on multimodal large language models (MLLMs), such as LISA, often struggle with novel or emerging entities due to their inability to incorporate up-to-date knowledge. To address this challenge, we introduce the Novel Emerging Segmentation Task (NEST), which focuses ...

Keywords: ROSE, NEST, MLLM, segmentation, retrieval-augmented, visual prompt, WebSense, gIoU
Loading...

Preparing your export...