Paper Archive

Browse and export your curated research paper collection

79
Archived Days
788
Total Papers
8.4
Avg Score
9
Categories

Export Archive Data

Download your archived papers in various formats

JSON: Complete data with analysis • CSV: Tabular data for analysis • Markdown: Human-readable reports • BibTeX: Academic citations
Browse by Date

Papers for December 3, 2025

10 papers found

Unknown authors 12/3/2025 huggingface

machine learning

We present MG-Nav (Memory-Guided Navigation), a dual-scale framework for zero-shot visual navigation that unifies global memory-guided planning with local geometry-enhanced control. At its core is the Sparse Spatial Memory Graph (SMG), a compact, region-centric memory where each node aggregates mult...

Keywords: MG-Nav, Sparse Spatial Memory Graph, SMG, VGGT-adapter, zero-shot navigation, global planning, local control, image-goal

Unknown authors 12/3/2025 huggingface

machine learning

Achieving fully autonomous driving systems requires learning rational decisions in a wide span of scenarios, including safety-critical and out-of-distribution ones. However, such cases are underrepresented in real-world corpus collected by human experts. To complement for the lack of data diversity,...

Keywords: simulation, neural rendering, pseudo-expert, co-training, autonomous driving, data synthesis, generalization, robustness

Unknown authors 12/3/2025 huggingface

computer vision

Current video generation techniques excel at single-shot clips but struggle to produce narrative multi-shot videos, which require flexible shot arrangement, coherent narrative, and controllability beyond text prompts. To tackle these challenges, we propose MultiShotMaster, a framework for highly con...

Keywords: multi-shot video, RoPE, Multi-Shot Narrative RoPE, Spatiotemporal Position-Aware RoPE, video generation, reference grounding, data annotation pipeline, controllable generation

Unknown authors 12/3/2025 huggingface

machine learning

AI self-evolution has long been envisioned as a path toward superintelligence, where models autonomously acquire, refine, and internalize knowledge from their own learning experiences. Yet in practice, unguided self-evolving systems often plateau quickly or even degrade as training progresses. These...

Keywords: self-evolving, R-Few, Challenger-Solver, in-context grounding, mixed training, curriculum learning, concept drift, synthetic data

Unknown authors 12/3/2025 huggingface

computer vision

Recent advances in video large language models have demonstrated strong capabilities in understanding short clips. However, scaling them to hours- or days-long videos remains highly challenging due to limited context capacity and the loss of critical visual details during ion. Existing memory-augmen...

Keywords: multimodal memory, long video reasoning, episodic memory, semantic memory, visual memory, adaptive retrieval, temporal granularity, video QA

Unknown authors 12/3/2025 huggingface

machine learning

Despite progress in video-to-audio generation, the field focuses predominantly on mono output, lacking spatial immersion. Existing binaural approaches remain constrained by a two-stage pipeline that first generates mono audio and then performs spatialization, often resulting in error accumulation an...

Keywords: binaural audio, video-to-audio, spatial audio, conditional flow matching, dual-branch architecture, BiAudio dataset, end-to-end, multimodal

Unknown authors 12/3/2025 huggingface

computer vision

This paper presents DualCamCtrl, a novel end-to-end diffusion model for camera-controlled video generation. Recent works have advanced this field by representing camera poses as ray-based conditions, yet they often lack sufficient scene understanding and geometric awareness. DualCamCtrl specifically...

Keywords: diffusion_model, camera_controlled_video, depth_estimation, RGB-depth_fusion, semantic_alignment, geometry_aware, SIGMA, dual_branch

Unknown authors 12/3/2025 huggingface

machine learning

Recent audio-video generative systems suggest that coupling modalities benefits not only audio-video synchrony but also the video modality itself. We pose a fundamental question: Does audio-video joint denoising training improve video generation, even when we only care about video quality? To study ...

Keywords: audio-video joint denoising, AVFullDiT, text-to-video, text-to-audio, multimodal learning, video generation, privileged signal, audio-visual causality

Unknown authors 12/3/2025 huggingface

natural language processing

This paper presents research on deepseek-v3.2:, pushing, frontier. The full abstract is not available at this time. Please visit the paper's website for complete details about the methodology, results, and contributions.

Keywords: DeepSeek-V3.2, open large language models, open LLMs, large language models, paper metadata missing

Unknown authors 12/3/2025 huggingface

computer vision

This paper presents research on mixture, horizons, action. The full abstract is not available at this time. Please visit the paper's website for complete details about the methodology, results, and contributions.

Keywords: Mixture of Horizons, Action Chunking, temporal modeling, mixture models
Loading...

Preparing your export...