Browse and export your curated research paper collection
Unknown authors 12/3/2025 huggingface
machine learningWe present MG-Nav (Memory-Guided Navigation), a dual-scale framework for zero-shot visual navigation that unifies global memory-guided planning with local geometry-enhanced control. At its core is the Sparse Spatial Memory Graph (SMG), a compact, region-centric memory where each node aggregates mult...
Unknown authors 12/3/2025 huggingface
machine learningAchieving fully autonomous driving systems requires learning rational decisions in a wide span of scenarios, including safety-critical and out-of-distribution ones. However, such cases are underrepresented in real-world corpus collected by human experts. To complement for the lack of data diversity,...
Unknown authors 12/3/2025 huggingface
computer visionCurrent video generation techniques excel at single-shot clips but struggle to produce narrative multi-shot videos, which require flexible shot arrangement, coherent narrative, and controllability beyond text prompts. To tackle these challenges, we propose MultiShotMaster, a framework for highly con...
Unknown authors 12/3/2025 huggingface
machine learningAI self-evolution has long been envisioned as a path toward superintelligence, where models autonomously acquire, refine, and internalize knowledge from their own learning experiences. Yet in practice, unguided self-evolving systems often plateau quickly or even degrade as training progresses. These...
Unknown authors 12/3/2025 huggingface
computer visionRecent advances in video large language models have demonstrated strong capabilities in understanding short clips. However, scaling them to hours- or days-long videos remains highly challenging due to limited context capacity and the loss of critical visual details during ion. Existing memory-augmen...
Unknown authors 12/3/2025 huggingface
machine learningDespite progress in video-to-audio generation, the field focuses predominantly on mono output, lacking spatial immersion. Existing binaural approaches remain constrained by a two-stage pipeline that first generates mono audio and then performs spatialization, often resulting in error accumulation an...
Unknown authors 12/3/2025 huggingface
computer visionThis paper presents DualCamCtrl, a novel end-to-end diffusion model for camera-controlled video generation. Recent works have advanced this field by representing camera poses as ray-based conditions, yet they often lack sufficient scene understanding and geometric awareness. DualCamCtrl specifically...
Unknown authors 12/3/2025 huggingface
machine learningRecent audio-video generative systems suggest that coupling modalities benefits not only audio-video synchrony but also the video modality itself. We pose a fundamental question: Does audio-video joint denoising training improve video generation, even when we only care about video quality? To study ...
Unknown authors 12/3/2025 huggingface
natural language processingThis paper presents research on deepseek-v3.2:, pushing, frontier. The full abstract is not available at this time. Please visit the paper's website for complete details about the methodology, results, and contributions.
Unknown authors 12/3/2025 huggingface
computer visionThis paper presents research on mixture, horizons, action. The full abstract is not available at this time. Please visit the paper's website for complete details about the methodology, results, and contributions.
Preparing your export...