Paper Archive

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 3/26/2026 huggingface

machine learning

Multi-shot video generation is crucial for long narrative storytelling, yet current bidirectional architectures suffer from limited interactivity and high latency. We propose ShotStream, a novel causal multi-shot architecture that enables interactive storytelling and efficient on-the-fly frame gener...

Keywords: multi-shot video generation, causal generation, distribution matching distillation, dual-cache memory, RoPE discontinuity, interactive storytelling, text-to-video, real-time inference

View Paper

Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 3/26/2026 huggingface

computer vision

Existing feed-forward 3D Gaussian Splatting methods predict pixel-aligned primitives, leading to a quadratic growth in primitive count as resolution increases. This fundamentally limits their scalability, making high-resolution synthesis such as 4K intractable. We introduce LGTM (Less Gaussians, Tex...

Keywords: Gaussian Splatting, feed-forward, 4K, novel view synthesis, per-primitive texture, scalability, rendering

View Paper

MuRF: Unlocking the Multi-Scale Potential of Vision Foundation Models

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object] 3/26/2026 huggingface

computer vision

Vision Foundation Models (VFMs) have become the cornerstone of modern computer vision, offering robust representations across a wide array of tasks. While recent advances allow these models to handle varying input sizes during training, inference typically remains restricted to a single, fixed scale...

Keywords: Multi-Resolution Fusion, MuRF, Vision Foundation Models, multi-scale, inference-time, training-free, feature fusion, DINOv2

View Paper

Vega: Learning to Drive with Natural Language Instructions

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 3/26/2026 huggingface

machine learning

Vision-language-action models have reshaped autonomous driving to incorporate languages into the decision-making process. However, most existing pipelines only utilize the language modality for scene descriptions or reasoning and lack the flexibility to follow diverse user instructions for personali...

Keywords: Vega, InstructScene, vision-language-action, instruction following, diffusion models, autoregressive, joint attention, autonomous driving

View Paper

MegaFlow: Zero-Shot Large Displacement Optical Flow

0

9.0/10

[object Object], [object Object], [object Object], [object Object] 3/26/2026 huggingface

computer vision

Accurate estimation of large displacement optical flow remains a critical challenge. Existing methods typically rely on iterative local search or/and domain-specific fine-tuning, which severely limits their performance in large displacement and zero-shot generalization scenarios. To overcome this, w...

Keywords: optical_flow, zero_shot, large_displacement, Vision_Transformer, global_matching, transfer_learning, motion_estimation, long_range_tracking

View Paper

PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 3/26/2026 huggingface

machine learning

Autoregressive video diffusion models have demonstrated remarkable progress, yet they remain bottlenecked by intractable linear KV-cache growth, temporal repetition, and compounding errors during long-video generation. To address these challenges, we present PackForcing, a unified framework that eff...

Keywords: PackForcing, KV-cache, context compression, video diffusion, long-video generation, Temporal RoPE, VBench

View Paper

Iterated beta integrals

0

9.0/10

[object Object], [object Object] 3/26/2026 huggingface

machine learning

We introduce iterated beta integrals, a new class of iterated integrals on the universal abelian covering of the punctured projective line that unifies hyperlogarithms and classical beta integrals while preserving their fundamental properties. We establish various analytic properties of these integr...

Keywords: iterated beta integrals, hyperlogarithms, beta integrals, iterated integrals, universal abelian covering, punctured projective line, multiple zeta values, t-values

View Paper

Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 3/26/2026 huggingface

computer vision

Video world models have shown immense potential in simulating the physical world, yet existing memory mechanisms primarily treat environments as static canvases. When dynamic subjects hide out of sight and later re-emerge, current methods often struggle, leading to frozen, distorted, or vanishing su...

Keywords: Hybrid Memory, video world models, HyDRA, HM-World, spatiotemporal retrieval, dynamic subject tracking, video generation, memory compression

View Paper

Spectrum of SL(2,R)-characters: the once-punctured torus case

0

9.0/10

[object Object], [object Object] 3/26/2026 huggingface

machine learning

Consider a topological surface Σ. We introduce the spectrum of a representation from the fundamental group of Σ to SL(2,R), which is a subset of projective measured lamination on the surface, which captures the directions along which the representation fails to be Fuchsian, and which characterizes t...

Keywords: SL(2,R), spectrum, projective measured lamination, once-punctured torus, Cantor set, mapping class group, interval exchange transformation, cocycles

View Paper

Compiling molecular ultrastructure into neural dynamics

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 3/26/2026 huggingface

machine learning

High-resolution brain imaging can now capture not just synapse locations but their molecular composition, with the cost of such mapping falling exponentially. Yet such ultrastructural data has so far told us little about local neuronal physiology - specifically, the parameters (e.g., synaptic effica...

Keywords: ultrastructure-to-dynamics compiler, molecularly annotated ultrastructure, physiological parameters, simulator-ready, uncertainty-aware, paired training data, biophysical simulations, structure-to-function

View Paper

Export Archive Data

Browse by Date

Papers for March 30, 2026

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting

MuRF: Unlocking the Multi-Scale Potential of Vision Foundation Models

Vega: Learning to Drive with Natural Language Instructions

MegaFlow: Zero-Shot Large Displacement Optical Flow

PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference

Iterated beta integrals

Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

Spectrum of SL(2,R)-characters: the once-punctured torus case

Compiling molecular ultrastructure into neural dynamics