Paper Archive

DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/28/2026 huggingface

robotics

Robot manipulation critically depends on perception that preserves the action-relevant aspects of a scene. Yet most robot learning pipelines are built upon visual encoders pre-trained for static recognition or vision-language alignment, leaving motion understanding to downstream policies. We introdu...

Keywords: robotics perception, multimodal learning, dynamics-aware representation, computer vision, robot manipulation, pre-training, visual encoders

View Paper

LLMSurgeon: Diagnosing Data Mixture of Large Language Models

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/28/2026 huggingface

machine learning

The pretraining data mixture of Large Language Models (LLMs) constitutes their "digital DNA", shaping model behaviors, capabilities, and failure modes. Yet this composition is rarely disclosed, making post-hoc auditing of data combination or provenance difficult. In this work, we formalize Data Mixt...

Keywords: LLM auditing, data mixture, provenance, digital DNA, inverse problem, LLMSurgeon, LLMScan

View Paper

Colored Noise Diffusion Sampling

0

9.0/10

[object Object], [object Object], [object Object] 5/28/2026 huggingface

computer vision

Diffusion models achieve state-of-the-art image synthesis, with their generative trajectories fundamentally exhibiting a spectral bias, resolving low-frequency global structures early and high-frequency fine details later. Conventional stochastic differential equation (SDE) solvers fail to account f...

Keywords: diffusion models, colored noise, sampling methods, spectral bias, image synthesis, SDE solvers, frequency-decoupled energy transfer

View Paper

Tiny but Trusted: Efficient Vision-Language Reasoning for Time-Series Anomaly Detection

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object] 5/28/2026 huggingface

computer vision

Recent advances in Vision-Language Models (VLMs) have achieved impressive performance across many tasks, yet prior studies report unsatisfactory performance when applying large language or multimodal models to finding abnormal patterns in sequential data. Public anomaly detection benchmarks typicall...

Keywords: Vision-Language Models, Time-Series Anomaly Detection, Explainable AI, Parameter-Efficient Fine-Tuning, Benchmark

View Paper

AdaState: Self-Evolving Anchors for Streaming Video Generation

0

9.0/10

[object Object], [object Object] 5/28/2026 huggingface

computer vision

Autoregressive video diffusion models generate streaming video by producing frames sequentially, conditioning each chunk on previously generated content. These models are structurally anchored to the first frame: its key-value representation occupies a privileged position in the attention cache and ...

Keywords: video generation, autoregressive models, diffusion models, adaptive state, streaming video, temporal dynamics, attention mechanisms

View Paper

Export Archive Data

Browse by Date

Papers for May 30, 2026

DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

LLMSurgeon: Diagnosing Data Mixture of Large Language Models

Colored Noise Diffusion Sampling

Tiny but Trusted: Efficient Vision-Language Reasoning for Time-Series Anomaly Detection

AdaState: Self-Evolving Anchors for Streaming Video Generation