Paper Archive

Seeing Fast and Slow: Learning the Flow of Time in Videos

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface

computer vision

How can we tell whether a video has been sped up or slowed down? How can we generate videos at different speeds? Although videos have been central to modern computer vision research, little attention has been paid to perceiving and controlling the passage of time. In this paper, we study time as a l...

Keywords: temporal reasoning, self-supervised learning, speed estimation, slow-motion dataset, speed-conditioned generation, temporal super-resolution, video forensics, multimodal cues

View Paper

Vista4D: Video Reshooting with 4D Point Clouds

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface

computer vision

We present Vista4D, a robust and flexible video reshooting framework that grounds the input video and target cameras in a 4D point cloud. Specifically, given an input video, our method re-synthesizes the scene with the same dynamics from a different camera trajectory and viewpoint. Existing video re...

Keywords: 4D point cloud, video reshooting, dynamic scene, multiview reconstruction, camera control, view synthesis, scene recomposition

View Paper

UniGenDet: A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detection

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface

computer vision

In recent years, significant progress has been made in both image generation and generated image detection. Despite their rapid, yet largely independent, development, these two fields have evolved distinct architectural paradigms: the former predominantly relies on generative networks, while the lat...

Keywords: unified framework, generative-discriminative, symbiotic self-attention, detector-informed alignment, image generation, deepfake detection, co-evolution

View Paper

TingIS: Real-time Risk Event Discovery from Noisy Customer Incidents at Enterprise Scale

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface

machine learning

Real-time detection and mitigation of technical anomalies are critical for large-scale cloud-native services, where even minutes of downtime can result in massive financial losses and diminished user trust. While customer incidents serve as a vital signal for discovering risks missed by monitoring, ...

Keywords: incident_detection, event_linking, LLMs, noise_reduction, real_time, routing, clustering, SRE

View Paper

Revisiting Non-Verbatim Memorization in Large Language Models: The Role of Entity Surface Forms

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface

natural language processing

Understanding what kinds of factual knowledge large language models (LLMs) memorize is essential for evaluating their reliability and limitations. Entity-based QA is a common framework for analyzing non-verbatim memorization, but typical evaluations query each entity using a single canonical surface...

Keywords: non-verbatim memorization, RedirectQA, entity surface forms, Wikipedia redirects, Wikidata, LLMs, entity frequency, QA robustness

View Paper

Thinking with Reasoning Skills: Fewer Tokens, More Accuracy

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface

machine learning

Reasoning LLMs often spend substantial tokens on long intermediate reasoning traces (e.g., chain-of-thought) when solving new problems. We propose to summarize and store reusable reasoning skills distilled from extensive deliberation and trial-and-error exploration, and to retrieve these skills at i...

Keywords: reasoning skills, skill distillation, retrieval, chain-of-thought, efficiency, coding reasoning, mathematical reasoning, LLMs

View Paper

StyleID: A Perception-Aware Dataset and Metric for Stylization-Agnostic Facial Identity Recognition

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface

computer vision

Creative face stylization aims to render portraits in diverse visual idioms such as cartoons, sketches, and paintings while retaining recognizable identity. However, current identity encoders, which are typically trained and calibrated on natural photographs, exhibit severe brittleness under styliza...

Keywords: StyleID, StyleBench, face recognition, stylization, psychophysics, 2AFC, perception-aware, dataset

View Paper

WorldMark: A Unified Benchmark Suite for Interactive Video World Models

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface

computer vision

Interactive video generation models such as Genie, YUME, HY-World, and Matrix-Game are advancing rapidly, yet every model is evaluated on its own benchmark with private scenes and trajectories, making fair cross-model comparison impossible. Existing public benchmarks offer useful metrics such as tra...

Keywords: interactive video, world models, benchmark, action mapping, evaluation toolkit, control alignment, visual quality, world consistency

View Paper

Sapiens2

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface

computer vision

We present Sapiens2, a model family of high-resolution transformers for human-centric vision focused on generalization, versatility, and high-fidelity outputs. Our model sizes range from 0.4 to 5 billion parameters, with native 1K resolution and hierarchical variants that support 4K. Sapiens2 substa...

Keywords: Sapiens2, high-resolution transformer, human-centric vision, masked image reconstruction, self-distilled contrastive, windowed attention, 4K, curated dataset

View Paper

DiffNR: Diffusion-Enhanced Neural Representation Optimization for Sparse-View 3D Tomographic Reconstruction

0

9.0/10

[object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface

machine learning

Neural representations (NRs), such as neural fields and 3D Gaussians, effectively model volumetric data in computed tomography (CT) but suffer from severe artifacts under sparse-view settings. To address this, we propose DiffNR, a novel framework that enhances NR optimization with diffusion priors. ...

Keywords: DiffNR, SliceFixer, diffusion_priors, neural_representations, sparse_view_CT, tomographic_reconstruction, PSNR

View Paper

Export Archive Data

Browse by Date

Papers for April 24, 2026

Seeing Fast and Slow: Learning the Flow of Time in Videos

Vista4D: Video Reshooting with 4D Point Clouds

UniGenDet: A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detection

TingIS: Real-time Risk Event Discovery from Noisy Customer Incidents at Enterprise Scale

Revisiting Non-Verbatim Memorization in Large Language Models: The Role of Entity Surface Forms

Thinking with Reasoning Skills: Fewer Tokens, More Accuracy

StyleID: A Perception-Aware Dataset and Metric for Stylization-Agnostic Facial Identity Recognition

WorldMark: A Unified Benchmark Suite for Interactive Video World Models

Sapiens2

DiffNR: Diffusion-Enhanced Neural Representation Optimization for Sparse-View 3D Tomographic Reconstruction