Browse and export your curated research paper collection
[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface
computer visionHow can we tell whether a video has been sped up or slowed down? How can we generate videos at different speeds? Although videos have been central to modern computer vision research, little attention has been paid to perceiving and controlling the passage of time. In this paper, we study time as a l...
[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface
computer visionWe present Vista4D, a robust and flexible video reshooting framework that grounds the input video and target cameras in a 4D point cloud. Specifically, given an input video, our method re-synthesizes the scene with the same dynamics from a different camera trajectory and viewpoint. Existing video re...
[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface
computer visionIn recent years, significant progress has been made in both image generation and generated image detection. Despite their rapid, yet largely independent, development, these two fields have evolved distinct architectural paradigms: the former predominantly relies on generative networks, while the lat...
[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface
machine learningReal-time detection and mitigation of technical anomalies are critical for large-scale cloud-native services, where even minutes of downtime can result in massive financial losses and diminished user trust. While customer incidents serve as a vital signal for discovering risks missed by monitoring, ...
[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface
natural language processingUnderstanding what kinds of factual knowledge large language models (LLMs) memorize is essential for evaluating their reliability and limitations. Entity-based QA is a common framework for analyzing non-verbatim memorization, but typical evaluations query each entity using a single canonical surface...
[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface
machine learningReasoning LLMs often spend substantial tokens on long intermediate reasoning traces (e.g., chain-of-thought) when solving new problems. We propose to summarize and store reusable reasoning skills distilled from extensive deliberation and trial-and-error exploration, and to retrieve these skills at i...
[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface
computer visionCreative face stylization aims to render portraits in diverse visual idioms such as cartoons, sketches, and paintings while retaining recognizable identity. However, current identity encoders, which are typically trained and calibrated on natural photographs, exhibit severe brittleness under styliza...
[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface
computer visionInteractive video generation models such as Genie, YUME, HY-World, and Matrix-Game are advancing rapidly, yet every model is evaluated on its own benchmark with private scenes and trajectories, making fair cross-model comparison impossible. Existing public benchmarks offer useful metrics such as tra...
[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface
computer visionWe present Sapiens2, a model family of high-resolution transformers for human-centric vision focused on generalization, versatility, and high-fidelity outputs. Our model sizes range from 0.4 to 5 billion parameters, with native 1K resolution and hierarchical variants that support 4K. Sapiens2 substa...
[object Object], [object Object], [object Object], [object Object], [object Object] 4/23/2026 huggingface
machine learningNeural representations (NRs), such as neural fields and 3D Gaussians, effectively model volumetric data in computed tomography (CT) but suffer from severe artifacts under sparse-view settings. To address this, we propose DiffNR, a novel framework that enhances NR optimization with diffusion priors. ...
Preparing your export...