Paper Archive

TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/25/2026 huggingface

computer vision

Sparse-view 3D reconstruction is increasingly addressed with feed-forward splatting networks that predict explicit primitives directly from images. Yet most existing methods remain centered on Gaussian primitives and expose surfaces only indirectly: extracting a usable mesh for downstream simulation...

View Paper

MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/25/2026 huggingface

natural language processing

We present MobileGym, a browser-hosted, lightweight, fully controllable environment for everyday mobile use, targeting interaction fidelity without replicating proprietary backends. It enables two capabilities previously out of reach for everyday apps: verifiable outcome signals through deterministi...

View Paper

Helix4D: Complex 4D Mesh Generation

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/25/2026 huggingface

computer vision

Current video-to-4D methods struggle with complex topology changes, transparent materials, thin structures, and inner surfaces. We present Helix4D, a dynamic mesh generation framework by inheriting the expressive representation of Trellis2, adapting it from image-to-3D to video-conditioned 4D genera...

Keywords: attention

View Paper

Reinforcing Few-step Generators via Reward-Tilted Distribution Matching

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/25/2026 huggingface

computer vision

Recent advances in few-step diffusion distillation have enabled efficient image generation, yet aligning these models with human preferences remains challenging. We propose Reward-Tilted Distribution Matching Distillation (RTDMD), a two-stage framework that unifies distribution matching distillation...

Keywords: reinforcement learning, backpropagation

View Paper

On-Policy Adversarial Flow Distillation for Autoregressive Video Generation

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/25/2026 huggingface

computer vision

Autoregressive video generators are attractive for streaming, long-horizon, and interactive applications, but distilling strong black-box teachers into causal students remains difficult. The student must learn under its own rollout distribution, whereas practical teachers may expose only prompt-cond...

Keywords: reinforcement learning, fine-tuning

View Paper

InstructSAM: Segment Any Instance with Any Instructions

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/25/2026 huggingface

computer vision

In this paper, we introduce InstructSAM, a unified and streamlined framework designed for multi-instance segmentation under arbitrary instructions. We formulates instruction-driven instance segmentation as a set-structured query prediction problem and propose an explicit reasoning-to-instance query ...

Keywords: attention, segmentation

View Paper

Pixel-Level Pavement Distress Assessment Using Instance Segmentation

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object] 5/25/2026 huggingface

computer vision

Automated pavement distress assessment requires more than image-level classification or coarse bounding box detection, demanding precise localization of thin, branching, and irregular cracks to achieve the geometric precision necessary for maintenance-relevant quantification. This paper presents a v...

Keywords: cnn, fine-tuning, segmentation, detection, classification

View Paper

Channel-wise Vector Quantization

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/25/2026 huggingface

computer vision

We present Channel-wise Vector Quantization (CVQ), a novel image tokenization paradigm that replaces patch-wise tokens with channel-wise tokens. Unlike conventional vector quantization, which assigns a discrete token to each patch feature vector, CVQ quantizes each channel of the feature map. This f...

View Paper

Claw-Anything: Benchmarking Always-On Personal Assistants with Broader Access to User's Digital World

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 5/25/2026 huggingface

computer vision

Large language model agents are increasingly envisioned as always-on personal assistants with access to anything relevant in the user's digital world. Yet current systems operate over only narrow slices of that world, limiting context-sensitive reasoning and effective assistance. Existing benchmarks...

Keywords: gpt

View Paper

SemBridge: Language Transfer in Sparse Encoders via Multilingual Semantic Bridges

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object] 5/25/2026 huggingface

natural language processing

Sparse encoders offer high-precision retrieval by representing term importance within a vocabulary space, yet their English-centric structures pose a critical impediment to language transfer for non-English languages. To overcome this structural limitation, we propose SemBridge, a novel embedding in...

Keywords: fine-tuning

View Paper

Export Archive Data

Browse by Date

Papers for May 26, 2026

TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction

MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research

Helix4D: Complex 4D Mesh Generation

Reinforcing Few-step Generators via Reward-Tilted Distribution Matching

On-Policy Adversarial Flow Distillation for Autoregressive Video Generation

InstructSAM: Segment Any Instance with Any Instructions

Pixel-Level Pavement Distress Assessment Using Instance Segmentation

Channel-wise Vector Quantization

Claw-Anything: Benchmarking Always-On Personal Assistants with Broader Access to User's Digital World

SemBridge: Language Transfer in Sparse Encoders via Multilingual Semantic Bridges