Paper Archive

SimuScene: Simulation-Ready Compositional 3D Scene Reconstruction from a Single Image

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 6/2/2026 huggingface

computer vision

Reconstructing interactive, simulation-ready 3D scenes from a single image is a critical bottleneck for robotic manipulation. While recent single-image lifters recover plausible per-object shapes, composing them yields scenes that collapse under physical simulation due to interpenetrating, hovering,...

View Paper

Value-Aware Stochastic KV Cache Eviction for Reasoning Models

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 6/2/2026 huggingface

machine learning

Reasoning models improve accuracy through extended chains of thought, but their long outputs create a memory and compute bottleneck. KV cache eviction methods reduce this cost by evicting unimportant key-value pairs from the cache, yet they often yield worse accuracy than selection-based sparse atte...

Keywords: attention

View Paper

KletterMix: Climbing Toward High-Quality German Pretraining Data

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 6/2/2026 huggingface

natural language processing

High-quality pretraining data is a central ingredient in modern language models, but German-language resources remain far less developed than their English counterparts: they are often smaller, less carefully curated, weakly documented, and rarely validated through controlled training experiments. W...

Keywords: natural language processing, pretraining

View Paper

World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning

0

5.0/10

[object Object], [object Object], [object Object], [object Object] 6/2/2026 huggingface

computer vision

World models and multimodal large language models (MLLMs) provide complementary capabilities for predicting future outcomes from static visual observations. World models can generate concrete visual rollouts of possible futures, while MLLMs can reason abstractly over questions, goals, and rules. How...

View Paper

Diffusing in the Right Space: A Systematic Study of Latent Diffusability

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object] 6/2/2026 huggingface

computer vision

Latent diffusion models leverage visual tokenizers to compress images into latent spaces for efficient generative modeling. However, better reconstruction quality of a tokenizer does not necessarily translate into better generation quality, suggesting that latent representations should be evaluated ...

Keywords: diffusion model

View Paper

Acenaphthene Derivatives as Signatures of C_{11}H_9^+ Reactivity with Methylated Naphthalenes

0

5.0/10

[object Object], [object Object], [object Object], [object Object] 6/2/2026 huggingface

machine learning

C_{11}H_9^+ ion is the dominant fragment cation formed from methyl-naphthalene (MeNp) and dimethyl-naphthalene (diMeNp). Using the multiplex capabilities of PIRENEA, a setup dedicated to laboratory astrophysics, we studied the reactivity of the benzylium-like isomers of C_{11}H_9^+ with diMeNp under...

View Paper

When Attention Collapses: Stage-Aware Visual Token Pruning from Structure to Semantics

0

5.0/10

[object Object], [object Object], [object Object], [object Object] 6/2/2026 huggingface

computer vision

Vision-Language Models (VLMs) have demonstrated remarkable capabilities but suffer from significant computational overhead during inference. While visual token pruning offers a promising solution, existing methods predominantly rely on initial attention scores. This single-metric paradigm presents a...

Keywords: attention

View Paper

ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 6/2/2026 huggingface

reinforcement learning

Large Reasoning Models (LRMs) have achieved remarkable progress thanks to Reinforcement Learning with Verifiable Rewards (RLVR) on Chain-of-Thoughts (CoTs). However, since long CoTs naturally contain trial and errors and mainstream RLVR approaches choose outcome-correct CoT trajectories for memoriza...

Keywords: reinforcement learning

View Paper

WavTTS: Towards High-Quality Zero-Shot TTS via Direct Raw Waveform Modeling

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 6/2/2026 huggingface

computer vision

Recently, diffusion models operating on VAE latents or mel-spectrograms have become the dominant paradigm for zero-shot TTS. Although these compressed representations improve generation efficiency, they inevitably suffer from information loss and non-end-to-end training. Theoretically, directly mode...

Keywords: transformer, diffusion model

View Paper

SEA-NLI: Natural Language Inference as a Lens into Southeast Asian Cultural Understanding

0

5.0/10

[object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object] 6/2/2026 huggingface

natural language processing

Frontier LLMs perform well in Western contexts, but remain poorly tested on underrepresented cultures such as those in Southeast Asia (SEA). Existing NLI benchmarks are largely Western-centric, translation-derived, or monolingual, limiting their ability to measure culturally grounded reasoning. We i...

View Paper

Export Archive Data

Browse by Date

Papers for June 3, 2026

SimuScene: Simulation-Ready Compositional 3D Scene Reconstruction from a Single Image

Value-Aware Stochastic KV Cache Eviction for Reasoning Models

KletterMix: Climbing Toward High-Quality German Pretraining Data

World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning

Diffusing in the Right Space: A Systematic Study of Latent Diffusability

Acenaphthene Derivatives as Signatures of C_{11}H_9^+ Reactivity with Methylated Naphthalenes

When Attention Collapses: Stage-Aware Visual Token Pruning from Structure to Semantics

ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning

WavTTS: Towards High-Quality Zero-Shot TTS via Direct Raw Waveform Modeling

SEA-NLI: Natural Language Inference as a Lens into Southeast Asian Cultural Understanding