Browse and export your curated research paper collection
Valentin Lacombe, Valentin Quesnel, Damien Sileo 3/2/2026 arxiv
machine learningTraining on verifiable symbolic data is a promising way to expand the reasoning frontier of language models beyond what standard pre-training corpora provide. Yet existing procedural generators often rely on fixed puzzles or templates and do not deliver the distributional breadth needed at scale. We...
Amir Asiaee, Kavey Aryan, James P. Long 3/2/2026 arxiv
machine learningSelective conformal prediction can yield substantially tighter uncertainty sets when we can identify calibration examples that are exchangeable with the test example. In interventional settings, such as perturbation experiments in genomics, exchangeability often holds only within subsets of interven...
Alex Serrano, Wen Xing, David Lindner, Erik Jenner 3/2/2026 arxiv
machine learningPre-deployment evaluations inspect only a limited sample of model actions. A malicious model seeking to evade oversight could exploit this by randomizing when to "defect": misbehaving so rarely that no malicious actions are observed during evaluation, but often enough that they occur eventually in d...
Drew Prinster, Clara Fannjiang, Ji Won Park, Kyunghyun Cho, Anqi Liu, Suchi Saria, Samuel Stanton 3/2/2026 arxiv
reinforcement learningAn agent must try new behaviors to explore and improve. In high-stakes environments, an agent that violates safety constraints may cause harm and must be taken offline, curtailing any future interaction. Imitating old behavior is safe, but excessive conservatism discourages exploration. How much beh...
Richard Freinschlag, Timo Bertram, Erich Kobler, Andreas Mayr, Günter Klambauer 3/2/2026 arxiv
machine learningReasoning problems such as Sudoku and ARC-AGI remain challenging for neural networks. The structured problem solving architecture family of Recurrent Reasoning Models (RRMs), including Hierarchical Reasoning Model (HRM) and Tiny Recursive Model (TRM), offer a compact alternative to large language mo...
Songtao Liu, Hongwu Peng, Zhiwei Zhang, Zhengyu Chen, Yue Guo 3/2/2026 arxiv
natural language processingLong-context inference in large language models is bottlenecked by Key--Value (KV) cache loading during the decoding stage, where the sequential nature of generation requires repeatedly transferring the KV cache from off-chip High-Bandwidth Memory (HBM) to on-chip Static Random-Access Memory (SRAM) ...
Jinqi Wu, Sishuo Chen, Zhangming Chan, Yong Bai, Lei Zhang, Sheng Chen, Chenghuan Hou, Xiang-Rong Sheng, Han Zhu, Jian Xu, Bo Zheng, Chaoyou Fu 3/2/2026 arxiv
machine learningMulti-attribution learning (MAL), which enhances model performance by learning from conversion labels yielded by multiple attribution mechanisms, has emerged as a promising learning paradigm for conversion rate (CVR) prediction. However, the conversion labels in public CVR datasets are generated by ...
Quoc-Khang Tran, Minh-Thien Nguyen, Nguyen-Khang Pham 3/2/2026 arxiv
computer visionThe classification of Intangible Cultural Heritage (ICH) images in the Mekong Delta poses unique challenges due to limited annotated data, high visual similarity among classes, and domain heterogeneity. In such low-resource settings, conventional deep learning models often suffer from high variance ...
Hao Li, Chunjiang Mu, Jianhao Chen, Siyue Ren, Zhiyao Cui, Yiqun Zhang, Lei Bai, Shuyue Hu 3/2/2026 arxiv
machine learningThe rapid proliferation of Claude agent skills has raised the central question of how to effectively leverage, manage, and scale the agent skill ecosystem. In this paper, we propose AgentSkillOS, the first principled framework for skill selection, orchestration, and ecosystem-level management. Agent...
AI Research Community 3/3/2026 huggingface
generative modelsWe present novel techniques for accelerating diffusion models while maintaining high-quality image generation. Our approach combines architectural improvements with advanced sampling strategies to achieve significant speedup in inference time without compromising output quality.
Preparing your export...