Paper Archive

Revisiting Generalization Across Difficulty Levels: It's Not So Easy

0

9.0/10

Yeganeh Kordi, Nihal V. Nayak, Max Zuo, Ilana Nguyen, Stephen H. Bach 11/26/2025 arxiv

natural language processing

We investigate how well large language models (LLMs) generalize across different task difficulties, a key question for effective data curation and evaluation. Existing research is mixed regarding whether training on easier or harder data leads to better results, and whether those gains come on easie...

Keywords: LLMs, generalization, difficulty, Item Response Theory, dataset curation, evaluation, curriculum learning

View Paper

Canvas-to-Image: Compositional Image Generation with Multimodal Controls

0

9.0/10

Yusuf Dalva, Guocheng Gordon Qian, Maya Goldenberg, Tsai-Shien Chen, Kfir Aberman, Sergey Tulyakov, Pinar Yanardag, Kuan-Chieh Jackson Wang 11/26/2025 arxiv

computer vision

While modern diffusion models excel at generating high-quality and diverse images, they still struggle with high-fidelity compositional and multimodal control, particularly when users simultaneously specify text prompts, subject references, spatial arrangements, pose constraints, and layout annotati...

Keywords: Canvas-to-Image, compositional_generation, diffusion_models, multimodal_controls, composite_canvas, multi-task_training, layout, pose

View Paper

TraceGen: World Modeling in 3D Trace Space Enables Learning from Cross-Embodiment Videos

0

9.0/10

Seungjae Lee, Yoonkyo Jung, Inkook Chun, Yao-Chih Lee, Zikui Cai, Hongjia Huang, Aayush Talreja, Tan Dat Dao, Yongyuan Liang, Jia-Bin Huang, Furong Huang 11/26/2025 arxiv

robotics

Learning new robot tasks on new platforms and in new scenes from only a handful of demonstrations remains challenging. While videos of other embodiments - humans and different robots - are abundant, differences in embodiment, camera, and environment hinder their direct use. We address the small-data...

Keywords: TraceGen, TraceForge, 3D trace-space, world model, cross-embodiment, pretraining, robot learning, few-shot

View Paper

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

0

9.0/10

Hongjin Su, Shizhe Diao, Ximing Lu, Mingjie Liu, Jiacheng Xu, Xin Dong, Yonggan Fu, Peter Belcak, Hanrong Ye, Hongxu Yin, Yi Dong, Evelina Bakhturina, Tao Yu, Yejin Choi, Jan Kautz, Pavlo Molchanov 11/26/2025 arxiv

machine learning

Large language models are powerful generalists, yet solving deep and complex problems such as those of the Humanity's Last Exam (HLE) remains both conceptually challenging and computationally expensive. We show that small orchestrators managing other models and a variety of tools can both push the u...

Keywords: ToolOrchestra, Orchestrator, tool-use agents, reinforcement learning, efficiency-aware rewards, user-preference alignment, HLE, GPT-5

View Paper

G$^2$VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

0

9.0/10

Wenbo Hu, Jingli Lin, Yilin Long, Yunlong Ran, Lihan Jiang, Yifan Wang, Chenming Zhu, Runsen Xu, Tai Wang, Jiangmiao Pang 11/26/2025 arxiv

computer vision

Vision-Language Models (VLMs) still lack robustness in spatial intelligence, demonstrating poor performance on spatial understanding and reasoning tasks. We attribute this gap to the absence of a visual geometry learning process capable of reconstructing 3D space from 2D images. We present G$^2$VLM,...

Keywords: G2VLM, vision-language model, 3D reconstruction, spatial reasoning, geometry grounding, multi-view, in-context learning, 3D visual priors

View Paper

Matrix: Peer-to-Peer Multi-Agent Synthetic Data Generation Framework

0

9.0/10

Dong Wang, Yang Li, Ansong Ni, Ching-Feng Yeh, Youssef Emad, Xinjie Lei, Liam Robbins, Karthik Padthe, Hu Xu, Xian Li, Asli Celikyilmaz, Ramya Raghavendra, Lifei Huang, Carole-Jean Wu, Shang-Wen Li 11/26/2025 arxiv

machine learning

Synthetic data has become increasingly important for training large language models, especially when real data is scarce, expensive, or privacy-sensitive. Many such generation tasks require coordinated multi-agent workflows, where specialized agents collaborate to produce data that is higher quality...

Keywords: synthetic data, multi-agent, peer-to-peer, distributed queues, Ray, scalability, LLM inference, data generation

View Paper

Seeing without Pixels: Perception from Camera Trajectories

0

9.0/10

Zihui Xue, Kristen Grauman, Dima Damen, Andrew Zisserman, Tengda Han 11/26/2025 arxiv

computer vision

Can one perceive a video's content without seeing its pixels, just from the camera trajectory-the path it carves through space? This paper is the first to systematically investigate this seemingly implausible question. Towards this end, we propose a contrastive learning framework to train CamFormer,...

Keywords: camera trajectory, CamFormer, contrastive learning, pose embeddings, vision-and-language, egocentric, exocentric, representation learning

View Paper

On Evolution-Based Models for Experimentation Under Interference

0

9.0/10

Sadegh Shirani, Mohsen Bayati 11/26/2025 arxiv

machine learning

Causal effect estimation in networked systems is central to data-driven decision making. In such settings, interventions on one unit can spill over to others, and in complex physical or social systems, the interaction pathways driving these interference structures remain largely unobserved. We argue...

Keywords: interference, causal inference, exposure mapping, evolution-based models, causal message passing, difference-in-differences, spillover effects, influencer networks

View Paper

Revolutionizing Glioma Segmentation & Grading Using 3D MRI - Guided Hybrid Deep Learning Models

0

9.0/10

Pandiyaraju V, Sreya Mynampati, Abishek Karthik, Poovarasan L, D. Saraswathi 11/26/2025 arxiv

machine learning

Gliomas are brain tumor types that have a high mortality rate which means early and accurate diagnosis is important for therapeutic intervention for the tumors. To address this difficulty, the proposed research will develop a hybrid deep learning model which integrates U-Net based segmentation and a...

Keywords: glioma, glioma segmentation, 3D MRI, U-Net, DenseNet, VGG, hybrid model, multi-head attention

View Paper

Escaping the Verifier: Learning to Reason via Demonstrations

0

9.0/10

Locke Cai, Ivan Provilkov 11/26/2025 arxiv

machine learning

Training Large Language Models (LLMs) to reason often relies on Reinforcement Learning (RL) with task-specific verifiers. However, many real-world reasoning-intensive tasks lack verifiers, despite offering abundant expert demonstrations that remain under-utilized for reasoning-focused training. We i...

Keywords: RARO, RelativisticCritic, InverseReinforcementLearning, AdversarialTraining, Reasoning, LLMs, Verifier-free, StabilizationTechniques

View Paper

Export Archive Data

Browse by Date

Papers for November 28, 2025

Revisiting Generalization Across Difficulty Levels: It's Not So Easy

Canvas-to-Image: Compositional Image Generation with Multimodal Controls

TraceGen: World Modeling in 3D Trace Space Enables Learning from Cross-Embodiment Videos

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

G$^2$VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

Matrix: Peer-to-Peer Multi-Agent Synthetic Data Generation Framework

Seeing without Pixels: Perception from Camera Trajectories

On Evolution-Based Models for Experimentation Under Interference

Revolutionizing Glioma Segmentation & Grading Using 3D MRI - Guided Hybrid Deep Learning Models

Escaping the Verifier: Learning to Reason via Demonstrations