Paper Archive

Browse and export your curated research paper collection

33
Archived Days
330
Total Papers
7.8
Avg Score
7
Categories

Export Archive Data

Download your archived papers in various formats

JSON: Complete data with analysis • CSV: Tabular data for analysis • Markdown: Human-readable reports • BibTeX: Academic citations
Browse by Date

Papers for October 21, 2025

10 papers found

Zixin Yin, Ling-Hao Chen, Lionel Ni, Xili Dai 10/20/2025 arxiv

computer vision

Recent advances in training-free attention control methods have enabled flexible and efficient text-guided editing capabilities for existing generation models. However, current approaches struggle to simultaneously deliver strong editing strength while preserving consistency with the source. This li...

Keywords: ConsistEdit, MM-DiT, attention control, vision-only attention, mask-guided fusion, query-key-value manipulation, training-free, image editing

Rui Pan, Yang Luo, Yuxing Liu, Yang You, Tong Zhang 10/20/2025 arxiv

machine learning

Memory-efficient optimization is critical for training increasingly large language models (LLMs). A popular strategy involves gradient low-rank projection, storing only the projected optimizer states, with GaLore being a representative example. However, a significant drawback of many such methods is...

Keywords: low-rank projection, GaLore, Muon, GUM, layerwise sampling, unbiased optimizer, convergence guarantees, LLM fine-tuning

Yulin Luo, Chun-Kai Fan, Menghang Dong, Jiayu Shi, Mengdi Zhao, Bo-Wen Zhang, Cheng Chi, Jiaming Liu, Gaole Dai, Rongyu Zhang, Ruichuan An, Kun Wu, Zhengping Che, Shaoxuan Xie, Guocai Yao, Zhongxia Zhao, Pengwei Wang, Guang Liu, Zhongyuan Wang, Tiejun Huang, Shanghang Zhang 10/20/2025 arxiv

robotics

Building robots that can perceive, reason, and act in dynamic, unstructured environments remains a core challenge. Recent embodied systems often adopt a dual-system paradigm, where System 2 handles high-level reasoning while System 1 executes low-level control. In this work, we refer to System 2 as ...

Keywords: RoboBench, multimodal LLM, embodied AI, benchmark, robotics, affordance, planning, perception reasoning

Jiale Cheng, Yusen Liu, Xinyu Zhang, Yulin Fei, Wenyi Hong, Ruiliang Lyu, Weihan Wang, Zhe Su, Xiaotao Gu, Xiao Liu, Yushi Bai, Jie Tang, Hongning Wang, Minlie Huang 10/20/2025 arxiv

machine learning

Large language models (LLMs) increasingly rely on long-context modeling for tasks such as document understanding, code analysis, and multi-step reasoning. However, scaling context windows to the million-token level brings prohibitive computational and memory costs, limiting the practicality of long-...

Keywords: Glyph, visual-text compression, vision-language models, long-context LLMs, LLM-driven genetic search, token compression, document understanding

Akshara Prabhakar, Roshan Ram, Zixiang Chen, Silvio Savarese, Frank Wang, Caiming Xiong, Huan Wang, Weiran Yao 10/20/2025 arxiv

machine learning

As information grows exponentially, enterprises face increasing pressure to transform unstructured data into coherent, actionable insights. While autonomous agents show promise, they often struggle with domain-specific nuances, intent alignment, and enterprise integration. We present Enterprise Deep...

Keywords: EDR, multi-agent system, enterprise analytics, Master Planning Agent, reflection mechanism, NL2SQL, visualization agent, DeepResearch Bench

Yujie Luo, Zhuoyun Yu, Xuehai Wang, Yuqi Zhu, Ningyu Zhang, Lanning Wei, Lun Du, Da Zheng, Huajun Chen 10/20/2025 arxiv

machine learning

Replicating AI research is a crucial yet challenging task for large language model (LLM) agents. Existing approaches often struggle to generate executable code, primarily due to insufficient background knowledge and the limitations of retrieval-augmented generation (RAG) methods, which fail to captu...

Keywords: Executable Knowledge Graphs, xKG, reproducibility, PaperBench, RAG, knowledge graph, code extraction, LLM agents

Austin Xu, Xuan-Phi Nguyen, Yilun Zhou, Chien-Sheng Wu, Caiming Xiong, Shafiq Joty 10/20/2025 arxiv

machine learning

Finetuning specialized generative evaluators has emerged as a popular paradigm to meet the increasing demand for scalable evaluation during both training and test-time. However, recent work has largely focused on applying new methodology, such as reinforcement learning (RL), to training evaluators, ...

Keywords: foundational_evaluators, FARE, evaluation, reasoning, supervised_finetuning, rejection_sampling, MATH, RL_verification

Yuhao Yang, Zhen Yang, Zi-Yi Dou, Anh Nguyen, Keen You, Omar Attia, Andrew Szot, Michael Feng, Ram Ramrakhya, Alexander Toshev, Chao Huang, Yinfei Yang, Zhe Gan 10/20/2025 arxiv

machine learning

Multimodal agents for computer use rely exclusively on primitive actions (click, type, scroll) that require accurate visual grounding and lengthy execution chains, leading to cascading failures and performance bottlenecks. While other agents leverage rich programmatic interfaces (APIs, MCP servers, ...

Keywords: Hybrid Action, Computer Use Agents, GUI primitives, Programmatic Tools, Synthetic Data, Online Reinforcement Learning, Supervised Fine-tuning, OSWorld

Jackson Harmon, Andreas Hochlehnert, Matthias Bethge, Ameya Prabhu 10/20/2025 arxiv

natural language processing

Scaled post-training now drives many of the largest capability gains in language models (LMs), yet its effect on pretrained knowledge remains poorly understood. Not all forgetting is equal: Forgetting one fact (e.g., a U.S. president or an API call) does not "average out" by recalling another. Hence...

Keywords: post-training, forgetting, backward transfer, sample-wise metric, chance-adjusted accuracy, domain-continual pretraining, RL/SFT, instruction tuning

Samir Khaki, Junxian Guo, Jiaming Tang, Shang Yang, Yukang Chen, Konstantinos N. Plataniotis, Yao Lu, Song Han, Zhijian Liu 10/20/2025 arxiv

machine learning

Vision Language Models (VLMs) have rapidly advanced in integrating visual and textual reasoning, powering applications across high-resolution image understanding, long-video analysis, and multi-turn conversation. However, their scalability remains limited by the growing number of visual tokens that ...

Keywords: SparseVILA, visual sparsity, vision-language models, prefill pruning, query-aware retrieval, AWQ, inference acceleration, multimodal
Loading...

Preparing your export...