Paper Archive

Browse and export your curated research paper collection

36
Archived Days
360
Total Papers
7.9
Avg Score
7
Categories

Export Archive Data

Download your archived papers in various formats

JSON: Complete data with analysis • CSV: Tabular data for analysis • Markdown: Human-readable reports • BibTeX: Academic citations
Browse by Date

Papers for October 23, 2025

10 papers found

Ilona Demler, Saumya Chauhan, Georgia Gkioxari 10/22/2025 arxiv

computer vision

We introduce ITTO, a challenging new benchmark suite for evaluating and diagnosing the capabilities and limitations of point tracking methods. Our videos are sourced from existing datasets and egocentric real-world recordings, with high-quality human annotations collected through a multi-stage pipel...

Keywords: ITTO, point tracking, benchmark, occlusion, egocentric, re-identification, dataset, motion complexity

Jacob Berg, Chuning Zhu, Yanda Bao, Ishan Durugkar, Abhishek Gupta 10/22/2025 arxiv

robotics

Planning with world models offers a powerful paradigm for robotic control. Conventional approaches train a model to predict future frames conditioned on current frames and actions, which can then be used for planning. However, the objective of predicting future pixels is often at odds with the actua...

Keywords: semantic world models, vision-language models, visual question answering, planning, robotics, policy improvement, generalization, action-conditional modeling

Jake Poznanski, Luca Soldaini, Kyle Lo 10/22/2025 arxiv

machine learning

We present olmOCR 2, the latest in our family of powerful OCR systems for converting digitized print documents, like PDFs, into clean, naturally ordered plain text. olmOCR 2 is powered by olmOCR-2-7B-1025, a specialized, 7B vision language model (VLM) trained using reinforcement learning with verifi...

Keywords: OCR, vision-language model, reinforcement learning, unit tests, synthetic data, document parsing, tables, math OCR

Siyang Wu, Jack Nugent, Willow Yang, Jia Deng 10/22/2025 arxiv

computer vision

Monocular depth estimation is an important task with rapid progress, but how to evaluate it remains an open question, as evidenced by a lack of standardization in existing literature and a large selection of evaluation metrics whose trade-offs and behaviors are not well understood. This paper contri...

Keywords: monocular depth estimation, evaluation metrics, sensitivity analysis, relative surface normals, human judgment, depth visualization, composite metrics, curvature

Johnny Tian-Zheng Wei, Ameya Godbole, Mohammad Aflah Khan, Ryan Wang, Xiaoyuan Zhu, James Flemings, Nitya Kashyap, Krishna P. Gummadi, Willie Neiswanger, Robin Jia 10/22/2025 arxiv

machine learning

We present Hubble, a suite of fully open-source large language models (LLMs) for the scientific study of LLM memorization. Hubble models come in standard and perturbed variants: standard models are pretrained on a large English corpus, and perturbed models are trained in the same way but with contro...

Keywords: LLM memorization, data privacy, membership inference, machine unlearning, open-source models, training dynamics, dataset perturbation

Yusu Qian, Eli Bocek-Rivele, Liangchen Song, Jialing Tong, Yinfei Yang, Jiasen Lu, Wenze Hu, Zhe Gan 10/22/2025 arxiv

machine learning

Recent advances in multimodal models have demonstrated remarkable text-guided image editing capabilities, with systems like GPT-4o and Nano-Banana setting new benchmarks. However, the research community's progress remains constrained by the absence of large-scale, high-quality, and openly accessible...

Keywords: Pico-Banana-400K, image editing, dataset, Nano-Banana, OpenImages, MLLM quality scoring, multi-turn edits, alignment

Xichen Zhang, Sitong Wu, Yinghao Zhu, Haoru Tan, Shaozuo Yu, Ziyi He, Jiaya Jia 10/22/2025 arxiv

reinforcement learning

Reinforcement learning from verifiable rewards has emerged as a powerful technique for enhancing the complex reasoning abilities of Large Language Models (LLMs). However, these methods are fundamentally constrained by the ''learning cliff'' phenomenon: when faced with problems far beyond their curre...

Keywords: Scaf-GRPO, GRPO, scaffolding, learning cliff, in-prompt hints, LLM reasoning, reinforcement learning from verifiable rewards, Qwen2.5-Math-7B

David Mora, Viraat Aryabumi, Wei-Yin Ko, Sara Hooker, Julia Kreutzer, Marzieh Fadaee 10/22/2025 arxiv

machine learning

Synthetic data has become a cornerstone for scaling large language models, yet its multilingual use remains bottlenecked by translation-based prompts. This strategy inherits English-centric framing and style and neglects cultural dimensions, ultimately constraining model generalization. We argue tha...

Keywords: multilingual, synthetic data, prompt optimization, cultural adaptation, difficulty enhancement, Global-MMLU, Flores XCometXL, mArenaHard

Sandra Malagon, Monica A. Ulloa Ruiz, Tatiana Elizabeth Sandoval Plaza, Gabriel Rafael Rosario Bolívar, Valentina García Mesa, Ivanna Alvarado Morales 10/22/2025 arxiv

machine learning

The rapid escalation of computational requirements for training large-scale language models has reinforced structural asymmetries between high-capacity jurisdictions and countries in the Global South. This paper examines the technical and fiscal feasibility of sovereign-scale language model training...

Keywords: sovereign language models, Brazil, Mexico, H100, A100, compute governance, energy consumption, fiscal feasibility

Roey Magen, Gal Vardi 10/22/2025 arxiv

machine learning

Transformers have demonstrated impressive in-context learning (ICL) capabilities, raising the question of whether they can serve as metalearners that adapt to new tasks using only a small number of in-context examples, without any further training. While recent theoretical work has studied transform...

Keywords: Transformers, In-context learning, Metalearning, Sample complexity, Gaussian mixture, Subspace, Gradient descent, Linear classification
Loading...

Preparing your export...