Paper Archive

Browse and export your curated research paper collection

33
Archived Days
330
Total Papers
7.8
Avg Score
7
Categories

Export Archive Data

Download your archived papers in various formats

JSON: Complete data with analysis • CSV: Tabular data for analysis • Markdown: Human-readable reports • BibTeX: Academic citations
Browse by Date

Papers for October 11, 2025

10 papers found

Haofei Xu, Daniel Barath, Andreas Geiger, Marc Pollefeys 10/9/2025 arxiv

computer vision

While feed-forward Gaussian splatting models provide computational efficiency and effectively handle sparse input settings, their performance is fundamentally limited by the reliance on a single forward pass during inference. We propose ReSplat, a feed-forward recurrent Gaussian splatting model that...

Keywords: Gaussian splatting, recurrent network, view synthesis, 3D reconstruction, rendering error feedback, realestate10k, DL3DV

Rocktim Jyoti Das, Harsh Singh, Diana Turmakhan, Muhammad Abdullah Sohail, Mingfei Han, Preslav Nakov, Fabio Pizzati, Ivan Laptev 10/9/2025 arxiv

robotics

Scaling data and models has played a pivotal role in the remarkable progress of computer vision and language. Inspired by these domains, recent efforts in robotics have similarly focused on scaling both data and model size to develop more generalizable and robust policies. However, unlike vision and...

Keywords: BLAZER, LLM planners, zero-shot data generation, robotic manipulation, simulation, sim-to-real transfer, fine-tuning, data scaling

Nimrod Berman, Assaf Hallak, Assaf Shocher 10/9/2025 arxiv

machine learning

Neural networks are famously nonlinear. However, linearity is defined relative to a pair of vector spaces, $f$$:$$X$$\to$$Y$. Is it possible to identify a pair of non-standard vector spaces for which a conventionally nonlinear function is, in fact, linear? This paper introduces a method that makes s...

Keywords: Linearizer, invertible neural networks, linear operator, vector spaces, idempotency, projective generative model, diffusion models, SVD

Animikh Aich, Adwait Kulkarni, Eshed Ohn-Bar 10/9/2025 arxiv

machine learning

Real-World evaluation of perception-based planning models for robotic systems, such as autonomous vehicles, can be safely and inexpensively conducted offline, i.e., by computing model prediction error over a pre-collected validation dataset with ground-truth annotations. However, extrapolating from ...

Keywords: offline metrics, epistemic uncertainty, autonomous driving, closed-loop evaluation, simulation, real-world validation, perception-based planning, offline-to-online correlation

Hongyu Li, Lingfeng Sun, Yafei Hu, Duy Ta, Jennifer Barry, George Konidaris, Jiahui Fu 10/9/2025 arxiv

robotics

Enabling robots to execute novel manipulation tasks zero-shot is a central goal in robotics. Most existing methods assume in-distribution tasks or rely on fine-tuning with embodiment-matched data, limiting transfer across platforms. We present NovaFlow, an autonomous manipulation framework that conv...

Keywords: zero-shot, video generation, actionable flow, object flow, robot manipulation, deformable objects, trajectory optimization, particle-based dynamics

Qin Liu, Jacob Dineen, Yuxi Huang, Sheng Zhang, Hoifung Poon, Ben Zhou, Muhao Chen 10/9/2025 arxiv

machine learning

Benchmarks are central to measuring the capabilities of large language models and guiding model development, yet widespread data leakage from pretraining corpora undermines their validity. Models can match memorized content rather than demonstrate true generalization, which inflates scores, distorts...

Keywords: ArenaBencher, benchmark evolution, benchmarking, data leakage, LLM judge, multi-model evaluation, in-context learning, test-case generation

Tajamul Ashraf, Umair Nawaz, Abdelrahman M. Shaker, Rao Anwer, Philip Torr, Fahad Shahbaz Khan, Salman Khan 10/9/2025 arxiv

multimodal learning

Vision language models (VLMs) are increasingly deployed as controllers with access to external tools for complex reasoning and decision-making, yet their effectiveness remains limited by the scarcity of high-quality multimodal trajectories and the cost of manual annotation. We address this challenge...

Keywords: vision-language models, multimodal trajectories, agent tuning, M-TRACE, Pref-X, preference learning, tool use, MATRIX Agent

Meixi Song, Xin Lin, Dizhe Zhang, Haodong Li, Xiangtai Li, Bo Du, Lu Qi 10/9/2025 arxiv

computer vision

Recent advances in 3D Gaussian Splatting (3DGS) enable real-time, high-fidelity novel view synthesis (NVS) with explicit 3D representations. However, performance degradation and instability remain significant under sparse-view conditions. In this work, we identify two key failure modes under sparse-...

Keywords: 3D Gaussian Splatting, sparse-view reconstruction, novel view synthesis, depth-guided dropout, distance-aware supervision, stability metric, 3D reconstruction, rendering robustness

Zhen Zhu, Yiming Gong, Yao Xiao, Yaoyao Liu, Derek Hoiem 10/9/2025 arxiv

machine learning

How can we teach large multimodal models (LMMs) new skills without erasing prior abilities? We study sequential fine-tuning on five target skills while monitoring general ability on eight held-out benchmarks across three model families. We observe that apparent "forgetting" on held-out tasks after n...

Keywords: large multimodal models, fine-tuning, catastrophic forgetting, self-attention projection, MLP Gate, token distribution, counting-bias probe, continual learning

Changyao Tian, Hao Li, Gen Luo, Xizhou Zhu, Weijie Su, Hanming Deng, Jinguo Zhu, Jie Shao, Ziran Zhu, Yunpeng Liu, Lewei Lu, Wenhai Wang, Hongsheng Li, Jifeng Dai 10/9/2025 arxiv

machine learning

Compositional training has been the de-facto paradigm in existing Multimodal Large Language Models (MLLMs), where pre-trained vision encoders are connected with pre-trained LLMs through continuous multimodal pre-training. However, the multimodal scaling property of this paradigm remains difficult to...

Keywords: NaViL, native MLLM, end-to-end training, scaling laws, data-constrained, visual encoder, LLM, multimodal
Loading...

Preparing your export...