Paper Archive

Browse and export your curated research paper collection

33
Archived Days
330
Total Papers
7.8
Avg Score
7
Categories

Export Archive Data

Download your archived papers in various formats

JSON: Complete data with analysis • CSV: Tabular data for analysis • Markdown: Human-readable reports • BibTeX: Academic citations
Browse by Date

Papers for October 16, 2025

10 papers found

Dominick Reilly, Manish Kumar Govind, Le Xue, Srijan Das 10/15/2025 arxiv

machine learning

Large Vision-Language Models (VLMs) excel at general visual reasoning tasks but exhibit sharp performance degradation when applied to novel domains with substantial distribution shifts from pretraining data. Existing domain adaptation approaches finetune different VLM components, but this often resu...

Keywords: Vision-Language Models, Domain Adaptation, Visual Probes, VisCoP, Cross-modal Transfer, Egocentric Vision

Xinchen Zhang, Xiaoying Zhang, Youbin Wu, Yanbin Cao, Renrui Zhang, Ruihang Chu, Ling Yang, Yujiu Yang 10/15/2025 arxiv

machine learning

We introduce Generative Universal Verifier, a novel concept and plugin designed for next-generation multimodal reasoning in vision-language models and unified multimodal models, providing the fundamental capability of reflection and refinement on visual outcomes during the reasoning and generation p...

Keywords: Generative Universal Verifier, OmniVerifier-7B, OmniVerifier-TTS, ViVerBench, visual verification, multimodal reasoning, test-time scaling, vision-language models

Xinhang Liu, Yuxi Xiao, Donny Y. Chen, Jiashi Feng, Yu-Wing Tai, Chi-Keung Tang, Bingyi Kang 10/15/2025 arxiv

computer vision

Effective spatio-temporal representation is fundamental to modeling, understanding, and predicting dynamics in videos. The atomic unit of a video, the pixel, traces a continuous 3D trajectory over time, serving as the primitive element of dynamics. Based on this principle, we propose representing an...

Keywords: Trajectory Field, Trace Anything, video dynamics, 4D, B-spline, spatio-temporal representation, point tracking, neural field

Jia-Chen Gu, Junyi Zhang, Di Wu, Yuankai Li, Kai-Wei Chang, Nanyun Peng 10/15/2025 arxiv

machine learning

As retrieval-augmented generation (RAG) tackles complex tasks, increasingly expanded contexts offer richer information, but at the cost of higher latency and increased cognitive load on the model. To mitigate this bottleneck, especially for intricate multi-hop questions, we introduce BRIEF-Pro. It i...

Keywords: BRIEF-Pro, context compression, short-to-long synthesis, RAG, multi-hop QA, abstractive summarization, in-context learning, efficiency

Giovanni Monea, Yair Feldman, Shankar Padmanabhan, Kianté Brantley, Yoav Artzi 10/15/2025 arxiv

machine learning

The scalability of large language models for long-context reasoning is severely constrained by the linear growth of their Transformer key-value cache, which incurs significant memory and computational costs. We posit that as a model generates reasoning tokens, the informational value of past generat...

Keywords: KV cache, cache compression, compression beacons, breadcrumbs reasoning, transformer, reinforcement learning, distillation, memory-efficiency

Shuyu Wu, Ziqiao Ma, Xiaoxi Luo, Yidong Huang, Josue Torres-Fonseca, Freda Shi, Joyce Chai 10/15/2025 arxiv

machine learning

Symbol grounding (Harnad, 1990) describes how symbols such as words acquire their meanings by connecting to real-world sensorimotor experiences. Recent work has shown preliminary evidence that grounding may emerge in (vision-)language models trained at scale without using explicit grounding objectiv...

Keywords: symbol grounding, mechanistic interpretability, multimodal models, attention heads, aggregate mechanism, Transformers, state-space models, LSTM

Yi Zhang, Bolin Ni, Xin-Sheng Chen, Heng-Rui Zhang, Yongming Rao, Houwen Peng, Qinglin Lu, Han Hu, Meng-Hao Guo, Shi-Min Hu 10/15/2025 arxiv

machine learning

Fully open multimodal large language models (MLLMs) currently lag behind proprietary counterparts, primarily due to a significant gap in data quality for supervised fine-tuning (SFT). Existing open-source datasets are often plagued by widespread noise and a critical deficit in complex reasoning data...

Keywords: Honey-Data-15M, HoneyPipe, DataStudio, Bee-8B, multimodal-LLM, Chain-of-Thought, data curation, supervised fine-tuning

Nir Goren, Oren Katzir, Abhinav Nakarmi, Eyal Ronen, Mahmood Sharif, Or Patashnik 10/15/2025 arxiv

computer vision

With the rapid adoption of diffusion models for visual content generation, proving authorship and protecting copyright have become critical. This challenge is particularly important when model owners keep their models private and may be unwilling or unable to handle authorship issues, making third-p...

Keywords: NoisePrints, watermarking, diffusion models, seed-based verification, zero-knowledge proofs, authorship verification, model provenance, cryptographic hashing

Ziqing Lu, Lifeng Lai, Weiyu Xu 10/15/2025 arxiv

reinforcement learning

Reinforcement learning (RL) for the Markov Decision Process (MDP) has emerged in many security-related applications, such as autonomous driving, financial decisions, and drone/robot algorithms. In order to improve the robustness/defense of RL systems against adversaries, studying various adversarial...

Keywords: rate-distortion, information-theoretic, adversarial attacks, reinforcement learning, MDP, reward regret, model-based, model-free

Md. Joshem Uddin, Soham Changani, Baris Coskunuzer 10/15/2025 arxiv

machine learning

Temporal graph classification plays a critical role in applications such as cybersecurity, brain connectivity analysis, social dynamics, and traffic monitoring. Despite its significance, this problem remains underexplored compared to temporal link prediction or node forecasting. Existing methods oft...

Keywords: T3former, temporal graphs, temporal graph classification, topological descriptors, spectral descriptors, descriptor-attention, stability guarantees, dynamic social networks
Loading...

Preparing your export...