Paper Archive

Browse and export your curated research paper collection

175
Archived Days
1738
Total Papers
7.7
Avg Score
9
Categories

Export Archive Data

Download your archived papers in various formats

JSON: Complete data with analysis • CSV: Tabular data for analysis • Markdown: Human-readable reports • BibTeX: Academic citations
Browse by Date

Papers for February 28, 2026

10 papers found

Sahal Shaji Mullappilly, Mohammed Irfan Kurpath, Omair Mohamed, Mohamed Zidan, Fahad Khan, Salman Khan, Rao Anwer, Hisham Cholakkal 2/26/2026 arxiv

machine learning

We introduce MediX-R1, an open-ended Reinforcement Learning (RL) framework for medical multimodal large language models (MLLMs) that enables clinically grounded, free-form answers beyond multiple-choice formats. MediX-R1 fine-tunes a baseline vision-language backbone with Group Based RL and a compos...

Keywords: MediX-R1, medical reinforcement learning, multimodal LLM, vision-language model, Group Based RL, composite reward, LLM-as-judge, medical embeddings

Sven Elflein, Ruilong Li, Sérgio Agostinho, Zan Gojcic, Laura Leal-Taixé, Qunjie Zhou, Aljosa Osep 2/26/2026 arxiv

computer vision

We present a scalable 3D reconstruction model that addresses a critical limitation in offline feed-forward methods: their computational and memory requirements grow quadratically w.r.t. the number of input images. Our approach is built on the key insight that this bottleneck stems from the varying-l...

Keywords: VGG-T3, test-time training, MLP distillation, Key-Value representation, 3D reconstruction, scalability, linear-time, softmax attention

Eric Eaton, Surbhi Goel, Marcel Hussing, Michael Kearns, Aaron Roth, Sikata Bela Sengupta, Jessica Sorrell 2/26/2026 arxiv

machine learning

Numerous lines of aim to control $\textit{model disagreement}$ -- the extent to which two machine learning models disagree in their predictions. We adopt a simple and standard notion of model disagreement in real-valued prediction problems, namely the expected squared difference in predictions betwe...

Keywords: model disagreement, anchoring, stacked aggregation, gradient boosting, neural architecture search, regression trees, theoretical bounds, squared loss

Vaibhav Agrawal, Rishubh Parihar, Pradhaan Bhat, Ravi Kiran Sarvadevabhatla, R. Venkatesh Babu 2/26/2026 arxiv

computer vision

We identify occlusion reasoning as a fundamental yet overlooked aspect for 3D layout-conditioned generation. It is essential for synthesizing partially occluded objects with depth-consistent geometry and scale. While existing methods can generate realistic scenes that follow input layouts, they ofte...

Keywords: SeeThrough3D, occlusion reasoning, 3D layout, OSCR, occlusion-aware 3D scene representation, translucent 3D boxes, text-to-image, flow-based generative model

Elad Kimchi Shoshani, Leeyam Gabay, Yedid Hoshen 2/26/2026 arxiv

computer vision

A dataset server must often distribute the same large payload to many clients, incurring massive communication costs. Since clients frequently operate on diverse hardware and software frameworks, transmitting a pre-trained model is often infeasible; instead, agents require raw data to train their ow...

Keywords: PLADA, pseudo-labels, dataset serving, dataset distillation, communication-efficient ML, ImageNet, pruning, data efficiency

Simon Roschmann, Paul Krzakala, Sonia Mazelet, Quentin Bouniot, Zeynep Akata 2/26/2026 arxiv

machine learning

The Platonic Representation Hypothesis posits that neural networks trained on different modalities converge toward a shared statistical model of the world. Recent work exploits this convergence by aligning frozen pretrained vision and language models with lightweight alignment layers, but typically ...

Keywords: SOTAlign, semi-supervised, optimal transport, vision-language, unimodal encoders, alignment, contrastive learning

Amita Kamath, Jack Hessel, Khyathi Chandu, Jena D. Hwang, Kai-Wei Chang, Ranjay Krishna 2/26/2026 arxiv

machine learning

The lack of reasoning capabilities in Vision-Language Models (VLMs) has remained at the forefront of research discourse. We posit that this behavior stems from a reporting bias in their training data. That is, how people communicate about visual content by default omits tacit information needed to s...

Keywords: reporting bias, vision-language, VLM, pragmatics, spatial reasoning, temporal reasoning, negation, counting

Jose Javier Gonzalez Ortiz, Abhay Gupta, Chris Renard, Davis Blalock 2/26/2026 arxiv

machine learning

Standard mixed-precision training of neural networks requires many bytes of accelerator memory for each model parameter. These bytes reflect not just the parameter itself, but also its gradient and one or more optimizer state variables. With each of these values typically requiring 4 bytes, training...

Keywords: FlashOptim, optimizer quantization, memory efficiency, companding, master weight splitting, AdamW, Lion, mixed-precision

Alkis Kalavasis, Anay Mehrotra, Manolis Zampetakis, Felix Zhou, Ziyu Zhu 2/26/2026 arxiv

machine learning

Coarse data arise when learners observe only partial information about samples; namely, a set containing the sample rather than its exact value. This occurs naturally through measurement rounding, sensor limitations, and lag in economic systems. We study Gaussian mean estimation from coarse data, wh...

Keywords: coarse data, mean estimation, Gaussian, convex partitions, identifiability, algorithms, computational complexity, NP-hard

Tilemachos Aravanis, Vladan Stojnić, Bill Psomas, Nikos Komodakis, Giorgos Tolias 2/26/2026 arxiv

computer vision

Open-vocabulary segmentation (OVS) extends the zero-shot recognition capabilities of vision-language models (VLMs) to pixel-level prediction, enabling segmentation of arbitrary categories specified by text prompts. Despite recent progress, OVS lags behind fully supervised approaches due to two chall...

Keywords: open-vocabulary segmentation, few-shot, retrieval-augmented, test-time adapter, vision-language models, per-query fusion, personalized segmentation
Loading...

Preparing your export...