Paper Archive

Browse and export your curated research paper collection

33
Archived Days
330
Total Papers
7.8
Avg Score
7
Categories

Export Archive Data

Download your archived papers in various formats

JSON: Complete data with analysis • CSV: Tabular data for analysis • Markdown: Human-readable reports • BibTeX: Academic citations
Browse by Date

Papers for October 7, 2025

10 papers found

Cheng Xin, Fan Xu, Xin Ding, Jie Gao, Jiaxin Ding 10/6/2025 arxiv

machine learning

Graph Neural Networks (GNNs) have shown remarkable success across various scientific fields, yet their adoption in critical decision-making is often hindered by a lack of interpretability. Recently, intrinsically interpretable GNNs have been studied to provide insights into model predictions by iden...

Keywords: persistent homology, graph neural networks, interpretability, rationale filtration, topological discrepancy, explainable AI, autoregressive rationale generation

Mingkang Zhu, Xi Chen, Bei Yu, Hengshuang Zhao, Jiaya Jia 10/6/2025 arxiv

machine learning

Large reasoning models (LRMs) generate intermediate reasoning traces before producing final answers, yielding strong gains on multi-step and mathematical tasks. Yet aligning LRMs with human preferences, a crucial prerequisite for model deployment, remains underexplored. The statistically correct obj...

Keywords: BVPO, preference optimization, bias-variance trade-off, gradient variance, reasoning traces, alignment, large reasoning models

Ziqi Huang, Ning Yu, Gordon Chen, Haonan Qiu, Paul Debevec, Ziwei Liu 10/6/2025 arxiv

computer vision

Recent video generation models can produce smooth and visually appealing clips, but they often struggle to synthesize complex dynamics with a coherent chain of consequences. Accurately modeling visual outcomes and state transitions over time remains a core challenge. In contrast, large language and ...

Keywords: VChain, chain-of-visual-thought, video generation, multimodal models, keyframes, inference-time tuning, sparse tuning, visual reasoning

Le Zhuo, Songhao Han, Yuandong Pu, Boxiang Qiu, Sayak Paul, Yue Liao, Yihao Liu, Jie Shao, Xi Chen, Si Liu, Hongsheng Li 10/6/2025 arxiv

computer vision

While modern visual generation models excel at creating aesthetically pleasing natural images, they struggle with producing or editing structured visuals like charts, diagrams, and mathematical figures, which demand composition planning, text rendering, and multimodal reasoning for factual fidelity....

Keywords: structured visuals, dataset, 1.3M image pairs, chain-of-thought, VLM, FLUX.1 Kontext, three-stage training, external reasoner

Runchu Tian, Junxia Cui, Xueqiang Xu, Feng Yao, Jingbo Shang 10/6/2025 arxiv

natural language processing

Diffusion large language models (dLLMs) have recently emerged as a promising alternative to autoregressive (AR) models, offering advantages such as accelerated parallel decoding and bidirectional context modeling. However, the vanilla decoding strategy in discrete dLLMs suffers from a critical limit...

Keywords: diffusion LLM, Tolerator, token-level cross-validation, decoding algorithm, remasking, sequence fill-up, diffusion models, code generation

Janos Perczel, Jin Chow, Dorottya Demszky 10/6/2025 arxiv

machine learning

The promise of generative AI to revolutionize education is constrained by the pedagogical limits of large language models (LLMs). A major issue is the lack of access to high-quality training data that reflect the learning of actual students. Prompt engineering has emerged as a stopgap, but the abili...

Keywords: LLM, education, fine-tuning, parameter-efficient fine-tuning, synthetic student model, dialogue evaluation, Polygence dataset, student-tutor interactions

Ronen Kamenetsky, Sara Dorfman, Daniel Garibi, Roni Paiss, Or Patashnik, Daniel Cohen-Or 10/6/2025 arxiv

computer vision

Large-scale text-to-image diffusion models have become the backbone of modern image editing, yet text prompts alone do not offer adequate control over the editing process. Two properties are especially desirable: disentanglement, where changing one attribute does not unintentionally alter others, an...

Keywords: Sparse Autoencoder, token-level control, text embeddings, disentangled editing, continuous control, diffusion models, model-agnostic

Siheng Zhao, Yanjie Ze, Yue Wang, C. Karen Liu, Pieter Abbeel, Guanya Shi, Rocky Duan 10/6/2025 arxiv

robotics

Humanoid whole-body loco-manipulation promises transformative capabilities for daily service and warehouse tasks. While recent advances in general motion tracking (GMT) have enabled humanoids to reproduce diverse human motions, these policies lack the precision and object awareness required for loco...

Keywords: ResMimic, residual learning, humanoid, loco-manipulation, general motion tracking, point-cloud reward, contact reward, curriculum learning

Unknown authors 10/7/2025 huggingface

computer vision

Video understanding represents the most challenging frontier in computer vision, requiring models to reason about complex spatiotemporal relationships, long-term dependencies, and multimodal evidence. The recent emergence of Video-Large Multimodal Models (Video-LMMs), which integrate visual encoders...

Keywords: Video-LMMs, supervised fine-tuning, reinforcement learning, test-time scaling, temporal localization, spatiotemporal grounding, long video efficiency, multimodal evidence integration

Unknown authors 10/7/2025 huggingface

natural language processing

ArXiv: https://arxiv.org/pdf/2510.04800.Code and detailed results will be released later.\n","updatedAt":"2025-10-07T02:43:05.775Z","author":{"_id":"6602ca1e10a1441af41637be","avatarUrl":"/avatars/5880e699def320beb352cbed77495b2f....

Keywords: self-attention mechanisms, structured state space models, Mamba, hybrid architectures, inter-layer fusion, intra-layer fusion, long-context capabilities, scaling analysis
Loading...

Preparing your export...