Paper Archive

Browse and export your curated research paper collection

175
Archived Days
1738
Total Papers
7.7
Avg Score
9
Categories

Export Archive Data

Download your archived papers in various formats

JSON: Complete data with analysis • CSV: Tabular data for analysis • Markdown: Human-readable reports • BibTeX: Academic citations
Browse by Date

Papers for March 4, 2026

10 papers found

Yujia Zhang, Xiaoyang Wu, Yunhan Yang, Xianzhe Fan, Han Li, Yuechen Zhang, Zehao Huang, Naiyan Wang, Hengshuang Zhao 3/3/2026 arxiv

computer vision

We dream of a future where point clouds from all domains can come together to shape a single model that benefits them all. Toward this goal, we present Utonia, a first step toward training a single self-supervised point transformer encoder across diverse domains, spanning remote sensing, outdoor LiD...

Keywords: point cloud, self-supervised learning, point transformer, cross-domain, LiDAR, RGB-D, CAD, foundation model

M. Hamza Mughal, Rishabh Dabral, Vera Demberg, Christian Theobalt 3/3/2026 arxiv

machine learning

Embodied Conversational Agents (ECAs) aim to emulate human face-to-face interaction through speech, gestures, and facial expressions. Current large language model (LLM)-based conversational agents lack embodiment and the expressive gestures essential for natural interaction. Existing solutions for E...

Keywords: co-speech_gesture_synthesis, embodied_conversational_agents, causal_autoregressive, body-part_codecs, discrete_tokens, LLM_conditioning, real-time, expressivity

Hanyang Wang, Yiyang Liu, Jiawei Chi, Fangfu Liu, Ran Xue, Yueqi Duan 3/3/2026 arxiv

computer vision

Classifier-Free Guidance (CFG) has emerged as a central approach for enhancing semantic alignment in flow-based diffusion models. In this paper, we explore a unified framework called CFG-Ctrl, which reinterprets CFG as a control applied to the first-order continuous-time generative flow, using the c...

Keywords: Classifier-Free Guidance, CFG-Ctrl, SMC-CFG, Sliding Mode Control, diffusion models, semantic alignment, Lyapunov stability, Stable Diffusion 3.5

Toru Lin, Shuying Deng, Zhao-Heng Yin, Pieter Abbeel, Jitendra Malik 3/3/2026 arxiv

robotics

Many essential manipulation tasks - such as food preparation, surgery, and craftsmanship - remain intractable for autonomous robots. These tasks are characterized not only by contact-rich, force-sensitive dynamics, but also by their "implicit" success criteria: unlike pick-and-place, task quality in...

Keywords: robotic manipulation, preference-based finetuning, imitation learning, force-aware data collection, learned reward model, fine-grained manipulation, peeling, human preference

Xialin He, Sirui Xu, Xinyao Li, Runpei Dong, Liuyu Bian, Yu-Xiong Wang, Liang-Yan Gui 3/3/2026 arxiv

robotics

Achieving autonomous and versatile whole-body loco-manipulation remains a central barrier to making humanoids practically useful. Yet existing approaches are fundamentally constrained: retargeted data are often scarce or low-quality; methods struggle to scale to large skill repertoires; and, most im...

Keywords: ULTRA, neural retargeting, multimodal controller, loco-manipulation, humanoid, egocentric perception, reinforcement learning, skill latent space

William Liang, Sam Wang, Hung-Ju Wang, Osbert Bastani, Yecheng Jason Ma, Dinesh Jayaraman 3/3/2026 arxiv

robotics

The ability to conduct and learn from interaction and experience is a central challenge in robotics, offering a scalable alternative to labor-intensive human demonstrations. However, realizing such "play" requires (1) a policy robust to diverse, potentially out-of-distribution environment states, an...

Keywords: autonomous play, trajectory warping, semantic correspondences, vision-language models, data-efficient imitation, open-loop policy, robot learning

Shengbang Tong, David Fan, John Nguyen, Ellis Brown, Gaoyue Zhou, Shengyi Qian, Boyang Zheng, Théophane Vallaeys, Junlin Han, Rob Fergus, Naila Murray, Marjan Ghazvininejad, Mike Lewis, Nicolas Ballas, Amir Bar, Michael Rabbat, Jakob Verbeek, Luke Zettlemoyer, Koustuv Sinha, Yann LeCun, Saining Xie 3/3/2026 arxiv

machine learning

The visual world offers a critical axis for advancing foundation models beyond language. Despite growing interest in this direction, the design space for native multimodal models remains opaque. We provide empirical clarity through controlled, from-scratch pretraining experiments, isolating the fact...

Keywords: multimodal, Transfusion, Representation Autoencoder, RAE, Mixture-of-Experts, MoE, IsoFLOP, scaling laws

Jessie Z. Li, Zhiqing Hong, Toru Shirakawa, Serina Chang 3/3/2026 arxiv

machine learning

Human mobility trajectories are widely studied in public health and social science, where different demographic groups exhibit significantly different mobility patterns. However, existing trajectory generation models rarely capture this heterogeneity because most trajectory datasets lack demographic...

Keywords: ATLAS, weak supervision, mobility trajectories, demographic-conditioned generation, aggregate supervision, census data, JSD, trajectory generation

Adam Dorian Wong, John D. Hastings 3/3/2026 arxiv

machine learning

Mobile devices are frequent targets of eCrime threat actors through SMS spearphishing (smishing) links that leverage Domain Generation Algorithms (DGA) to rotate hostile infrastructure. Despite this, DGA research and evaluation largely emphasize malware C2 and email phishing datasets, leaving limite...

Keywords: DGA, smishing, mobile security, domain generation algorithm, Gravity Falls dataset, entropy, LSTM, COSSAS DGAD

Junyi Zhang, Charles Herrmann, Junhwa Hur, Chen Sun, Ming-Hsuan Yang, Forrester Cole, Trevor Darrell, Deqing Sun 3/3/2026 arxiv

computer vision

Feedforward geometric foundation models achieve strong short-window reconstruction, yet scaling them to minutes-long videos is bottlenecked by quadratic attention complexity or limited effective memory in recurrent designs. We present LoGeR (Long-context Geometric Reconstruction), a novel architectu...

Keywords: 3D reconstruction, long-context, hybrid memory, test-time training, sliding window attention, feedforward geometric models, KITTI, VBR dataset
Loading...

Preparing your export...