Browse and export your curated research paper collection
Unknown authors 10/13/2025 huggingface
machine learningProject Page: https://kangliao929.github.io/projects/puffin/Github: https://github.com/KangLiao929/PuffinDataset: https://huggingface.co/datasets/KangLiao/Puffin-4MModel: https://huggingface.co/KangLiao/PuffinDemo: https://huggingface.co/spaces/KangLiao/Puffin\n","updatedAt":"202...
Unknown authors 10/13/2025 huggingface
machine learningWe present D2E 🎮→🤖, a framework that scales Vision-Action Pretraining on desktop interaction data to accelerate Embodied AI 🚀.By turning ordinary game and desktop interactions into training fuel, D2E builds rich visuomotor priors that transfer from screens to robots\n✨ OWA Toolkit 🖥️ — a unified...
Unknown authors 10/13/2025 huggingface
machine learningWe introduce the problem of multimodal prompt optimization and propose the multimodal prompt optimizer, to harness the full capacity of multimodal large language models beyond text.\n","updatedAt":"2025-10-13T02:26:45.067Z","author":{"_id":"64cfa0b97...
Unknown authors 10/13/2025 huggingface
reinforcement learning1T).\r\nOur Webscale-RL pipeline converts pretraining text into diverse RL-ready QA data — scaling RL to pretraining levels!\r\n\r\nAll codes and datasets are open-source!\r\n\r\nHF🤗: https://huggingface.co/datasets/Salesforce/Webscale-RL\r\n\r\nGithub 🤖: https://github.com/SalesforceAIResearch/Pr...
Unknown authors 10/13/2025 huggingface
machine learningRecent trends in test-time scaling for reasoning models (e.g., OpenAI o1, DeepSeek-R1) have led to remarkable improvements through long Chain-of-Thought (CoT). However, existing benchmarks mainly focus on immediate, single-horizon tasks, failing to adequately evaluate models' ability to understand a...
Unknown authors 10/13/2025 huggingface
machine learningWith the current surge in spatial reasoning explorations, researchers have made significant progress in understanding indoor scenes, but still struggle with diverse applications such as robotics and autonomous driving. This paper aims to advance all-scale spatial reasoning across diverse scenarios b...
Unknown authors 10/13/2025 huggingface
machine learningPaper(arXiv): https://arxiv.org/abs/2510.08457Github: https://github.com/shawn0728/ARESModel & Dataset (hugging face): https://huggingface.co/collections/ares0728/ares-68e7c7160dcb48734dee4e95\n","updatedAt":"2025-10-13T03:57:04.248Z","author":{"_id&qu...
Unknown authors 10/13/2025 huggingface
machine learningStreamingVLM enables real-time, stable understanding of effectively infinite video by keeping a compact KV cache and aligning training with streaming inference. It avoids quadratic cost and sliding-window pitfalls, runs up to 8 FPS on a single H100, and wins 66.18% vs GPT-4o mini on a new long-video...
Unknown authors 10/13/2025 huggingface
computer visionRecent diffusion models achieve the state-of-the-art performance in image generation, but often suffer from semantic inconsistencies or hallucinations. While various inference-time guidance methods can enhance generation, they often operate indirectly by relying on external signals or architectural ...
Unknown authors 10/13/2025 huggingface
machine learning🧐 Why AutoPR?The academic community continues to expand output each year without a corresponding increase in visibility or value. In 2024 alone, NeurIPS accepted over 4,000 papers, with conference volumes at CVPR and ICCV also soaring. In such an environment of overwhelming information, how can ind...
Preparing your export...