✨WAFT-Stereo: Warping-Alone Field Transforms for Stereo Matching 📝 Summary: WAFT-Stereo achieves state-of-the-art stereo matching performance by replacing cost volumes with warping techniques, demonstrating superior efficiency and accuracy on major benchmarks. AI-generated su... 🔹 Publication Date: Published on Mar 25 🔹 Paper Links: • arXiv Page: https://arxiv.org/abs/2603.24836 • PDF: https://arxiv.org/pdf/2603.24836 • Github: https://github.com/princeton-vl/WAFT-Stereo 🔹 Models citing this...
ML Research Hub
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers. Admin: @HusseinSheikho || @Hussein_Sheikho
Графики
📊 Средний охват постов
📉 ERR % по дням
📋 Публикации по дням
📎 Типы контента
Лучшие публикации
20 из 20✨QuantAgent: Price-Driven Multi-Agent LLMs for High-Frequency Trading 📝 Summary: QuantAgent is a multi-agent LLM framework for high-frequency trading. It uses specialized agents for indicators, patterns, trends, and risk to make rapid decisions. It outperforms existing neural and rule-based systems in accuracy and returns. 🔹 Publication Date: Published on Sep 12, 2025 🔹 Paper Links: • arXiv Page: https://arxiv.org/abs/2509.09995 • PDF: https://arxiv.org/pdf/2509.09995 • Project Page: https://...
✨AVO: Agentic Variation Operators for Autonomous Evolutionary Search 📝 Summary: Agentic variation operators enable autonomous discovery of performance-critical micro-architectural optimizations for attention kernels, outperforming state-of-the-art implementations on advanced GPU ... 🔹 Publication Date: Published on Mar 25 🔹 Paper Links: • arXiv Page: https://arxiv.org/abs/2603.24517 • PDF: https://arxiv.org/pdf/2603.24517 ================================== For more data science resources: ✓ h...
✨Reaching Beyond the Mode: RL for Distributional Reasoning in Language Models 📝 Summary: Language models typically give one answer, but many tasks have multiple solutions. This paper presents multi-answer RL, allowing LMs to generate multiple plausible answers with confidence in a single pass, improving diversity, accuracy, and computational efficiency. 🔹 Publication Date: Published on Mar 25 🔹 Paper Links: • arXiv Page: https://arxiv.org/abs/2603.24844 • PDF: https://arxiv.org/pdf/2603.24844...
✨Nudging Hidden States: Training-Free Model Steering for Chain-of-Thought Reasoning in Large Audio-Language Models 📝 Summary: This paper introduces training-free inference-time model steering to enhance Chain-of-Thought reasoning in Large Audio-Language Models. It achieves accuracy gains up to 4.4% and shows cross-modal transfer, where text-derived steering vectors efficiently guide speech reasoning. This positions mode... 🔹 Publication Date: Published on Mar 15 🔹 Paper Links: • arXiv Page: h...
✨Pixel-level Scene Understanding in One Token: Visual States Need What-is-Where Composition 📝 Summary: CroBo is a visual state representation framework that learns what-is-where composition for robotics. It uses global-to-local reconstruction to encode scene element identities and spatial locations in a compact token. This enables tracking scene dynamics for sequential decision making. 🔹 Publication Date: Published on Mar 14 🔹 Paper Links: • arXiv Page: https://arxiv.org/abs/2603.13904 • PDF:...
✨VFIG: Vectorizing Complex Figures in SVG with Vision-Language Models 📝 Summary: VFIG is a vision-language model that converts raster images into scalable vector graphics SVG. It employs a 66K dataset and hierarchical training for high-fidelity conversion, outperforming open-source models and matching proprietary ones. 🔹 Publication Date: Published on Mar 25 🔹 Paper Links: • arXiv Page: https://arxiv.org/pdf/2603.24575 • PDF: https://arxiv.org/pdf/2603.24575 • Project Page: https://vfig-proj....
✨MemMA: Coordinating the Memory Cycle through Multi-Agent Reasoning and In-Situ Self-Evolution 📝 Summary: MemMA is a multi-agent framework that coordinates the memory cycle in LLM agents. It uses a Meta-Thinker for strategic guidance and in-situ self-evolving repair for memory construction and retrieval. MemMA consistently outperforms existing baselines. 🔹 Publication Date: Published on Mar 19 🔹 Paper Links: • arXiv Page: https://arxiv.org/abs/2603.18718 • PDF: https://arxiv.org/pdf/2603.1871...
✨Can MLLMs Read Students' Minds? Unpacking Multimodal Error Analysis in Handwritten Math 📝 Summary: ScratchMath introduces a benchmark for analyzing errors in student handwritten math. It reveals MLLMs significantly lag human experts in visual and logical reasoning, but proprietary models show potential for error explanation. 🔹 Publication Date: Published on Mar 26 🔹 Paper Links: • arXiv Page: https://arxiv.org/abs/2603.24961 • PDF: https://arxiv.org/pdf/2603.24961 • Project Page: https://bbs...
✨Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibration 📝 Summary: Calibri enhances Diffusion Transformers by adding a single learned scaling parameter to improve generative quality. This parameter-efficient method, optimizing only ~100 parameters, reduces inference steps across various text-to-image models while maintaining high-quality outputs. 🔹 Publication Date: Published on Mar 25 🔹 Paper Links: • arXiv Page: https://arxiv.org/abs/2603.24800 • PDF: https://arxiv.or...