Notizie
Notizie
ZeroSearch: Incentivize the Search Capability of LLMs without Searching
arXiv:2505.04588v1 Announce Type: new Abstract: Effective information searching is essential for enhancing the reasoning and...
ZeroSearch from Alibaba Uses Reinforcement Learning and Simulated Documents to Teach LLMs Retrieval Without Real-Time Search
Large language models are now central to various applications, from coding to academic tutoring and...
Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning
arXiv:2507.16746v2 Announce Type: replace-cross Abstract: Humans often use visual aids, for example diagrams or sketches...
Your AI models are failing in production—Here’s how to fix model selection
The Allen Institute of AI updated its reward model evaluation RewardBench to better reflect real-life...
You Don’t Need to Share Data to Train a Language Model Anymore—FlexOlmo Demonstrates How
The development of large-scale language models (LLMs) has historically required centralized access to extensive datasets...
You can now fine-tune your enterprise’s own version of OpenAI’s o4-mini reasoning model with reinforcement learning
For organizations with clearly defined problems and verifiable answers, RFT offers a compelling way to...
Yandex Releases Yambda: The World’s Largest Event Dataset to Accelerate Recommender Systems
Yandex has recently made a significant contribution to the recommender systems community by releasing Yambda...
Yandex Releases Alchemist: A Compact Supervised Fine-Tuning Dataset for Enhancing Text-to-Image T2I Model Quality
Despite the substantial progress in text-to-image (T2I) generation brought about by models such as DALL-E...
Xiaomi Released MiMo-Audio, a 7B Speech Language Model Trained on 100M+ Hours with High-Fidelity Discrete Tokens
Xiaomi’s MiMo team released MiMo-Audio, a 7-billion-parameter audio-language model that runs a single next-token objective...
xAI launches Grok-4-Fast: Unified Reasoning and Non-Reasoning Model with 2M-Token Context and Trained End-to-End with Tool-Use Reinforcement Learning (RL)
xAI introduced Grok-4-Fast, a cost-optimized successor to Grok-4 that merges “reasoning” and “non-reasoning” behaviors into...
Why open-source AI became an American national priority
To reflect democratic principles, AI must be built in the open. If the U.S. wants...
Why Generalization in Flow Matching Models Comes from Approximation, Not Stochasticity
Introduction: Understanding Generalization in Deep Generative Models Deep generative models, including diffusion and flow matching...