Notizie
Notizie
Multimodal Foundation Models Fall Short on Physical Reasoning: PHYX Benchmark Highlights Key Limitations in Visual and Symbolic Integration
State-of-the-art models show human-competitive accuracy on AIME, GPQA, MATH-500, and OlympiadBench, solving Olympiad-level problems. Recent...
Multimodal Assessment of Classroom Discourse Quality: A Text-Centered Attention-Based Multi-Task Learning Approach
arXiv:2505.07902v1 Announce Type: cross Abstract: Classroom discourse is an essential vehicle through which teaching and...
Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You
arXiv:2401.16092v4 Announce Type: replace Abstract: Text-to-image generation models have recently achieved astonishing results in image...
Multilingual Machine Translation with Quantum Encoder Decoder Attention-based Convolutional Variational Circuits
arXiv:2505.09407v1 Announce Type: new Abstract: Cloud-based multilingual translation services like Google Translate and Microsoft Translator...
Multi2: Multi-Agent Test-Time Scalable Framework for Multi-Document Processing
arXiv:2502.20592v3 Announce Type: replace Abstract: Recent advances in test-time scaling have shown promising results in...
Multi-domain Multilingual Sentiment Analysis in Industry: Predicting Aspect-based Opinion Quadruples
arXiv:2505.10389v2 Announce Type: replace Abstract: This paper explores the design of an aspect-based sentiment analysis...
Moonshot AI’s Kimi K2 outperforms GPT-4 in key benchmarks — and it’s free
Chinese AI startup Moonshot releases open-source Kimi K2 model that outperforms OpenAI and Anthropic on...
Moonshot AI Releases Kimi K2: A Trillion-Parameter MoE Model Focused on Long Context, Code, Reasoning, and Agentic Behavior
Kimi K2, launched by Moonshot AI in July 2025, is a purpose-built, open-source Mixture-of-Experts (MoE)...
MoE Architecture Comparison: Qwen3 30B-A3B vs. GPT-OSS 20B
This article provides a technical comparison between two recently released Mixture-of-Experts (MoE) transformer models: Alibaba’s...
Model Context Protocol: A promising AI integration layer, but not a standard (yet)
Enterprises should experiment with MCP where it adds value, isolate dependencies and prepare for a...
MIT Researchers Develop Methods to Control Transformer Sensitivity with Provable Lipschitz Bounds and Muon
Training large-scale transformers stably has been a longstanding challenge in deep learning, particularly as models...