Nachrichten
Nachrichten
Explaining Length Bias in LLM-Based Preference Evaluations
arXiv:2407.01085v4 Announce Type: replace-cross Abstract: The use of large language models (LLMs) as judges, particularly...
Explaining Large Language Models with gSMILE
arXiv:2505.21657v5 Announce Type: replace Abstract: Large Language Models (LLMs) such as GPT, LLaMA, and Claude...
Expanding the WMT24++ Benchmark with Rumantsch Grischun, Sursilvan, Sutsilvan, Surmiran, Puter, and Vallader
arXiv:2509.03148v2 Announce Type: replace Abstract: The Romansh language, spoken in Switzerland, has limited resources for...
ExCyTIn-Bench: Evaluating LLM agents on Cyber Threat Investigation
arXiv:2507.14201v2 Announce Type: replace-cross Abstract: We present ExCyTIn-Bench, the first benchmark to Evaluate an LLM...
Exclusive eBook: The great Al hype correction of 2025
2025 was a year of reckoning, including how the heads of the top AI companies...
EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes
arXiv:2507.11407v2 Announce Type: replace Abstract: This technical report introduces EXAONE 4.0, which integrates a Non-reasoning...
Exa AI Introduces Exa Instant: A Sub-200ms Neural Search Engine Designed to Eliminate Bottlenecks for Real-Time Agentic Workflows
In the world of Large Language Models (LLMs), speed is the only feature that matters...
EVs could be cheaper to own than gas cars in Africa by 2040
Electric vehicles could be economically competitive in Africa sooner than expected. Just 1% of new...
EvoSchema: Towards Text-to-SQL Robustness Against Schema Evolution
arXiv:2603.10697v1 Announce Type: cross Abstract: Neural text-to-SQL models, which translate natural language questions (NLQs) into...
Everything you need to know about estimating AI’s energy and emissions burden
When we set out to write a story on the best available estimates for AI’s...
Everyone’s looking to get in on vibe coding — and Google is no different with Stitch, its follow-up to Jules
Google is looking to compete in vibe coding with Stitch, which designs user interfaces (UIs)...
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
arXiv:2510.18855v2 Announce Type: replace Abstract: We present Ring-1T, the first open-source, state-of-the-art thinking model with...

