YouZum

Nachrichten

Nachrichten

Explaining Length Bias in LLM-Based Preference Evaluations

arXiv:2407.01085v4 Announce Type: replace-cross Abstract: The use of large language models (LLMs) as judges, particularly...

Explaining Large Language Models with gSMILE

arXiv:2505.21657v5 Announce Type: replace Abstract: Large Language Models (LLMs) such as GPT, LLaMA, and Claude...

Expanding the WMT24++ Benchmark with Rumantsch Grischun, Sursilvan, Sutsilvan, Surmiran, Puter, and Vallader

arXiv:2509.03148v2 Announce Type: replace Abstract: The Romansh language, spoken in Switzerland, has limited resources for...

ExCyTIn-Bench: Evaluating LLM agents on Cyber Threat Investigation

arXiv:2507.14201v2 Announce Type: replace-cross Abstract: We present ExCyTIn-Bench, the first benchmark to Evaluate an LLM...

Exclusive eBook: The great Al hype correction of 2025

2025 was a year of reckoning, including how the heads of the top AI companies...

EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes

arXiv:2507.11407v2 Announce Type: replace Abstract: This technical report introduces EXAONE 4.0, which integrates a Non-reasoning...

EVs could be cheaper to own than gas cars in Africa by 2040

Electric vehicles could be economically competitive in Africa sooner than expected. Just 1% of new...

EvoSchema: Towards Text-to-SQL Robustness Against Schema Evolution

arXiv:2603.10697v1 Announce Type: cross Abstract: Neural text-to-SQL models, which translate natural language questions (NLQs) into...

Everything you need to know about estimating AI’s energy and emissions burden

When we set out to write a story on the best available estimates for AI’s...

Everyone’s looking to get in on vibe coding — and Google is no different with Stitch, its follow-up to Jules

Google is looking to compete in vibe coding with Stitch, which designs user interfaces (UIs)...

Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model

arXiv:2510.18855v2 Announce Type: replace Abstract: We present Ring-1T, the first open-source, state-of-the-art thinking model with...

We use cookies to improve your experience and performance on our website. You can learn more at Datenschutzrichtlinie and manage your privacy settings by clicking Settings.

Privacy Preferences

You can choose your cookie settings by turning on/off each type of cookie as you wish, except for essential cookies.

Allow All
Manage Consent Preferences
  • Always Active

Save
de_DE