YouZum

Actualités

Actualités

Fast, Slow, and Tool-augmented Thinking for LLMs: A Review

arXiv:2508.12265v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable progress in reasoning...

FalseReject: A Resource for Improving Contextual Safety and Mitigating Over-Refusals in LLMs via Structured Reasoning

arXiv:2505.08054v1 Announce Type: new Abstract: Safety alignment approaches in large language models (LLMs) often lead...

False Sense of Security: Why Probing-based Malicious Input Detection Fails to Generalize

arXiv:2509.03888v1 Announce Type: new Abstract: Large Language Models (LLMs) can comply with harmful instructions, raising...

Failure by Interference: Language Models Make Balanced Parentheses Errors When Faulty Mechanisms Overshadow Sound Ones

arXiv:2507.00322v1 Announce Type: new Abstract: Despite remarkable advances in coding capabilities, language models (LMs) still...

Exploring the Escalation of Source Bias in User, Data, and Recommender System Feedback Loop

arXiv:2405.17998v2 Announce Type: replace-cross Abstract: Recommender systems are essential for information access, allowing users to...

Exploring Procedural Data Generation for Automatic Acoustic Guitar Fingerpicking Transcription

arXiv:2508.07987v1 Announce Type: cross Abstract: Automatic transcription of acoustic guitar fingerpicking performances remains a challenging...

Exploring LLM Autoscoring Reliability in Large-Scale Writing Assessments Using Generalizability Theory

arXiv:2507.19980v1 Announce Type: new Abstract: This study investigates the estimation of reliability for large language...

Exploration of Plan-Guided Summarization for Narrative Texts: the Case of Small Language Models

arXiv:2504.09071v2 Announce Type: replace Abstract: Plan-guided summarization attempts to reduce hallucinations in small language models...

Exploiting Adaptive Contextual Masking for Aspect-Based Sentiment Analysis

arXiv:2402.13722v2 Announce Type: replace Abstract: Aspect-Based Sentiment Analysis (ABSA) is a fine-grained linguistics problem that...

Explaining Length Bias in LLM-Based Preference Evaluations

arXiv:2407.01085v4 Announce Type: replace-cross Abstract: The use of large language models (LLMs) as judges, particularly...

ExCyTIn-Bench: Evaluating LLM agents on Cyber Threat Investigation

arXiv:2507.14201v2 Announce Type: replace-cross Abstract: We present ExCyTIn-Bench, the first benchmark to Evaluate an LLM...

Everything you need to know about estimating AI’s energy and emissions burden

When we set out to write a story on the best available estimates for AI’s...

We use cookies to improve your experience and performance on our website. You can learn more at Politique de confidentialité and manage your privacy settings by clicking Settings.

Privacy Preferences

You can choose your cookie settings by turning on/off each type of cookie as you wish, except for essential cookies.

Allow All
Manage Consent Preferences
  • Always Active

Save
fr_FR