YouZum

新闻

新闻

FalseReject: A Resource for Improving Contextual Safety and Mitigating Over-Refusals in LLMs via Structured Reasoning

arXiv:2505.08054v1 Announce Type: new Abstract: Safety alignment approaches in large language models (LLMs) often lead...

Failure by Interference: Language Models Make Balanced Parentheses Errors When Faulty Mechanisms Overshadow Sound Ones

arXiv:2507.00322v1 Announce Type: new Abstract: Despite remarkable advances in coding capabilities, language models (LMs) still...

Exploring the Escalation of Source Bias in User, Data, and Recommender System Feedback Loop

arXiv:2405.17998v2 Announce Type: replace-cross Abstract: Recommender systems are essential for information access, allowing users to...

Exploiting Adaptive Contextual Masking for Aspect-Based Sentiment Analysis

arXiv:2402.13722v2 Announce Type: replace Abstract: Aspect-Based Sentiment Analysis (ABSA) is a fine-grained linguistics problem that...

Everything you need to know about estimating AI’s energy and emissions burden

When we set out to write a story on the best available estimates for AI’s...

Everyone’s looking to get in on vibe coding — and Google is no different with Stitch, its follow-up to Jules

Google is looking to compete in vibe coding with Stitch, which designs user interfaces (UIs)...

EventHunter: Dynamic Clustering and Ranking of Security Events from Hacker Forum Discussions

arXiv:2507.09762v1 Announce Type: cross Abstract: Hacker forums provide critical early warning signals for emerging cybersecurity...

Evaluating Rare Disease Diagnostic Performance in Symptom Checkers: A Synthetic Vignette Simulation Approach

arXiv:2506.19750v2 Announce Type: replace Abstract: Symptom Checkers (SCs) provide users with personalized medical information. To...

Evaluating Creative Short Story Generation in Humans and Large Language Models

arXiv:2411.02316v5 Announce Type: replace Abstract: Story-writing is a fundamental aspect of human imagination, relying heavily...

Evaluating and Improving Robustness in Large Language Models: A Survey and Future Directions

arXiv:2506.11111v1 Announce Type: new Abstract: Large Language Models (LLMs) have gained enormous attention in recent...

EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees

arXiv:2503.08893v2 Announce Type: replace Abstract: An ideal model evaluation should achieve two goals: identifying where...

Estimating LLM Uncertainty with Logits

arXiv:2502.00290v4 Announce Type: replace Abstract: Over the past few years, Large Language Models (LLMs) have...
zh_CN