新闻

2 月 28, 2026admin NUAI,Committee,新闻,Uncategorized0

Under the Influence: Quantifying Persuasion and Vigilance in Large Language Models

arXiv:2602.21262v2 Announce Type: replace Abstract: With increasing integration of Large Language Models (LLMs) into areas...

5 月 27, 2026admin NUAI,Committee,新闻,Uncategorized0

Uncertainty-Aware Budget Allocation for Adaptive Test-Time Reasoning

arXiv:2605.26849v1 Announce Type: new Abstract: Sampling multiple responses improves language model reasoning, but uniform compute...

11 月 13, 2025admin NUAI,Committee,新闻,Uncategorized0

Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers

arXiv:2504.19254v4 Announce Type: replace Abstract: Hallucinations are a persistent problem with Large Language Models (LLMs)...

10 月 24, 2025admin NUAI,Committee,新闻,Uncategorized0

UltraCUA: A Foundation Computer-Use Agents Model that Bridges the Gap between General-Purpose GUI Agents and Specialized API-based Agents

Computer-use agents have been limited to primitives. They click, they type, they scroll. Long action...

10 月 30, 2025admin NUAI,Committee,新闻,Uncategorized0

TwinVoice: A Multi-dimensional Benchmark Towards Digital Twins via LLM Persona Simulation

arXiv:2510.25536v1 Announce Type: new Abstract: Large Language Models (LLMs) are exhibiting emergent human-like abilities and...

8 月 4, 2025admin NUAI,Committee,新闻,Uncategorized0

Tutorial: Exploring SHAP-IQ Visualizations

In this tutorial, we’ll explore a range of SHAP-IQ visualizations that provide insights into how...

9 月 18, 2025admin NUAI,Committee,新闻,Uncategorized0

Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions

arXiv:2501.01872v5 Announce Type: replace Abstract: Large language models, despite extensive alignment with human values and...

TurkBench: A Benchmark for Evaluating Turkish Large Language Models

arXiv:2601.07020v2 Announce Type: replace Abstract: With the recent surge in the development of large language...

5 月 14, 2025admin NUAI,Committee,新闻,Uncategorized0

TUMS: Enhancing Tool-use Abilities of LLMs with Multi-structure Handlers

arXiv:2505.08402v1 Announce Type: new Abstract: Recently, large language models(LLMs) have played an increasingly important role...

TuCo: Measuring the Contribution of Fine-Tuning to Individual Responses of LLMs

arXiv:2506.23423v1 Announce Type: new Abstract: Past work has studied the effects of fine-tuning on large...

TSEmbed: Unlocking Task Scaling in Universal Multimodal Embeddings

arXiv:2603.04772v1 Announce Type: new Abstract: Despite the exceptional reasoning capabilities of Multimodal Large Language Models...

1 月 21, 2026admin NUAI,Committee,新闻,Uncategorized0

Trustworthy Data-driven Chronological Age Estimation from Panoramic Dental Images

arXiv:2601.12960v1 Announce Type: new Abstract: Integrating deep learning into healthcare enables personalized care but raises...

新闻

新闻

Under the Influence: Quantifying Persuasion and Vigilance in Large Language Models

Uncertainty-Aware Budget Allocation for Adaptive Test-Time Reasoning

Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers

UltraCUA: A Foundation Computer-Use Agents Model that Bridges the Gap between General-Purpose GUI Agents and Specialized API-based Agents

TwinVoice: A Multi-dimensional Benchmark Towards Digital Twins via LLM Persona Simulation

Tutorial: Exploring SHAP-IQ Visualizations

Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions

TurkBench: A Benchmark for Evaluating Turkish Large Language Models

TUMS: Enhancing Tool-use Abilities of LLMs with Multi-structure Handlers

TuCo: Measuring the Contribution of Fine-Tuning to Individual Responses of LLMs

TSEmbed: Unlocking Task Scaling in Universal Multimodal Embeddings

Trustworthy Data-driven Chronological Age Estimation from Panoramic Dental Images

我们的服务

首页

工作原理

新闻

定价

支持

幫助中心

报告问题

提供反馈

隱私權政策

用户账户

关注我们