ข่าว
ข่าว
Falcon: A Comprehensive Chinese Text-to-SQL Benchmark for Enterprise-Grade Evaluation
arXiv:2510.24762v1 Announce Type: new Abstract: We introduce Falcon, a cross-domain Chinese text-to-SQL benchmark grounded in...
FaithUn: Toward Faithful Forgetting in Language Models by Investigating the Interconnectedness of Knowledge
arXiv:2502.19207v2 Announce Type: replace Abstract: Various studies have attempted to remove sensitive or private knowledge...
Faithfulness-Aware Uncertainty Quantification for Fact-Checking the Output of Retrieval Augmented Generation
arXiv:2505.21072v5 Announce Type: replace Abstract: Large Language Models (LLMs) enhanced with retrieval, an approach known...
Faithfulness metric fusion: Improving the evaluation of LLM trustworthiness across domains
arXiv:2512.05700v1 Announce Type: new Abstract: We present a methodology for improving the accuracy of faithfulness...
Fair-GPTQ: Bias-Aware Quantization for Large Language Models
arXiv:2509.15206v1 Announce Type: new Abstract: High memory demands of generative language models have drawn attention...
Fair Text Classification via Transferable Representations
arXiv:2503.07691v2 Announce Type: replace-cross Abstract: Group fairness is a central research topic in text classification...
Failure by Interference: Language Models Make Balanced Parentheses Errors When Faulty Mechanisms Overshadow Sound Ones
arXiv:2507.00322v1 Announce Type: new Abstract: Despite remarkable advances in coding capabilities, language models (LMs) still...
External Hippocampus: Topological Cognitive Maps for Guiding Large Language Model Reasoning
arXiv:2512.18190v1 Announce Type: cross Abstract: This paper proposes the External Hippocampus framework, which models language...
Exploring the Influence of Relevant Knowledge for Natural Language Generation Interpretability
arXiv:2510.24179v1 Announce Type: new Abstract: This paper explores the influence of external knowledge integration in...
Exploring the Escalation of Source Bias in User, Data, and Recommender System Feedback Loop
arXiv:2405.17998v2 Announce Type: replace-cross Abstract: Recommender systems are essential for information access, allowing users to...
Exploring Procedural Data Generation for Automatic Acoustic Guitar Fingerpicking Transcription
arXiv:2508.07987v1 Announce Type: cross Abstract: Automatic transcription of acoustic guitar fingerpicking performances remains a challenging...
Exploring LLM Autoscoring Reliability in Large-Scale Writing Assessments Using Generalizability Theory
arXiv:2507.19980v1 Announce Type: new Abstract: This study investigates the estimation of reliability for large language...