YouZum

Committee

AI, Committee, Noticias, Uncategorized

A Systematic Literature Review of Retrieval-Augmented Generation: Techniques, Metrics, and Challenges

arXiv:2508.06401v3 Announce Type: replace-cross Abstract: This systematic review of the research literature on retrieval-augmented generation (RAG) provides a focused analysis of the most highly cited studies published between 2020 and May 2025. A total of 128 articles met our inclusion criteria. The records were retrieved from ACM Digital Library, IEEE Xplore, Scopus, ScienceDirect, and the Digital Bibliography and Library Project (DBLP). RAG couples a neural retriever with a generative language model, grounding output in up-to-date, non-parametric memory while retaining the semantic generalisation stored in model weights. Guided by the PRISMA 2020 framework, we (i) specify explicit inclusion and exclusion criteria based on citation count and research questions, (ii) catalogue datasets, architectures, and evaluation practices, and (iii) synthesise empirical evidence on the effectiveness and limitations of RAG. To mitigate citation-lag bias, we applied a lower citation-count threshold to papers published in 2025 so that emerging breakthroughs with naturally fewer citations were still captured. This review clarifies the current research landscape, highlights methodological gaps, and charts priority directions for future research.

A Systematic Literature Review of Retrieval-Augmented Generation: Techniques, Metrics, and Challenges Leer entrada »

AI, Committee, Noticias, Uncategorized

GENUINE: Graph Enhanced Multi-level Uncertainty Estimation for Large Language Models

arXiv:2509.07925v1 Announce Type: new Abstract: Uncertainty estimation is essential for enhancing the reliability of Large Language Models (LLMs), particularly in high-stakes applications. Existing methods often overlook semantic dependencies, relying on token-level probability measures that fail to capture structural relationships within the generated text. We propose GENUINE: Graph ENhanced mUlti-level uncertaINty Estimation for Large Language Models, a structure-aware framework that leverages dependency parse trees and hierarchical graph pooling to refine uncertainty quantification. By incorporating supervised learning, GENUINE effectively models semantic and structural relationships, improving confidence assessments. Extensive experiments across NLP tasks show that GENUINE achieves up to 29% higher AUROC than semantic entropy-based approaches and reduces calibration errors by over 15%, demonstrating the effectiveness of graph-based uncertainty modeling. The code is available at https://github.com/ODYSSEYWT/GUQ.

GENUINE: Graph Enhanced Multi-level Uncertainty Estimation for Large Language Models Leer entrada »

AI, Committee, Noticias, Uncategorized

Step-level Verifier-guided Hybrid Test-Time Scaling for Large Language Models

arXiv:2507.15512v3 Announce Type: replace Abstract: Test-Time Scaling (TTS) is a promising approach to progressively elicit the model’s intelligence during inference. Recently, training-based TTS methods, such as continued reinforcement learning (RL), have further surged in popularity, while training-free TTS methods are gradually fading from prominence. However, the additional computation overhead of training amplifies the burden on test-time scaling. In this paper, we focus on training-free TTS methods for reasoning. We first design Conditional Step-level Self-refinement, a fine-grained sequential scaling method guided by process verification. On top of its effectiveness, we further combine it with other classical parallel scaling methods at the step level, to introduce a novel inference paradigm called Hybrid Test-Time Scaling. Extensive experiments on five instruction-tuned LLMs across different scales (3B-14B) and families demonstrate that hybrid strategy incorporating various training-free TTS methods at a fine granularity has considerable potential for expanding the reasoning performance boundaries of LLMs.

Step-level Verifier-guided Hybrid Test-Time Scaling for Large Language Models Leer entrada »

AI, Committee, Noticias, Uncategorized

UPLex: Fine-Grained Personality Control in Large Language Models via Unsupervised Lexical Modulation

arXiv:2310.16582v3 Announce Type: replace Abstract: Personality is a crucial factor that shapes human communication patterns, thereby regulating the personalities of large language models (LLMs) holds significant potential in enhancing their user experiences. Previous approaches either relied on fine-tuning LLMs on specific corpora or required manually crafted prompts to evoke specific personalities from LLMs. However, the former is inefficient and costly, while the latter cannot precisely manipulate personality traits at a fine-grained level. To address these challenges, we propose UPLex, a method that uses an Unsupervisedly-Built Personalized Lexicon (UPL) during the decoding phase to manipulate LLM’s personality traits. UPL can be constructed from a newly built situational judgment test dataset in an unsupervised fashion, and used to modulate the personality expression of LLMs by dynamically altering their predicted probability of upcoming words in a pluggable fashion. Extensive experimentation demonstrates the remarkable effectiveness and pluggability of our method for fine-grained manipulation of LLMs’ personalities.

UPLex: Fine-Grained Personality Control in Large Language Models via Unsupervised Lexical Modulation Leer entrada »

AI, Committee, Noticias, Uncategorized

Avoiding Knowledge Edit Skipping in Multi-hop Question Answering with Guided Decomposition

arXiv:2509.07555v1 Announce Type: new Abstract: In a rapidly evolving world where information updates swiftly, knowledge in large language models (LLMs) becomes outdated quickly. Retraining LLMs is not a cost-effective option, making knowledge editing (KE) without modifying parameters particularly necessary. We find that although existing retrieval-augmented generation (RAG)-based KE methods excel at editing simple knowledge, they struggle with KE in multi-hop question answering due to the issue of “edit skipping”, which refers to skipping the relevant edited fact in inference. In addition to the diversity of natural language expressions of knowledge, edit skipping also arises from the mismatch between the granularity of LLMs in problem-solving and the facts in the edited memory. To address this issue, we propose a novel Iterative Retrieval-Augmented Knowledge Editing method with guided decomposition (IRAKE) through the guidance from single edited facts and entire edited cases. Experimental results demonstrate that IRAKE mitigates the failure of editing caused by edit skipping and outperforms state-of-the-art methods for KE in multi-hop question answering.

Avoiding Knowledge Edit Skipping in Multi-hop Question Answering with Guided Decomposition Leer entrada »

AI, Committee, Noticias, Uncategorized

Antidistillation Sampling

arXiv:2504.13146v4 Announce Type: replace-cross Abstract: Frontier models that generate extended reasoning traces inadvertently produce rich token sequences that can facilitate model distillation. Recognizing this vulnerability, model owners may seek sampling strategies that limit the effectiveness of distillation without compromising model performance. Antidistillation sampling provides exactly this capability. By strategically modifying a model’s next-token probability distribution, antidistillation sampling poisons reasoning traces, rendering them significantly less effective for distillation while preserving the model’s practical utility. For further details, see https://antidistillation.com.

Antidistillation Sampling Leer entrada »

AI, Committee, Noticias, Uncategorized

No Encore: Unlearning as Opt-Out in Music Generation

arXiv:2509.06277v1 Announce Type: new Abstract: AI music generation is rapidly emerging in the creative industries, enabling intuitive music generation from textual descriptions. However, these systems pose risks in exploitation of copyrighted creations, raising ethical and legal concerns. In this paper, we present preliminary results on the first application of machine unlearning techniques from an ongoing research to prevent inadvertent usage of creative content. Particularly, we explore existing methods in machine unlearning to a pre-trained Text-to-Music (TTM) baseline and analyze their efficacy in unlearning pre-trained datasets without harming model performance. Through our experiments, we provide insights into the challenges of applying unlearning in music generation, offering a foundational analysis for future works on the application of unlearning for music generative models.

No Encore: Unlearning as Opt-Out in Music Generation Leer entrada »

AI, Committee, Noticias, Uncategorized

Support or Refute: Analyzing the Stance of Evidence to Detect Out-of-Context Mis- and Disinformation

arXiv:2311.01766v5 Announce Type: replace Abstract: Mis- and disinformation online have become a major societal problem as major sources of online harms of different kinds. One common form of mis- and disinformation is out-of-context (OOC) information, where different pieces of information are falsely associated, e.g., a real image combined with a false textual caption or a misleading textual description. Although some past studies have attempted to defend against OOC mis- and disinformation through external evidence, they tend to disregard the role of different pieces of evidence with different stances. Motivated by the intuition that the stance of evidence represents a bias towards different detection results, we propose a stance extraction network (SEN) that can extract the stances of different pieces of multi-modal evidence in a unified framework. Moreover, we introduce a support-refutation score calculated based on the co-occurrence relations of named entities into the textual SEN. Extensive experiments on a public large-scale dataset demonstrated that our proposed method outperformed the state-of-the-art baselines, with the best model achieving a performance gain of 3.2% in accuracy. The source code and checkpoints are publicly available at https://github.com/yx3266/SEN.

Support or Refute: Analyzing the Stance of Evidence to Detect Out-of-Context Mis- and Disinformation Leer entrada »

AI, Committee, Noticias, Uncategorized

Automatic Prompt Optimization with Prompt Distillation

arXiv:2508.18992v2 Announce Type: replace Abstract: Autoprompting is the process of automatically selecting optimized prompts for language models, which is gaining popularity due to the rapid development of prompt engineering driven by extensive research in the field of large language models (LLMs). This paper presents DistillPrompt — a novel autoprompting method based on large language models that employs a multi-stage integration of task-specific information into prompts using training data. DistillPrompt utilizes distillation, compression, and aggregation operations to explore the prompt space more thoroughly. The method was tested on different datasets for text classification and generation tasks using the t-lite-instruct-0.1 language model. The results demonstrate a significant average improvement (e.g., 20.12% across the entire dataset compared to Grips) in key metrics over existing methods in the field, establishing DistillPrompt as one of the most effective non-gradient approaches in autoprompting.

Automatic Prompt Optimization with Prompt Distillation Leer entrada »

AI, Committee, Noticias, Uncategorized

Anchoring Refusal Direction: Mitigating Safety Risks in Tuning via Projection Constraint

arXiv:2509.06795v1 Announce Type: new Abstract: Instruction Fine-Tuning (IFT) has been widely adopted as an effective post-training strategy to enhance various abilities of Large Language Models (LLMs). However, prior studies have shown that IFT can significantly compromise LLMs’ safety, particularly their ability to refuse malicious instructions, raising significant concerns. Recent research into the internal mechanisms of LLMs has identified the refusal direction (r-direction) in the hidden states, which plays a pivotal role in governing refusal behavior. Building on this insight, our study reveals that the r-direction tends to drift during training, which we identify as one of the causes of the associated safety risks. To mitigate such drift, our proposed ProCon method introduces a projection-constrained loss term that regularizes the projection magnitude of each training sample’s hidden state onto the r-direction. Our initial analysis shows that applying an appropriate constraint can effectively mitigate the refusal direction drift and associated safety risks, but remains limited by overall performance barriers. To overcome this barrier, informed by our observation of early-stage sharp drift and a data-driven perspective, we introduce a warm-up strategy that emphasizes early-stage strong constraints and broaden the data distribution to strengthen constraint signals, leading to an enhanced ProCon method. Experimental results under various datasets, scenarios, and LLMs demonstrate that our method can significantly mitigate safety risks posed by IFT while preserving task performance gains. Even compared with strong baselines, our method consistently delivers superior overall performance. Crucially, our analysis indicates that ProCon can contribute to stabilizing the r-direction during training, while such an interpretability-driven exploration of LLMs’ internal mechanisms lays a solid foundation for future safety research.

Anchoring Refusal Direction: Mitigating Safety Risks in Tuning via Projection Constraint Leer entrada »

We use cookies to improve your experience and performance on our website. You can learn more at Política de privacidad and manage your privacy settings by clicking Settings.

Privacy Preferences

You can choose your cookie settings by turning on/off each type of cookie as you wish, except for essential cookies.

Allow All
Manage Consent Preferences
  • Always Active

Save
es_ES