Committee Archives - Página 10 de 96

RA3: Mid-Training with Temporal Action Abstractions for Faster Reinforcement Learning (RL) Post-Training in Code LLMs

admin NU / octubre 9, 2025

TL;DR: A new research from Apple, formalizes what “mid-training” should do before reinforcement learning RL post-training and introduces RA3 (Reasoning as Action Abstractions)—an EM-style procedure that learns temporally consistent latent actions from expert traces, then fine-tunes on those bootstrapped traces. It shows mid-training should (1) prune to a compact near-optimal action subspace and (2) shorten the effective planning horizon, improving RL convergence. Empirically, RA3 improves HumanEval/MBPP by ~8/4 points over base/NTP and accelerates RLVR on HumanEval+, MBPP+, LiveCodeBench, and Codeforces. What does the research present? The research team present the first formal treatment of how mid-training shapes post-training reinforcement learning RL: they breakdown outcomes into (i) pruning efficiency—how well mid-training selects a compact near-optimal action subset that shapes the initial policy prior—and (ii) RL convergence—how quickly post-training improves within that restricted set. The analysis argues mid-training is most effective when the decision space is compact and the effective horizon is short, favoring temporal abstractions over primitive next-token actions. https://arxiv.org/pdf/2509.25810 Algorithm: RA3 in one pass RA3 derives a sequential variational lower bound (a temporal ELBO) and optimizes it with an EM-like loop: E-step (latent discovery): use RL to infer temporally consistent latent structures (abstractions) aligned to expert sequences. M-step (model update): perform next-token prediction on the bootstrapped, latent-annotated traces to make those abstractions part of the model’s policy. Results: code generation and RLVR On Python code tasks, the research team reports that across multiple base models, RA3 improves average pass@k on HumanEval and MBPP by ~8 and ~4 points over the base model and an NTP mid-training baseline. In post-training, RLVR converges faster and to higher final performance on HumanEval+, MBPP+, LiveCodeBench, and Codeforces when initialized from RA3. These are mid- and post-training effects respectively; the evaluation scope is code generation. Key Takeaways The research team formalizes mid-training via two determinants—pruning efficiency and impact on RL convergence—arguing effectiveness rises when the decision space is compact and the effective horizon is short. RA3 optimizes a sequential variational lower bound by iteratively discovering temporally consistent latent structures with RL and then fine-tuning on bootstrapped traces (EM-style). On code generation, RA3 reports ~+8 (HumanEval) and ~+4 (MBPP) average pass@k gains over base/NTP mid-training baselines across several model scales. Initializing post-training with RA3 accelerates RLVR convergence and improves asymptotic performance on HumanEval+, MBPP+, LiveCodeBench, and Codeforces. Editorial Comments RA3’s contribution is concrete and narrow: it formalizes mid-training around two determinants—pruning efficiency and RL convergence—and operationalizes them via a temporal ELBO optimized in an EM loop to learn persistent action abstractions before RLVR. The researchers report ~+8 (HumanEval) and ~+4 (MBPP) average pass@k gains over base/NTP and faster RLVR convergence on HumanEval+, MBPP+, LiveCodeBench, and Codeforces. Check out the Technical Paper. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well. The post RA3: Mid-Training with Temporal Action Abstractions for Faster Reinforcement Learning (RL) Post-Training in Code LLMs appeared first on MarkTechPost.

RA3: Mid-Training with Temporal Action Abstractions for Faster Reinforcement Learning (RL) Post-Training in Code LLMs Leer entrada »

AI, Committee, Noticias, Uncategorized

Fun-ASR Technical Report

admin NU / octubre 8, 2025

arXiv:2509.12508v3 Announce Type: replace Abstract: In recent years, automatic speech recognition (ASR) has witnessed transformative advancements driven by three complementary paradigms: data scaling, model size scaling, and deep integration with large language models (LLMs). However, LLMs are prone to hallucination, which can significantly degrade user experience in real-world ASR applications. In this paper, we present Fun-ASR, a large-scale, LLM-based ASR system that synergistically combines massive data, large model capacity, LLM integration, and reinforcement learning to achieve state-of-the-art performance across diverse and complex speech recognition scenarios. Moreover, Fun-ASR is specifically optimized for practical deployment, with enhancements in streaming capability, noise robustness, code-switching, hotword customization, and satisfying other real-world application requirements. Experimental results show that while most LLM-based ASR systems achieve strong performance on open-source benchmarks, they often underperform on real industry evaluation sets. Thanks to production-oriented optimizations, Fun-ASR achieves state-of-the-art performance on real application datasets, demonstrating its effectiveness and robustness in practical settings.

Fun-ASR Technical Report Leer entrada »

AI, Committee, Noticias, Uncategorized

Is It Thinking or Cheating? Detecting Implicit Reward Hacking by Measuring Reasoning Effort

admin NU / octubre 8, 2025

arXiv:2510.01367v3 Announce Type: replace-cross Abstract: Reward hacking, where a reasoning model exploits loopholes in a reward function to achieve high rewards without solving the intended task, poses a significant threat. This behavior may be explicit, i.e. verbalized in the model’s chain-of-thought (CoT), or implicit, where the CoT appears benign thus bypasses CoT monitors. To detect implicit reward hacking, we propose TRACE (Truncated Reasoning AUC Evaluation). Our key observation is that hacking occurs when exploiting the loophole is easier than solving the actual task. This means that the model is using less ‘effort’ than required to achieve high reward. TRACE quantifies effort by measuring how early a model’s reasoning becomes sufficient to obtain the reward. We progressively truncate a model’s CoT at various lengths, force the model to answer, and estimate the expected reward at each cutoff. A hacking model, which takes a shortcut, will achieve a high expected reward with only a small fraction of its CoT, yielding a large area under the accuracy-vs-length curve. TRACE achieves over 65% gains over our strongest 72B CoT monitor in math reasoning, and over 30% gains over a 32B monitor in coding. We further show that TRACE can discover unknown loopholes during training. Overall, TRACE offers a scalable unsupervised approach for oversight where current monitoring methods prove ineffective.

Is It Thinking or Cheating? Detecting Implicit Reward Hacking by Measuring Reasoning Effort Leer entrada »

AI, Committee, Noticias, Uncategorized

Partial Information Decomposition via Normalizing Flows in Latent Gaussian Distributions

admin NU / octubre 8, 2025

arXiv:2510.04417v1 Announce Type: cross Abstract: The study of multimodality has garnered significant interest in fields where the analysis of interactions among multiple information sources can enhance predictive modeling, data fusion, and interpretability. Partial information decomposition (PID) has emerged as a useful information-theoretic framework to quantify the degree to which individual modalities independently, redundantly, or synergistically convey information about a target variable. However, existing PID methods depend on optimizing over a joint distribution constrained by estimated pairwise probability distributions, which are costly and inaccurate for continuous and high-dimensional modalities. Our first key insight is that the problem can be solved efficiently when the pairwise distributions are multivariate Gaussians, and we refer to this problem as Gaussian PID (GPID). We propose a new gradient-based algorithm that substantially improves the computational efficiency of GPID based on an alternative formulation of the underlying optimization problem. To generalize the applicability to non-Gaussian data, we learn information-preserving encoders to transform random variables of arbitrary input distributions into pairwise Gaussian random variables. Along the way, we resolved an open problem regarding the optimality of joint Gaussian solutions for GPID. Empirical validation in diverse synthetic examples demonstrates that our proposed method provides more accurate and efficient PID estimates than existing baselines. We further evaluate a series of large-scale multimodal benchmarks to show its utility in real-world applications of quantifying PID in multimodal datasets and selecting high-performing models.

Partial Information Decomposition via Normalizing Flows in Latent Gaussian Distributions Leer entrada »

AI, Committee, Noticias, Uncategorized

How Many Parameters Does Your Task Really Need? Task Specific Pruning with LLM-Sieve

admin NU / octubre 7, 2025

arXiv:2505.18350v2 Announce Type: replace-cross Abstract: As Large Language Models (LLMs) are increasingly deployed for narrow tasks in resource-constrained settings, a central question arises: how much of an LLM is truly necessary for a given task? We present LLM-Sieve, a framework that prunes LLMs down to the minimal parameter subset needed to preserve task performance. Our approach introduces two innovations: (i) output-aligned non-orthogonal projections, which yield more faithful low-rank approximations than traditional PCA/SVD by aligning directly with layer outputs; and (ii) adaptive pruning via a Genetic Algorithm, which automatically discovers matrix-specific pruning levels and exposes the uneven distribution of task-relevant knowledge. Across models from 3.8B to 70B parameters, LLM-Sieve removes 20-75% of weights with only 1-5% accuracy loss-substantially ahead of prior pruning methods. Beyond efficiency, our framework reveals bottleneck matrices that concentrate critical knowledge, suggesting architectural implications for future LLM design. LLM-Sieve integrates seamlessly with LoRA fine-tuning and quantization, enabling both efficient deployment and deeper understanding of knowledge organization in LLMs.

How Many Parameters Does Your Task Really Need? Task Specific Pruning with LLM-Sieve Leer entrada »

AI, Committee, Noticias, Uncategorized

Google DeepMind Introduces CodeMender: A New AI Agent that Uses Gemini Deep Think to Automatically Patch Critical Software Vulnerabilities

admin NU / octubre 7, 2025

What if an AI agent could localize a root cause, prove a candidate fix via automated analysis and testing, and proactively rewrite related code to eliminate the entire vulnerability class—then open an upstream patch for review? Google DeepMind introduces CodeMender, an AI agent that generates, validates, and upstreams fixes for real-world vulnerabilities using Gemini “Deep Think” reasoning and a tool-augmented workflow. In six months of internal deployment, CodeMender contributed 72 security patches across open-source projects, including codebases up to ~4.5M lines, and is designed to act both reactively (patching known issues) and proactively (rewriting code to remove vulnerability classes). Understanding the Architecture The agent couples large-scale code reasoning with program-analysis tooling: static and dynamic analysis, differential testing, fuzzing, and satisfiability-modulo-theory (SMT) solvers. A multi-agent design adds specialized “critique” reviewers that inspect semantic diffs and trigger self-corrections when regressions are detected. These components let the system localize root causes, synthesize candidate patches, and automatically regression-test changes before surfacing them for human review. https://deepmind.google/discover/blog/introducing-codemender-an-ai-agent-for-code-security/? Validation Pipeline and Human Gate DeepMind emphasizes automatic validation before any human touches a patch: the system tests for root-cause fixes, functional correctness, absence of regressions, and style compliance; only high-confidence patches are proposed for maintainer review. This workflow is explicitly tied to Gemini Deep Think’s planning-centric reasoning over debugger traces, code search results, and test outcomes. Proactive Hardening: Compiler-Level Guards Beyond patching, CodeMender applies security-hardening transforms at scale. Example: automated insertion of Clang’s -fbounds-safety annotations in libwebp to enforce compiler-level bounds checks—an approach that would have neutralized the 2023 libwebp heap overflow (CVE-2023-4863) exploited in a zero-click iOS chain and similar buffer over/underflows where annotations are applied. Case Studies DeepMind details two non-trivial fixes: (1) a crash initially flagged as a heap overflow traced to incorrect XML stack management; and (2) a lifetime bug requiring edits to a custom C-code generator. In both cases, agent-generated patches passed automated analysis and an LLM-judge check for functional equivalence before proposal. https://deepmind.google/discover/blog/introducing-codemender-an-ai-agent-for-code-security/? Deployment Context and Related Initiatives Google’s broader announcement frames CodeMender as part of a defensive stack that includes a new AI Vulnerability Reward Program (consolidating AI-related bounties) and the Secure AI Framework 2.0 for agent security. The post reiterates the motivation: as AI-powered vulnerability discovery scales (e.g., via BigSleep and OSS-Fuzz), automated remediation must scale in tandem. Our Comments CodeMender operationalizes Gemini Deep Think plus program-analysis tools (static/dynamic analysis, fuzzing, SMT) to localize root causes and propose patches that pass automated validation before human review. Reported early data: 72 upstreamed security fixes across open-source projects over six months, including codebases on the order of ~4.5M lines. The system also applies proactive hardening (e.g., compiler-enforced bounds via Clang -fbounds-safety) to reduce memory-safety bug classes rather than only patching instances. No latency or throughput benchmarks are published yet, so impact is best measured by validated fixes and scope of hardened code. Check out the TECHNICAL DETAILS. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well. The post Google DeepMind Introduces CodeMender: A New AI Agent that Uses Gemini Deep Think to Automatically Patch Critical Software Vulnerabilities appeared first on MarkTechPost.

Google DeepMind Introduces CodeMender: A New AI Agent that Uses Gemini Deep Think to Automatically Patch Critical Software Vulnerabilities Leer entrada »

AI, Committee, Noticias, Uncategorized

Internal World Models as Imagination Networks in Cognitive Agents

admin NU / octubre 7, 2025

arXiv:2510.04391v1 Announce Type: cross Abstract: What is the computational objective of imagination? While classical interpretations suggest imagination is useful for maximizing rewards, recent findings challenge this view. In this study, we propose that imagination serves to access an internal world model (IWM) and use psychological network analysis to explore IWMs in humans and large language models (LLMs). Specifically, we assessed imagination vividness ratings using two questionnaires and constructed imagination networks from these reports. Imagination networks from human groups showed correlations between different centrality measures, including expected influence, strength, and closeness. However, imagination networks from LLMs showed a lack of clustering and lower correlations between centrality measures under different prompts and conversational memory conditions. Together, these results indicate a lack of similarity between IWMs in human and LLM agents. Overall, our study offers a novel method for comparing internally-generated representations in humans and AI, providing insights for developing human-like imagination in artificial intelligence.

Internal World Models as Imagination Networks in Cognitive Agents Leer entrada »

AI, Committee, Noticias, Uncategorized

Large Language Models Preserve Semantic Isotopies in Story Continuations

admin NU / octubre 7, 2025

arXiv:2510.04400v1 Announce Type: new Abstract: In this work, we explore the relevance of textual semantics to Large Language Models (LLMs), extending previous insights into the connection between distributional semantics and structural semantics. We investigate whether LLM-generated texts preserve semantic isotopies. We design a story continuation experiment using 10,000 ROCStories prompts completed by five LLMs. We first validate GPT-4o’s ability to extract isotopies from a linguistic benchmark, then apply it to the generated stories. We then analyze structural (coverage, density, spread) and semantic properties of isotopies to assess how they are affected by completion. Results show that LLM completion within a given token horizon preserves semantic isotopies across multiple properties.

Large Language Models Preserve Semantic Isotopies in Story Continuations Leer entrada »

AI, Committee, Noticias, Uncategorized

SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations

admin NU / octubre 7, 2025

arXiv:2510.04398v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly deployed in high-risk domains. However, state-of-the-art LLMs often produce hallucinations, raising serious concerns about their reliability. Prior work has explored adversarial attacks for hallucination elicitation in LLMs, but it often produces unrealistic prompts, either by inserting gibberish tokens or by altering the original meaning. As a result, these approaches offer limited insight into how hallucinations may occur in practice. While adversarial attacks in computer vision often involve realistic modifications to input images, the problem of finding realistic adversarial prompts for eliciting LLM hallucinations has remained largely underexplored. To address this gap, we propose Semantically Equivalent and Coherent Attacks (SECA) to elicit hallucinations via realistic modifications to the prompt that preserve its meaning while maintaining semantic coherence. Our contributions are threefold: (i) we formulate finding realistic attacks for hallucination elicitation as a constrained optimization problem over the input prompt space under semantic equivalence and coherence constraints; (ii) we introduce a constraint-preserving zeroth-order method to effectively search for adversarial yet feasible prompts; and (iii) we demonstrate through experiments on open-ended multiple-choice question answering tasks that SECA achieves higher attack success rates while incurring almost no constraint violations compared to existing methods. SECA highlights the sensitivity of both open-source and commercial gradient-inaccessible LLMs to realistic and plausible prompt variations. Code is available at https://github.com/Buyun-Liang/SECA.

SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations Leer entrada »

AI, Committee, Noticias, Uncategorized

Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation

admin NU / octubre 7, 2025

arXiv:2509.08825v2 Announce Type: replace Abstract: Large language models are rapidly transforming social science research by enabling the automation of labor-intensive tasks like data annotation and text analysis. However, LLM outputs vary significantly depending on the implementation choices made by researchers (e.g., model selection or prompting strategy). Such variation can introduce systematic biases and random errors, which propagate to downstream analyses and cause Type I (false positive), Type II (false negative), Type S (wrong sign), or Type M (exaggerated effect) errors. We call this phenomenon where configuration choices lead to incorrect conclusions LLM hacking. We find that intentional LLM hacking is strikingly simple. By replicating 37 data annotation tasks from 21 published social science studies, we show that, with just a handful of prompt paraphrases, virtually anything can be presented as statistically significant. Beyond intentional manipulation, our analysis of 13 million labels from 18 different LLMs across 2361 realistic hypotheses shows that there is also a high risk of accidental LLM hacking, even when following standard research practices. We find incorrect conclusions in approximately 31% of hypotheses for state-of-the-art LLMs, and in half the hypotheses for smaller language models. While higher task performance and stronger general model capabilities reduce LLM hacking risk, even highly accurate models remain susceptible. The risk of LLM hacking decreases as effect sizes increase, indicating the need for more rigorous verification of LLM-based findings near significance thresholds. We analyze 21 mitigation techniques and find that human annotations provide crucial protection against false positives. Common regression estimator correction techniques can restore valid inference but trade off Type I vs. Type II errors. We publish a list of practical recommendations to prevent LLM hacking.

Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation Leer entrada »

Committee

RA3: Mid-Training with Temporal Action Abstractions for Faster Reinforcement Learning (RL) Post-Training in Code LLMs

Fun-ASR Technical Report

Is It Thinking or Cheating? Detecting Implicit Reward Hacking by Measuring Reasoning Effort

Partial Information Decomposition via Normalizing Flows in Latent Gaussian Distributions

How Many Parameters Does Your Task Really Need? Task Specific Pruning with LLM-Sieve

Google DeepMind Introduces CodeMender: A New AI Agent that Uses Gemini Deep Think to Automatically Patch Critical Software Vulnerabilities

Internal World Models as Imagination Networks in Cognitive Agents

Large Language Models Preserve Semantic Isotopies in Story Continuations

SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations

Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation

Nuestros servicios

Inicio

Cómo funciona

Noticias

Precios

Soporte

Centro de ayuda

Reportar un problema

Dar comentarios

Política de privacidad

Cuenta de usuario

Síguenos