News Archives - Página 19 de 101

Context Parametrization with Compositional Adapters

admin NU / septiembre 29, 2025

arXiv:2509.22158v1 Announce Type: new Abstract: Large language models (LLMs) often seamlessly adapt to new tasks through in-context learning (ICL) or supervised fine-tuning (SFT). However, both of these approaches face key limitations: ICL is inefficient when handling many demonstrations, and SFT incurs training overhead while sacrificing flexibility. Mapping instructions or demonstrations from context directly into adapter parameters offers an appealing alternative. While prior work explored generating adapters based on a single input context, it has overlooked the need to integrate multiple chunks of information. To address this gap, we introduce CompAs, a meta-learning framework that translates context into adapter parameters with a compositional structure. Adapters generated this way can be merged algebraically, enabling instructions, demonstrations, or retrieved passages to be seamlessly combined without reprocessing long prompts. Critically, this approach yields three benefits: lower inference cost, robustness to long-context instability, and establishes a principled solution when input exceeds the model’s context window. Furthermore, CompAs encodes information into adapter parameters in a reversible manner, enabling recovery of input context through a decoder, facilitating safety and security. Empirical results on diverse multiple-choice and extractive question answering tasks show that CompAs outperforms ICL and prior generator-based methods, especially when scaling to more inputs. Our work establishes composable adapter generation as a practical and efficient alternative for scaling LLM deployment.

Context Parametrization with Compositional Adapters Leer entrada »

AI, Committee, Noticias, Uncategorized

Learning Human-Perceived Fakeness in AI-Generated Videos via Multimodal LLMs

admin NU / septiembre 29, 2025

arXiv:2509.22646v1 Announce Type: cross Abstract: Can humans identify AI-generated (fake) videos and provide grounded reasons? While video generation models have advanced rapidly, a critical dimension — whether humans can detect deepfake traces within a generated video, i.e., spatiotemporal grounded visual artifacts that reveal a video as machine generated — has been largely overlooked. We introduce DeeptraceReward, the first fine-grained, spatially- and temporally- aware benchmark that annotates human-perceived fake traces for video generation reward. The dataset comprises 4.3K detailed annotations across 3.3K high-quality generated videos. Each annotation provides a natural-language explanation, pinpoints a bounding-box region containing the perceived trace, and marks precise onset and offset timestamps. We consolidate these annotations into 9 major categories of deepfake traces that lead humans to identify a video as AI-generated, and train multimodal language models (LMs) as reward models to mimic human judgments and localizations. On DeeptraceReward, our 7B reward model outperforms GPT-5 by 34.7% on average across fake clue identification, grounding, and explanation. Interestingly, we observe a consistent difficulty gradient: binary fake v.s. real classification is substantially easier than fine-grained deepfake trace detection; within the latter, performance degrades from natural language explanations (easiest), to spatial grounding, to temporal labeling (hardest). By foregrounding human-perceived deepfake traces, DeeptraceReward provides a rigorous testbed and training signal for socially aware and trustworthy video generation.

Learning Human-Perceived Fakeness in AI-Generated Videos via Multimodal LLMs Leer entrada »

AI, Committee, Noticias, Uncategorized

Learning the Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models

admin NU / septiembre 29, 2025

arXiv:2509.21155v2 Announce Type: replace Abstract: For an LLM to correctly respond to an instruction it must understand both the semantics and the domain (i.e., subject area) of a given task-instruction pair. However, syntax can also convey implicit information Recent work shows that syntactic templates — frequent sequences of Part-of-Speech (PoS) tags — are prevalent in training data and often appear in model outputs. In this work we characterize syntactic templates, domain, and semantics in task-instruction pairs. We identify cases of spurious correlations between syntax and domain, where models learn to associate a domain with syntax during training; this can sometimes override prompt semantics. Using a synthetic training dataset, we find that the syntactic-domain correlation can lower performance (mean 0.51 +/- 0.06) on entity knowledge tasks in OLMo-2 models (1B-13B). We introduce an evaluation framework to detect this phenomenon in trained models, and show that it occurs on a subset of the FlanV2 dataset in open (OLMo-2-7B; Llama-4-Maverick), and closed (GPT-4o) models. Finally, we present a case study on the implications for safety finetuning, showing that unintended syntactic-domain correlations can be used to bypass refusals in OLMo-2-7B Instruct and GPT-4o. Our findings highlight two needs: (1) to explicitly test for syntactic-domain correlations, and (2) to ensure syntactic diversity in training data, specifically within domains, to prevent such spurious correlations.

Learning the Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models Leer entrada »

AI, Committee, Noticias, Uncategorized

Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization

admin NU / septiembre 29, 2025

arXiv:2408.15176v4 Announce Type: replace-cross Abstract: We present a unified framework for automatic multitrack music arrangement that enables a single pre-trained symbolic music model to handle diverse arrangement scenarios, including reinterpretation, simplification, and additive generation. At its core is a segment-level reconstruction objective operating on token-level disentangled content and style, allowing for flexible any-to-any instrumentation transformations at inference time. To support track-wise modeling, we introduce REMI-z, a structured tokenization scheme for multitrack symbolic music that enhances modeling efficiency and effectiveness for both arrangement tasks and unconditional generation. Our method outperforms task-specific state-of-the-art models on representative tasks in different arrangement scenarios — band arrangement, piano reduction, and drum arrangement, in both objective metrics and perceptual evaluations. Taken together, our framework demonstrates strong generality and suggests broader applicability in symbolic music-to-music transformation.

Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization Leer entrada »

AI, Committee, Noticias, Uncategorized

Top 10 Local LLMs (2025): Context Windows, VRAM Targets, and Licenses Compared

admin NU / septiembre 28, 2025

Local LLMs matured fast in 2025: open-weight families like Llama 3.1 (128K context length (ctx)), Qwen3 (Apache-2.0, dense + MoE), Gemma 2 (9B/27B, 8K ctx), Mixtral 8×7B (Apache-2.0 SMoE), and Phi-4-mini (3.8B, 128K ctx) now ship reliable specs and first-class local runners (GGUF/llama.cpp, LM Studio, Ollama), making on-prem and even laptop inference practical if you match context length and quantization to VRAM. This guide lists the ten most deployable options by license clarity, stable GGUF availability, and reproducible performance characteristics (params, context length (ctx), quant presets). Top 10 Local LLMs (2025) 1) Meta Llama 3.1-8B — robust “daily driver,” 128K context Why it matters. A stable, multilingual baseline with long context and first-class support across local toolchains.Specs. Dense 8B decoder-only; official 128K context; instruction-tuned and base variants. Llama license (open weights). Common GGUF builds and Ollama recipes exist. Typical setup: Q4_K_M/Q5_K_M for ≤12-16 GB VRAM, Q6_K for ≥24 GB. 2) Meta Llama 3.2-1B/3B — edge-class, 128K context, on-device friendly Why it matters. Small models that still take 128K tokens and run acceptably on CPUs/iGPUs when quantized; good for laptops and mini-PCs.Specs. 1B/3B instruction-tuned models; 128K context confirmed by Meta. Works well via llama.cpp GGUF and LM Studio’s multi-runtime stack (CPU/CUDA/Vulkan/Metal/ROCm). 3) Qwen3-14B / 32B — open Apache-2.0, strong tool-use & multilingual Why it matters. Broad family (dense+MoE) under Apache-2.0 with active community ports to GGUF; widely reported as a capable general/agentic “daily driver” locally.Specs. 14B/32B dense checkpoints with long-context variants; modern tokenizer; rapid ecosystem updates. Start at Q4_K_M for 14B on 12 GB; move to Q5/Q6 when you have 24 GB+. (Qwen) 4) DeepSeek-R1-Distill-Qwen-7B — compact reasoning that fits Why it matters. Distilled from R1-style reasoning traces; delivers step-by-step quality at 7B with widely available GGUFs. Excellent for math/coding on modest VRAM.Specs. 7B dense; long-context variants exist per conversion; curated GGUFs cover F32→Q4_K_M. For 8–12 GB VRAM try Q4_K_M; for 16–24 GB use Q5/Q6. 5) Google Gemma 2-9B / 27B — efficient dense; 8K context (explicit) Why it matters. Strong quality-for-size and quantization behavior; 9B is a great mid-range local model.Specs. Dense 9B/27B; 8K context (don’t overstate); open weights under Gemma terms; widely packaged for llama.cpp/Ollama. 9B@Q4_K_M runs on many 12 GB cards. 6) Mixtral 8×7B (SMoE) — Apache-2.0 sparse MoE; cost/perf workhorse Why it matters. Mixture-of-Experts throughput benefits at inference: ~2 experts/token selected at runtime; great compromise when you have ≥24–48 GB VRAM (or multi-GPU) and want stronger general performance.Specs. 8 experts of 7B each (sparse activation); Apache-2.0; instruct/base variants; mature GGUF conversions and Ollama recipes. 7) Microsoft Phi-4-mini-3.8B — small model, 128K context Why it matters. Realistic “small-footprint reasoning” with 128K context and grouped-query attention; solid for CPU/iGPU boxes and latency-sensitive tools.Specs. 3.8B dense; 200k vocab; SFT/DPO alignment; model card documents 128K context and training profile. Use Q4_K_M on ≤8–12 GB VRAM. 8) Microsoft Phi-4-Reasoning-14B — mid-size reasoning (check ctx per build) Why it matters. A 14B reasoning-tuned variant that is materially better for chain-of-thought-style tasks than generic 13–15B baselines.Specs. Dense 14B; context varies by distribution (model card for a common release lists 32K). For 24 GB VRAM, Q5_K_M/Q6_K is comfortable; mixed-precision runners (non-GGUF) need more. 9) Yi-1.5-9B / 34B — Apache-2.0 bilingual; 4K/16K/32K variants Why it matters. Competitive EN/zh performance and permissive license; 9B is a strong alternative to Gemma-2-9B; 34B steps toward higher reasoning under Apache-2.0.Specs. Dense; context variants 4K/16K/32K; open weights under Apache-2.0 with active HF cards/repos. For 9B use Q4/Q5 on 12–16 GB. 10) InternLM 2 / 2.5-7B / 20B — research-friendly; math-tuned branches Why it matters. An open series with lively research cadence; 7B is a practical local target; 20B moves you toward Gemma-2-27B-class capability (at higher VRAM).Specs. Dense 7B/20B; multiple chat/base/math variants; active HF presence. GGUF conversions and Ollama packs are common. source: marktechpost.com Summary In local LLMs, the trade-offs are clear: pick dense models for predictable latency and simpler quantization (e.g., Llama 3.1-8B with a documented 128K context; Gemma 2-9B/27B with an explicit 8K window), move to sparse MoE like Mixtral 8×7B when your VRAM and parallelism justify higher throughput per cost, and treat small reasoning models (Phi-4-mini-3.8B, 128K) as the sweet spot for CPU/iGPU boxes. Licenses and ecosystems matter as much as raw scores: Qwen3’s Apache-2.0 releases (dense + MoE) and Meta/Google/Microsoft model cards give the operational guardrails (context, tokenizer, usage terms) you’ll actually live with. On the runtime side, standardize on GGUF/llama.cpp for portability, layer Ollama/LM Studio for convenience and hardware offload, and size quantization (Q4→Q6) to your memory budget. In short: choose by context + license + hardware path, not just leaderboard vibes. The post Top 10 Local LLMs (2025): Context Windows, VRAM Targets, and Licenses Compared appeared first on MarkTechPost.

Top 10 Local LLMs (2025): Context Windows, VRAM Targets, and Licenses Compared Leer entrada »

AI, Committee, Noticias, Uncategorized

Sakana AI Released ShinkaEvolve: An Open-Source Framework that Evolves Programs for Scientific Discovery with Unprecedented Sample-Efficiency

admin NU / septiembre 27, 2025

Table of contents What problem is it actually solving? Does the sample-efficiency claim hold beyond toy problems? How does the evolutionary loop look in practice? What are the concrete results? How does this compare to AlphaEvolve and related systems? Summary FAQs — ShinkaEvolve Sakana AI has released ShinkaEvolve, an open-sourced framework that uses large language models (LLMs) as mutation operators in an evolutionary loop to evolve programs for scientific and engineering problems—while drastically cutting the number of evaluations needed to reach strong solutions. On the canonical circle-packing benchmark (n=26 in a unit square), ShinkaEvolve reports a new SOTA configuration using ~150 program evaluations, where prior systems typically burned thousands. The project ships under Apache-2.0, with a research report and public code. https://sakana.ai/shinka-evolve/ What problem is it actually solving? Most “agentic” code-evolution systems explore by brute force: they mutate code, run it, score it, and repeat—consuming enormous sampling budgets. ShinkaEvolve targets that waste explicitly with three interacting components: Adaptive parent sampling to balance exploration/exploitation. Parents are drawn from “islands” via fitness- and novelty-aware policies (power-law or weighted by performance and offspring counts) rather than always climbing the current best. Novelty-based rejection filtering to avoid re-evaluating near-duplicates. Mutable code segments are embedded; if cosine similarity exceeds a threshold, a secondary LLM acts as a “novelty judge” before execution. Bandit-based LLM ensembling so the system learns which model (e.g., GPT/Gemini/Claude/DeepSeek families) is yielding the biggest relative fitness jumps and routes future mutations accordingly (UCB1-style update on improvement over parent/baseline). Does the sample-efficiency claim hold beyond toy problems? The research team evaluates four distinct domains and shows consistent gains with small budgets: Circle packing (n=26): reaches an improved configuration in roughly 150 evaluations; the research team also validate with stricter exact-constraint checking. AIME math reasoning (2024 set): evolves agentic scaffolds that trace out a Pareto frontier (accuracy vs. LLM-call budget), outperforming hand-built baselines under limited query budgets / Pareto frontier of accuracy vs. calls and transferring to other AIME years and LLMs. Competitive programming (ALE-Bench LITE): starting from ALE-Agent solutions, ShinkaEvolve delivers ~2.3% mean improvement across 10 tasks and pushes one task’s solution from 5th → 2nd in an AtCoder leaderboard counterfactual. LLM training (Mixture-of-Experts): evolves a new load-balancing loss that improves perplexity and downstream accuracy at multiple regularization strengths vs. the widely-used global-batch LBL. https://sakana.ai/shinka-evolve/ How does the evolutionary loop look in practice? ShinkaEvolve maintains an archive of evaluated programs with fitness, public metrics, and textual feedback. For each generation: sample an island and parent(s); construct a mutation context with top-K and random “inspiration” programs; then propose edits via three operators—diff edits, full rewrites, and LLM-guided crossovers—while protecting immutable code regions with explicit markers. Executed candidates update both the archive and the bandit statistics that steer subsequent LLM/model selection. The system periodically produces a meta-scratchpad that summarizes recently successful strategies; those summaries are fed back into prompts to accelerate later generations. What are the concrete results? Circle packing: combined structured initialization (e.g., golden-angle patterns), hybrid global–local search (simulated annealing + SLSQP), and escape mechanisms (temperature reheating, ring rotations) discovered by the system—not hand-coded a priori. AIME scaffolds: three-stage expert ensemble (generation → critical peer review → synthesis) that hits the accuracy/cost sweet spot at ~7 calls while retaining robustness when swapped to different LLM backends. ALE-Bench: targeted engineering wins (e.g., caching kd-tree subtree stats; “targeted edge moves” toward misclassified items) that push scores without wholesale rewrites. MoE loss: adds an entropy-modulated under-use penalty to the global-batch objective; empirically reduces miss-routing and improves perplexity/benchmarks as layer routing concentrates. How does this compare to AlphaEvolve and related systems? AlphaEvolve demonstrated strong closed-source results but at higher evaluation counts. ShinkaEvolve reproduces and surpasses the circle-packing result with orders-of-magnitude fewer samples and releases all components open-source. The research team also contrast variants (single-model vs. fixed ensemble vs. bandit ensemble) and ablate parent selection and novelty filtering, showing each contributes to the observed efficiency. Summary ShinkaEvolve is an Apache-2.0 framework for LLM-driven program evolution that cuts evaluations from thousands to hundreds by combining fitness/novelty-aware parent sampling, embedding-plus-LLM novelty rejection, and a UCB1-style adaptive LLM ensemble. It sets a new SOTA on circle packing (~150 evals), finds stronger AIME scaffolds under strict query budgets, improves ALE-Bench solutions (~2.3% mean gain, 5th→2nd on one task), and discovers a new MoE load-balancing loss that improves perplexity and downstream accuracy. Code and report are public. FAQs — ShinkaEvolve 1) What is ShinkaEvolve?An open-source framework that couples LLM-driven program mutations with evolutionary search to automate algorithm discovery and optimization. Code and report are public. 2) How does it achieve higher sample-efficiency than prior evolutionary systems?Three mechanisms: adaptive parent sampling (explore/exploit balance), novelty-based rejection to avoid duplicate evaluations, and a bandit-based selector that routes mutations to the most promising LLMs. 3) What supports the results?It reaches state-of-the-art circle packing with ~150 evaluations; on AIME-2024 it evolves scaffolds under a 10-query cap per problem; it improves ALE-Bench solutions over strong baselines. 4) Where can I run it and what’s the license?The GitHub repo provides a WebUI and examples; ShinkaEvolve is released under Apache-2.0. Check out the Technical details, Paper and GitHub Page. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. The post Sakana AI Released ShinkaEvolve: An Open-Source Framework that Evolves Programs for Scientific Discovery with Unprecedented Sample-Efficiency appeared first on MarkTechPost.

Sakana AI Released ShinkaEvolve: An Open-Source Framework that Evolves Programs for Scientific Discovery with Unprecedented Sample-Efficiency Leer entrada »

AI, Committee, Noticias, Uncategorized

Why and When to Use Sentence Embeddings Over Word Embeddings

admin NU / septiembre 27, 2025

Choosing the right text representation is a critical first step in any natural language processing (NLP) project.

Why and When to Use Sentence Embeddings Over Word Embeddings Leer entrada »

AI, Committee, Noticias, Uncategorized

US investigators are using AI to detect child abuse images made by AI

admin NU / septiembre 27, 2025

Generative AI has enabled the production of child sexual abuse images to skyrocket. Now the leading investigator of child exploitation in the US is experimenting with using AI to distinguish AI-generated images from material depicting real victims, according to a new government filing. The Department of Homeland Security’s Cyber Crimes Center, which investigates child exploitation across international borders, has awarded a $150,000 contract to San Francisco–based Hive AI for its software, which can identify whether a piece of content was AI-generated. The filing, posted on September 19, is heavily redacted and Hive cofounder and CEO Kevin Guo told MIT Technology Review that he could not discuss the details of the contract, but confirmed it involves use of the company’s AI detection algorithms for child sexual abuse material (CSAM). The filing quotes data from the National Center for Missing and Exploited Children that reported a 1,325% increase in incidents involving generative AI in 2024. “The sheer volume of digital content circulating online necessitates the use of automated tools to process and analyze data efficiently,” the filing reads. The first priority of child exploitation investigators is to find and stop any abuse currently happening, but the flood of AI-generated CSAM has made it difficult for investigators to know whether images depict a real victim currently at risk. A tool that could successfully flag real victims would be a massive help when they try to prioritize cases. Identifying AI-generated images “ensures that investigative resources are focused on cases involving real victims, maximizing the program’s impact and safeguarding vulnerable individuals,” the filing reads. Hive AI offers AI tools that create videos and images, as well as a range of content moderation tools that can flag violence, spam, and sexual material and even identify celebrities. In December, MIT Technology Review reported that the company was selling its deepfake-detection technology to the US military. For detecting CSAM, Hive offers a tool created with Thorn, a child safety nonprofit, which companies can integrate into their platforms. This tool uses a “hashing” system, which assigns unique IDs to content known by investigators to be CSAM, and blocks that material from being uploaded. This tool, and others like it, have become a standard line of defense for tech companies. But these tools simply identify a piece of content as CSAM; they don’t detect whether it was generated by AI. Hive has created a separate tool that determines whether images in general were AI-generated. Though it is not trained specifically to work on CSAM, according to Guo, it doesn’t need to be. “There’s some underlying combination of pixels in this image that we can identify” as AI-generated, he says. “It can be generalizable.” This tool, Guo says, is what the Cyber Crimes Center will be using to evaluate CSAM. He adds that Hive benchmarks its detection tools for each specific use case its customers have in mind. The National Center for Missing and Exploited Children, which participates in efforts to stop the spread of CSAM, did not respond to requests for comment on the effectiveness of such detection models in time for publication. In its filing, the government justifies awarding the contract to Hive without a competitive bidding process. Though parts of this justification are redacted, it primarily references two points also found in a Hive presentation slide deck. One involves a 2024 study from the University of Chicago, which found that Hive’s AI detection tool outranked four other detectors in identifying AI-generated art. The other is its contract with the Pentagon for identifying deepfakes. The trial will last three months.

US investigators are using AI to detect child abuse images made by AI Leer entrada »

AI, Committee, Noticias, Uncategorized

The Download: shoplifter-chasing drones, and Trump’s TikTok deal

admin NU / septiembre 27, 2025

Shoplifters in the US could soon be chased down by drones The news: Flock Safety, whose drones were once reserved for police departments, is now offering them for private-sector security, the company has announced. Potential customers include businesses trying to curb shoplifting. How it works: If the security team at a store sees shoplifters leave, they can activate a camera-equipped drone. “The drone follows the people. The people get in a car. You click a button and you track the vehicle with the drone, and the drone just follows the car,” says Keith Kauffman, a former police chief who now directs Flock’s drone program. The video feed of that drone might go to the company’s security team, but it could also be automatically transmitted directly to police departments. The response: Flock’s expansion into private-sector security is “a logical step, but in the wrong direction,” says Rebecca Williams, senior strategist for the ACLU’s privacy and data governance unit. Read the full story. —James O’Donnell Read more of our stories about the latest in drone tech: + Why you’re about to see a lot more drones over America’s skies. + Meet Serhii “Flash” Beskrestnov, the radio-obsessed civilian shaping Ukraine’s drone defense. His work could help to determine the future of Ukraine, and wars far beyond it. + We examined four big trends that show what’s next for drone technology. + The defense tech startup Epirus has developed a cutting-edge, cost-efficient drone zapper that’s sparking the interest of the US military. Read our story about how it could change the future of war. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 TikTok US is being valued at $14 billion by Trump’s dealThat’s shockingly low for a fast-growing social media company. (FT $) + The deal is basically just Trump giving TikTok to his friends. (Vox $)+ Here’s what the sale means for you. (WP $) 2 Microsoft has stopped letting Israel use its technology for surveillanceThe system was used to collect millions of Palestinian civilians’ phone calls every day. (The Guardian) 3 There are more robots working in China than the rest of the world combinedIt’s a trend that’ll further cement its status as the world’s leading manufacturer. (NYT $)+ China’s EV giants are betting big on humanoid robots. (MIT Technology Review) 4 The inside story of what happened when DOGE came to townIf anything, this is even more grim and chaotic than you might imagine. (Wired $) 5 Instagram’s teen safety features are flawedResearchers tested 47 of these features, and found that only 8 were fully effective. (Reuters $)+ There’s growing concern among lawmakers about the risks of kids forming bonds with chatbots. (MIT Technology Review) 6 Brazil’s judicial system is adopting AI with gustoThe trouble is that rather than reducing the amount of work for judges and lawyers, AI seems to be increasing it. (Rest of World)+ Meet the early-adopter judges using AI. (MIT Technology Review) 7 Amazon is refunding $1.5 billion to Prime subscribersThe deal with the FTC lets it avoid a trial over claims it tricked consumers into signing up. (WP $) 8 These women are in love with AI Like it or not, these sorts of romances are becoming more common. (Slate $)+ It’s surprisingly easy to stumble into a relationship with an AI chatbot. (MIT Technology Review) 9 Scientists are improving how we measure nothingResearchers are developing a vacuum-measurement tool that could unlock exciting new possibilities for science. (IEEE Spectrum)+ This quantum radar could image buried objects. (MIT Technology Review) 10 Why does everything online feel so icky? Most of us will go to extreme lengths to avoid awkwardness IRL. On social media, it’s another matter entirely… (Vox $)+ China’s government has had enough of everyone being negative on its internet. (BBC) Quote of the day “AI machines—in quite a literal sense—appear to be saving the US economy right now. In the absence of tech-related spending, the US would be close to, or in, recession this year.” —George Saravelos, global head of FX research at Deutsche Bank, warns that the AI boom is unsustainable in a note to clients, Fortune reports. One more thing COURTESY OF OPENAI The two people shaping the future of OpenAI’s research —Will Douglas Heaven For the past couple of years, OpenAI has felt like a one-man brand. With his showbiz style and fundraising glitz, CEO Sam Altman overshadows all other big names on the firm’s roster. But Altman is not the one building the technology on which its reputation rests. That responsibility falls to OpenAI’s twin heads of research—chief research officer Mark Chen and chief scientist Jakub Pachocki. Between them, they share the role of making sure OpenAI stays one step ahead of powerhouse rivals like Google. I recently sat down with Chen and Pachocki for an exclusive conversation which covered everything from how they manage the inherent tension between research and product, to what they really mean when they talk about AGI, and what happened to OpenAI’s superalignment team. Read the full story. We can still have nice things + Wherever you are, this website helps you discover the most interesting bars nearby. + Take a tour of Norway’s lighthouses.+ Inside London’s flourishing underground rave scene.+ Meaningful changes rarely occur instantly. Here’s how they do happen.

The Download: shoplifter-chasing drones, and Trump’s TikTok deal Leer entrada »

AI, Committee, Noticias, Uncategorized

Meet Qwen3Guard: The Qwen3-based Multilingual Safety Guardrail Models Built for Global, Real-Time AI Safety

admin NU / septiembre 27, 2025

Can safety keep up with real-time LLMs? Alibaba’s Qwen team thinks so, and it just shipped Qwen3Guard—a multilingual guardrail model family built to moderate prompts and streaming responses in-real-time. Qwen3Guard comes in two variants: Qwen3Guard-Gen (a generative classifier that reads full prompt/response context) and Qwen3Guard-Stream (a token-level classifier that moderates as text is generated). Both are released in 0.6B, 4B, and 8B parameter sizes and target global deployments with coverage for 119 languages and dialects. The models are open-sourced, with weights on Hugging Face and GitHub Repo. https://github.com/QwenLM/Qwen3Guard What’s new? Streaming moderation head: Stream attaches two lightweight classification heads to the final transformer layer—one monitors the user prompt, the other scores each generated token in real time as Safe / Controversial / Unsafe. This enables policy enforcement while a reply is being produced, instead of post-hoc filtering. Three-tier risk semantics: Beyond binary safe/unsafe labels, a Controversial tier supports adjustable strictness (binary tightening/loosening) across datasets and policies—useful when “borderline” content must be routed or escalated, not simply dropped. Structured outputs for Gen: The generative variant emits a standard header—Safety: …, Categories: …, Refusal: …—that’s trivial to parse for pipelines and RL reward functions. Categories include Violent, Non-violent Illegal Acts, Sexual Content, PII, Suicide & Self-Harm, Unethical Acts, Politically Sensitive Topics, Copyright Violation, Jailbreak. Benchmarks and safety RL The Qwen research team shows state-of-the-art average F1 across English, Chinese, and multilingual safety benchmarks for both prompt and response classification, with data plotted for Qwen3Guard-Gen versus prior open models. While the research team emphasizes relative gains rather than a single composite metric, the consistent lead across settings is the key point. For training downstream assistants, the research team test safety-driven RL using Qwen3Guard-Gen as a reward signal. A Guard-only reward maximizes safety but spikes refusals and slightly dents arena-hard-v2 win rate; a Hybrid reward (penalizing over-refusals, blending quality signals) lifts the WildGuard-measured safety score from ~60 to >97 without degrading reasoning tasks, and even nudges arena-hard-v2 upward. This is a practical recipe for teams that saw prior reward shaping collapse into “refuse-everything” behavior. https://github.com/QwenLM/Qwen3Guard Where it fits? Most open guard models only classify completed outputs. Qwen3Guard’s dual heads + token-time scoring align with production agents that stream responses, enabling early intervention (block, redact, or redirect) with lower latency cost than re-decoding. The Controversial tier also maps cleanly onto enterprise policy knobs (e.g., treat “Controversial” as unsafe in regulated contexts, but allow with review in consumer chat). Summary Qwen3Guard is a practical guardrail stack: open-weights (0.6B/4B/8B), two operating modes (full-context Gen, token-time Stream), tri-level risk labeling, and multilingual coverage (119 languages). For production teams, this is a credible baseline to replace post-hoc filters with real-time moderation and to align assistants with safety rewards while monitoring refusal rates. Check out the Paper, GitHub Page and Full Collection on HF. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. The post Meet Qwen3Guard: The Qwen3-based Multilingual Safety Guardrail Models Built for Global, Real-Time AI Safety appeared first on MarkTechPost.

Meet Qwen3Guard: The Qwen3-based Multilingual Safety Guardrail Models Built for Global, Real-Time AI Safety Leer entrada »

Noticias

Context Parametrization with Compositional Adapters

Learning Human-Perceived Fakeness in AI-Generated Videos via Multimodal LLMs

Learning the Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models

Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization

Top 10 Local LLMs (2025): Context Windows, VRAM Targets, and Licenses Compared

Sakana AI Released ShinkaEvolve: An Open-Source Framework that Evolves Programs for Scientific Discovery with Unprecedented Sample-Efficiency

Why and When to Use Sentence Embeddings Over Word Embeddings

US investigators are using AI to detect child abuse images made by AI

The Download: shoplifter-chasing drones, and Trump’s TikTok deal

Meet Qwen3Guard: The Qwen3-based Multilingual Safety Guardrail Models Built for Global, Real-Time AI Safety

Nuestros servicios

Inicio

Cómo funciona

Noticias

Precios

Soporte

Centro de ayuda

Reportar un problema

Dar comentarios

Política de privacidad

Cuenta de usuario

Síguenos