YouZum

Committee

AI, Committee, Actualités, Uncategorized

Restoring Exploration after Post-Training: Latent Exploration Decoding for Large Reasoning Models

arXiv:2602.01698v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) have recently achieved strong mathematical and code reasoning performance through Reinforcement Learning (RL) post-training. However, we show that modern reasoning post-training induces an unintended exploration collapse: temperature-based sampling no longer increases pass@$n$ accuracy. Empirically, the final-layer posterior of post-trained LRMs exhibit sharply reduced entropy, while the entropy of intermediate layers remains relatively high. Motivated by this entropy asymmetry, we propose Latent Exploration Decoding (LED), a depth-conditioned decoding strategy. LED aggregates intermediate posteriors via cumulative sum and selects depth configurations with maximal entropy as exploration candidates. Without additional training or parameters, LED consistently improves pass@1 and pass@16 accuracy by 0.61 and 1.03 percentage points across multiple reasoning benchmarks and models. Project page: https://GitHub.com/Xiaomi-Research/LED.

Restoring Exploration after Post-Training: Latent Exploration Decoding for Large Reasoning Models Lire l’article »

AI, Committee, Actualités, Uncategorized

Understanding QA generation: Extracting Parametric and Contextual Knowledge with CQA for Low Resource Bangla Language

arXiv:2602.01451v1 Announce Type: new Abstract: Question-Answering (QA) models for low-resource languages like Bangla face challenges due to limited annotated data and linguistic complexity. A key issue is determining whether models rely more on pre-encoded (parametric) knowledge or contextual input during answer generation, as existing Bangla QA datasets lack the structure required for such analysis. We introduce BanglaCQA, the first Counterfactual QA dataset in Bangla, by extending a Bangla dataset while integrating counterfactual passages and answerability annotations. In addition, we propose fine-tuned pipelines for encoder-decoder language-specific and multilingual baseline models, and prompting-based pipelines for decoder-only LLMs to disentangle parametric and contextual knowledge in both factual and counterfactual scenarios. Furthermore, we apply LLM-based and human evaluation techniques that measure answer quality based on semantic similarity. We also present a detailed analysis of how models perform across different QA settings in low-resource languages, and show that Chain-of-Thought (CoT) prompting reveals a uniquely effective mechanism for extracting parametric knowledge in counterfactual scenarios, particularly in decoder-only LLMs. Our work not only introduces a novel framework for analyzing knowledge sources in Bangla QA but also uncovers critical findings that open up broader directions for counterfactual reasoning in low-resource language settings.

Understanding QA generation: Extracting Parametric and Contextual Knowledge with CQA for Low Resource Bangla Language Lire l’article »

AI, Committee, Actualités, Uncategorized

Geometric-disentangelment Unlearning

arXiv:2511.17100v4 Announce Type: replace-cross Abstract: Large language models (LLMs) can internalize private or harmful content, motivating unlearning that removes a forget set while preserving retaining knowledge. However, forgetting updates often cause collateral degradation on retaining knowledge, creating a persistent trade-off. Existing LLM unlearning methods are often heuristic, and other theoretical approaches rely on offline feature constructions that do not capture update-time forget-retain interaction in LLMs. To address this limitation, we aim to develop an LLM unlearning method that reduces the forget-retain trade-off with theoretical guarantees. We take a first-principles view by formalizing “no side effects” as local retain invariance under small parameter updates, and prove an equivalence under optimizer-induced geometry: the retain loss is locally invariant if and only if the update direction is orthogonal to the subspace spanned by retain gradients. Based on the insight, we propose Geometric-disentanglement Unlearning (GU), a lightweight and theoretically grounded projection that can be plug-and-play to existing gradient-based unlearning methods to mitigate forget-retain side effects. Experiments on TOFU, MUSE, and WMDP-cyber show that GU strengthens forgetting while reducing retain drift. When added to SimNPO, it achieves up to 62% improved forgetting Extraction Strength (ES) and 31% higher retain ES. We open-sourced our code in https://github.com/Lemutisme/Geometric-Unlearning.

Geometric-disentangelment Unlearning Lire l’article »

AI, Committee, Actualités, Uncategorized

SignX: Continuous Sign Recognition in Compact Pose-Rich Latent Space

arXiv:2504.16315v3 Announce Type: replace-cross Abstract: The complexity of sign language data processing brings many challenges. The current approach to recognition of ASL signs aims to translate RGB sign language videos through pose information into English-based ID Glosses, which serve to uniquely identify ASL signs. This paper proposes SignX, a novel framework for continuous sign language recognition in compact pose-rich latent space. First, we construct a unified latent representation that encodes heterogeneous pose formats (SMPLer-X, DWPose, Mediapipe, PrimeDepth, and Sapiens Segmentation) into a compact, information-dense space. Second, we train a ViT-based Video2Pose module to extract this latent representation directly from raw videos. Finally, we develop a temporal modeling and sequence refinement method that operates entirely in this latent space. This multi-stage design achieves end-to-end sign language recognition while significantly reducing computational consumption. Experimental results demonstrate that SignX achieves state-of-the-art accuracy on continuous sign language recognition.

SignX: Continuous Sign Recognition in Compact Pose-Rich Latent Space Lire l’article »

AI, Committee, Actualités, Uncategorized

Microbes could extract the metal needed for cleantech

In a pine forest on Michigan’s Upper Peninsula, the only active nickel mine in the US is nearing the end of its life. At a time when carmakers want the metal for electric-vehicle batteries, nickel concentration at Eagle Mine is falling and could soon drop too low to warrant digging. But earlier this year, the mine’s owner started testing a new process that could eke out a bit more nickel. In a pair of shipping containers recently installed at the mine’s mill, a fermentation-derived broth developed by the startup Allonnia is mixed with concentrated ore to capture and remove impurities. The process allows nickel production from lower-quality ore.  Kent Sorenson, Allonnia’s chief technology officer, says this approach could help companies continue operating sites that, like Eagle Mine, have burned through their best ore. “The low-hanging fruit is to keep mining the mines that we have,” he says.  Demand for nickel, copper, and rare earth elements is rapidly increasing amid the explosive growth of metal-intensive data centers, electric cars, and renewable energy projects. But producing these metals is becoming harder and more expensive because miners have already exploited the best resources. Like the age-old technique of rolling up the end of a toothpaste tube, Allonnia’s broth is one of a number of ways that biotechnology could help miners squeeze more metal out of aging mines, mediocre ore, or piles of waste. The mining industry has intentionally seeded copper ore with microbes for decades. At current copper bioleaching sites, miners pile crushed copper ore into heaps and add sulfuric acid. Acid-loving bacteria like Acidithiobacillus ferrooxidans colonize the mound. A chemical the organisms produce breaks the bond between sulfur and copper molecules to liberate the metal. Until now, beyond maintaining the acidity and blowing air into the heap, there wasn’t much more miners could do to encourage microbial growth. But Elizabeth Dennett, CEO of the startup Endolith, says the decreasing cost of genetic tools is making it possible to manage the communities of microbes in a heap more actively. “The technology we’re using now didn’t exist a few years ago,” she says. Endolith analyzes bits of DNA and RNA in the copper-rich liquid that flows out of an ore heap to characterize the microbes living inside. Combined with a suite of chemical analyses, the information helps the company determine which microbes to sprinkle on a heap to optimize extraction.  Endolith scientists use columns filled with copper ore to test the firm’s method of actively managing microbes in the ore to increase metal extraction.ENDOLITH In lab tests on ore from the mining firm BHP, Endolith’s active techniques outperformed passive bioleaching approaches. In November, the company raised $16.5 million to move from its Denver lab to heaps in active mines. Despite these promising early results, Corale Brierley, an engineer who has worked on metal bioleaching systems since the 1970s, questions whether companies like Endolith that add additional microbes to ore will successfully translate their processes to commercial scales. “What guarantees are you going to give the company that those organisms will actually grow?” Brierley asks. Big mining firms that have already optimized every hose, nut, and bolt in their process won’t be easy to convince either, says Diana Rasner, an analyst covering mining technology for the research firm Cleantech Group.  “They are acutely aware of what it takes to scale these technologies because they know the industry,” she says. “They’ll be your biggest supporters, but they’re going to be your biggest critics.” In addition to technical challenges, Rasner points out that venture-capital-backed biotechnology startups will struggle to deliver the quick returns their investors seek. Mining companies want lots of data before adopting a new process, which could take years of testing to compile. “This is not software,” Rasner says.   Nuton, a subsidiary of the mining giant Rio Tinto, is a good example. The company has been working for decades on a copper bioleaching process that uses a blend of archaea and bacteria strains, plus some chemical additives. But it started demonstrating the technology only late last year, at a mine in Arizona.  Nuton is testing an improved bioleaching process at Gunnison Copper’s Johnson Camp mine in Arizona.NUTON While Endolith and Nuton use naturally occurring microbes, the startup 1849 is hoping to achieve a bigger performance boost by genetically engineering microbes. “You can do what mining companies have traditionally done,” says CEO Jai Padmakumar. “Or you can try to take the moonshot bet and engineer them. If you get that, you have a huge win.” Genetic engineering would allow 1849 to tailor its microbes to the specific challenges facing a customer. But engineering organisms can also make them harder to grow, warns Buz Barstow, a Cornell University microbiologist who studies applications for biotechnology in mining. Other companies are trying to avoid that trade-off by applying the products of microbial fermentation, rather than live organisms. Alta Resource Technologies, which closed a $28 million investment round in December, is engineering microbes that make proteins capable of extracting and separating rare earth elements. Similarly, the startup REEgen, based in Ithaca, New York, relies on the organic acids produced by an engineered strain of Gluconobacter oxydans to extract rare earth elements from ore and from waste materials like metal recycling slag, coal ash, or old electronics. “The microbes are the manufacturing,” says CEO Alexa Schmitz, an alumna of Barstow’s lab. To make a dent in the growing demand for metal, this new wave of biotechnologies will have to go beyond copper and gold, says Barstow. In 2024, he started a project to map out genes that could be useful for extracting and separating a wider range of metals. Even with the challenges ahead, he says, biotechnology has the potential to transform mining the way fracking changed natural gas. “Biomining is one of these areas where the need … is big enough,” he says.  The challenge will be moving fast enough to keep up with growing demand.

Microbes could extract the metal needed for cleantech Lire l’article »

AI, Committee, Actualités, Uncategorized

Sparse or Dense? A Mechanistic Estimation of Computation Density in Transformer-based LLMs

arXiv:2601.22795v1 Announce Type: new Abstract: Transformer-based large language models (LLMs) are comprised of billions of parameters arranged in deep and wide computational graphs. Several studies on LLM efficiency optimization argue that it is possible to prune a significant portion of the parameters, while only marginally impacting performance. This suggests that the computation is not uniformly distributed across the parameters. We introduce here a technique to systematically quantify computation density in LLMs. In particular, we design a density estimator drawing on mechanistic interpretability. We experimentally test our estimator and find that: (1) contrary to what has been often assumed, LLM processing generally involves dense computation; (2) computation density is dynamic, in the sense that models shift between sparse and dense processing regimes depending on the input; (3) per-input density is significantly correlated across LLMs, suggesting that the same inputs trigger either low or high density. Investigating the factors influencing density, we observe that predicting rarer tokens requires higher density, and increasing context length often decreases the density. We believe that our computation density estimator will contribute to a better understanding of the processing at work in LLMs, challenging their symbolic interpretation.

Sparse or Dense? A Mechanistic Estimation of Computation Density in Transformer-based LLMs Lire l’article »

AI, Committee, Actualités, Uncategorized

NeUQI: Near-Optimal Uniform Quantization Parameter Initialization for Low-Bit LLMs

arXiv:2505.17595v3 Announce Type: replace-cross Abstract: Large language models (LLMs) achieve impressive performance across domains but face significant challenges when deployed on consumer-grade GPUs or personal devices such as laptops, due to high memory consumption and inference costs. Post-training quantization (PTQ) of LLMs offers a promising solution that reduces their memory footprint and decoding latency. In practice, PTQ with uniform quantization representation is favored due to its efficiency and ease of deployment, as uniform quantization is widely supported by mainstream hardware and software libraries. Recent studies on low-bit uniform quantization have led to noticeable improvements in post-quantization model performance; however, they mainly focus on quantization methodologies, while the initialization of quantization parameters remains underexplored and still relies on the conventional Min-Max formula. In this work, we identify the limitations of the Min-Max formula, move beyond its constraints, and propose NeUQI, a method that efficiently determines near-optimal initialization for uniform quantization. Our NeUQI simplifies the joint optimization of the scale and zero-point by deriving the zero-point for a given scale, thereby reducing the problem to a scale-only optimization. Benefiting from the improved quantization parameters, our NeUQI consistently outperforms existing methods in the experiments with the LLaMA and Qwen families on various settings and tasks. Furthermore, when combined with a lightweight distillation strategy, NeUQI even achieves superior performance to PV-tuning, a considerably more resource-intensive method.

NeUQI: Near-Optimal Uniform Quantization Parameter Initialization for Low-Bit LLMs Lire l’article »

AI, Committee, Actualités, Uncategorized

What’s next for EV batteries in 2026

MIT Technology Review’s What’s Next series looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here. Demand for electric vehicles and the batteries that power them has never been hotter. In 2025, EVs made up over a quarter of new vehicle sales globally, up from less than 5% in 2020. Some regions are seeing even higher uptake: In China, more than 50% of new vehicle sales last year were battery electric or plug-in hybrids. In Europe, more purely electric vehicles hit the roads in December than gas-powered ones. (The US is the notable exception here, dragging down the global average with a small sales decline from 2024.) As EVs become increasingly common on the roads, the battery world is growing too. Looking ahead, we could soon see wider adoption of new chemistries, including some that deliver lower costs or higher performance. Meanwhile, the geopolitics of batteries are shifting, and so is the policy landscape. Here’s what’s coming next for EV batteries in 2026 and beyond. A big opportunity for sodium-ion batteries Lithium-ion batteries are the default chemistry used in EVs, personal devices, and even stationary storage systems on the grid today. But in a tough environment in some markets like the US, there’s a growing interest in cheaper alternatives. Automakers right now largely care just about batteries’ cost, regardless of performance improvements, says Kara Rodby, a technical principal at Volta Energy Technologies, a venture capital firm that focuses on energy storage technology. Sodium-ion cells have long been held up as a potentially less expensive alternative to lithium. The batteries are limited in their energy density, so they deliver a shorter range than lithium-ion. But sodium is also more abundant, so they could be cheaper. Sodium’s growth has been cursed, however, by the very success of lithium-based batteries, says Shirley Meng, a professor of molecular engineering at the University of Chicago. A lithium-ion battery cell cost $568 per kilowatt-hour in 2013, but that cost had fallen to just $74 per kilowatt-hour by 2025—quite the moving target for cheaper alternatives to chase. Sodium-ion batteries currently cost about $59 per kilowatt-hour on average. That’s less expensive than the average lithium-ion battery. But if you consider only lithium iron phosphate (LFP) cells, a lower-end type of lithium-ion battery that averages $52 per kilowatt-hour, sodium is still more expensive today.  We could soon see an opening for sodium-batteries, though. Lithium prices have been ticking up in recent months, a shift that could soon slow or reverse the steady downward march of prices for lithium-based batteries.  Sodium-ion batteries are already being used commercially, largely for stationary storage on the grid. But we’re starting to see sodium-ion cells incorporated into vehicles, too. The Chinese companies Yadea, JMEV, and HiNa Battery have all started producing sodium-ion batteries in limited numbers for EVs, including small, short-range cars and electric scooters that don’t require a battery with high energy density. CATL, a Chinese battery company that’s the world’s largest, says it recently began producing sodium-ion cells. The company plans to launch its first EV using the chemistry by the middle of this year.  Today, both production and demand for sodium-ion batteries are heavily centered in China. That’s likely to continue, especially after a cutback in tax credits and other financial support for the battery and EV industries in the US. One of the biggest sodium-battery companies in the US, Natron, ceased operations last year after running into funding issues. We could also see progress in sodium-ion research: Companies and researchers are developing new materials for components including the electrolyte and electrodes, so the cells could get more comparable to lower-end lithium-ion cells in terms of energy density, Meng says.  Major tests for solid-state batteries As we enter the second half of this decade, many eyes in the battery world are on big promises and claims about solid-state batteries. These batteries could pack more energy into a smaller package by removing the liquid electrolyte, the material that ions move through when a battery is charging and discharging. With a higher energy density, they could unlock longer-range EVs. Companies have been promising solid-state batteries for years. Toyota, for example, once planned to have them in vehicles by 2020. That timeline has been delayed several times, though the company says it’s now on track to launch the new cells in cars in 2027 or 2028. Historically, battery makers have struggled to produce solid-state batteries at the scale needed to deliver a commercially relevant supply for EVs. There’s been progress in manufacturing techniques, though, and companies could soon actually make good on their promises, Meng says.  Factorial Energy, a US-based company making solid-state batteries, provided cells for a Mercedes test vehicle that drove over 745 miles on a single charge in a real-world test in September. The company says it plans to bring its tech to market as soon as 2027. Quantumscape, another major solid-state player in the US, is testing its cells with automotive partners and plans to have its batteries in commercial production later this decade.   Before we see true solid-state batteries, we could see hybrid technologies, often referred to as semi-solid-state batteries. These commonly use materials like gel electrolytes, reducing the liquid inside cells without removing it entirely. Many Chinese companies are looking to build semi-solid-state batteries before transitioning to entirely solid-state ones, says Evelina Stoikou, head of battery technologies and supply chains at BloombergNEF, an energy consultancy. A global patchwork The picture for the near future of the EV industry looks drastically different depending on where you’re standing. Last year, China overtook Japan as the country with the most global auto sales. And more than one in three EVs made in 2025 had a CATL battery in it. Simply put, China is dominating the global battery industry, and that doesn’t seem likely to change anytime soon. China’s influence outside its domestic market is growing especially quickly. CATL is expected to begin production this year at its second

What’s next for EV batteries in 2026 Lire l’article »

AI, Committee, Actualités, Uncategorized

NVIDIA AI Brings Nemotron-3-Nano-30B to NVFP4 with Quantization Aware Distillation (QAD) for Efficient Reasoning Inference

NVIDIA has released Nemotron-Nano-3-30B-A3B-NVFP4, a production checkpoint that runs a 30B parameter reasoning model in 4 bit NVFP4 format while keeping accuracy close to its BF16 baseline. The model combines a hybrid Mamba2 Transformer Mixture of Experts architecture with a Quantization Aware Distillation (QAD) recipe designed specifically for NVFP4 deployment. Overall, it is an ultra-efficient NVFP4 precision version of Nemotron-3-Nano that delivers up to 4x higher throughput on Blackwell B200. https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4 What is Nemotron-Nano-3-30B-A3B-NVFP4? Nemotron-Nano-3-30B-A3B-NVFP4 is a quantized version of Nemotron-3-Nano-30B-A3B-BF16, trained from scratch by NVIDIA team as a unified reasoning and chat model. It is built as a hybrid Mamba2 Transformer MoE network: 30B parameters in total 52 layers in depth 23 Mamba2 and MoE layers 6 grouped query attention layers with 2 groups Each MoE layer has 128 routed experts and 1 shared expert 6 experts are active per token, which gives about 3.5B active parameters per token The model is pre-trained on 25T tokens using a Warmup Stable Decay learning rate schedule with a batch size of 3072, a peak learning rate of 1e-3 and a minimum learning rate of 1e-5. Post training follows a 3 stage pipeline: Supervised fine tuning on synthetic and curated data for code, math, science, tool calling, instruction following and structured outputs. Reinforcement learning with synchronous GRPO across multi step tool use, multi turn chat and structured environments, and RLHF with a generative reward model. Post training quantization to NVFP4 with FP8 KV cache and a selective high precision layout, followed by QAD. The NVFP4 checkpoint keeps the attention layers and the Mamba layers that feed into them in BF16, quantizes remaining layers to NVFP4 and uses FP8 for the KV cache. NVFP4 format and why it matters? NVFP4 is a 4 bit floating point format designed for both training and inference on recent NVIDIA GPUs. The main properties of NVFP4: Compared with FP8, NVFP4 delivers 2 to 3 times higher arithmetic throughput. It reduces memory usage by about 1.8 times for weights and activations. It extends MXFP4 by reducing the block size from 32 to 16 and introduces two level scaling. The two level scaling uses E4M3-FP8 scales per block and a FP32 scale per tensor. The smaller block size allows the quantizer to adapt to local statistics and the dual scaling increases dynamic range while keeping quantization error low. For very large LLMs, simple post training quantization (PTQ) to NVFP4 already gives decent accuracy across benchmarks. For smaller models, especially those heavily postage pipelines, the research team notes that PTQ causes non negligible accuracy drops, which motivates a training based recovery method. From QAT to QAD Standard Quantization Aware Training (QAT) inserts a pseudo quantization into the forward pass and reuses the original task loss, such as next token cross entropy. This works well for convolutional networks, but the research team lists 2 main issues for modern LLMs: Complex multi stage post training pipelines with SFT, RL and model merging are hard to reproduce. Original training data for open models is often unavailabublic form. Quantization Aware Distillation (QAD) changes the objective instead of the full pipeline. A frozen BF16 model acts as teacher and the NVFP4 model is a student. Training minimizes KL divergence between their output token distributions, not the original supervised or RL objective. The research team highlights 3 properties of QAD: It aligns the quantized model with the high precision teacher more accurately than QAT. It stays stable even when the teacher has already gone through several stages, such as supervised fine tuning, reinforcement learning and model merging, because QAD only tries to match the final teacher behavior. It works with partial, synthetic or filtered data, because it only needs input text to query the teacher and student, not the original labels or reward models. Benchmarks on Nemotron-3-Nano-30B Nemotron-3-Nano-30B-A3B is one of the RL heavy models in the QAD research. The below Table shows accuracy on AA-LCR, AIME25, GPQA-D, LiveCodeBench-v5 and SciCode-TQ, NVFP4-QAT and NVFP4-QAD. https://research.nvidia.com/labs/nemotron/files/NVFP4-QAD-Report.pdf Key Takeaways Nemotron-3-Nano-30B-A3B-NVFP4 is a 30B parameter hybrid Mamba2 Transformer MoE model that runs in 4 bit NVFP4 with FP8 KV cache and a small set of BF16 layers preserved for stability, while keeping about 3.5B active parameters per token and supporting context windows up to 1M tokens. NVFP4 is a 4 bit floating point format with block size 16 and two level scaling, using E4M3-FP8 per block scales and a FP32 per tensor scale, which gives about 2 to 3 times higher arithmetic throughput and about 1.8 times lower memory cost than FP8 for weights and activations. Quantization Aware Distillation (QAD) replaces the original task loss with KL divergence to a frozen BF16 teacher, so the NVFP4 student directly matches the teacher’s output distribution without replaying the full SFT, RL and model merge pipeline or needing the original reward models. Using the new Quantization Aware Distillation method, the NVFP4 version achieves up to 99.4% accuracy of BF16 On AA-LCR, AIME25, GPQA-D, LiveCodeBench and SciCode, NVFP4-PTQ shows noticeable accuracy loss and NVFP4-QAT degrades further, while NVFP4-QAD recovers performance to near BF16 levels, reducing the gap to only a few points across these reasoning and coding benchmarks. Check out the Paper and Model Weights. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well. The post NVIDIA AI Brings Nemotron-3-Nano-30B to NVFP4 with Quantization Aware Distillation (QAD) for Efficient Reasoning Inference appeared first on MarkTechPost.

NVIDIA AI Brings Nemotron-3-Nano-30B to NVFP4 with Quantization Aware Distillation (QAD) for Efficient Reasoning Inference Lire l’article »

We use cookies to improve your experience and performance on our website. You can learn more at Politique de confidentialité and manage your privacy settings by clicking Settings.

Privacy Preferences

You can choose your cookie settings by turning on/off each type of cookie as you wish, except for essential cookies.

Allow All
Manage Consent Preferences
  • Always Active

Save
fr_FR