{"id":53315,"date":"2025-11-24T08:15:41","date_gmt":"2025-11-24T08:15:41","guid":{"rendered":"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/"},"modified":"2025-11-24T08:15:41","modified_gmt":"2025-11-24T08:15:41","slug":"nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost","status":"publish","type":"post","link":"https:\/\/youzum.net\/es\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/","title":{"rendered":"NVIDIA AI Releases Nemotron-Elastic-12B: A Single AI Model that Gives You 6B\/9B\/12B Variants without Extra Training Cost"},"content":{"rendered":"<p>Why are AI dev teams still training and storing multiple large language models for different deployment needs when one elastic model can generate several sizes at the same cost? NVIDIA is collapsing the usual \u2018model family\u2019 stack into a single training job. NVIDIA AI team releases <strong>Nemotron-Elastic-12B<\/strong>, a 12B parameter reasoning model that embeds nested 9B and 6B variants in the same parameter space, so all three sizes come from one elastic checkpoint with no extra distillation runs per size.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Many in one model family<\/strong><\/h2>\n<p>Most production systems need several model sizes, a larger model for server side workloads, a mid size model for strong edge GPUs, and a smaller model for tight latency or power budgets. The usual pipeline trains or distills each size separately, so token cost and checkpoint storage scale with the number of variants.<\/p>\n<p>Nemotron Elastic takes a different route. It starts from the Nemotron Nano V2 12B reasoning model and trains an elastic hybrid Mamba Attention network that exposes multiple nested submodels. The released Nemotron-Elastic-12B checkpoint can be sliced into 9B and 6B variants, Nemotron-Elastic-9B and Nemotron-Elastic-6B, using a provided slicing script, without any extra optimization.<\/p>\n<p>All variants share weights and routing metadata, so training cost and deployment memory are tied to the largest model, not to the number of sizes in the family.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1782\" height=\"796\" data-attachment-id=\"76526\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/11\/23\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/screenshot-2025-11-23-at-10-47-02-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1.png\" data-orig-size=\"1782,796\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2025-11-23 at 10.47.02\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-300x134.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-1024x457.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1.png\" alt=\"\" class=\"wp-image-76526\" \/><figcaption class=\"wp-element-caption\">https:\/\/arxiv.org\/pdf\/2511.16664v1<\/figcaption><\/figure>\n<\/div>\n<h2 class=\"wp-block-heading\"><strong>Hybrid Mamba Transformer with elastic masks<\/strong><\/h2>\n<p>Architecturally, Nemotron Elastic is a Mamba-2 Transformer hybrid. The base network follows the Nemotron-H style design, where most layers are Mamba-2 based sequence state space blocks plus MLP, and a small set of attention layers preserve global receptive field.<\/p>\n<p><strong>Elasticity is implemented by turning this hybrid into a dynamic model controlled by masks<\/strong>:<\/p>\n<ul class=\"wp-block-list\">\n<li>Width, embedding channels, Mamba heads and head channels, attention heads, and FFN intermediate size can be reduced through binary masks.<\/li>\n<li>Depth, layers can be dropped according to a learned importance ordering, with residual paths preserving signal flow.<\/li>\n<\/ul>\n<p>A router module outputs discrete configuration choices per budget. These choices are converted to masks with Gumbel Softmax, then applied to embeddings, Mamba projections, attention projections, and FFN matrices. <strong>The research team adds several details to keep the SSM structure valid:<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li>Group aware SSM elastification that respects Mamba head and channel grouping.<\/li>\n<li>Heterogeneous MLP elastification where different layers can have distinct intermediate sizes.<\/li>\n<li>Normalized MSE based layer importance to decide which layers stay when depth is reduced.<\/li>\n<\/ul>\n<p>Smaller variants are always prefix selections in the ranked component lists, which makes the 6B and 9B models true nested subnetworks of the 12B parent.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img decoding=\"async\" width=\"1844\" height=\"1310\" data-attachment-id=\"76524\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/11\/23\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/screenshot-2025-11-23-at-10-46-29-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.46.29-PM-1.png\" data-orig-size=\"1844,1310\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2025-11-23 at 10.46.29\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.46.29-PM-1-300x213.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.46.29-PM-1-1024x727.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.46.29-PM-1.png\" alt=\"\" class=\"wp-image-76524\" \/><figcaption class=\"wp-element-caption\">https:\/\/arxiv.org\/pdf\/2511.16664v1<\/figcaption><\/figure>\n<\/div>\n<h2 class=\"wp-block-heading\"><strong>Two stage training for reasoning workloads<\/strong><\/h2>\n<p>Nemotron Elastic is trained as a reasoning model with a frozen teacher. The teacher is the original Nemotron-Nano-V2-12B reasoning model. The elastic-12B student is optimized jointly for all three budgets, 6B, 9B, 12B, using knowledge distillation plus language modeling loss.<\/p>\n<p><strong>Training runs in two stages<\/strong>:<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Stage 1:<\/strong> short context, sequence length 8192, batch size 1536, around 65B tokens, with uniform sampling over the three budgets.<\/li>\n<li><strong>Stage 2:<\/strong> extended context, sequence length 49152, batch size 512, around 45B tokens, with non uniform sampling that favors the full 12B budget.<\/li>\n<\/ul>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img decoding=\"async\" width=\"1882\" height=\"786\" data-attachment-id=\"76512\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/11\/23\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/screenshot-2025-11-23-at-10-26-00-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.26.00-PM-1.png\" data-orig-size=\"1882,786\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2025-11-23 at 10.26.00\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.26.00-PM-1-300x125.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.26.00-PM-1-1024x428.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.26.00-PM-1.png\" alt=\"\" class=\"wp-image-76512\" \/><figcaption class=\"wp-element-caption\">https:\/\/arxiv.org\/pdf\/2511.16664v1<\/figcaption><\/figure>\n<\/div>\n<p>The second stage is important for reasoning tasks. The above table shows that for AIME 2025, the 6B model improves from 56.88 to 68.13, a 19.8 percent relative gain, while the 9B model gains 9.7 percent and the 12B model gains 4.0 percent after extended context training.<\/p>\n<p>Budget sampling is also tuned. In Stage 2, non uniform weights of 0.5, 0.3, 0.2 for 12B, 9B, 6B avoid degradation of the largest model and keep all variants competitive on Math 500, AIME 2025, and GPQA.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Benchmark results<\/strong><\/h2>\n<p>Nemotron Elastic is evaluated on reasoning heavy benchmarks, MATH 500, AIME 2024, AIME 2025, GPQA, LiveCodeBench v5, and MMLU Pro. The below table summarizes pass at 1 accuracy.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1986\" height=\"580\" data-attachment-id=\"76515\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/11\/23\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/screenshot-2025-11-23-at-10-28-10-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.28.10-PM-1.png\" data-orig-size=\"1986,580\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2025-11-23 at 10.28.10\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.28.10-PM-1-300x88.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.28.10-PM-1-1024x299.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.28.10-PM-1.png\" alt=\"\" class=\"wp-image-76515\" \/><figcaption class=\"wp-element-caption\">https:\/\/arxiv.org\/pdf\/2511.16664v1<\/figcaption><\/figure>\n<\/div>\n<p>The 12B elastic model matches the NanoV2-12B baseline on average, 77.41 versus 77.38, while also providing 9B and 6B variants from the same run. The 9B elastic model tracks the NanoV2-9B baseline closely, 75.95 versus 75.99. The 6B elastic model reaches 70.61, slightly below Qwen3-8B at 72.68 but still strong for its parameter count given that it is not trained separately.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Training token and memory savings<\/strong><\/h2>\n<p>Nemotron Elastic targets the cost problem directly. The below table compares the token budgets needed to derive 6B and 9B models from a 12B parent:<\/p>\n<ul class=\"wp-block-list\">\n<li>NanoV2 pretraining for 6B and 9B, 40T tokens total.<\/li>\n<li>NanoV2 Compression with Minitron SSM, 480B exploratory plus 270B final, 750B tokens.<\/li>\n<li>Nemotron Elastic, 110B tokens in a single elastic distillation run.<\/li>\n<\/ul>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"2004\" height=\"548\" data-attachment-id=\"76518\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/11\/23\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/screenshot-2025-11-23-at-10-30-55-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.30.55-PM-1.png\" data-orig-size=\"2004,548\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2025-11-23 at 10.30.55\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.30.55-PM-1-300x82.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.30.55-PM-1-1024x280.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.30.55-PM-1.png\" alt=\"\" class=\"wp-image-76518\" \/><figcaption class=\"wp-element-caption\">https:\/\/arxiv.org\/pdf\/2511.16664v1<\/figcaption><\/figure>\n<\/div>\n<p>The research team reports that this gives around 360 times reduction versus training the two extra models from scratch, and around 7 times reduction versus the compression baseline.<\/p>\n<p>Deployment memory is reduced as well. The below table states that storing Nemotron Elastic 6B, 9B, and 12B together requires 24GB of BF16 weights, while storing NanoV2 9B plus 12B requires 42GB. This is a 43 percent memory reduction while also exposing an extra 6B size.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1012\" height=\"448\" data-attachment-id=\"76520\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/11\/23\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/screenshot-2025-11-23-at-10-31-47-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.31.47-PM-1.png\" data-orig-size=\"1012,448\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2025-11-23 at 10.31.47\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.31.47-PM-1-300x133.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.31.47-PM-1.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.31.47-PM-1.png\" alt=\"\" class=\"wp-image-76520\" \/><figcaption class=\"wp-element-caption\">https:\/\/arxiv.org\/pdf\/2511.16664v1<\/figcaption><\/figure>\n<\/div>\n<h2 class=\"wp-block-heading\"><strong>Comparison<\/strong><\/h2>\n<figure class=\"wp-block-table is-style-stripes\">\n<table class=\"has-fixed-layout\">\n<thead>\n<tr>\n<th>System<\/th>\n<th>Sizes (B)<\/th>\n<th>Avg reasoning score*<\/th>\n<th>Tokens for 6B + 9B<\/th>\n<th>BF16 memory<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Nemotron Elastic<\/td>\n<td>6, 9, 12<\/td>\n<td>70.61 \/ 75.95 \/ 77.41<\/td>\n<td>110B<\/td>\n<td>24GB<\/td>\n<\/tr>\n<tr>\n<td>NanoV2 Compression<\/td>\n<td>9, 12<\/td>\n<td>75.99 \/ 77.38<\/td>\n<td>750B<\/td>\n<td>42GB<\/td>\n<\/tr>\n<tr>\n<td>Qwen3<\/td>\n<td>8<\/td>\n<td>72.68<\/td>\n<td>n \/ a<\/td>\n<td>n \/ a<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/figure>\n<h2 class=\"wp-block-heading\"><strong>Key Takeaways<\/strong><\/h2>\n<ol class=\"wp-block-list\">\n<li>Nemotron Elastic trains one 12B reasoning model that contains nested 9B and 6B variants which can be extracted zero shot without extra training.<\/li>\n<li>The elastic family uses a hybrid Mamba-2 and Transformer architecture plus a learned router that applies structured masks over width and depth to define each submodel.<\/li>\n<li>The approach needs 110B training tokens to derive 6B and 9B from the 12B parent which is about 7 times fewer tokens than the 750B token Minitron SSM compression baseline and about 360 times fewer than training extra models from scratch.<\/li>\n<li>On reasoning benchmarks such as MATH 500, AIME 2024 and 2025, GPQA, LiveCodeBench and MMLU Pro the 6B, 9B and 12B elastic models reach average scores of about 70.61, 75.95 and 77.41 which are on par with or close to the NanoV2 baselines and competitive with Qwen3-8B.<\/li>\n<li>All three sizes share one 24GB BF16 checkpoint so deployment memory stays constant for the family compared with around 42GB for separate NanoV2-9B and 12B models which gives about 43 percent memory savings while adding a 6B option.<\/li>\n<\/ol>\n<h2 class=\"wp-block-heading\"><strong>Editorial Comments<\/strong><\/h2>\n<p>Nemotron-Elastic-12B is a practical step toward making reasoning model families cheaper to build and operate. One elastic checkpoint produces 6B, 9B, and 12B variants with a hybrid Mamba-2 and Transformer architecture, a learned router, and structured masks that preserve reasoning performance. The approach cuts token cost relative to separate compression or pretraining runs and keeps deployment memory at 24GB for all sizes, which simplifies fleet management for multi tier LLM deployments. Overall, Nemotron-Elastic-12B turns multi size reasoning LLMs into a single elastic systems design problem.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<p>Check out the\u00a0<strong><a href=\"https:\/\/arxiv.org\/pdf\/2511.16664v1\" target=\"_blank\" rel=\"noreferrer noopener\">Paper<\/a>\u00a0and\u00a0<a href=\"https:\/\/huggingface.co\/nvidia\/Nemotron-Elastic-12B\" target=\"_blank\" rel=\"noreferrer noopener\">Model weights<\/a><\/strong>.\u00a0Feel free to check out our\u00a0<strong><mark><a href=\"https:\/\/github.com\/Marktechpost\/AI-Tutorial-Codes-Included\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub Page for Tutorials, Codes and Notebooks<\/a><\/mark><\/strong>.\u00a0Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">100k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2025\/11\/23\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/\">NVIDIA AI Releases Nemotron-Elastic-12B: A Single AI Model that Gives You 6B\/9B\/12B Variants without Extra Training Cost<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Why are AI dev teams still training and storing multiple large language models for different deployment needs when one elastic model can generate several sizes at the same cost? NVIDIA is collapsing the usual \u2018model family\u2019 stack into a single training job. NVIDIA AI team releases Nemotron-Elastic-12B, a 12B parameter reasoning model that embeds nested 9B and 6B variants in the same parameter space, so all three sizes come from one elastic checkpoint with no extra distillation runs per size. Many in one model family Most production systems need several model sizes, a larger model for server side workloads, a mid size model for strong edge GPUs, and a smaller model for tight latency or power budgets. The usual pipeline trains or distills each size separately, so token cost and checkpoint storage scale with the number of variants. Nemotron Elastic takes a different route. It starts from the Nemotron Nano V2 12B reasoning model and trains an elastic hybrid Mamba Attention network that exposes multiple nested submodels. The released Nemotron-Elastic-12B checkpoint can be sliced into 9B and 6B variants, Nemotron-Elastic-9B and Nemotron-Elastic-6B, using a provided slicing script, without any extra optimization. All variants share weights and routing metadata, so training cost and deployment memory are tied to the largest model, not to the number of sizes in the family. https:\/\/arxiv.org\/pdf\/2511.16664v1 Hybrid Mamba Transformer with elastic masks Architecturally, Nemotron Elastic is a Mamba-2 Transformer hybrid. The base network follows the Nemotron-H style design, where most layers are Mamba-2 based sequence state space blocks plus MLP, and a small set of attention layers preserve global receptive field. Elasticity is implemented by turning this hybrid into a dynamic model controlled by masks: Width, embedding channels, Mamba heads and head channels, attention heads, and FFN intermediate size can be reduced through binary masks. Depth, layers can be dropped according to a learned importance ordering, with residual paths preserving signal flow. A router module outputs discrete configuration choices per budget. These choices are converted to masks with Gumbel Softmax, then applied to embeddings, Mamba projections, attention projections, and FFN matrices. The research team adds several details to keep the SSM structure valid: Group aware SSM elastification that respects Mamba head and channel grouping. Heterogeneous MLP elastification where different layers can have distinct intermediate sizes. Normalized MSE based layer importance to decide which layers stay when depth is reduced. Smaller variants are always prefix selections in the ranked component lists, which makes the 6B and 9B models true nested subnetworks of the 12B parent. https:\/\/arxiv.org\/pdf\/2511.16664v1 Two stage training for reasoning workloads Nemotron Elastic is trained as a reasoning model with a frozen teacher. The teacher is the original Nemotron-Nano-V2-12B reasoning model. The elastic-12B student is optimized jointly for all three budgets, 6B, 9B, 12B, using knowledge distillation plus language modeling loss. Training runs in two stages: Stage 1: short context, sequence length 8192, batch size 1536, around 65B tokens, with uniform sampling over the three budgets. Stage 2: extended context, sequence length 49152, batch size 512, around 45B tokens, with non uniform sampling that favors the full 12B budget. https:\/\/arxiv.org\/pdf\/2511.16664v1 The second stage is important for reasoning tasks. The above table shows that for AIME 2025, the 6B model improves from 56.88 to 68.13, a 19.8 percent relative gain, while the 9B model gains 9.7 percent and the 12B model gains 4.0 percent after extended context training. Budget sampling is also tuned. In Stage 2, non uniform weights of 0.5, 0.3, 0.2 for 12B, 9B, 6B avoid degradation of the largest model and keep all variants competitive on Math 500, AIME 2025, and GPQA. Benchmark results Nemotron Elastic is evaluated on reasoning heavy benchmarks, MATH 500, AIME 2024, AIME 2025, GPQA, LiveCodeBench v5, and MMLU Pro. The below table summarizes pass at 1 accuracy. https:\/\/arxiv.org\/pdf\/2511.16664v1 The 12B elastic model matches the NanoV2-12B baseline on average, 77.41 versus 77.38, while also providing 9B and 6B variants from the same run. The 9B elastic model tracks the NanoV2-9B baseline closely, 75.95 versus 75.99. The 6B elastic model reaches 70.61, slightly below Qwen3-8B at 72.68 but still strong for its parameter count given that it is not trained separately. Training token and memory savings Nemotron Elastic targets the cost problem directly. The below table compares the token budgets needed to derive 6B and 9B models from a 12B parent: NanoV2 pretraining for 6B and 9B, 40T tokens total. NanoV2 Compression with Minitron SSM, 480B exploratory plus 270B final, 750B tokens. Nemotron Elastic, 110B tokens in a single elastic distillation run. https:\/\/arxiv.org\/pdf\/2511.16664v1 The research team reports that this gives around 360 times reduction versus training the two extra models from scratch, and around 7 times reduction versus the compression baseline. Deployment memory is reduced as well. The below table states that storing Nemotron Elastic 6B, 9B, and 12B together requires 24GB of BF16 weights, while storing NanoV2 9B plus 12B requires 42GB. This is a 43 percent memory reduction while also exposing an extra 6B size. https:\/\/arxiv.org\/pdf\/2511.16664v1 Comparison System Sizes (B) Avg reasoning score* Tokens for 6B + 9B BF16 memory Nemotron Elastic 6, 9, 12 70.61 \/ 75.95 \/ 77.41 110B 24GB NanoV2 Compression 9, 12 75.99 \/ 77.38 750B 42GB Qwen3 8 72.68 n \/ a n \/ a Key Takeaways Nemotron Elastic trains one 12B reasoning model that contains nested 9B and 6B variants which can be extracted zero shot without extra training. The elastic family uses a hybrid Mamba-2 and Transformer architecture plus a learned router that applies structured masks over width and depth to define each submodel. The approach needs 110B training tokens to derive 6B and 9B from the 12B parent which is about 7 times fewer tokens than the 750B token Minitron SSM compression baseline and about 360 times fewer than training extra models from scratch. On reasoning benchmarks such as MATH 500, AIME 2024 and 2025, GPQA, LiveCodeBench and MMLU Pro the 6B, 9B and 12B elastic models reach average scores of about 70.61, 75.95 and 77.41 which are on par<\/p>","protected":false},"author":2,"featured_media":53316,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"pmpro_default_level":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"_pvb_checkbox_block_on_post":false,"footnotes":""},"categories":[52,5,7,1],"tags":[],"class_list":["post-53315","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-club","category-committee","category-news","category-uncategorized","pmpro-has-access"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>NVIDIA AI Releases Nemotron-Elastic-12B: A Single AI Model that Gives You 6B\/9B\/12B Variants without Extra Training Cost - YouZum<\/title>\n<meta name=\"description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/youzum.net\/es\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/\" \/>\n<meta property=\"og:locale\" content=\"es_ES\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"NVIDIA AI Releases Nemotron-Elastic-12B: A Single AI Model that Gives You 6B\/9B\/12B Variants without Extra Training Cost - YouZum\" \/>\n<meta property=\"og:description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta property=\"og:url\" content=\"https:\/\/youzum.net\/es\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/\" \/>\n<meta property=\"og:site_name\" content=\"YouZum\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DroneAssociationTH\/\" \/>\n<meta property=\"article:published_time\" content=\"2025-11-24T08:15:41+00:00\" \/>\n<meta name=\"author\" content=\"admin NU\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Escrito por\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin NU\" \/>\n\t<meta name=\"twitter:label2\" content=\"Tiempo de lectura\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutos\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/\"},\"author\":{\"name\":\"admin NU\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\"},\"headline\":\"NVIDIA AI Releases Nemotron-Elastic-12B: A Single AI Model that Gives You 6B\/9B\/12B Variants without Extra Training Cost\",\"datePublished\":\"2025-11-24T08:15:41+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/\"},\"wordCount\":1223,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA.webp\",\"articleSection\":[\"AI\",\"Committee\",\"News\",\"Uncategorized\"],\"inLanguage\":\"es\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/\",\"url\":\"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/\",\"name\":\"NVIDIA AI Releases Nemotron-Elastic-12B: A Single AI Model that Gives You 6B\/9B\/12B Variants without Extra Training Cost - YouZum\",\"isPartOf\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA.webp\",\"datePublished\":\"2025-11-24T08:15:41+00:00\",\"description\":\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\",\"breadcrumb\":{\"@id\":\"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#breadcrumb\"},\"inLanguage\":\"es\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"es\",\"@id\":\"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#primaryimage\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA.webp\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA.webp\",\"width\":1782,\"height\":796},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/youzum.net\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"NVIDIA AI Releases Nemotron-Elastic-12B: A Single AI Model that Gives You 6B\/9B\/12B Variants without Extra Training Cost\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/yousum.gpucore.co\/#website\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"name\":\"YouSum\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/yousum.gpucore.co\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"es\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\",\"name\":\"Drone Association Thailand\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"es\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"width\":300,\"height\":300,\"caption\":\"Drone Association Thailand\"},\"image\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/DroneAssociationTH\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\",\"name\":\"admin NU\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"es\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"caption\":\"admin NU\"},\"url\":\"https:\/\/youzum.net\/es\/members\/adminnu\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"NVIDIA AI Releases Nemotron-Elastic-12B: A Single AI Model that Gives You 6B\/9B\/12B Variants without Extra Training Cost - YouZum","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/youzum.net\/es\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/","og_locale":"es_ES","og_type":"article","og_title":"NVIDIA AI Releases Nemotron-Elastic-12B: A Single AI Model that Gives You 6B\/9B\/12B Variants without Extra Training Cost - YouZum","og_description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","og_url":"https:\/\/youzum.net\/es\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/","og_site_name":"YouZum","article_publisher":"https:\/\/www.facebook.com\/DroneAssociationTH\/","article_published_time":"2025-11-24T08:15:41+00:00","author":"admin NU","twitter_card":"summary_large_image","twitter_misc":{"Escrito por":"admin NU","Tiempo de lectura":"6 minutos"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#article","isPartOf":{"@id":"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/"},"author":{"name":"admin NU","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c"},"headline":"NVIDIA AI Releases Nemotron-Elastic-12B: A Single AI Model that Gives You 6B\/9B\/12B Variants without Extra Training Cost","datePublished":"2025-11-24T08:15:41+00:00","mainEntityOfPage":{"@id":"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/"},"wordCount":1223,"commentCount":0,"publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"image":{"@id":"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA.webp","articleSection":["AI","Committee","News","Uncategorized"],"inLanguage":"es","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/","url":"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/","name":"NVIDIA AI Releases Nemotron-Elastic-12B: A Single AI Model that Gives You 6B\/9B\/12B Variants without Extra Training Cost - YouZum","isPartOf":{"@id":"https:\/\/yousum.gpucore.co\/#website"},"primaryImageOfPage":{"@id":"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#primaryimage"},"image":{"@id":"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA.webp","datePublished":"2025-11-24T08:15:41+00:00","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","breadcrumb":{"@id":"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#breadcrumb"},"inLanguage":"es","potentialAction":[{"@type":"ReadAction","target":["https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/"]}]},{"@type":"ImageObject","inLanguage":"es","@id":"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#primaryimage","url":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA.webp","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA.webp","width":1782,"height":796},{"@type":"BreadcrumbList","@id":"https:\/\/youzum.net\/nvidia-ai-releases-nemotron-elastic-12b-a-single-ai-model-that-gives-you-6b-9b-12b-variants-without-extra-training-cost\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/youzum.net\/"},{"@type":"ListItem","position":2,"name":"NVIDIA AI Releases Nemotron-Elastic-12B: A Single AI Model that Gives You 6B\/9B\/12B Variants without Extra Training Cost"}]},{"@type":"WebSite","@id":"https:\/\/yousum.gpucore.co\/#website","url":"https:\/\/yousum.gpucore.co\/","name":"YouSum","description":"","publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/yousum.gpucore.co\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"es"},{"@type":"Organization","@id":"https:\/\/yousum.gpucore.co\/#organization","name":"Drone Association Thailand","url":"https:\/\/yousum.gpucore.co\/","logo":{"@type":"ImageObject","inLanguage":"es","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","width":300,"height":300,"caption":"Drone Association Thailand"},"image":{"@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DroneAssociationTH\/"]},{"@type":"Person","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c","name":"admin NU","image":{"@type":"ImageObject","inLanguage":"es","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","caption":"admin NU"},"url":"https:\/\/youzum.net\/es\/members\/adminnu\/"}]}},"rttpg_featured_image_url":{"full":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA.webp",1782,796,false],"landscape":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA.webp",1782,796,false],"portraits":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA.webp",1782,796,false],"thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA-150x150.webp",150,150,true],"medium":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA-300x134.webp",300,134,true],"large":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA-1024x457.webp",1024,457,true],"1536x1536":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA-1536x686.webp",1536,686,true],"2048x2048":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA.webp",1782,796,false],"trp-custom-language-flag":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA-18x8.webp",18,8,true],"woocommerce_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA-300x300.webp",300,300,true],"woocommerce_single":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA-600x268.webp",600,268,true],"woocommerce_gallery_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/11\/Screenshot-2025-11-23-at-10.47.02-PM-1-4nmLwA-100x100.webp",100,100,true]},"rttpg_author":{"display_name":"admin NU","author_link":"https:\/\/youzum.net\/es\/members\/adminnu\/"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/youzum.net\/es\/category\/ai-club\/\" rel=\"category tag\">AI<\/a> <a href=\"https:\/\/youzum.net\/es\/category\/committee\/\" rel=\"category tag\">Committee<\/a> <a href=\"https:\/\/youzum.net\/es\/category\/news\/\" rel=\"category tag\">News<\/a> <a href=\"https:\/\/youzum.net\/es\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","rttpg_excerpt":"Why are AI dev teams still training and storing multiple large language models for different deployment needs when one elastic model can generate several sizes at the same cost? NVIDIA is collapsing the usual \u2018model family\u2019 stack into a single training job. NVIDIA AI team releases Nemotron-Elastic-12B, a 12B parameter reasoning model that embeds nested&hellip;","_links":{"self":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/posts\/53315","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/comments?post=53315"}],"version-history":[{"count":0,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/posts\/53315\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/media\/53316"}],"wp:attachment":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/media?parent=53315"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/categories?post=53315"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/tags?post=53315"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}