{"id":62220,"date":"2026-01-05T10:42:33","date_gmt":"2026-01-05T10:42:33","guid":{"rendered":"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/"},"modified":"2026-01-05T10:42:33","modified_gmt":"2026-01-05T10:42:33","slug":"llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression","status":"publish","type":"post","link":"https:\/\/youzum.net\/ja\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/","title":{"rendered":"LLM-Pruning Collection: A JAX Based Repo For Structured And Unstructured LLM Compression"},"content":{"rendered":"<p>Zlab Princeton researchers have released <strong><a href=\"https:\/\/github.com\/zlab-princeton\/llm-pruning-collection\" target=\"_blank\" rel=\"noreferrer noopener\">LLM-Pruning Collection<\/a><\/strong>, a JAX based repository that consolidates major pruning algorithms for large language models into a single, reproducible framework. It targets one concrete goal, make it easy to compare block level, layer level and weight level pruning methods under a consistent training and evaluation stack on both GPUs and TPUs. <\/p>\n<h3 class=\"wp-block-heading\"><strong>What LLM-Pruning Collection Contains<\/strong>?<\/h3>\n<p>It is described as a JAX based repo for LLM pruning. <strong>It is organized into three main directories:<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li><code>pruning<\/code> holds implementations for several pruning methods: Minitron, ShortGPT, Wanda, SparseGPT, Magnitude, Sheared Llama and LLM-Pruner.<\/li>\n<li><code>training<\/code> provides integration with FMS-FSDP for GPU training and MaxText for TPU training.<\/li>\n<li><code>eval<\/code> exposes JAX compatible evaluation scripts built around lm-eval-harness, with accelerate based support for MaxText that gives about 2 to 4 times speedup.<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\"><strong>Pruning Methods Covered<\/strong><\/h3>\n<p>LLM-Pruning Collection spans several families of pruning algorithms with different granularity levels:<\/p>\n<h4 class=\"wp-block-heading\"><strong>Minitron<\/strong><\/h4>\n<p>Minitron is a practical pruning and distillation recipe developed by NVIDIA that compresses Llama 3.1 8B and Mistral NeMo 12B to 4B and 8B while preserving performance. It explores depth pruning and joint width pruning of hidden sizes, attention and MLP, followed by distillation. <\/p>\n<p>In LLM-Pruning Collection, the <code>pruning\/minitron<\/code> folder provides scripts such as <code>prune_llama3.1-8b.sh<\/code> which run Minitron style pruning on Llama 3.1 8B. <\/p>\n<h4 class=\"wp-block-heading\"><strong>ShortGPT<\/strong><\/h4>\n<p>ShortGPT is based on the observation that many Transformer layers are redundant. The method defines Block Influence, a metric that measures the contribution of each layer and then removes low influence layers by direct layer deletion. Experiments show that ShortGPT outperforms previous pruning methods for multiple choice and generative tasks.<\/p>\n<p>In the collection, ShortGPT is implemented through the Minitron folder with a dedicated script <code>prune_llama2-7b.sh<\/code>. <\/p>\n<h4 class=\"wp-block-heading\"><strong>Wanda, SparseGPT, Magnitude<\/strong><\/h4>\n<p>Wanda is a post training pruning method that scores weights by the product of weight magnitude and corresponding input activation on a per output basis. It prunes the smallest scores, requires no retraining and induces sparsity that works well even at billion parameter scale. <\/p>\n<p>SparseGPT is another post training method that uses a second order inspired reconstruction step to prune large GPT style models at high sparsity ratios. Magnitude pruning is the classical baseline that removes weights with small absolute value.<\/p>\n<p>In LLM-Pruning Collection, all three live under <code>pruning\/wanda<\/code> with a shared installation path. The README includes a dense table of Llama 2 7B results that compares Wanda, SparseGPT and Magnitude across BoolQ, RTE, HellaSwag, Winogrande, ARC E, ARC C and OBQA, under unstructured and structured sparsity patterns such as 4:8 and 2:4.<\/p>\n<h4 class=\"wp-block-heading\"><strong>Sheared Llama<\/strong><\/h4>\n<p>Sheared LLaMA is a structured pruning method that learns masks for layers, attention heads and hidden dimensions and then retrains the pruned architecture. The original release provides models at multiple scales including 2.7B and 1.3B.<\/p>\n<p>The <code>pruning\/llmshearing<\/code> directory in LLM-Pruning Collection integrates this recipe. It uses a RedPajama subset for calibration, accessed through Hugging Face, and helper scripts to convert between Hugging Face and MosaicML Composer formats. <\/p>\n<h4 class=\"wp-block-heading\"><strong>LLM-Pruner<\/strong><\/h4>\n<p>LLM-Pruner is a framework for structural pruning of large language models. It removes non critical coupled structures, such as attention heads or MLP channels, using gradient based importance scores and then recovers performance with a short LoRA tuning stage that uses about 50K samples. The collection includes LLM-Pruner under <code>pruning\/LLM-Pruner<\/code> with scripts for LLaMA, LLaMA 2 and Llama 3.1 8B. <\/p>\n<h3 class=\"wp-block-heading\"><strong>Key Takeaways<\/strong><\/h3>\n<ul class=\"wp-block-list\">\n<li>LLM-Pruning Collection is a JAX based, Apache-2.0 repo from zlab-princeton that unifies modern LLM pruning methods with shared pruning, training and evaluation pipelines for GPUs and TPUs.<\/li>\n<li>The codebase implements block, layer and weight level pruning approaches, including Minitron, ShortGPT, Wanda, SparseGPT, Sheared LLaMA, Magnitude pruning and LLM-Pruner, with method specific scripts for Llama family models.<\/li>\n<li>Training integrates FMS-FSDP on GPU and MaxText on TPU with JAX compatible evaluation scripts built on lm-eval-harness, giving roughly 2 to 4 times faster eval for MaxText checkpoints via accelerate. <\/li>\n<li>The repository reproduces key results from prior pruning work, publishing side by side \u201cpaper vs reproduced\u201d tables for methods like Wanda, SparseGPT, Sheared LLaMA and LLM-Pruner so engineers can verify their runs against known baselines. <\/li>\n<\/ul>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<p>Check out the\u00a0<strong><a href=\"https:\/\/github.com\/zlab-princeton\/llm-pruning-collection\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub Repo<\/a><\/strong>.\u00a0Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">100k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2026\/01\/04\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/\">LLM-Pruning Collection: A JAX Based Repo For Structured And Unstructured LLM Compression<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Zlab Princeton researchers have released LLM-Pruning Collection, a JAX based repository that consolidates major pruning algorithms for large language models into a single, reproducible framework. It targets one concrete goal, make it easy to compare block level, layer level and weight level pruning methods under a consistent training and evaluation stack on both GPUs and TPUs. What LLM-Pruning Collection Contains? It is described as a JAX based repo for LLM pruning. It is organized into three main directories: pruning holds implementations for several pruning methods: Minitron, ShortGPT, Wanda, SparseGPT, Magnitude, Sheared Llama and LLM-Pruner. training provides integration with FMS-FSDP for GPU training and MaxText for TPU training. eval exposes JAX compatible evaluation scripts built around lm-eval-harness, with accelerate based support for MaxText that gives about 2 to 4 times speedup. Pruning Methods Covered LLM-Pruning Collection spans several families of pruning algorithms with different granularity levels: Minitron Minitron is a practical pruning and distillation recipe developed by NVIDIA that compresses Llama 3.1 8B and Mistral NeMo 12B to 4B and 8B while preserving performance. It explores depth pruning and joint width pruning of hidden sizes, attention and MLP, followed by distillation. In LLM-Pruning Collection, the pruning\/minitron folder provides scripts such as prune_llama3.1-8b.sh which run Minitron style pruning on Llama 3.1 8B. ShortGPT ShortGPT is based on the observation that many Transformer layers are redundant. The method defines Block Influence, a metric that measures the contribution of each layer and then removes low influence layers by direct layer deletion. Experiments show that ShortGPT outperforms previous pruning methods for multiple choice and generative tasks. In the collection, ShortGPT is implemented through the Minitron folder with a dedicated script prune_llama2-7b.sh. Wanda, SparseGPT, Magnitude Wanda is a post training pruning method that scores weights by the product of weight magnitude and corresponding input activation on a per output basis. It prunes the smallest scores, requires no retraining and induces sparsity that works well even at billion parameter scale. SparseGPT is another post training method that uses a second order inspired reconstruction step to prune large GPT style models at high sparsity ratios. Magnitude pruning is the classical baseline that removes weights with small absolute value. In LLM-Pruning Collection, all three live under pruning\/wanda with a shared installation path. The README includes a dense table of Llama 2 7B results that compares Wanda, SparseGPT and Magnitude across BoolQ, RTE, HellaSwag, Winogrande, ARC E, ARC C and OBQA, under unstructured and structured sparsity patterns such as 4:8 and 2:4. Sheared Llama Sheared LLaMA is a structured pruning method that learns masks for layers, attention heads and hidden dimensions and then retrains the pruned architecture. The original release provides models at multiple scales including 2.7B and 1.3B. The pruning\/llmshearing directory in LLM-Pruning Collection integrates this recipe. It uses a RedPajama subset for calibration, accessed through Hugging Face, and helper scripts to convert between Hugging Face and MosaicML Composer formats. LLM-Pruner LLM-Pruner is a framework for structural pruning of large language models. It removes non critical coupled structures, such as attention heads or MLP channels, using gradient based importance scores and then recovers performance with a short LoRA tuning stage that uses about 50K samples. The collection includes LLM-Pruner under pruning\/LLM-Pruner with scripts for LLaMA, LLaMA 2 and Llama 3.1 8B. Key Takeaways LLM-Pruning Collection is a JAX based, Apache-2.0 repo from zlab-princeton that unifies modern LLM pruning methods with shared pruning, training and evaluation pipelines for GPUs and TPUs. The codebase implements block, layer and weight level pruning approaches, including Minitron, ShortGPT, Wanda, SparseGPT, Sheared LLaMA, Magnitude pruning and LLM-Pruner, with method specific scripts for Llama family models. Training integrates FMS-FSDP on GPU and MaxText on TPU with JAX compatible evaluation scripts built on lm-eval-harness, giving roughly 2 to 4 times faster eval for MaxText checkpoints via accelerate. The repository reproduces key results from prior pruning work, publishing side by side \u201cpaper vs reproduced\u201d tables for methods like Wanda, SparseGPT, Sheared LLaMA and LLM-Pruner so engineers can verify their runs against known baselines. Check out the\u00a0GitHub Repo.\u00a0Also,\u00a0feel free to follow us on\u00a0Twitter\u00a0and don\u2019t forget to join our\u00a0100k+ ML SubReddit\u00a0and Subscribe to\u00a0our Newsletter. Wait! are you on telegram?\u00a0now you can join us on telegram as well. The post LLM-Pruning Collection: A JAX Based Repo For Structured And Unstructured LLM Compression appeared first on MarkTechPost.<\/p>","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"pmpro_default_level":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"_pvb_checkbox_block_on_post":false,"footnotes":""},"categories":[52,5,7,1],"tags":[],"class_list":["post-62220","post","type-post","status-publish","format-standard","hentry","category-ai-club","category-committee","category-news","category-uncategorized","pmpro-has-access"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>LLM-Pruning Collection: A JAX Based Repo For Structured And Unstructured LLM Compression - YouZum<\/title>\n<meta name=\"description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/youzum.net\/ja\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/\" \/>\n<meta property=\"og:locale\" content=\"ja_JP\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"LLM-Pruning Collection: A JAX Based Repo For Structured And Unstructured LLM Compression - YouZum\" \/>\n<meta property=\"og:description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta property=\"og:url\" content=\"https:\/\/youzum.net\/ja\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/\" \/>\n<meta property=\"og:site_name\" content=\"YouZum\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DroneAssociationTH\/\" \/>\n<meta property=\"article:published_time\" content=\"2026-01-05T10:42:33+00:00\" \/>\n<meta name=\"author\" content=\"admin NU\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"\u57f7\u7b46\u8005\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin NU\" \/>\n\t<meta name=\"twitter:label2\" content=\"\u63a8\u5b9a\u8aad\u307f\u53d6\u308a\u6642\u9593\" \/>\n\t<meta name=\"twitter:data2\" content=\"4\u5206\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/\"},\"author\":{\"name\":\"admin NU\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\"},\"headline\":\"LLM-Pruning Collection: A JAX Based Repo For Structured And Unstructured LLM Compression\",\"datePublished\":\"2026-01-05T10:42:33+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/\"},\"wordCount\":717,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"articleSection\":[\"AI\",\"Committee\",\"News\",\"Uncategorized\"],\"inLanguage\":\"ja\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/\",\"url\":\"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/\",\"name\":\"LLM-Pruning Collection: A JAX Based Repo For Structured And Unstructured LLM Compression - YouZum\",\"isPartOf\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#website\"},\"datePublished\":\"2026-01-05T10:42:33+00:00\",\"description\":\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\",\"breadcrumb\":{\"@id\":\"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/#breadcrumb\"},\"inLanguage\":\"ja\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/youzum.net\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"LLM-Pruning Collection: A JAX Based Repo For Structured And Unstructured LLM Compression\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/yousum.gpucore.co\/#website\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"name\":\"YouSum\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/yousum.gpucore.co\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ja\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\",\"name\":\"Drone Association Thailand\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ja\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"width\":300,\"height\":300,\"caption\":\"Drone Association Thailand\"},\"image\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/DroneAssociationTH\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\",\"name\":\"admin NU\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ja\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"caption\":\"admin NU\"},\"url\":\"https:\/\/youzum.net\/ja\/members\/adminnu\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"LLM-Pruning Collection: A JAX Based Repo For Structured And Unstructured LLM Compression - YouZum","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/youzum.net\/ja\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/","og_locale":"ja_JP","og_type":"article","og_title":"LLM-Pruning Collection: A JAX Based Repo For Structured And Unstructured LLM Compression - YouZum","og_description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","og_url":"https:\/\/youzum.net\/ja\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/","og_site_name":"YouZum","article_publisher":"https:\/\/www.facebook.com\/DroneAssociationTH\/","article_published_time":"2026-01-05T10:42:33+00:00","author":"admin NU","twitter_card":"summary_large_image","twitter_misc":{"\u57f7\u7b46\u8005":"admin NU","\u63a8\u5b9a\u8aad\u307f\u53d6\u308a\u6642\u9593":"4\u5206"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/#article","isPartOf":{"@id":"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/"},"author":{"name":"admin NU","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c"},"headline":"LLM-Pruning Collection: A JAX Based Repo For Structured And Unstructured LLM Compression","datePublished":"2026-01-05T10:42:33+00:00","mainEntityOfPage":{"@id":"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/"},"wordCount":717,"commentCount":0,"publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"articleSection":["AI","Committee","News","Uncategorized"],"inLanguage":"ja","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/","url":"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/","name":"LLM-Pruning Collection: A JAX Based Repo For Structured And Unstructured LLM Compression - YouZum","isPartOf":{"@id":"https:\/\/yousum.gpucore.co\/#website"},"datePublished":"2026-01-05T10:42:33+00:00","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","breadcrumb":{"@id":"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/#breadcrumb"},"inLanguage":"ja","potentialAction":[{"@type":"ReadAction","target":["https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/youzum.net\/llm-pruning-collection-a-jax-based-repo-for-structured-and-unstructured-llm-compression\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/youzum.net\/"},{"@type":"ListItem","position":2,"name":"LLM-Pruning Collection: A JAX Based Repo For Structured And Unstructured LLM Compression"}]},{"@type":"WebSite","@id":"https:\/\/yousum.gpucore.co\/#website","url":"https:\/\/yousum.gpucore.co\/","name":"YouSum","description":"","publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/yousum.gpucore.co\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ja"},{"@type":"Organization","@id":"https:\/\/yousum.gpucore.co\/#organization","name":"Drone Association Thailand","url":"https:\/\/yousum.gpucore.co\/","logo":{"@type":"ImageObject","inLanguage":"ja","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","width":300,"height":300,"caption":"Drone Association Thailand"},"image":{"@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DroneAssociationTH\/"]},{"@type":"Person","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c","name":"admin NU","image":{"@type":"ImageObject","inLanguage":"ja","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","caption":"admin NU"},"url":"https:\/\/youzum.net\/ja\/members\/adminnu\/"}]}},"rttpg_featured_image_url":null,"rttpg_author":{"display_name":"admin NU","author_link":"https:\/\/youzum.net\/ja\/members\/adminnu\/"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/youzum.net\/ja\/category\/ai-club\/\" rel=\"category tag\">AI<\/a> <a href=\"https:\/\/youzum.net\/ja\/category\/committee\/\" rel=\"category tag\">Committee<\/a> <a href=\"https:\/\/youzum.net\/ja\/category\/news\/\" rel=\"category tag\">News<\/a> <a href=\"https:\/\/youzum.net\/ja\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","rttpg_excerpt":"Zlab Princeton researchers have released LLM-Pruning Collection, a JAX based repository that consolidates major pruning algorithms for large language models into a single, reproducible framework. It targets one concrete goal, make it easy to compare block level, layer level and weight level pruning methods under a consistent training and evaluation stack on both GPUs and&hellip;","_links":{"self":[{"href":"https:\/\/youzum.net\/ja\/wp-json\/wp\/v2\/posts\/62220","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/youzum.net\/ja\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/youzum.net\/ja\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/youzum.net\/ja\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/youzum.net\/ja\/wp-json\/wp\/v2\/comments?post=62220"}],"version-history":[{"count":0,"href":"https:\/\/youzum.net\/ja\/wp-json\/wp\/v2\/posts\/62220\/revisions"}],"wp:attachment":[{"href":"https:\/\/youzum.net\/ja\/wp-json\/wp\/v2\/media?parent=62220"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/youzum.net\/ja\/wp-json\/wp\/v2\/categories?post=62220"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/youzum.net\/ja\/wp-json\/wp\/v2\/tags?post=62220"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}