{"id":29130,"date":"2025-08-03T05:54:23","date_gmt":"2025-08-03T05:54:23","guid":{"rendered":"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/"},"modified":"2025-08-03T05:54:23","modified_gmt":"2025-08-03T05:54:23","slug":"mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon","status":"publish","type":"post","link":"https:\/\/youzum.net\/it\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/","title":{"rendered":"MIT Researchers Develop Methods to Control Transformer Sensitivity with Provable Lipschitz Bounds and Muon"},"content":{"rendered":"<p><strong>Training large-scale transformers stably<\/strong> has been a longstanding challenge in <a href=\"https:\/\/www.marktechpost.com\/2025\/01\/15\/what-is-deep-learning-2\/\" target=\"_blank\">deep learning<\/a>, particularly as models grow in size and expressivity. <strong>MIT researchers tackle a persistent problem at its root: the <em>unstable growth of activations<\/em> and loss spikes caused by unconstrained weight and activation norms.<\/strong> Their solution is to enforce <em>provable Lipschitz bounds<\/em> on the transformer by *spectrally regulating the weights\u2014*with no use of activation normalization, QK norm, or logit softcapping tricks.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1024\" height=\"572\" data-attachment-id=\"73157\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/08\/02\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/screenshot-2025-08-02-at-1-51-26-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1.png\" data-orig-size=\"1646,920\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2025-08-02 at 1.51.26\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-300x168.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572.png\" alt=\"\" class=\"wp-image-73157\" \/><\/figure>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>What is a Lipschitz Bound\u2014and Why Enforce It?<\/strong><\/h3>\n<p>A <strong>Lipschitz bound<\/strong> on a neural network quantifies the maximum amount by which the output can change in response to input (or weight) perturbations. Mathematically, a function fff is KKK-Lipschitz if:\u2225f(x1)\u2212f(x2)\u2225\u2264K\u2225x1\u2212x2\u2225\u00a0\u2200x1,x2|f(x_1) \u2013 f(x_2)| leq K |x_1 \u2013 x_2|  forall x_1, x_2\u2225f(x1)\u2212f(x2)\u2225\u2264K\u2225x1\u2212x2\u2225\u00a0\u2200x1,x2<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Lower Lipschitz bound \u21d2 greater robustness and predictability.<\/strong><\/li>\n<li>It is crucial for stability, adversarial robustness, privacy, and generalization, with lower bounds meaning the network is less sensitive to changes or adversarial noise.<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\"><strong>Motivation and Problem Statement<\/strong><\/h3>\n<p>Traditionally, training stable transformers at scale has involved <em>a variety of \u201cband-aid\u201d stabilization tricks<\/em>:<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Layer normalization<\/strong><\/li>\n<li><strong>QK normalization<\/strong><\/li>\n<li><strong>Logit tanh softcapping<\/strong><\/li>\n<\/ul>\n<p>But these do not directly address the underlying spectral norm (largest singular value) growth in the weights, a root cause of exploding activations and training instability\u2014especially in large models.<\/p>\n<p>The <strong>central hypothesis<\/strong>: <strong>If we spectrally regulate the weights themselves\u2014beyond just the optimizer or activations\u2014we can maintain tight control over Lipschitzness, potentially solving instability at its source.<\/strong><\/p>\n<h2 class=\"wp-block-heading\"><strong>Key Innovations<\/strong><\/h2>\n<h3 class=\"wp-block-heading\"><strong>Weight Spectral Regulation and the Muon Optimizer<\/strong><\/h3>\n<ul class=\"wp-block-list\">\n<li><strong>Muon<\/strong> optimizer spectrally regularizes <em>gradients<\/em>, ensuring each gradient step does not increase the spectral norm beyond a set limit.<\/li>\n<li>The researchers <strong>extend regulation to the weights<\/strong>: After each step, they apply operations to <em>cap the singular values<\/em> of every weight matrix. <strong>Activation norms stay remarkably small<\/strong> as a result\u2014rarely exceeding values compatible with fp8 precision in their GPT-2 scale transformers.<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\"><strong>Removing Stability Tricks<\/strong><\/h3>\n<p>In all experiments, <strong>no layer normalization, no QK norm, no logit tanh were used.<\/strong> Yet,<\/p>\n<ul class=\"wp-block-list\">\n<li>Maximum activation entries in <em>their GPT-2 scale transformer never exceeded ~100,<\/em> while the unconstrained baseline surpassed 148,000.<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\"><strong>Table Sample (NanoGPT Experiment)<\/strong><\/h3>\n<figure class=\"wp-block-table\">\n<table class=\"has-fixed-layout\">\n<thead>\n<tr>\n<th>Model<\/th>\n<th>Max Activation<\/th>\n<th>Layer Stability Tricks<\/th>\n<th>Validation Accuracy<\/th>\n<th>Lipschitz Bound<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Baseline (Speedrun)<\/td>\n<td>148,480<\/td>\n<td>Yes<\/td>\n<td>39.4%<\/td>\n<td>\u221e<\/td>\n<\/tr>\n<tr>\n<td>Lipschitz Transformer<\/td>\n<td>160<\/td>\n<td>None<\/td>\n<td>39.5%<\/td>\n<td>10\u00b9\u2070\u00b2\u2076\u2074<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/figure>\n<h3 class=\"wp-block-heading\"><strong>Methods for Enforcing Lipschitz Constraints<\/strong><\/h3>\n<p>A variety of <em>weight norm constraint methods<\/em> were explored and compared for their ability to:<\/p>\n<ol class=\"wp-block-list\">\n<li><strong>Maintain high performance<\/strong>,<\/li>\n<li><strong>Guarantee a Lipschitz bound<\/strong>, and<\/li>\n<li><strong>Optimize the performance-Lipschitz tradeoff.<\/strong><\/li>\n<\/ol>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img decoding=\"async\" width=\"1024\" height=\"678\" data-attachment-id=\"73159\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/08\/02\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/screenshot-2025-08-02-at-1-51-56-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.56-PM-1.png\" data-orig-size=\"1640,1086\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2025-08-02 at 1.51.56\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.56-PM-1-300x199.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.56-PM-1-1024x678.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.56-PM-1-1024x678.png\" alt=\"\" class=\"wp-image-73159\" \/><\/figure>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>Techniques<\/strong><\/h3>\n<ul class=\"wp-block-list\">\n<li><strong>Weight Decay<\/strong>: Standard method, but not always strict on spectral norm.<\/li>\n<li><strong>Spectral Normalization<\/strong>: Ensures top singular value is capped, but may affect all singular values globally.<\/li>\n<li><strong>Spectral Soft Cap<\/strong>: Novel method, smoothly and efficiently applies \u03c3\u2192min\u2061(\u03c3max,\u03c3)sigma to min(sigma_{text{max}}, sigma)\u03c3\u2192min(\u03c3max,\u03c3) to all singular values in parallel (using odd polynomial approximations). This is co-designed for Muon\u2019s high stable-rank updates for tight bounds.<\/li>\n<li><strong>Spectral Hammer<\/strong>: Sets only the largest singular value to \u03c3maxsigma_{text{max}}\u03c3max, best suited for AdamW optimizer.<\/li>\n<\/ul>\n<h2 class=\"wp-block-heading\"><strong>Experimental Results and Insights<\/strong><\/h2>\n<h3 class=\"wp-block-heading\"><strong>Model Evaluation at Various Scales<\/strong><\/h3>\n<ol class=\"wp-block-list\">\n<li><strong>Shakespeare (Small Transformer, &lt;2-Lipschitz):<\/strong>\n<ul class=\"wp-block-list\">\n<li>Achieves 60% validation accuracy with a provable Lipschitz bound below.<\/li>\n<li>Outperforms unconstrained baseline in validation loss.<\/li>\n<\/ul>\n<\/li>\n<li><strong>NanoGPT (145M Parameters):<\/strong>\n<ul class=\"wp-block-list\">\n<li>With a Lipschitz bound &lt;10, validation accuracy: 21.2%.<\/li>\n<li>To <em>match<\/em> the strong unconstrained baseline (39.4% accuracy), <strong>required a large upper bound of 1026410^{264}10264<\/strong>. This highlights how strict Lipschitz constraints often trade off with expressivity at large scales for now.<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<h3 class=\"wp-block-heading\"><strong>Weight Constraint Method Efficiency<\/strong><\/h3>\n<ul class=\"wp-block-list\">\n<li><strong>Muon + Spectral Cap<\/strong>: <em>Leads the tradeoff frontier<\/em>\u2014lower Lipschitz constants for matched or better validation loss compared to AdamW + weight decay.<\/li>\n<li><strong>Spectral soft cap and normalization<\/strong> (under Muon) consistently enable best frontier on the loss-Lipschitz tradeoff.<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\"><strong>Stability and Robustness<\/strong><\/h3>\n<ul class=\"wp-block-list\">\n<li><strong>Adversarial robustness<\/strong> increases sharply at lower Lipschitz bounds.<\/li>\n<li>In experiments, models with a constrained Lipschitz constant suffered much milder accuracy drop under adversarial attack compared to unconstrained baselines.<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\"><strong>Activation Magnitudes<\/strong><\/h3>\n<ul class=\"wp-block-list\">\n<li><strong>With spectral weight regulation:<\/strong> Maximum activations remain tiny (near-fp8 compatible), compared to the unbounded baselines, even at scale.<\/li>\n<li>This opens avenues for <strong>low-precision training and inference<\/strong> in hardware, where smaller activations reduce compute, memory, and power costs.<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\"><strong>Limitations and Open Questions<\/strong><\/h3>\n<ul class=\"wp-block-list\">\n<li><strong>Selecting the \u201ctightest\u201d tradeoff<\/strong> for weight norms, logit scaling, and attention scaling still relies on sweeps, not principle.<\/li>\n<li><strong>Current upper-bounding is loose<\/strong>: Calculated global bounds can be astronomically large (e.g. 1026410^{264}10264), while real activation norms remain small.<\/li>\n<li>It\u2019s unclear if matching unconstrained baseline performance with strictly small Lipschitz bounds is possible as scale increases\u2014<em>more research needed<\/em>.<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\"><strong>Conclusion<\/strong><\/h3>\n<p><strong>Spectral weight regulation\u2014especially when paired with the Muon optimizer\u2014can stably train large transformers with enforced Lipschitz bounds, without activation normalization or other band-aid tricks.<\/strong> This addresses instability at a deeper level and keeps activations in a compact, predictable range, greatly improving adversarial robustness and potentially hardware efficiency.<\/p>\n<p>This line of work points to new, efficient computational primitives for neural network regulation, with broad applications for privacy, safety, and low-precision AI deployment.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<p>Check out the\u00a0<strong><a href=\"https:\/\/arxiv.org\/abs\/2507.13338\" target=\"_blank\" rel=\"noreferrer noopener\">Paper<\/a>, <a href=\"https:\/\/github.com\/Arongil\/lipschitz-transformers\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub Page<\/a> and <a href=\"https:\/\/huggingface.co\/phess2\/lipschitz-transformers\" target=\"_blank\" rel=\"noreferrer noopener\">Hugging Face Project Page<\/a><em>.<\/em><\/strong>\u00a0Feel free to check out our\u00a0<strong><mark><a href=\"https:\/\/github.com\/Marktechpost\/AI-Tutorial-Codes-Included\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub Page for Tutorials, Codes and Notebooks<\/a><\/mark><\/strong>.\u00a0Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">100k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>.<\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2025\/08\/02\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/\">MIT Researchers Develop Methods to Control Transformer Sensitivity with Provable Lipschitz Bounds and Muon<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Training large-scale transformers stably has been a longstanding challenge in deep learning, particularly as models grow in size and expressivity. MIT researchers tackle a persistent problem at its root: the unstable growth of activations and loss spikes caused by unconstrained weight and activation norms. Their solution is to enforce provable Lipschitz bounds on the transformer by *spectrally regulating the weights\u2014*with no use of activation normalization, QK norm, or logit softcapping tricks. What is a Lipschitz Bound\u2014and Why Enforce It? A Lipschitz bound on a neural network quantifies the maximum amount by which the output can change in response to input (or weight) perturbations. Mathematically, a function fff is KKK-Lipschitz if:\u2225f(x1)\u2212f(x2)\u2225\u2264K\u2225x1\u2212x2\u2225\u00a0\u2200x1,x2|f(x_1) \u2013 f(x_2)| leq K |x_1 \u2013 x_2| forall x_1, x_2\u2225f(x1)\u2212f(x2)\u2225\u2264K\u2225x1\u2212x2\u2225\u00a0\u2200x1,x2 Lower Lipschitz bound \u21d2 greater robustness and predictability. It is crucial for stability, adversarial robustness, privacy, and generalization, with lower bounds meaning the network is less sensitive to changes or adversarial noise. Motivation and Problem Statement Traditionally, training stable transformers at scale has involved a variety of \u201cband-aid\u201d stabilization tricks: Layer normalization QK normalization Logit tanh softcapping But these do not directly address the underlying spectral norm (largest singular value) growth in the weights, a root cause of exploding activations and training instability\u2014especially in large models. The central hypothesis: If we spectrally regulate the weights themselves\u2014beyond just the optimizer or activations\u2014we can maintain tight control over Lipschitzness, potentially solving instability at its source. Key Innovations Weight Spectral Regulation and the Muon Optimizer Muon optimizer spectrally regularizes gradients, ensuring each gradient step does not increase the spectral norm beyond a set limit. The researchers extend regulation to the weights: After each step, they apply operations to cap the singular values of every weight matrix. Activation norms stay remarkably small as a result\u2014rarely exceeding values compatible with fp8 precision in their GPT-2 scale transformers. Removing Stability Tricks In all experiments, no layer normalization, no QK norm, no logit tanh were used. Yet, Maximum activation entries in their GPT-2 scale transformer never exceeded ~100, while the unconstrained baseline surpassed 148,000. Table Sample (NanoGPT Experiment) Model Max Activation Layer Stability Tricks Validation Accuracy Lipschitz Bound Baseline (Speedrun) 148,480 Yes 39.4% \u221e Lipschitz Transformer 160 None 39.5% 10\u00b9\u2070\u00b2\u2076\u2074 Methods for Enforcing Lipschitz Constraints A variety of weight norm constraint methods were explored and compared for their ability to: Maintain high performance, Guarantee a Lipschitz bound, and Optimize the performance-Lipschitz tradeoff. Techniques Weight Decay: Standard method, but not always strict on spectral norm. Spectral Normalization: Ensures top singular value is capped, but may affect all singular values globally. Spectral Soft Cap: Novel method, smoothly and efficiently applies \u03c3\u2192min\u2061(\u03c3max,\u03c3)sigma to min(sigma_{text{max}}, sigma)\u03c3\u2192min(\u03c3max,\u03c3) to all singular values in parallel (using odd polynomial approximations). This is co-designed for Muon\u2019s high stable-rank updates for tight bounds. Spectral Hammer: Sets only the largest singular value to \u03c3maxsigma_{text{max}}\u03c3max, best suited for AdamW optimizer. Experimental Results and Insights Model Evaluation at Various Scales Shakespeare (Small Transformer, &lt;2-Lipschitz): Achieves 60% validation accuracy with a provable Lipschitz bound below. Outperforms unconstrained baseline in validation loss. NanoGPT (145M Parameters): With a Lipschitz bound &lt;10, validation accuracy: 21.2%. To match the strong unconstrained baseline (39.4% accuracy), required a large upper bound of 1026410^{264}10264. This highlights how strict Lipschitz constraints often trade off with expressivity at large scales for now. Weight Constraint Method Efficiency Muon + Spectral Cap: Leads the tradeoff frontier\u2014lower Lipschitz constants for matched or better validation loss compared to AdamW + weight decay. Spectral soft cap and normalization (under Muon) consistently enable best frontier on the loss-Lipschitz tradeoff. Stability and Robustness Adversarial robustness increases sharply at lower Lipschitz bounds. In experiments, models with a constrained Lipschitz constant suffered much milder accuracy drop under adversarial attack compared to unconstrained baselines. Activation Magnitudes With spectral weight regulation: Maximum activations remain tiny (near-fp8 compatible), compared to the unbounded baselines, even at scale. This opens avenues for low-precision training and inference in hardware, where smaller activations reduce compute, memory, and power costs. Limitations and Open Questions Selecting the \u201ctightest\u201d tradeoff for weight norms, logit scaling, and attention scaling still relies on sweeps, not principle. Current upper-bounding is loose: Calculated global bounds can be astronomically large (e.g. 1026410^{264}10264), while real activation norms remain small. It\u2019s unclear if matching unconstrained baseline performance with strictly small Lipschitz bounds is possible as scale increases\u2014more research needed. Conclusion Spectral weight regulation\u2014especially when paired with the Muon optimizer\u2014can stably train large transformers with enforced Lipschitz bounds, without activation normalization or other band-aid tricks. This addresses instability at a deeper level and keeps activations in a compact, predictable range, greatly improving adversarial robustness and potentially hardware efficiency. This line of work points to new, efficient computational primitives for neural network regulation, with broad applications for privacy, safety, and low-precision AI deployment. Check out the\u00a0Paper, GitHub Page and Hugging Face Project Page.\u00a0Feel free to check out our\u00a0GitHub Page for Tutorials, Codes and Notebooks.\u00a0Also,\u00a0feel free to follow us on\u00a0Twitter\u00a0and don\u2019t forget to join our\u00a0100k+ ML SubReddit\u00a0and Subscribe to\u00a0our Newsletter. The post MIT Researchers Develop Methods to Control Transformer Sensitivity with Provable Lipschitz Bounds and Muon appeared first on MarkTechPost.<\/p>","protected":false},"author":2,"featured_media":29131,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"pmpro_default_level":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"_pvb_checkbox_block_on_post":false,"footnotes":""},"categories":[52,5,7,1],"tags":[],"class_list":["post-29130","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-club","category-committee","category-news","category-uncategorized","pmpro-has-access"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>MIT Researchers Develop Methods to Control Transformer Sensitivity with Provable Lipschitz Bounds and Muon - YouZum<\/title>\n<meta name=\"description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/youzum.net\/it\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/\" \/>\n<meta property=\"og:locale\" content=\"it_IT\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"MIT Researchers Develop Methods to Control Transformer Sensitivity with Provable Lipschitz Bounds and Muon - YouZum\" \/>\n<meta property=\"og:description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta property=\"og:url\" content=\"https:\/\/youzum.net\/it\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/\" \/>\n<meta property=\"og:site_name\" content=\"YouZum\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DroneAssociationTH\/\" \/>\n<meta property=\"article:published_time\" content=\"2025-08-03T05:54:23+00:00\" \/>\n<meta name=\"author\" content=\"admin NU\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Scritto da\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin NU\" \/>\n\t<meta name=\"twitter:label2\" content=\"Tempo di lettura stimato\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minuti\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/\"},\"author\":{\"name\":\"admin NU\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\"},\"headline\":\"MIT Researchers Develop Methods to Control Transformer Sensitivity with Provable Lipschitz Bounds and Muon\",\"datePublished\":\"2025-08-03T05:54:23+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/\"},\"wordCount\":893,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN.png\",\"articleSection\":[\"AI\",\"Committee\",\"News\",\"Uncategorized\"],\"inLanguage\":\"it-IT\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/\",\"url\":\"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/\",\"name\":\"MIT Researchers Develop Methods to Control Transformer Sensitivity with Provable Lipschitz Bounds and Muon - YouZum\",\"isPartOf\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN.png\",\"datePublished\":\"2025-08-03T05:54:23+00:00\",\"description\":\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\",\"breadcrumb\":{\"@id\":\"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#breadcrumb\"},\"inLanguage\":\"it-IT\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"it-IT\",\"@id\":\"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#primaryimage\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN.png\",\"width\":1024,\"height\":572},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/youzum.net\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"MIT Researchers Develop Methods to Control Transformer Sensitivity with Provable Lipschitz Bounds and Muon\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/yousum.gpucore.co\/#website\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"name\":\"YouSum\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/yousum.gpucore.co\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"it-IT\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\",\"name\":\"Drone Association Thailand\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"it-IT\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"width\":300,\"height\":300,\"caption\":\"Drone Association Thailand\"},\"image\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/DroneAssociationTH\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\",\"name\":\"admin NU\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"it-IT\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"caption\":\"admin NU\"},\"url\":\"https:\/\/youzum.net\/it\/members\/adminnu\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"MIT Researchers Develop Methods to Control Transformer Sensitivity with Provable Lipschitz Bounds and Muon - YouZum","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/youzum.net\/it\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/","og_locale":"it_IT","og_type":"article","og_title":"MIT Researchers Develop Methods to Control Transformer Sensitivity with Provable Lipschitz Bounds and Muon - YouZum","og_description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","og_url":"https:\/\/youzum.net\/it\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/","og_site_name":"YouZum","article_publisher":"https:\/\/www.facebook.com\/DroneAssociationTH\/","article_published_time":"2025-08-03T05:54:23+00:00","author":"admin NU","twitter_card":"summary_large_image","twitter_misc":{"Scritto da":"admin NU","Tempo di lettura stimato":"4 minuti"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#article","isPartOf":{"@id":"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/"},"author":{"name":"admin NU","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c"},"headline":"MIT Researchers Develop Methods to Control Transformer Sensitivity with Provable Lipschitz Bounds and Muon","datePublished":"2025-08-03T05:54:23+00:00","mainEntityOfPage":{"@id":"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/"},"wordCount":893,"commentCount":0,"publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"image":{"@id":"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN.png","articleSection":["AI","Committee","News","Uncategorized"],"inLanguage":"it-IT","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/","url":"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/","name":"MIT Researchers Develop Methods to Control Transformer Sensitivity with Provable Lipschitz Bounds and Muon - YouZum","isPartOf":{"@id":"https:\/\/yousum.gpucore.co\/#website"},"primaryImageOfPage":{"@id":"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#primaryimage"},"image":{"@id":"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN.png","datePublished":"2025-08-03T05:54:23+00:00","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","breadcrumb":{"@id":"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#breadcrumb"},"inLanguage":"it-IT","potentialAction":[{"@type":"ReadAction","target":["https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/"]}]},{"@type":"ImageObject","inLanguage":"it-IT","@id":"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#primaryimage","url":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN.png","width":1024,"height":572},{"@type":"BreadcrumbList","@id":"https:\/\/youzum.net\/mit-researchers-develop-methods-to-control-transformer-sensitivity-with-provable-lipschitz-bounds-and-muon\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/youzum.net\/"},{"@type":"ListItem","position":2,"name":"MIT Researchers Develop Methods to Control Transformer Sensitivity with Provable Lipschitz Bounds and Muon"}]},{"@type":"WebSite","@id":"https:\/\/yousum.gpucore.co\/#website","url":"https:\/\/yousum.gpucore.co\/","name":"YouSum","description":"","publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/yousum.gpucore.co\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"it-IT"},{"@type":"Organization","@id":"https:\/\/yousum.gpucore.co\/#organization","name":"Drone Association Thailand","url":"https:\/\/yousum.gpucore.co\/","logo":{"@type":"ImageObject","inLanguage":"it-IT","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","width":300,"height":300,"caption":"Drone Association Thailand"},"image":{"@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DroneAssociationTH\/"]},{"@type":"Person","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c","name":"admin NU","image":{"@type":"ImageObject","inLanguage":"it-IT","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","caption":"admin NU"},"url":"https:\/\/youzum.net\/it\/members\/adminnu\/"}]}},"rttpg_featured_image_url":{"full":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN.png",1024,572,false],"landscape":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN.png",1024,572,false],"portraits":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN.png",1024,572,false],"thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN-150x150.png",150,150,true],"medium":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN-300x168.png",300,168,true],"large":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN.png",1024,572,false],"1536x1536":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN.png",1024,572,false],"2048x2048":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN.png",1024,572,false],"trp-custom-language-flag":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN-18x10.png",18,10,true],"woocommerce_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN-300x300.png",300,300,true],"woocommerce_single":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN-600x335.png",600,335,true],"woocommerce_gallery_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-02-at-1.51.26-PM-1-1024x572-7MwCeN-100x100.png",100,100,true]},"rttpg_author":{"display_name":"admin NU","author_link":"https:\/\/youzum.net\/it\/members\/adminnu\/"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/youzum.net\/it\/category\/ai-club\/\" rel=\"category tag\">AI<\/a> <a href=\"https:\/\/youzum.net\/it\/category\/committee\/\" rel=\"category tag\">Committee<\/a> <a href=\"https:\/\/youzum.net\/it\/category\/news\/\" rel=\"category tag\">News<\/a> <a href=\"https:\/\/youzum.net\/it\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","rttpg_excerpt":"Training large-scale transformers stably has been a longstanding challenge in deep learning, particularly as models grow in size and expressivity. MIT researchers tackle a persistent problem at its root: the unstable growth of activations and loss spikes caused by unconstrained weight and activation norms. Their solution is to enforce provable Lipschitz bounds on the transformer&hellip;","_links":{"self":[{"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/posts\/29130","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/comments?post=29130"}],"version-history":[{"count":0,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/posts\/29130\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/media\/29131"}],"wp:attachment":[{"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/media?parent=29130"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/categories?post=29130"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/tags?post=29130"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}