{"id":85785,"date":"2026-04-24T15:27:25","date_gmt":"2026-04-24T15:27:25","guid":{"rendered":"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/"},"modified":"2026-04-24T15:27:25","modified_gmt":"2026-04-24T15:27:25","slug":"google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates","status":"publish","type":"post","link":"https:\/\/youzum.net\/zh\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/","title":{"rendered":"Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates"},"content":{"rendered":"<p>Training frontier AI models is, at its core, a coordination problem. Thousands of chips must communicate with each other continuously, synchronizing every gradient update across the network. When one chip fails or even slows down, the entire training run can stall. As models scale toward hundreds of billions of parameters, that fragility becomes increasingly untenable. Google DeepMind is now proposing a different model entirely.<\/p>\n<p>Google DeepMind researchers introduced <strong>Decoupled DiLoCo<\/strong> (Distributed Low-Communication), a distributed training architecture that decouples compute into asynchronous, fault-isolated \u2018islands,\u2019 enabling large language model pre-training across geographically distant data centers without requiring the tight synchronization that makes conventional approaches brittle at scale.<\/p>\n<h3 class=\"wp-block-heading\"><strong>The Problem with Traditional Distributed Training<\/strong><\/h3>\n<p>To understand why Decoupled DiLoCo is important, it helps to understand how distributed training typically works. Standard <strong>Data-Parallel<\/strong> training replicates a model across many accelerators (GPUs or TPUs), each processing a different mini-batch of data. After each forward and backward pass, gradients must be averaged across every device \u2014 a process called <strong>AllReduce<\/strong> \u2014 before the next training step can begin. This blocking synchronization step means every device must wait for the slowest one. Across thousands of chips spanning multiple data centers, that bottleneck is not just inconvenient; it makes global-scale training effectively impractical.<\/p>\n<p>Bandwidth is another hard constraint. Conventional Data-Parallel training requires approximately 198 Gbps of inter-datacenter bandwidth across eight data centers \u2014 far beyond what standard wide-area networking (WAN) can support between geographically distributed facilities.<\/p>\n<h3 class=\"wp-block-heading\"><strong>How Decoupled DiLoCo Works<\/strong><\/h3>\n<p>Decoupled DiLoCo builds on two prior systems from Google. The first is <strong>Pathways<\/strong>, which introduced a distributed AI system based on asynchronous data flow, allowing different compute resources to work at their own pace without blocking on one another. The second is <strong>DiLoCo<\/strong>, which dramatically reduced the inter-datacenter bandwidth required for distributed training by having each worker perform many local gradient steps before communicating with peers \u2014 dramatically reducing how much data needs to flow between data centers.<\/p>\n<p>Decoupled DiLoCo brings both ideas together. Built on top of Pathways, training is divided across separate clusters of accelerators called <strong>learner units<\/strong> \u2014 the \u2018islands\u2019 of compute. Each learner unit trains semi-independently, performing many local steps, before sharing a compressed gradient signal with an outer optimizer that aggregates updates across all learner units. Because this outer synchronization step is asynchronous, a chip failure or slow learner unit in one island does not block the others from continuing to train.<\/p>\n<p>The bandwidth savings are dramatic. Decoupled DiLoCo reduces required inter-datacenter bandwidth from 198 Gbps to just 0.84 Gbps across eight data centers \u2014 multiple orders of magnitude lower \u2014 making it compatible with standard internet-scale connectivity between datacenter facilities rather than requiring custom high-speed network infrastructure.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Self-Healing Through Chaos Engineering<\/strong><\/h3>\n<p>One of the most technically significant properties of Decoupled DiLoCo is its fault tolerance. The research team used <strong>chaos engineering<\/strong>, a method that deliberately introduces artificial hardware failures into a running system to test its robustness during training runs. The system continued training after the loss of entire learner units, and then seamlessly reintegrated those units when they came back online. This behavior is what the research team describes as \u2018self-healing\u2019.<\/p>\n<p>In simulations involving 1.2 million chips under high failure rates, Decoupled DiLoCo maintained a <strong>goodput<\/strong> (the fraction of time the system is performing useful training) of 88%, compared to just 27% for standard Data-Parallel methods. Goodput is the practical metric that matters here: a training run with high nominal compute but low goodput wastes significant resources.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1386\" height=\"844\" data-attachment-id=\"79266\" data-permalink=\"https:\/\/www.marktechpost.com\/2026\/04\/23\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/screenshot-2026-04-23-at-9-39-48-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1.png\" data-orig-size=\"1386,844\" data-comments-opened=\"0\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2026-04-23 at 9.39.48\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-1024x624.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1.png\" alt=\"\" class=\"wp-image-79266\" \/><figcaption class=\"wp-element-caption\">https:\/\/deepmind.google\/blog\/decoupled-diloco\/?<\/figcaption><\/figure>\n<\/div>\n<p>Critically, these resilience gains come with minimal degradation in model quality. In real-world experiments using <strong>Gemma 4<\/strong> models, Decoupled DiLoCo achieved an average ML benchmark accuracy of 64.1%, compared to 64.4% for the conventional baseline \u2014 a difference well within the noise of typical evaluation variance.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Training a 12B Model Across Four U.S. Regions<\/strong><\/h3>\n<p>The research team validated Decoupled DiLoCo at production scale by successfully training a <strong>12 billion parameter model<\/strong> across four separate U.S. regions using just 2\u20135 Gbps of wide-area networking, a bandwidth level achievable with existing commercial internet infrastructure between data center facilities. The system accomplished this more than 20 times faster than conventional synchronization methods. The key reason: rather than forcing compute to pause and wait for communication to complete, Decoupled DiLoCo incorporates required communication into longer periods of computation, eliminating the \u201cblocking\u201d bottlenecks that make conventional distributed training slow at global scale.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Mixing Hardware Generations<\/strong><\/h3>\n<p>An underappreciated implication of the architecture is its support for heterogeneous hardware. Because learner units operate asynchronously, they do not need to run on identical hardware at the same clock speed. The research team demonstrated training runs that mixed <strong>TPU v6e<\/strong> and <strong>TPU v5p<\/strong> chips \u2014 different hardware generations with different performance characteristics \u2014 in a single training job, without degrading ML performance relative to homogeneous runs.<\/p>\n<p>This has two practical consequences worth noting. First, it extends the useful life of existing hardware, allowing older accelerators to continue contributing meaningfully to large-scale training. Second, because new hardware generations do not arrive everywhere at once, being able to train across generations can alleviate the recurring logistical and capacity bottlenecks that arise during hardware transition periods \u2014 a real operational challenge at organizations running large training infrastructure.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Key Takeaways<\/strong><\/h3>\n<ul class=\"wp-block-list\">\n<li><strong>Decoupled DiLoCo eliminates the single-point-of-failure problem in large-scale AI training<\/strong> by dividing training across asynchronous, fault-isolated \u201cislands\u201d of compute called learner units \u2014 so a chip or cluster failure in one island does not stall the rest of the training run.<\/li>\n<li><strong>The architecture reduces inter-datacenter bandwidth requirements by orders of magnitude<\/strong> \u2014 from 198 Gbps down to 0.84 Gbps across eight data centers \u2014 making globally distributed pre-training feasible over standard wide-area networking rather than requiring custom high-speed infrastructure.<\/li>\n<li><strong>Decoupled DiLoCo is self-healing:<\/strong> using chaos engineering to simulate real hardware failures, the system maintained 88% goodput compared to just 27% for standard Data-Parallel training under high failure rates, and seamlessly reintegrated offline learner units when they came back online.<\/li>\n<li><strong>The approach was validated at production scale<\/strong>, successfully training a 12 billion parameter model across four U.S. regions \u2014 achieving this more than 20 times faster than conventional synchronization methods by folding communication into computation rather than treating it as a blocking step.<\/li>\n<li><strong>Decoupled DiLoCo supports heterogeneous hardware in a single training run<\/strong>, demonstrated by mixing TPU v6e and TPU v5p chips without performance degradation \u2014 extending the useful life of older accelerators and easing capacity bottlenecks during hardware generation transitions.<\/li>\n<\/ul>\n<hr class=\"wp-block-separator aligncenter has-alpha-channel-opacity is-style-wide\" \/>\n<p>Check out\u00a0the <strong><a href=\"https:\/\/storage.googleapis.com\/deepmind-media\/DeepMind.com\/Blog\/decoupled-diloco-a-new-frontier-for-resilient-distributed-ai-training\/decoupled-diloco-for-resilient-distributed-pre-training.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Paper<\/a><\/strong> and<strong>\u00a0<a href=\"https:\/\/deepmind.google\/blog\/decoupled-diloco\/?\" target=\"_blank\" rel=\"noreferrer noopener\">Technical details<\/a><\/strong>.<strong>\u00a0<\/strong>Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">130k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p>Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.?\u00a0<strong><a href=\"https:\/\/forms.gle\/MTNLpmJtsFA3VRVd9\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Connect with us<\/mark><\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2026\/04\/23\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/\">Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Training frontier AI models is, at its core, a coordination problem. Thousands of chips must communicate with each other continuously, synchronizing every gradient update across the network. When one chip fails or even slows down, the entire training run can stall. As models scale toward hundreds of billions of parameters, that fragility becomes increasingly untenable. Google DeepMind is now proposing a different model entirely. Google DeepMind researchers introduced Decoupled DiLoCo (Distributed Low-Communication), a distributed training architecture that decouples compute into asynchronous, fault-isolated \u2018islands,\u2019 enabling large language model pre-training across geographically distant data centers without requiring the tight synchronization that makes conventional approaches brittle at scale. The Problem with Traditional Distributed Training To understand why Decoupled DiLoCo is important, it helps to understand how distributed training typically works. Standard Data-Parallel training replicates a model across many accelerators (GPUs or TPUs), each processing a different mini-batch of data. After each forward and backward pass, gradients must be averaged across every device \u2014 a process called AllReduce \u2014 before the next training step can begin. This blocking synchronization step means every device must wait for the slowest one. Across thousands of chips spanning multiple data centers, that bottleneck is not just inconvenient; it makes global-scale training effectively impractical. Bandwidth is another hard constraint. Conventional Data-Parallel training requires approximately 198 Gbps of inter-datacenter bandwidth across eight data centers \u2014 far beyond what standard wide-area networking (WAN) can support between geographically distributed facilities. How Decoupled DiLoCo Works Decoupled DiLoCo builds on two prior systems from Google. The first is Pathways, which introduced a distributed AI system based on asynchronous data flow, allowing different compute resources to work at their own pace without blocking on one another. The second is DiLoCo, which dramatically reduced the inter-datacenter bandwidth required for distributed training by having each worker perform many local gradient steps before communicating with peers \u2014 dramatically reducing how much data needs to flow between data centers. Decoupled DiLoCo brings both ideas together. Built on top of Pathways, training is divided across separate clusters of accelerators called learner units \u2014 the \u2018islands\u2019 of compute. Each learner unit trains semi-independently, performing many local steps, before sharing a compressed gradient signal with an outer optimizer that aggregates updates across all learner units. Because this outer synchronization step is asynchronous, a chip failure or slow learner unit in one island does not block the others from continuing to train. The bandwidth savings are dramatic. Decoupled DiLoCo reduces required inter-datacenter bandwidth from 198 Gbps to just 0.84 Gbps across eight data centers \u2014 multiple orders of magnitude lower \u2014 making it compatible with standard internet-scale connectivity between datacenter facilities rather than requiring custom high-speed network infrastructure. Self-Healing Through Chaos Engineering One of the most technically significant properties of Decoupled DiLoCo is its fault tolerance. The research team used chaos engineering, a method that deliberately introduces artificial hardware failures into a running system to test its robustness during training runs. The system continued training after the loss of entire learner units, and then seamlessly reintegrated those units when they came back online. This behavior is what the research team describes as \u2018self-healing\u2019. In simulations involving 1.2 million chips under high failure rates, Decoupled DiLoCo maintained a goodput (the fraction of time the system is performing useful training) of 88%, compared to just 27% for standard Data-Parallel methods. Goodput is the practical metric that matters here: a training run with high nominal compute but low goodput wastes significant resources. https:\/\/deepmind.google\/blog\/decoupled-diloco\/? Critically, these resilience gains come with minimal degradation in model quality. In real-world experiments using Gemma 4 models, Decoupled DiLoCo achieved an average ML benchmark accuracy of 64.1%, compared to 64.4% for the conventional baseline \u2014 a difference well within the noise of typical evaluation variance. Training a 12B Model Across Four U.S. Regions The research team validated Decoupled DiLoCo at production scale by successfully training a 12 billion parameter model across four separate U.S. regions using just 2\u20135 Gbps of wide-area networking, a bandwidth level achievable with existing commercial internet infrastructure between data center facilities. The system accomplished this more than 20 times faster than conventional synchronization methods. The key reason: rather than forcing compute to pause and wait for communication to complete, Decoupled DiLoCo incorporates required communication into longer periods of computation, eliminating the \u201cblocking\u201d bottlenecks that make conventional distributed training slow at global scale. Mixing Hardware Generations An underappreciated implication of the architecture is its support for heterogeneous hardware. Because learner units operate asynchronously, they do not need to run on identical hardware at the same clock speed. The research team demonstrated training runs that mixed TPU v6e and TPU v5p chips \u2014 different hardware generations with different performance characteristics \u2014 in a single training job, without degrading ML performance relative to homogeneous runs. This has two practical consequences worth noting. First, it extends the useful life of existing hardware, allowing older accelerators to continue contributing meaningfully to large-scale training. Second, because new hardware generations do not arrive everywhere at once, being able to train across generations can alleviate the recurring logistical and capacity bottlenecks that arise during hardware transition periods \u2014 a real operational challenge at organizations running large training infrastructure. Key Takeaways Decoupled DiLoCo eliminates the single-point-of-failure problem in large-scale AI training by dividing training across asynchronous, fault-isolated \u201cislands\u201d of compute called learner units \u2014 so a chip or cluster failure in one island does not stall the rest of the training run. The architecture reduces inter-datacenter bandwidth requirements by orders of magnitude \u2014 from 198 Gbps down to 0.84 Gbps across eight data centers \u2014 making globally distributed pre-training feasible over standard wide-area networking rather than requiring custom high-speed infrastructure. Decoupled DiLoCo is self-healing: using chaos engineering to simulate real hardware failures, the system maintained 88% goodput compared to just 27% for standard Data-Parallel training under high failure rates, and seamlessly reintegrated offline learner units when they came back online. The approach was validated at production scale, successfully training a 12 billion parameter model across four<\/p>","protected":false},"author":2,"featured_media":85786,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"pmpro_default_level":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"_pvb_checkbox_block_on_post":false,"footnotes":""},"categories":[52,5,7,1],"tags":[],"class_list":["post-85785","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-club","category-committee","category-news","category-uncategorized","pmpro-has-access"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates - YouZum<\/title>\n<meta name=\"description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/youzum.net\/zh\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/\" \/>\n<meta property=\"og:locale\" content=\"zh_CN\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates - YouZum\" \/>\n<meta property=\"og:description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta property=\"og:url\" content=\"https:\/\/youzum.net\/zh\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/\" \/>\n<meta property=\"og:site_name\" content=\"YouZum\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DroneAssociationTH\/\" \/>\n<meta property=\"article:published_time\" content=\"2026-04-24T15:27:25+00:00\" \/>\n<meta name=\"author\" content=\"admin NU\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"\u4f5c\u8005\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin NU\" \/>\n\t<meta name=\"twitter:label2\" content=\"\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 \u5206\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/\"},\"author\":{\"name\":\"admin NU\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\"},\"headline\":\"Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates\",\"datePublished\":\"2026-04-24T15:27:25+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/\"},\"wordCount\":1146,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W.webp\",\"articleSection\":[\"AI\",\"Committee\",\"News\",\"Uncategorized\"],\"inLanguage\":\"zh-Hans\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/\",\"url\":\"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/\",\"name\":\"Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates - YouZum\",\"isPartOf\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W.webp\",\"datePublished\":\"2026-04-24T15:27:25+00:00\",\"description\":\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\",\"breadcrumb\":{\"@id\":\"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#breadcrumb\"},\"inLanguage\":\"zh-Hans\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#primaryimage\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W.webp\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W.webp\",\"width\":1386,\"height\":844},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/youzum.net\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/yousum.gpucore.co\/#website\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"name\":\"YouSum\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/yousum.gpucore.co\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"zh-Hans\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\",\"name\":\"Drone Association Thailand\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"width\":300,\"height\":300,\"caption\":\"Drone Association Thailand\"},\"image\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/DroneAssociationTH\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\",\"name\":\"admin NU\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"caption\":\"admin NU\"},\"url\":\"https:\/\/youzum.net\/zh\/members\/adminnu\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates - YouZum","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/youzum.net\/zh\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/","og_locale":"zh_CN","og_type":"article","og_title":"Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates - YouZum","og_description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","og_url":"https:\/\/youzum.net\/zh\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/","og_site_name":"YouZum","article_publisher":"https:\/\/www.facebook.com\/DroneAssociationTH\/","article_published_time":"2026-04-24T15:27:25+00:00","author":"admin NU","twitter_card":"summary_large_image","twitter_misc":{"\u4f5c\u8005":"admin NU","\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4":"6 \u5206"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#article","isPartOf":{"@id":"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/"},"author":{"name":"admin NU","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c"},"headline":"Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates","datePublished":"2026-04-24T15:27:25+00:00","mainEntityOfPage":{"@id":"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/"},"wordCount":1146,"commentCount":0,"publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"image":{"@id":"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W.webp","articleSection":["AI","Committee","News","Uncategorized"],"inLanguage":"zh-Hans","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/","url":"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/","name":"Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates - YouZum","isPartOf":{"@id":"https:\/\/yousum.gpucore.co\/#website"},"primaryImageOfPage":{"@id":"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#primaryimage"},"image":{"@id":"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W.webp","datePublished":"2026-04-24T15:27:25+00:00","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","breadcrumb":{"@id":"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#breadcrumb"},"inLanguage":"zh-Hans","potentialAction":[{"@type":"ReadAction","target":["https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/"]}]},{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#primaryimage","url":"https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W.webp","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W.webp","width":1386,"height":844},{"@type":"BreadcrumbList","@id":"https:\/\/youzum.net\/google-deepmind-introduces-decoupled-diloco-an-asynchronous-training-architecture-achieving-88-goodput-under-high-hardware-failure-rates\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/youzum.net\/"},{"@type":"ListItem","position":2,"name":"Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates"}]},{"@type":"WebSite","@id":"https:\/\/yousum.gpucore.co\/#website","url":"https:\/\/yousum.gpucore.co\/","name":"YouSum","description":"","publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/yousum.gpucore.co\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"zh-Hans"},{"@type":"Organization","@id":"https:\/\/yousum.gpucore.co\/#organization","name":"Drone Association Thailand","url":"https:\/\/yousum.gpucore.co\/","logo":{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","width":300,"height":300,"caption":"Drone Association Thailand"},"image":{"@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DroneAssociationTH\/"]},{"@type":"Person","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c","name":"admin NU","image":{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","caption":"admin NU"},"url":"https:\/\/youzum.net\/zh\/members\/adminnu\/"}]}},"rttpg_featured_image_url":{"full":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W.webp",1386,844,false],"landscape":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W.webp",1386,844,false],"portraits":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W.webp",1386,844,false],"thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W-150x150.webp",150,150,true],"medium":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W-300x183.webp",300,183,true],"large":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W-1024x624.webp",1024,624,true],"1536x1536":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W.webp",1386,844,false],"2048x2048":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W.webp",1386,844,false],"trp-custom-language-flag":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W-18x12.webp",18,12,true],"woocommerce_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W-300x300.webp",300,300,true],"woocommerce_single":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W-600x365.webp",600,365,true],"woocommerce_gallery_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-at-9.39.48-PM-1-8oOK0W-100x100.webp",100,100,true]},"rttpg_author":{"display_name":"admin NU","author_link":"https:\/\/youzum.net\/zh\/members\/adminnu\/"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/youzum.net\/zh\/category\/ai-club\/\" rel=\"category tag\">AI<\/a> <a href=\"https:\/\/youzum.net\/zh\/category\/committee\/\" rel=\"category tag\">Committee<\/a> <a href=\"https:\/\/youzum.net\/zh\/category\/news\/\" rel=\"category tag\">News<\/a> <a href=\"https:\/\/youzum.net\/zh\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","rttpg_excerpt":"Training frontier AI models is, at its core, a coordination problem. Thousands of chips must communicate with each other continuously, synchronizing every gradient update across the network. When one chip fails or even slows down, the entire training run can stall. As models scale toward hundreds of billions of parameters, that fragility becomes increasingly untenable.&hellip;","_links":{"self":[{"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/posts\/85785","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/comments?post=85785"}],"version-history":[{"count":0,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/posts\/85785\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/media\/85786"}],"wp:attachment":[{"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/media?parent=85785"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/categories?post=85785"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/tags?post=85785"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}