{"id":42319,"date":"2025-10-05T06:52:31","date_gmt":"2025-10-05T06:52:31","guid":{"rendered":"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/"},"modified":"2025-10-05T06:52:31","modified_gmt":"2025-10-05T06:52:31","slug":"this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se","status":"publish","type":"post","link":"https:\/\/youzum.net\/zh\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/","title":{"rendered":"This AI Paper Proposes a Novel Dual-Branch Encoder-Decoder Architecture for Unsupervised Speech Enhancement (SE)"},"content":{"rendered":"<p>Can a speech enhancer trained only on real noisy recordings cleanly separate speech and noise\u2014without ever seeing paired data? A team of researchers from <strong>Brno University of Technology and Johns Hopkins University<\/strong> proposes<strong> <em>Unsupervised Speech Enhancement using Data-defined Priors (USE-DDP)<\/em><\/strong>, a dual-stream encoder\u2013decoder that separates any noisy input into two waveforms\u2014estimated clean speech and residual noise\u2014and learns both <em>solely<\/em> from unpaired datasets (clean-speech corpus and optional noise corpus). Training enforces that the <em>sum<\/em> of the two outputs reconstructs the input waveform, avoiding degenerate solutions and aligning the design with neural audio codec objectives.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1790\" height=\"934\" data-attachment-id=\"75082\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/10\/04\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/screenshot-2025-10-04-at-11-21-17-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1.png\" data-orig-size=\"1790,934\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2025-10-04 at 11.21.17\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-300x157.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-1024x534.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1.png\" alt=\"\" class=\"wp-image-75082\" \/><figcaption class=\"wp-element-caption\">https:\/\/arxiv.org\/pdf\/2509.22942<\/figcaption><\/figure>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>Why this is important?<\/strong><\/h3>\n<p>Most learning-based speech enhancement pipelines depend on paired clean\u2013noisy recordings, which are expensive or impossible to collect at scale in real-world conditions. Unsupervised routes like MetricGAN-U remove the need for clean data but couple model performance to external, non-intrusive metrics used during training. USE-DDP keeps the training <em>data-only<\/em>, imposing priors with discriminators over independent clean-speech and noise datasets and using reconstruction consistency to tie estimates back to the observed mixture.<\/p>\n<h3 class=\"wp-block-heading\"><strong>How it works?<\/strong><\/h3>\n<ul class=\"wp-block-list\">\n<li><strong>Generator:<\/strong> A codec-style encoder compresses the input audio into a latent sequence; this is split into two parallel transformer branches (RoFormer) that target <strong>clean speech<\/strong> and <strong>noise<\/strong> respectively, decoded by a shared decoder back to waveforms. The input is reconstructed as the least-squares combination of the two outputs (scalars \u03b1, \u03b2 compensate for amplitude errors). Reconstruction uses multi-scale mel\/STFT and SI-SDR losses, as in neural audio codecs. <\/li>\n<li><strong>Priors via adversaries:<\/strong> Three discriminator ensembles\u2014clean, noise, and noisy\u2014impose distributional constraints: the clean branch must resemble the clean-speech corpus; the noise branch must resemble a noise corpus; the reconstructed mixture must sound natural. LS-GAN and feature-matching losses are used.<\/li>\n<li><strong>Initialization:<\/strong> Initializing encoder\/decoder from a pretrained Descript Audio Codec improves convergence and final quality vs. training from scratch. <\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\"><strong>How it compares?<\/strong><\/h3>\n<p>On the standard <strong>VCTK+DEMAND<\/strong> simulated setup, USE-DDP reports parity with the strongest unsupervised baselines (e.g., unSE\/unSE+ based on optimal transport) and competitive DNSMOS vs. MetricGAN-U (which directly optimizes DNSMOS). Example numbers from the paper\u2019s Table 1 (input vs. systems): DNSMOS improves from <strong>2.54<\/strong> (noisy) to <strong>~3.03<\/strong> (USE-DDP), PESQ from <strong>1.97<\/strong> to <strong>~2.47<\/strong>; CBAK trails some baselines due to more aggressive noise attenuation in non-speech segments\u2014consistent with the explicit noise prior. <\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img decoding=\"async\" width=\"1144\" height=\"606\" data-attachment-id=\"75084\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/10\/04\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/screenshot-2025-10-04-at-11-21-43-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.43-PM-1.png\" data-orig-size=\"1144,606\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2025-10-04 at 11.21.43\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.43-PM-1-300x159.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.43-PM-1-1024x542.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.43-PM-1.png\" alt=\"\" class=\"wp-image-75084\" \/><figcaption class=\"wp-element-caption\">https:\/\/arxiv.org\/pdf\/2509.22942<\/figcaption><\/figure>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>Data choice is not a detail\u2014it\u2019s the result<\/strong><\/h3>\n<p>A central finding: <em>which<\/em> clean-speech corpus defines the prior can swing outcomes and even create <strong>over-optimistic<\/strong> results on simulated tests.<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>In-domain prior (VCTK clean) on VCTK+DEMAND<\/strong> \u2192 best scores (DNSMOS \u22483.03), but this configuration unrealistically \u201cpeeks\u201d at the target distribution used to synthesize the mixtures.<\/li>\n<li><strong>Out-of-domain prior<\/strong> \u2192 notably lower metrics (e.g., PESQ ~2.04), reflecting distribution mismatch and some noise leakage into the clean branch.<\/li>\n<li><strong>Real-world CHiME-3<\/strong>: using a \u201cclose-talk\u201d channel as <em>in-domain clean prior<\/em> actually hurts\u2014because the \u201cclean\u201d reference itself contains environment bleed; an out-of-domain truly clean corpus yields <em>higher<\/em> DNSMOS\/UTMOS on both dev and test, albeit with some intelligibility trade-off under stronger suppression.<\/li>\n<\/ul>\n<p>This clarifies discrepancies across prior unsupervised results and argues for careful, transparent prior selection when claiming SOTA on simulated benchmarks.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Our Comments<\/strong><\/h3>\n<p>The proposed dual-branch encoder-decoder architecture treats enhancement as explicit two-source estimation with data-defined priors, not metric-chasing. The reconstruction constraint (clean + noise = input) plus adversarial priors over independent clean\/noise corpora gives a clear inductive bias, and initializing from a neural audio codec is a pragmatic way to stabilize training. The results look competitive with unsupervised baselines while avoiding DNSMOS-guided objectives; the caveat is that \u201cclean prior\u201d choice materially affects reported gains, so claims should specify corpus selection. <\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<p>Check out the\u00a0<strong><a href=\"https:\/\/arxiv.org\/abs\/2509.22942\" target=\"_blank\" rel=\"noreferrer noopener\">PAPER<\/a><\/strong>. Feel free to check out our\u00a0<strong><mark><a href=\"https:\/\/github.com\/Marktechpost\/AI-Tutorial-Codes-Included\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub Page for Tutorials, Codes and Notebooks<\/a><\/mark><\/strong>.\u00a0Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">100k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2025\/10\/04\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/\">This AI Paper Proposes a Novel Dual-Branch Encoder-Decoder Architecture for Unsupervised Speech Enhancement (SE)<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Can a speech enhancer trained only on real noisy recordings cleanly separate speech and noise\u2014without ever seeing paired data? A team of researchers from Brno University of Technology and Johns Hopkins University proposes Unsupervised Speech Enhancement using Data-defined Priors (USE-DDP), a dual-stream encoder\u2013decoder that separates any noisy input into two waveforms\u2014estimated clean speech and residual noise\u2014and learns both solely from unpaired datasets (clean-speech corpus and optional noise corpus). Training enforces that the sum of the two outputs reconstructs the input waveform, avoiding degenerate solutions and aligning the design with neural audio codec objectives. https:\/\/arxiv.org\/pdf\/2509.22942 Why this is important? Most learning-based speech enhancement pipelines depend on paired clean\u2013noisy recordings, which are expensive or impossible to collect at scale in real-world conditions. Unsupervised routes like MetricGAN-U remove the need for clean data but couple model performance to external, non-intrusive metrics used during training. USE-DDP keeps the training data-only, imposing priors with discriminators over independent clean-speech and noise datasets and using reconstruction consistency to tie estimates back to the observed mixture. How it works? Generator: A codec-style encoder compresses the input audio into a latent sequence; this is split into two parallel transformer branches (RoFormer) that target clean speech and noise respectively, decoded by a shared decoder back to waveforms. The input is reconstructed as the least-squares combination of the two outputs (scalars \u03b1, \u03b2 compensate for amplitude errors). Reconstruction uses multi-scale mel\/STFT and SI-SDR losses, as in neural audio codecs. Priors via adversaries: Three discriminator ensembles\u2014clean, noise, and noisy\u2014impose distributional constraints: the clean branch must resemble the clean-speech corpus; the noise branch must resemble a noise corpus; the reconstructed mixture must sound natural. LS-GAN and feature-matching losses are used. Initialization: Initializing encoder\/decoder from a pretrained Descript Audio Codec improves convergence and final quality vs. training from scratch. How it compares? On the standard VCTK+DEMAND simulated setup, USE-DDP reports parity with the strongest unsupervised baselines (e.g., unSE\/unSE+ based on optimal transport) and competitive DNSMOS vs. MetricGAN-U (which directly optimizes DNSMOS). Example numbers from the paper\u2019s Table 1 (input vs. systems): DNSMOS improves from 2.54 (noisy) to ~3.03 (USE-DDP), PESQ from 1.97 to ~2.47; CBAK trails some baselines due to more aggressive noise attenuation in non-speech segments\u2014consistent with the explicit noise prior. https:\/\/arxiv.org\/pdf\/2509.22942 Data choice is not a detail\u2014it\u2019s the result A central finding: which clean-speech corpus defines the prior can swing outcomes and even create over-optimistic results on simulated tests. In-domain prior (VCTK clean) on VCTK+DEMAND \u2192 best scores (DNSMOS \u22483.03), but this configuration unrealistically \u201cpeeks\u201d at the target distribution used to synthesize the mixtures. Out-of-domain prior \u2192 notably lower metrics (e.g., PESQ ~2.04), reflecting distribution mismatch and some noise leakage into the clean branch. Real-world CHiME-3: using a \u201cclose-talk\u201d channel as in-domain clean prior actually hurts\u2014because the \u201cclean\u201d reference itself contains environment bleed; an out-of-domain truly clean corpus yields higher DNSMOS\/UTMOS on both dev and test, albeit with some intelligibility trade-off under stronger suppression. This clarifies discrepancies across prior unsupervised results and argues for careful, transparent prior selection when claiming SOTA on simulated benchmarks. Our Comments The proposed dual-branch encoder-decoder architecture treats enhancement as explicit two-source estimation with data-defined priors, not metric-chasing. The reconstruction constraint (clean + noise = input) plus adversarial priors over independent clean\/noise corpora gives a clear inductive bias, and initializing from a neural audio codec is a pragmatic way to stabilize training. The results look competitive with unsupervised baselines while avoiding DNSMOS-guided objectives; the caveat is that \u201cclean prior\u201d choice materially affects reported gains, so claims should specify corpus selection. Check out the\u00a0PAPER. Feel free to check out our\u00a0GitHub Page for Tutorials, Codes and Notebooks.\u00a0Also,\u00a0feel free to follow us on\u00a0Twitter\u00a0and don\u2019t forget to join our\u00a0100k+ ML SubReddit\u00a0and Subscribe to\u00a0our Newsletter. Wait! are you on telegram?\u00a0now you can join us on telegram as well. The post This AI Paper Proposes a Novel Dual-Branch Encoder-Decoder Architecture for Unsupervised Speech Enhancement (SE) appeared first on MarkTechPost.<\/p>","protected":false},"author":2,"featured_media":42320,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"pmpro_default_level":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"_pvb_checkbox_block_on_post":false,"footnotes":""},"categories":[52,5,7,1],"tags":[],"class_list":["post-42319","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-club","category-committee","category-news","category-uncategorized","pmpro-has-access"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>This AI Paper Proposes a Novel Dual-Branch Encoder-Decoder Architecture for Unsupervised Speech Enhancement (SE) - YouZum<\/title>\n<meta name=\"description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/youzum.net\/zh\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/\" \/>\n<meta property=\"og:locale\" content=\"zh_CN\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"This AI Paper Proposes a Novel Dual-Branch Encoder-Decoder Architecture for Unsupervised Speech Enhancement (SE) - YouZum\" \/>\n<meta property=\"og:description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta property=\"og:url\" content=\"https:\/\/youzum.net\/zh\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/\" \/>\n<meta property=\"og:site_name\" content=\"YouZum\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DroneAssociationTH\/\" \/>\n<meta property=\"article:published_time\" content=\"2025-10-05T06:52:31+00:00\" \/>\n<meta name=\"author\" content=\"admin NU\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"\u4f5c\u8005\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin NU\" \/>\n\t<meta name=\"twitter:label2\" content=\"\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 \u5206\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/\"},\"author\":{\"name\":\"admin NU\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\"},\"headline\":\"This AI Paper Proposes a Novel Dual-Branch Encoder-Decoder Architecture for Unsupervised Speech Enhancement (SE)\",\"datePublished\":\"2025-10-05T06:52:31+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/\"},\"wordCount\":684,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE.png\",\"articleSection\":[\"AI\",\"Committee\",\"News\",\"Uncategorized\"],\"inLanguage\":\"zh-Hans\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/\",\"url\":\"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/\",\"name\":\"This AI Paper Proposes a Novel Dual-Branch Encoder-Decoder Architecture for Unsupervised Speech Enhancement (SE) - YouZum\",\"isPartOf\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE.png\",\"datePublished\":\"2025-10-05T06:52:31+00:00\",\"description\":\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\",\"breadcrumb\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#breadcrumb\"},\"inLanguage\":\"zh-Hans\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#primaryimage\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE.png\",\"width\":1790,\"height\":934},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/youzum.net\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"This AI Paper Proposes a Novel Dual-Branch Encoder-Decoder Architecture for Unsupervised Speech Enhancement (SE)\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/yousum.gpucore.co\/#website\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"name\":\"YouSum\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/yousum.gpucore.co\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"zh-Hans\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\",\"name\":\"Drone Association Thailand\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"width\":300,\"height\":300,\"caption\":\"Drone Association Thailand\"},\"image\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/DroneAssociationTH\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\",\"name\":\"admin NU\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"caption\":\"admin NU\"},\"url\":\"https:\/\/youzum.net\/zh\/members\/adminnu\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"This AI Paper Proposes a Novel Dual-Branch Encoder-Decoder Architecture for Unsupervised Speech Enhancement (SE) - YouZum","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/youzum.net\/zh\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/","og_locale":"zh_CN","og_type":"article","og_title":"This AI Paper Proposes a Novel Dual-Branch Encoder-Decoder Architecture for Unsupervised Speech Enhancement (SE) - YouZum","og_description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","og_url":"https:\/\/youzum.net\/zh\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/","og_site_name":"YouZum","article_publisher":"https:\/\/www.facebook.com\/DroneAssociationTH\/","article_published_time":"2025-10-05T06:52:31+00:00","author":"admin NU","twitter_card":"summary_large_image","twitter_misc":{"\u4f5c\u8005":"admin NU","\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4":"3 \u5206"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#article","isPartOf":{"@id":"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/"},"author":{"name":"admin NU","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c"},"headline":"This AI Paper Proposes a Novel Dual-Branch Encoder-Decoder Architecture for Unsupervised Speech Enhancement (SE)","datePublished":"2025-10-05T06:52:31+00:00","mainEntityOfPage":{"@id":"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/"},"wordCount":684,"commentCount":0,"publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"image":{"@id":"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE.png","articleSection":["AI","Committee","News","Uncategorized"],"inLanguage":"zh-Hans","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/","url":"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/","name":"This AI Paper Proposes a Novel Dual-Branch Encoder-Decoder Architecture for Unsupervised Speech Enhancement (SE) - YouZum","isPartOf":{"@id":"https:\/\/yousum.gpucore.co\/#website"},"primaryImageOfPage":{"@id":"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#primaryimage"},"image":{"@id":"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE.png","datePublished":"2025-10-05T06:52:31+00:00","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","breadcrumb":{"@id":"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#breadcrumb"},"inLanguage":"zh-Hans","potentialAction":[{"@type":"ReadAction","target":["https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/"]}]},{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#primaryimage","url":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE.png","width":1790,"height":934},{"@type":"BreadcrumbList","@id":"https:\/\/youzum.net\/this-ai-paper-proposes-a-novel-dual-branch-encoder-decoder-architecture-for-unsupervised-speech-enhancement-se\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/youzum.net\/"},{"@type":"ListItem","position":2,"name":"This AI Paper Proposes a Novel Dual-Branch Encoder-Decoder Architecture for Unsupervised Speech Enhancement (SE)"}]},{"@type":"WebSite","@id":"https:\/\/yousum.gpucore.co\/#website","url":"https:\/\/yousum.gpucore.co\/","name":"YouSum","description":"","publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/yousum.gpucore.co\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"zh-Hans"},{"@type":"Organization","@id":"https:\/\/yousum.gpucore.co\/#organization","name":"Drone Association Thailand","url":"https:\/\/yousum.gpucore.co\/","logo":{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","width":300,"height":300,"caption":"Drone Association Thailand"},"image":{"@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DroneAssociationTH\/"]},{"@type":"Person","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c","name":"admin NU","image":{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","caption":"admin NU"},"url":"https:\/\/youzum.net\/zh\/members\/adminnu\/"}]}},"rttpg_featured_image_url":{"full":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE.png",1790,934,false],"landscape":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE.png",1790,934,false],"portraits":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE.png",1790,934,false],"thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE-150x150.png",150,150,true],"medium":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE-300x157.png",300,157,true],"large":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE-1024x534.png",1024,534,true],"1536x1536":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE-1536x801.png",1536,801,true],"2048x2048":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE.png",1790,934,false],"trp-custom-language-flag":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE-18x9.png",18,9,true],"woocommerce_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE-300x300.png",300,300,true],"woocommerce_single":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE-600x313.png",600,313,true],"woocommerce_gallery_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-04-at-11.21.17-PM-1-Ct3TgE-100x100.png",100,100,true]},"rttpg_author":{"display_name":"admin NU","author_link":"https:\/\/youzum.net\/zh\/members\/adminnu\/"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/youzum.net\/zh\/category\/ai-club\/\" rel=\"category tag\">AI<\/a> <a href=\"https:\/\/youzum.net\/zh\/category\/committee\/\" rel=\"category tag\">Committee<\/a> <a href=\"https:\/\/youzum.net\/zh\/category\/news\/\" rel=\"category tag\">News<\/a> <a href=\"https:\/\/youzum.net\/zh\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","rttpg_excerpt":"Can a speech enhancer trained only on real noisy recordings cleanly separate speech and noise\u2014without ever seeing paired data? A team of researchers from Brno University of Technology and Johns Hopkins University proposes Unsupervised Speech Enhancement using Data-defined Priors (USE-DDP), a dual-stream encoder\u2013decoder that separates any noisy input into two waveforms\u2014estimated clean speech and residual&hellip;","_links":{"self":[{"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/posts\/42319","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/comments?post=42319"}],"version-history":[{"count":0,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/posts\/42319\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/media\/42320"}],"wp:attachment":[{"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/media?parent=42319"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/categories?post=42319"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/tags?post=42319"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}