{"id":79168,"date":"2026-03-26T14:41:31","date_gmt":"2026-03-26T14:41:31","guid":{"rendered":"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/"},"modified":"2026-03-26T14:41:31","modified_gmt":"2026-03-26T14:41:31","slug":"tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning","status":"publish","type":"post","link":"https:\/\/youzum.net\/fr\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/","title":{"rendered":"Tencent AI Open Sources Covo-Audio: A 7B Speech Language Model and Inference Pipeline for Real-Time Audio Conversations and Reasoning"},"content":{"rendered":"<p>Tencent AI Lab has released <strong>Covo-Audio<\/strong>, a 7B-parameter end-to-end Large Audio Language Model (LALM). The model is designed to unify speech processing and language intelligence by directly processing continuous audio inputs and generating audio outputs within a single architecture.<\/p>\n<h3 class=\"wp-block-heading\"><strong>System Architecture<\/strong><\/h3>\n<p><strong>The Covo-Audio framework consists of four primary components designed for seamless cross-modal interaction:<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Audio Encoder<\/strong>: The model utilizes <strong>Whisper-large-v3<\/strong> as its primary encoder due to its robustness against background noise and varied accents. This component operates at a frame rate of <strong>50 Hz<\/strong>.<\/li>\n<li><strong>Audio Adapter<\/strong>: To bridge the encoder and the LLM, a specialized adapter employs three downsampling modules, integrating linear and convolution layers to reduce the frame rate from <strong>50 Hz to 6.25 Hz<\/strong>.<\/li>\n<li><strong>LLM Backbone<\/strong>: The system is built upon <strong>Qwen2.5-7B-Base<\/strong>, which has been adapted to process interleaved sequences of continuous acoustic features and textual tokens.<\/li>\n<li><strong>Speech Tokenizer and Decoder<\/strong>: The tokenizer, based on <strong>WavLM-large<\/strong>, uses a codebook size of <strong>16,384<\/strong> to produce discrete audio tokens at <strong>25 Hz<\/strong>. The decoder employs a <strong>Flow-Matching (FM)<\/strong> based framework and a <strong>BigVGAN<\/strong> vocoder to reconstruct high-fidelity <strong>24K waveforms<\/strong>.<\/li>\n<\/ul>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1604\" height=\"952\" data-attachment-id=\"78612\" data-permalink=\"https:\/\/www.marktechpost.com\/2026\/03\/26\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/screenshot-2026-03-26-at-12-33-14-am-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1.png\" data-orig-size=\"1604,952\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2026-03-26 at 12.33.14\u202fAM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-300x178.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-1024x608.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1.png\" alt=\"\" class=\"wp-image-78612\" \/><figcaption class=\"wp-element-caption\">https:\/\/arxiv.org\/pdf\/2602.09823<\/figcaption><\/figure>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>Hierarchical Tri-modal Interleaving<\/strong><\/h3>\n<p>A core contribution of this work is the <strong>Hierarchical Tri-modal Speech-Text Interleaving<\/strong> strategy. Unlike traditional methods that operate solely at the word or character level, this framework aligns continuous acoustic features (ac)(a_c), discrete speech tokens (ad)(a_d), and natural language text (t)(t).<\/p>\n<p><strong>The model utilizes two primary patterns:<\/strong><\/p>\n<ol start=\"1\" class=\"wp-block-list\">\n<li><strong>Sequential Interleaving <\/strong>(ac\u2192t\u2192ad)(a_c rightarrow t rightarrow a_d): Continuous features, text, and discrete tokens are arranged in a progressive chain.<\/li>\n<li><strong>Parallel Integration <\/strong>(ac\u2192t|ad)(a_c rightarrow t | a_d): Continuous features are aligned with a coupled text-discrete unit.<\/li>\n<\/ol>\n<p>The hierarchical aspect ensures structural coherence by using phrase-level interleaving for fine-grained alignment and sentence-level interleaving to preserve global semantic integrity in long-form utterances<sup><\/sup>. The training process involved a two-stage pre-training pipeline processing a total of <strong>2T tokens<\/strong><sup><\/sup>.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Intelligence-Speaker Decoupling<\/strong><\/h3>\n<p>To mitigate the high cost of constructing large-scale dialogue data for specific speakers, the research team proposed an <strong>Intelligence Speaker Decoupling<\/strong> strategy. This technique separates dialogue intelligence from voice rendering, allowing for flexible voice customization using minimal text-to-speech (TTS) data.<\/p>\n<p>The method reformats high-quality TTS recordings into pseudo-conversations with <strong>masked text loss<\/strong><sup><\/sup>. By excluding the text response portion from the loss calculation, the model preserves its reasoning abilities while inheriting the naturalness of the TTS speaker<sup><\/sup>. This enables personalized interaction without the need for extensive, speaker-specific dialogue datasets<sup><\/sup>.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Full-Duplex Voice Interaction<\/strong><\/h3>\n<p>Covo-Audio evolved into <strong>Covo-Audio-Chat-FD<\/strong>, a variant capable of simultaneous dual-stream communication<sup><\/sup><sup><\/sup><sup><\/sup><sup><\/sup>. The audio encoder is reformatted into a chunk-streaming manner, and the user and model streams are chunk-interleaved in a <strong>1:4 ratio<\/strong><sup><\/sup>. Each chunk represents <strong>0.16s<\/strong> of audio<sup><\/sup>.<\/p>\n<p><strong>The system manages conversational states through specific architectural tokens:<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li><strong>THINK Token<\/strong>: Indicates a listening-only state while the model waits to respond.<\/li>\n<li><strong>SHIFT Token<\/strong>: Signifies the transition to the model\u2019s speaking turn.<\/li>\n<li><strong>BREAK Token<\/strong>: Detects interruption signals (barge-ins), triggering the model to terminate speaking immediately and switch back to listening.<\/li>\n<\/ul>\n<p>For multi-turn scenarios, the model implements a <strong>recursive context-filling strategy<\/strong>, where continuous audio features from user input and generated tokens from previous turns are prefixed as historical context<sup><\/sup>.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Audio Reasoning and Reinforcement Learning<\/strong><\/h3>\n<p>To enhance complex reasoning, the model incorporates <strong>Chain-of-Thought (CoT)<\/strong> reasoning and <strong>Group Relative Policy Optimization (GRPO)<\/strong>. <strong>The model is optimized using a verifiable composite reward function:<\/strong><\/p>\n<div class=\"wp-block-mathml-mathmlblock\">$$R_{total} = R_{accuracy} + R_{format} + R_{consistency} + R_{thinking}$$\n<\/div>\n<p>This structure allows the model to optimize for correctness (Raccuracy)(R_{accuracy}), structured output adherence (Rformat)(R_{format}), logical coherence (Rconsistency)(R_{consistency}), and reasoning depth (Rthinking)(R_{thinking}).<\/p>\n<h3 class=\"wp-block-heading\"><strong>Evaluation and Performance<\/strong><\/h3>\n<p><strong>Covo-Audio (7B) shows competitive or superior results on several evaluated benchmarks, with strongest claims made for models of comparable scale and selected speech\/audio tasks.<\/strong> <sup><\/sup><sup><\/sup><sup><\/sup><sup><\/sup>On the <strong>MMAU<\/strong> benchmark, it achieved an average score of <strong>75.30%<\/strong>, the highest among evaluated 7B-scale models<sup><\/sup><sup><\/sup><sup><\/sup><sup><\/sup>. It notably excelled in music understanding with a score of <strong>76.05%<\/strong><sup><\/sup>. On the <strong>MMSU<\/strong> benchmark, Covo-Audio achieved a leading <strong>66.64%<\/strong> average accuracy<sup><\/sup>.<\/p>\n<p>Regarding its conversational variants, <strong>Covo-Audio-Chat<\/strong> demonstrated strong performance on <strong>URO-Bench<\/strong>, particularly in speech reasoning and spoken dialogue tasks, outperforming models like <strong>Qwen3-Omni<\/strong> on the Chinese track<sup><\/sup>. For empathetic interaction on the <strong>VStyle<\/strong> benchmark, it achieved state-of-the-art results in Mandarin for anger (<strong>4.89<\/strong>), sadness (<strong>4.93<\/strong>), and anxiety (<strong>5.00<\/strong>)<sup><\/sup>.<\/p>\n<p><strong>The research team notes an \u2018early-response\u2019 issue on the GaokaoEval full-duplex setting, where unusually long silent pauses between vocal fragments can cause premature responses.<\/strong> This \u2018early-response\u2019 behavior correlates with the model\u2019s pause-handling success metric and is identified as a critical direction for future optimization.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Key Takeaways<\/strong><\/h3>\n<ul class=\"wp-block-list\">\n<li><strong>Unified End-to-End Architecture<\/strong>: Covo-Audio is a 7B-parameter model that natively processes continuous audio inputs and generates high-fidelity audio outputs within a single, unified architecture. It eliminates the need for cascaded ASR-LLM-TTS pipelines, reducing error propagation and information loss.<\/li>\n<li><strong>Hierarchical Tri-modal Interleaving<\/strong>: The model employs a specialized strategy to align continuous acoustic features, discrete speech tokens, and natural language text. By interleaving these modalities at both phrase and sentence levels, it preserves global semantic integrity while capturing fine-grained prosodic nuances.<\/li>\n<li><strong>Intelligence-Speaker Decoupling<\/strong>: Tencent research team introduces a technique to decouple dialogue intelligence from specific voice rendering. This allows for flexible voice customization using lightweight Text-to-Speech (TTS) data, significantly lowering the cost of developing personalized conversational agents.<\/li>\n<li><strong>Native Full-Duplex Interaction<\/strong>: The Covo-Audio-Chat-FD variant supports simultaneous listening and speaking. It utilizes specific architectural tokens\u2014THINK, SHIFT, and BREAK\u2014to manage complex real-time dynamics such as smooth turn-taking, backchanneling, and user barge-ins.<\/li>\n<li><strong>Superior Parameter Efficiency<\/strong>: Despite its compact 7B scale, Covo-Audio achieves state-of-the-art or highly competitive performance across core benchmarks, including MMAU, MMSU, and URO-Bench. It frequently matches or exceeds the performance of much larger systems, such as 32B-parameter models, in audio and speech understanding tasks.<\/li>\n<\/ul>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<p>Check out\u00a0the\u00a0<strong><a href=\"https:\/\/arxiv.org\/pdf\/2602.09823\" target=\"_blank\" rel=\"noreferrer noopener\">Paper<\/a>, <a href=\"https:\/\/huggingface.co\/tencent\/Covo-Audio-Chat\" target=\"_blank\" rel=\"noreferrer noopener\">Model on HF<\/a> <\/strong>and<strong> <a href=\"https:\/\/github.com\/Tencent\/Covo-Audio\" target=\"_blank\" rel=\"noreferrer noopener\">Repo<\/a>.\u00a0<\/strong>Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">120k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2026\/03\/26\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/\">Tencent AI Open Sources Covo-Audio: A 7B Speech Language Model and Inference Pipeline for Real-Time Audio Conversations and Reasoning<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Tencent AI Lab has released Covo-Audio, a 7B-parameter end-to-end Large Audio Language Model (LALM). The model is designed to unify speech processing and language intelligence by directly processing continuous audio inputs and generating audio outputs within a single architecture. System Architecture The Covo-Audio framework consists of four primary components designed for seamless cross-modal interaction: Audio Encoder: The model utilizes Whisper-large-v3 as its primary encoder due to its robustness against background noise and varied accents. This component operates at a frame rate of 50 Hz. Audio Adapter: To bridge the encoder and the LLM, a specialized adapter employs three downsampling modules, integrating linear and convolution layers to reduce the frame rate from 50 Hz to 6.25 Hz. LLM Backbone: The system is built upon Qwen2.5-7B-Base, which has been adapted to process interleaved sequences of continuous acoustic features and textual tokens. Speech Tokenizer and Decoder: The tokenizer, based on WavLM-large, uses a codebook size of 16,384 to produce discrete audio tokens at 25 Hz. The decoder employs a Flow-Matching (FM) based framework and a BigVGAN vocoder to reconstruct high-fidelity 24K waveforms. https:\/\/arxiv.org\/pdf\/2602.09823 Hierarchical Tri-modal Interleaving A core contribution of this work is the Hierarchical Tri-modal Speech-Text Interleaving strategy. Unlike traditional methods that operate solely at the word or character level, this framework aligns continuous acoustic features (ac)(a_c), discrete speech tokens (ad)(a_d), and natural language text (t)(t). The model utilizes two primary patterns: Sequential Interleaving (ac\u2192t\u2192ad)(a_c rightarrow t rightarrow a_d): Continuous features, text, and discrete tokens are arranged in a progressive chain. Parallel Integration (ac\u2192t|ad)(a_c rightarrow t | a_d): Continuous features are aligned with a coupled text-discrete unit. The hierarchical aspect ensures structural coherence by using phrase-level interleaving for fine-grained alignment and sentence-level interleaving to preserve global semantic integrity in long-form utterances. The training process involved a two-stage pre-training pipeline processing a total of 2T tokens. Intelligence-Speaker Decoupling To mitigate the high cost of constructing large-scale dialogue data for specific speakers, the research team proposed an Intelligence Speaker Decoupling strategy. This technique separates dialogue intelligence from voice rendering, allowing for flexible voice customization using minimal text-to-speech (TTS) data. The method reformats high-quality TTS recordings into pseudo-conversations with masked text loss. By excluding the text response portion from the loss calculation, the model preserves its reasoning abilities while inheriting the naturalness of the TTS speaker. This enables personalized interaction without the need for extensive, speaker-specific dialogue datasets. Full-Duplex Voice Interaction Covo-Audio evolved into Covo-Audio-Chat-FD, a variant capable of simultaneous dual-stream communication. The audio encoder is reformatted into a chunk-streaming manner, and the user and model streams are chunk-interleaved in a 1:4 ratio. Each chunk represents 0.16s of audio. The system manages conversational states through specific architectural tokens: THINK Token: Indicates a listening-only state while the model waits to respond. SHIFT Token: Signifies the transition to the model\u2019s speaking turn. BREAK Token: Detects interruption signals (barge-ins), triggering the model to terminate speaking immediately and switch back to listening. For multi-turn scenarios, the model implements a recursive context-filling strategy, where continuous audio features from user input and generated tokens from previous turns are prefixed as historical context. Audio Reasoning and Reinforcement Learning To enhance complex reasoning, the model incorporates Chain-of-Thought (CoT) reasoning and Group Relative Policy Optimization (GRPO). The model is optimized using a verifiable composite reward function: $$R_{total} = R_{accuracy} + R_{format} + R_{consistency} + R_{thinking}$$ This structure allows the model to optimize for correctness (Raccuracy)(R_{accuracy}), structured output adherence (Rformat)(R_{format}), logical coherence (Rconsistency)(R_{consistency}), and reasoning depth (Rthinking)(R_{thinking}). Evaluation and Performance Covo-Audio (7B) shows competitive or superior results on several evaluated benchmarks, with strongest claims made for models of comparable scale and selected speech\/audio tasks. On the MMAU benchmark, it achieved an average score of 75.30%, the highest among evaluated 7B-scale models. It notably excelled in music understanding with a score of 76.05%. On the MMSU benchmark, Covo-Audio achieved a leading 66.64% average accuracy. Regarding its conversational variants, Covo-Audio-Chat demonstrated strong performance on URO-Bench, particularly in speech reasoning and spoken dialogue tasks, outperforming models like Qwen3-Omni on the Chinese track. For empathetic interaction on the VStyle benchmark, it achieved state-of-the-art results in Mandarin for anger (4.89), sadness (4.93), and anxiety (5.00). The research team notes an \u2018early-response\u2019 issue on the GaokaoEval full-duplex setting, where unusually long silent pauses between vocal fragments can cause premature responses. This \u2018early-response\u2019 behavior correlates with the model\u2019s pause-handling success metric and is identified as a critical direction for future optimization. Key Takeaways Unified End-to-End Architecture: Covo-Audio is a 7B-parameter model that natively processes continuous audio inputs and generates high-fidelity audio outputs within a single, unified architecture. It eliminates the need for cascaded ASR-LLM-TTS pipelines, reducing error propagation and information loss. Hierarchical Tri-modal Interleaving: The model employs a specialized strategy to align continuous acoustic features, discrete speech tokens, and natural language text. By interleaving these modalities at both phrase and sentence levels, it preserves global semantic integrity while capturing fine-grained prosodic nuances. Intelligence-Speaker Decoupling: Tencent research team introduces a technique to decouple dialogue intelligence from specific voice rendering. This allows for flexible voice customization using lightweight Text-to-Speech (TTS) data, significantly lowering the cost of developing personalized conversational agents. Native Full-Duplex Interaction: The Covo-Audio-Chat-FD variant supports simultaneous listening and speaking. It utilizes specific architectural tokens\u2014THINK, SHIFT, and BREAK\u2014to manage complex real-time dynamics such as smooth turn-taking, backchanneling, and user barge-ins. Superior Parameter Efficiency: Despite its compact 7B scale, Covo-Audio achieves state-of-the-art or highly competitive performance across core benchmarks, including MMAU, MMSU, and URO-Bench. It frequently matches or exceeds the performance of much larger systems, such as 32B-parameter models, in audio and speech understanding tasks. Check out\u00a0the\u00a0Paper, Model on HF and Repo.\u00a0Also,\u00a0feel free to follow us on\u00a0Twitter\u00a0and don\u2019t forget to join our\u00a0120k+ ML SubReddit\u00a0and Subscribe to\u00a0our Newsletter. Wait! are you on telegram?\u00a0now you can join us on telegram as well. The post Tencent AI Open Sources Covo-Audio: A 7B Speech Language Model and Inference Pipeline for Real-Time Audio Conversations and Reasoning appeared first on MarkTechPost.<\/p>","protected":false},"author":2,"featured_media":79169,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"pmpro_default_level":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"_pvb_checkbox_block_on_post":false,"footnotes":""},"categories":[52,5,7,1],"tags":[],"class_list":["post-79168","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-club","category-committee","category-news","category-uncategorized","pmpro-has-access"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Tencent AI Open Sources Covo-Audio: A 7B Speech Language Model and Inference Pipeline for Real-Time Audio Conversations and Reasoning - YouZum<\/title>\n<meta name=\"description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/youzum.net\/fr\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/\" \/>\n<meta property=\"og:locale\" content=\"fr_FR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Tencent AI Open Sources Covo-Audio: A 7B Speech Language Model and Inference Pipeline for Real-Time Audio Conversations and Reasoning - YouZum\" \/>\n<meta property=\"og:description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta property=\"og:url\" content=\"https:\/\/youzum.net\/fr\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/\" \/>\n<meta property=\"og:site_name\" content=\"YouZum\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DroneAssociationTH\/\" \/>\n<meta property=\"article:published_time\" content=\"2026-03-26T14:41:31+00:00\" \/>\n<meta name=\"author\" content=\"admin NU\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"\u00c9crit par\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin NU\" \/>\n\t<meta name=\"twitter:label2\" content=\"Dur\u00e9e de lecture estim\u00e9e\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/\"},\"author\":{\"name\":\"admin NU\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\"},\"headline\":\"Tencent AI Open Sources Covo-Audio: A 7B Speech Language Model and Inference Pipeline for Real-Time Audio Conversations and Reasoning\",\"datePublished\":\"2026-03-26T14:41:31+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/\"},\"wordCount\":1033,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z.png\",\"articleSection\":[\"AI\",\"Committee\",\"News\",\"Uncategorized\"],\"inLanguage\":\"fr-FR\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/\",\"url\":\"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/\",\"name\":\"Tencent AI Open Sources Covo-Audio: A 7B Speech Language Model and Inference Pipeline for Real-Time Audio Conversations and Reasoning - YouZum\",\"isPartOf\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z.png\",\"datePublished\":\"2026-03-26T14:41:31+00:00\",\"description\":\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\",\"breadcrumb\":{\"@id\":\"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#breadcrumb\"},\"inLanguage\":\"fr-FR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"fr-FR\",\"@id\":\"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#primaryimage\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z.png\",\"width\":1604,\"height\":952},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/youzum.net\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Tencent AI Open Sources Covo-Audio: A 7B Speech Language Model and Inference Pipeline for Real-Time Audio Conversations and Reasoning\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/yousum.gpucore.co\/#website\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"name\":\"YouSum\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/yousum.gpucore.co\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"fr-FR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\",\"name\":\"Drone Association Thailand\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"fr-FR\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"width\":300,\"height\":300,\"caption\":\"Drone Association Thailand\"},\"image\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/DroneAssociationTH\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\",\"name\":\"admin NU\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"fr-FR\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"caption\":\"admin NU\"},\"url\":\"https:\/\/youzum.net\/fr\/members\/adminnu\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Tencent AI Open Sources Covo-Audio: A 7B Speech Language Model and Inference Pipeline for Real-Time Audio Conversations and Reasoning - YouZum","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/youzum.net\/fr\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/","og_locale":"fr_FR","og_type":"article","og_title":"Tencent AI Open Sources Covo-Audio: A 7B Speech Language Model and Inference Pipeline for Real-Time Audio Conversations and Reasoning - YouZum","og_description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","og_url":"https:\/\/youzum.net\/fr\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/","og_site_name":"YouZum","article_publisher":"https:\/\/www.facebook.com\/DroneAssociationTH\/","article_published_time":"2026-03-26T14:41:31+00:00","author":"admin NU","twitter_card":"summary_large_image","twitter_misc":{"\u00c9crit par":"admin NU","Dur\u00e9e de lecture estim\u00e9e":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#article","isPartOf":{"@id":"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/"},"author":{"name":"admin NU","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c"},"headline":"Tencent AI Open Sources Covo-Audio: A 7B Speech Language Model and Inference Pipeline for Real-Time Audio Conversations and Reasoning","datePublished":"2026-03-26T14:41:31+00:00","mainEntityOfPage":{"@id":"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/"},"wordCount":1033,"commentCount":0,"publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"image":{"@id":"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z.png","articleSection":["AI","Committee","News","Uncategorized"],"inLanguage":"fr-FR","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/","url":"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/","name":"Tencent AI Open Sources Covo-Audio: A 7B Speech Language Model and Inference Pipeline for Real-Time Audio Conversations and Reasoning - YouZum","isPartOf":{"@id":"https:\/\/yousum.gpucore.co\/#website"},"primaryImageOfPage":{"@id":"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#primaryimage"},"image":{"@id":"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z.png","datePublished":"2026-03-26T14:41:31+00:00","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","breadcrumb":{"@id":"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#breadcrumb"},"inLanguage":"fr-FR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/"]}]},{"@type":"ImageObject","inLanguage":"fr-FR","@id":"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#primaryimage","url":"https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z.png","width":1604,"height":952},{"@type":"BreadcrumbList","@id":"https:\/\/youzum.net\/tencent-ai-open-sources-covo-audio-a-7b-speech-language-model-and-inference-pipeline-for-real-time-audio-conversations-and-reasoning\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/youzum.net\/"},{"@type":"ListItem","position":2,"name":"Tencent AI Open Sources Covo-Audio: A 7B Speech Language Model and Inference Pipeline for Real-Time Audio Conversations and Reasoning"}]},{"@type":"WebSite","@id":"https:\/\/yousum.gpucore.co\/#website","url":"https:\/\/yousum.gpucore.co\/","name":"YouSum","description":"","publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/yousum.gpucore.co\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"fr-FR"},{"@type":"Organization","@id":"https:\/\/yousum.gpucore.co\/#organization","name":"Drone Association Thailand","url":"https:\/\/yousum.gpucore.co\/","logo":{"@type":"ImageObject","inLanguage":"fr-FR","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","width":300,"height":300,"caption":"Drone Association Thailand"},"image":{"@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DroneAssociationTH\/"]},{"@type":"Person","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c","name":"admin NU","image":{"@type":"ImageObject","inLanguage":"fr-FR","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","caption":"admin NU"},"url":"https:\/\/youzum.net\/fr\/members\/adminnu\/"}]}},"rttpg_featured_image_url":{"full":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z.png",1604,952,false],"landscape":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z.png",1604,952,false],"portraits":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z.png",1604,952,false],"thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z-150x150.png",150,150,true],"medium":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z-300x178.png",300,178,true],"large":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z-1024x608.png",1024,608,true],"1536x1536":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z-1536x912.png",1536,912,true],"2048x2048":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z.png",1604,952,false],"trp-custom-language-flag":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z-18x12.png",18,12,true],"woocommerce_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z-300x300.png",300,300,true],"woocommerce_single":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z-600x356.png",600,356,true],"woocommerce_gallery_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-26-at-12.33.14-AM-1-Suu58Z-100x100.png",100,100,true]},"rttpg_author":{"display_name":"admin NU","author_link":"https:\/\/youzum.net\/fr\/members\/adminnu\/"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/youzum.net\/fr\/category\/ai-club\/\" rel=\"category tag\">AI<\/a> <a href=\"https:\/\/youzum.net\/fr\/category\/committee\/\" rel=\"category tag\">Committee<\/a> <a href=\"https:\/\/youzum.net\/fr\/category\/news\/\" rel=\"category tag\">News<\/a> <a href=\"https:\/\/youzum.net\/fr\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","rttpg_excerpt":"Tencent AI Lab has released Covo-Audio, a 7B-parameter end-to-end Large Audio Language Model (LALM). The model is designed to unify speech processing and language intelligence by directly processing continuous audio inputs and generating audio outputs within a single architecture. System Architecture The Covo-Audio framework consists of four primary components designed for seamless cross-modal interaction: Audio\u2026","_links":{"self":[{"href":"https:\/\/youzum.net\/fr\/wp-json\/wp\/v2\/posts\/79168","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/youzum.net\/fr\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/youzum.net\/fr\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/youzum.net\/fr\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/youzum.net\/fr\/wp-json\/wp\/v2\/comments?post=79168"}],"version-history":[{"count":0,"href":"https:\/\/youzum.net\/fr\/wp-json\/wp\/v2\/posts\/79168\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/youzum.net\/fr\/wp-json\/wp\/v2\/media\/79169"}],"wp:attachment":[{"href":"https:\/\/youzum.net\/fr\/wp-json\/wp\/v2\/media?parent=79168"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/youzum.net\/fr\/wp-json\/wp\/v2\/categories?post=79168"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/youzum.net\/fr\/wp-json\/wp\/v2\/tags?post=79168"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}