{"id":37470,"date":"2025-09-11T07:03:31","date_gmt":"2025-09-11T07:03:31","guid":{"rendered":"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/"},"modified":"2025-09-11T07:03:31","modified_gmt":"2025-09-11T07:03:31","slug":"meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models","status":"publish","type":"post","link":"https:\/\/youzum.net\/es\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/","title":{"rendered":"Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2\u20134\u00d7 Faster than Previous Models"},"content":{"rendered":"<div class=\"wp-block-yoast-seo-table-of-contents yoast-table-of-contents\">\n<h3><strong>Table of contents<\/strong><\/h3>\n<ul>\n<li><a href=\"https:\/\/www.marktechpost.com\/2025\/09\/10\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#h-why-was-a-new-multilingual-encoder-needed\" data-level=\"3\">Why was a new multilingual encoder needed?<\/a><\/li>\n<li><a href=\"https:\/\/www.marktechpost.com\/2025\/09\/10\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#h-understanding-the-architecture-of-mmbert\" data-level=\"3\">Understanding the architecture of mmBERT<\/a><\/li>\n<li><a href=\"https:\/\/www.marktechpost.com\/2025\/09\/10\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#h-what-training-data-and-phases-were-used\" data-level=\"3\">What training data and phases were used?<\/a><\/li>\n<li><a href=\"https:\/\/www.marktechpost.com\/2025\/09\/10\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#h-what-new-training-strategies-were-introduced\" data-level=\"3\">What new training strategies were introduced?<\/a><\/li>\n<li><a href=\"https:\/\/www.marktechpost.com\/2025\/09\/10\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#h-how-does-mmbert-perform-on-benchmarks\" data-level=\"3\">How does mmBERT perform on benchmarks?<\/a><\/li>\n<li><a href=\"https:\/\/www.marktechpost.com\/2025\/09\/10\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#h-how-does-mmbert-handle-low-resource-languages\" data-level=\"3\">How does mmBERT handle low-resource languages?<\/a><\/li>\n<li><a href=\"https:\/\/www.marktechpost.com\/2025\/09\/10\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#h-what-efficiency-gains-does-mmbert-achieve\" data-level=\"3\">What efficiency gains does mmBERT achieve?<\/a><\/li>\n<li><a href=\"https:\/\/www.marktechpost.com\/2025\/09\/10\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#h-summary\" data-level=\"3\">Summary<\/a><\/li>\n<\/ul>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>Why was a new multilingual encoder needed?<\/strong><\/h3>\n<p>XLM-RoBERTa (XLM-R) has dominated multilingual NLP for more than 5 years, an unusually long reign in AI research. While encoder-only models like BERT and RoBERTa were central to early progress, most research energy shifted toward decoder-based generative models. Encoders, however, remain more efficient and often outperform decoders on embedding, retrieval, and classification tasks. Despite this, multilingual encoder development stalled. <\/p>\n<p>A team of researchers from Johns Hopkins University propose mmBERT that addresses this gap by delivering a modern encoder, surpassesing XLM-R and rivals recent large-scale models such as OpenAI\u2019s o3 and Google\u2019s Gemini 2.5 Pro.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Understanding the architecture of mmBERT<\/strong><\/h3>\n<p><strong>mmBERT comes in two main configurations:<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Base model<\/strong>: 22 transformer layers, 1152 hidden dimension, ~307M parameters (110M non-embedding).<\/li>\n<li><strong>Small model<\/strong>: ~140M parameters (42M non-embedding).<\/li>\n<\/ul>\n<p>It adopts the <strong>Gemma 2 tokenizer<\/strong> with a 256k vocabulary, rotary position embeddings (RoPE), and FlashAttention2 for efficiency. Sequence length is extended from <strong>1024 to 8192 tokens<\/strong>, using unpadded embeddings and sliding-window attention. This allows mmBERT to process contexts nearly an order of magnitude longer than XLM-R while maintaining faster inference.<\/p>\n<h3 class=\"wp-block-heading\"><strong>What training data and phases were used?<\/strong><\/h3>\n<p>mmBERT was trained on <strong>3 trillion tokens<\/strong> spanning <strong>1,833 languages<\/strong>. Data sources include FineWeb2, Dolma, MegaWika v2, ProLong, StarCoder, and others. English makes up only ~10\u201334% of the corpus depending on the phase.<\/p>\n<p><strong><\/strong><strong>Training was done in three stages:<\/strong><\/p>\n<ol class=\"wp-block-list\">\n<li><strong>Pre-training<\/strong>: 2.3T tokens across 60 languages and code.<\/li>\n<li><strong>Mid-training<\/strong>: 600B tokens across 110 languages, focused on higher-quality sources.<\/li>\n<li><strong>Decay phase<\/strong>: 100B tokens covering 1,833 languages, emphasizing low-resource adaptation.<\/li>\n<\/ol>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1024\" height=\"436\" data-attachment-id=\"74463\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/09\/10\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/image-125\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-24-scaled.png\" data-orig-size=\"2560,1089\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"image\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-24-300x128.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-24-1024x436.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-24-1024x436.png\" alt=\"\" class=\"wp-image-74463\" \/><\/figure>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>What new training strategies were introduced?<\/strong><\/h3>\n<p><strong>Three main innovations drive mmBERT\u2019s performance:<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Annealed Language Learning (ALL):<\/strong> Languages are introduced gradually (60 \u2192 110 \u2192 1833). Sampling distributions are annealed from high-resource to uniform, ensuring low-resource languages gain influence during later stages without overfitting limited data.<\/li>\n<li><strong>Inverse Masking Schedule:<\/strong> The masking ratio starts at 30% and decays to 5%, encouraging coarse-grained learning early and fine-grained refinements later.<\/li>\n<li><strong>Model Merging Across Decay Variants:<\/strong> Multiple decay-phase models (English-heavy, 110-language, and 1833-language) are combined via TIES merging, leveraging complementary strengths without retraining from scratch.<\/li>\n<\/ul>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img decoding=\"async\" width=\"1024\" height=\"410\" data-attachment-id=\"74461\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/09\/10\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/image-123\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-22.png\" data-orig-size=\"2006,804\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"image\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-22-300x120.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-22-1024x410.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-22-1024x410.png\" alt=\"\" class=\"wp-image-74461\" \/><\/figure>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>How does mmBERT perform on benchmarks?<\/strong><\/h3>\n<ul class=\"wp-block-list\">\n<li><strong>English NLU (GLUE):<\/strong> mmBERT base achieves 86.3, surpassing XLM-R (83.3) and nearly matching ModernBERT (87.4), despite allocating &gt;75% of training to non-English data.<\/li>\n<li><strong>Multilingual NLU (XTREME):<\/strong> mmBERT base scores 72.8 vs. XLM-R\u2019s 70.4, with gains in classification and QA tasks.<\/li>\n<li><strong>Embedding tasks (MTEB v2):<\/strong> mmBERT base ties ModernBERT in English (53.9 vs. 53.8) and leads in multilingual (54.1 vs. 52.4 for XLM-R).<\/li>\n<li><strong>Code retrieval (CoIR):<\/strong> mmBERT outperforms XLM-R by ~9 points, though EuroBERT remains stronger on proprietary data.<\/li>\n<\/ul>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img decoding=\"async\" width=\"1024\" height=\"233\" data-attachment-id=\"74456\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/09\/10\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/image-118\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-17-scaled.png\" data-orig-size=\"2560,582\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"image\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-17-300x68.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-17-1024x233.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-17-1024x233.png\" alt=\"\" class=\"wp-image-74456\" \/><\/figure>\n<\/div>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"180\" data-attachment-id=\"74459\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/09\/10\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/image-121\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-20-scaled.png\" data-orig-size=\"2560,451\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"image\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-20-300x53.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-20-1024x180.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/09\/image-20-1024x180.png\" alt=\"\" class=\"wp-image-74459\" \/><\/figure>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>How does mmBERT handle low-resource languages?<\/strong><\/h3>\n<p>The annealed learning schedule ensures that low-resource languages benefit during later training. On benchmarks like Faroese FoQA and Tigrinya TiQuAD, mmBERT significantly outperforms both o3 and Gemini 2.5 Pro. These results demonstrate that encoder models, if trained carefully, can generalize effectively even in extreme low-resource scenarios.<\/p>\n<h3 class=\"wp-block-heading\"><strong>What efficiency gains does mmBERT achieve?<\/strong><\/h3>\n<p>mmBERT is <strong>2\u20134\u00d7 faster<\/strong> than XLM-R and MiniLM while supporting <strong>8192-token inputs<\/strong>. Notably, it remains faster at 8192 tokens than older encoders were at 512 tokens. This speed boost derives from the ModernBERT training recipe, efficient attention mechanisms, and optimized embeddings.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Summary<\/strong><\/h3>\n<p>mmBERT comes as the long-overdue replacement for XLM-R, redefining what a multilingual encoder can deliver. It runs 2\u20134\u00d7 faster, handles sequences up to 8K tokens, and outperforms prior models on both high-resource benchmarks and low-resource languages that were underserved in the past. Its training recipe\u20143 trillion tokens paired with annealed language learning, inverse masking, and model merging\u2014shows how careful design can unlock broad generalization without excessive redundancy. The result is an open, efficient, and scalable encoder that not only fills the six-year gap since XLM-R but also provides a robust foundation for the next generation of multilingual NLP systems.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<p>Check out the\u00a0<strong><a href=\"https:\/\/arxiv.org\/abs\/2509.06888\" target=\"_blank\" rel=\"noreferrer noopener\">Paper<\/a><\/strong>,\u00a0<strong><a href=\"https:\/\/huggingface.co\/collections\/jhu-clsp\/mmbert-a-modern-multilingual-encoder-68b725831d7c6e3acc435ed4\" target=\"_blank\" rel=\"noreferrer noopener\">Model on\u00a0Hugging Face<\/a>,\u00a0<a href=\"https:\/\/github.com\/JHU-CLSP\/mmBERT?tab=readme-ov-file\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub<\/a>\u00a0and\u00a0<a href=\"https:\/\/huggingface.co\/blog\/mmbert\" target=\"_blank\" rel=\"noreferrer noopener\">Technical details<\/a><em>.<\/em><\/strong>\u00a0Feel free to check out our\u00a0<strong><mark><a href=\"https:\/\/github.com\/Marktechpost\/AI-Tutorial-Codes-Included\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub Page for Tutorials, Codes and Notebooks<\/a><\/mark><\/strong>.\u00a0Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">100k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>.<\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2025\/09\/10\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/\">Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2\u20134\u00d7 Faster than Previous Models<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Table of contents Why was a new multilingual encoder needed? Understanding the architecture of mmBERT What training data and phases were used? What new training strategies were introduced? How does mmBERT perform on benchmarks? How does mmBERT handle low-resource languages? What efficiency gains does mmBERT achieve? Summary Why was a new multilingual encoder needed? XLM-RoBERTa (XLM-R) has dominated multilingual NLP for more than 5 years, an unusually long reign in AI research. While encoder-only models like BERT and RoBERTa were central to early progress, most research energy shifted toward decoder-based generative models. Encoders, however, remain more efficient and often outperform decoders on embedding, retrieval, and classification tasks. Despite this, multilingual encoder development stalled. A team of researchers from Johns Hopkins University propose mmBERT that addresses this gap by delivering a modern encoder, surpassesing XLM-R and rivals recent large-scale models such as OpenAI\u2019s o3 and Google\u2019s Gemini 2.5 Pro. Understanding the architecture of mmBERT mmBERT comes in two main configurations: Base model: 22 transformer layers, 1152 hidden dimension, ~307M parameters (110M non-embedding). Small model: ~140M parameters (42M non-embedding). It adopts the Gemma 2 tokenizer with a 256k vocabulary, rotary position embeddings (RoPE), and FlashAttention2 for efficiency. Sequence length is extended from 1024 to 8192 tokens, using unpadded embeddings and sliding-window attention. This allows mmBERT to process contexts nearly an order of magnitude longer than XLM-R while maintaining faster inference. What training data and phases were used? mmBERT was trained on 3 trillion tokens spanning 1,833 languages. Data sources include FineWeb2, Dolma, MegaWika v2, ProLong, StarCoder, and others. English makes up only ~10\u201334% of the corpus depending on the phase. Training was done in three stages: Pre-training: 2.3T tokens across 60 languages and code. Mid-training: 600B tokens across 110 languages, focused on higher-quality sources. Decay phase: 100B tokens covering 1,833 languages, emphasizing low-resource adaptation. What new training strategies were introduced? Three main innovations drive mmBERT\u2019s performance: Annealed Language Learning (ALL): Languages are introduced gradually (60 \u2192 110 \u2192 1833). Sampling distributions are annealed from high-resource to uniform, ensuring low-resource languages gain influence during later stages without overfitting limited data. Inverse Masking Schedule: The masking ratio starts at 30% and decays to 5%, encouraging coarse-grained learning early and fine-grained refinements later. Model Merging Across Decay Variants: Multiple decay-phase models (English-heavy, 110-language, and 1833-language) are combined via TIES merging, leveraging complementary strengths without retraining from scratch. How does mmBERT perform on benchmarks? English NLU (GLUE): mmBERT base achieves 86.3, surpassing XLM-R (83.3) and nearly matching ModernBERT (87.4), despite allocating &gt;75% of training to non-English data. Multilingual NLU (XTREME): mmBERT base scores 72.8 vs. XLM-R\u2019s 70.4, with gains in classification and QA tasks. Embedding tasks (MTEB v2): mmBERT base ties ModernBERT in English (53.9 vs. 53.8) and leads in multilingual (54.1 vs. 52.4 for XLM-R). Code retrieval (CoIR): mmBERT outperforms XLM-R by ~9 points, though EuroBERT remains stronger on proprietary data. How does mmBERT handle low-resource languages? The annealed learning schedule ensures that low-resource languages benefit during later training. On benchmarks like Faroese FoQA and Tigrinya TiQuAD, mmBERT significantly outperforms both o3 and Gemini 2.5 Pro. These results demonstrate that encoder models, if trained carefully, can generalize effectively even in extreme low-resource scenarios. What efficiency gains does mmBERT achieve? mmBERT is 2\u20134\u00d7 faster than XLM-R and MiniLM while supporting 8192-token inputs. Notably, it remains faster at 8192 tokens than older encoders were at 512 tokens. This speed boost derives from the ModernBERT training recipe, efficient attention mechanisms, and optimized embeddings. Summary mmBERT comes as the long-overdue replacement for XLM-R, redefining what a multilingual encoder can deliver. It runs 2\u20134\u00d7 faster, handles sequences up to 8K tokens, and outperforms prior models on both high-resource benchmarks and low-resource languages that were underserved in the past. Its training recipe\u20143 trillion tokens paired with annealed language learning, inverse masking, and model merging\u2014shows how careful design can unlock broad generalization without excessive redundancy. The result is an open, efficient, and scalable encoder that not only fills the six-year gap since XLM-R but also provides a robust foundation for the next generation of multilingual NLP systems. Check out the\u00a0Paper,\u00a0Model on\u00a0Hugging Face,\u00a0GitHub\u00a0and\u00a0Technical details.\u00a0Feel free to check out our\u00a0GitHub Page for Tutorials, Codes and Notebooks.\u00a0Also,\u00a0feel free to follow us on\u00a0Twitter\u00a0and don\u2019t forget to join our\u00a0100k+ ML SubReddit\u00a0and Subscribe to\u00a0our Newsletter. The post Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2\u20134\u00d7 Faster than Previous Models appeared first on MarkTechPost.<\/p>","protected":false},"author":2,"featured_media":37471,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"pmpro_default_level":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"_pvb_checkbox_block_on_post":false,"footnotes":""},"categories":[52,5,7,1],"tags":[],"class_list":["post-37470","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-club","category-committee","category-news","category-uncategorized","pmpro-has-access"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2\u20134\u00d7 Faster than Previous Models - YouZum<\/title>\n<meta name=\"description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/youzum.net\/es\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/\" \/>\n<meta property=\"og:locale\" content=\"es_ES\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2\u20134\u00d7 Faster than Previous Models - YouZum\" \/>\n<meta property=\"og:description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta property=\"og:url\" content=\"https:\/\/youzum.net\/es\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/\" \/>\n<meta property=\"og:site_name\" content=\"YouZum\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DroneAssociationTH\/\" \/>\n<meta property=\"article:published_time\" content=\"2025-09-11T07:03:31+00:00\" \/>\n<meta name=\"author\" content=\"admin NU\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Escrito por\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin NU\" \/>\n\t<meta name=\"twitter:label2\" content=\"Tiempo de lectura\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutos\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/\"},\"author\":{\"name\":\"admin NU\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\"},\"headline\":\"Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2\u20134\u00d7 Faster than Previous Models\",\"datePublished\":\"2025-09-11T07:03:31+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/\"},\"wordCount\":741,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ.png\",\"articleSection\":[\"AI\",\"Committee\",\"News\",\"Uncategorized\"],\"inLanguage\":\"es\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/\",\"url\":\"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/\",\"name\":\"Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2\u20134\u00d7 Faster than Previous Models - YouZum\",\"isPartOf\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ.png\",\"datePublished\":\"2025-09-11T07:03:31+00:00\",\"description\":\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\",\"breadcrumb\":{\"@id\":\"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#breadcrumb\"},\"inLanguage\":\"es\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"es\",\"@id\":\"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#primaryimage\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ.png\",\"width\":1024,\"height\":436},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/youzum.net\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2\u20134\u00d7 Faster than Previous Models\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/yousum.gpucore.co\/#website\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"name\":\"YouSum\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/yousum.gpucore.co\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"es\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\",\"name\":\"Drone Association Thailand\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"es\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"width\":300,\"height\":300,\"caption\":\"Drone Association Thailand\"},\"image\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/DroneAssociationTH\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\",\"name\":\"admin NU\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"es\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"caption\":\"admin NU\"},\"url\":\"https:\/\/youzum.net\/es\/members\/adminnu\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2\u20134\u00d7 Faster than Previous Models - YouZum","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/youzum.net\/es\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/","og_locale":"es_ES","og_type":"article","og_title":"Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2\u20134\u00d7 Faster than Previous Models - YouZum","og_description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","og_url":"https:\/\/youzum.net\/es\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/","og_site_name":"YouZum","article_publisher":"https:\/\/www.facebook.com\/DroneAssociationTH\/","article_published_time":"2025-09-11T07:03:31+00:00","author":"admin NU","twitter_card":"summary_large_image","twitter_misc":{"Escrito por":"admin NU","Tiempo de lectura":"4 minutos"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#article","isPartOf":{"@id":"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/"},"author":{"name":"admin NU","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c"},"headline":"Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2\u20134\u00d7 Faster than Previous Models","datePublished":"2025-09-11T07:03:31+00:00","mainEntityOfPage":{"@id":"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/"},"wordCount":741,"commentCount":0,"publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"image":{"@id":"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ.png","articleSection":["AI","Committee","News","Uncategorized"],"inLanguage":"es","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/","url":"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/","name":"Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2\u20134\u00d7 Faster than Previous Models - YouZum","isPartOf":{"@id":"https:\/\/yousum.gpucore.co\/#website"},"primaryImageOfPage":{"@id":"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#primaryimage"},"image":{"@id":"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ.png","datePublished":"2025-09-11T07:03:31+00:00","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","breadcrumb":{"@id":"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#breadcrumb"},"inLanguage":"es","potentialAction":[{"@type":"ReadAction","target":["https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/"]}]},{"@type":"ImageObject","inLanguage":"es","@id":"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#primaryimage","url":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ.png","width":1024,"height":436},{"@type":"BreadcrumbList","@id":"https:\/\/youzum.net\/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/youzum.net\/"},{"@type":"ListItem","position":2,"name":"Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2\u20134\u00d7 Faster than Previous Models"}]},{"@type":"WebSite","@id":"https:\/\/yousum.gpucore.co\/#website","url":"https:\/\/yousum.gpucore.co\/","name":"YouSum","description":"","publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/yousum.gpucore.co\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"es"},{"@type":"Organization","@id":"https:\/\/yousum.gpucore.co\/#organization","name":"Drone Association Thailand","url":"https:\/\/yousum.gpucore.co\/","logo":{"@type":"ImageObject","inLanguage":"es","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","width":300,"height":300,"caption":"Drone Association Thailand"},"image":{"@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DroneAssociationTH\/"]},{"@type":"Person","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c","name":"admin NU","image":{"@type":"ImageObject","inLanguage":"es","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","caption":"admin NU"},"url":"https:\/\/youzum.net\/es\/members\/adminnu\/"}]}},"rttpg_featured_image_url":{"full":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ.png",1024,436,false],"landscape":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ.png",1024,436,false],"portraits":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ.png",1024,436,false],"thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ-150x150.png",150,150,true],"medium":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ-300x128.png",300,128,true],"large":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ.png",1024,436,false],"1536x1536":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ.png",1024,436,false],"2048x2048":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ.png",1024,436,false],"trp-custom-language-flag":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ-18x8.png",18,8,true],"woocommerce_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ-300x300.png",300,300,true],"woocommerce_single":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ-600x255.png",600,255,true],"woocommerce_gallery_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/09\/image-24-1024x436-gG0GTJ-100x100.png",100,100,true]},"rttpg_author":{"display_name":"admin NU","author_link":"https:\/\/youzum.net\/es\/members\/adminnu\/"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/youzum.net\/es\/category\/ai-club\/\" rel=\"category tag\">AI<\/a> <a href=\"https:\/\/youzum.net\/es\/category\/committee\/\" rel=\"category tag\">Committee<\/a> <a href=\"https:\/\/youzum.net\/es\/category\/news\/\" rel=\"category tag\">News<\/a> <a href=\"https:\/\/youzum.net\/es\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","rttpg_excerpt":"Table of contents Why was a new multilingual encoder needed? Understanding the architecture of mmBERT What training data and phases were used? What new training strategies were introduced? How does mmBERT perform on benchmarks? How does mmBERT handle low-resource languages? What efficiency gains does mmBERT achieve? Summary Why was a new multilingual encoder needed? XLM-RoBERTa&hellip;","_links":{"self":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/posts\/37470","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/comments?post=37470"}],"version-history":[{"count":0,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/posts\/37470\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/media\/37471"}],"wp:attachment":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/media?parent=37470"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/categories?post=37470"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/tags?post=37470"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}