{"id":32054,"date":"2025-08-16T06:06:27","date_gmt":"2025-08-16T06:06:27","guid":{"rendered":"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/"},"modified":"2025-08-16T06:06:27","modified_gmt":"2025-08-16T06:06:27","slug":"r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch","status":"publish","type":"post","link":"https:\/\/youzum.net\/es\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/","title":{"rendered":"R-Zero: A Fully Autonomous AI Framework that Generates Its Own Training Data from Scratch"},"content":{"rendered":"<p>Large Language Models (LLMs) have revolutionized fields from natural language understanding to reasoning and code generation. However, pushing their reasoning ability to truly superhuman levels has been limited by the need for massive, high-quality, human-annotated datasets. A team of researchers from Tencent AI Seattle Lab, Washington University, the University of Maryland, and the University of Texas have proposed R-Zero, a framework designed to train reasoning LLMs that can self-evolve without relying on external data labels.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Beyond Human-Curated Data<\/strong><\/h3>\n<p>Most progress in LLM reasoning is tethered to datasets laboriously curated by humans, an approach that is resource-intensive and fundamentally limited by human knowledge. Even label-free methods using LLMs\u2019 own outputs for reward signals still depend on existing collections of unsolved tasks or problems. These dependencies bottleneck scalability and hinder the dream of open-ended AI reasoning beyond human capabilities.<\/p>\n<h3 class=\"wp-block-heading\"><strong>R-Zero: Self-Evolution from Zero Data<\/strong><\/h3>\n<p><strong>R-Zero<\/strong> forges a novel path by entirely removing the reliance on external tasks and labels. Instead, it introduces a co-evolutionary dynamic between two instances of a base model:<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Challenger<\/strong>: Responsible for creating new, challenging reasoning tasks near the edge of the Solver\u2019s capability.<\/li>\n<li><strong>Solver<\/strong>: Trained to solve increasingly difficult problems posed by the Challenger, improving iteratively.<\/li>\n<\/ul>\n<p>This synergy enables the curriculum\u2014the set of training data\u2014to be self-generated and adapted continuously to the model\u2019s evolving strengths and weaknesses. The process works as follows:<\/p>\n<ol class=\"wp-block-list\">\n<li><strong>Challenger Training<\/strong>: Trained via reinforcement learning (specifically <strong>Group Relative Policy Optimization [GRPO]<\/strong>), it generates diverse, hard-to-solve questions. The reward signal for each question is based on the Solver\u2019s uncertainty: highest when Solver\u2019s answers are maximally inconsistent (empirical accuracy approaches 50%).<\/li>\n<li><strong>Solver Training<\/strong>: Solver is fine-tuned on the Challenger\u2019s curated problems. Pseudo-labels (answers) are determined by majority vote among Solver\u2019s own responses. Only questions with answers neither too consistent nor too scattered (i.e., in an informative band) are used for training.<\/li>\n<li><strong>Iterative Loop<\/strong>: Challenger and Solver alternate roles, co-evolving over several rounds, progressively improving reasoning abilities without human intervention.<\/li>\n<\/ol>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1024\" height=\"490\" data-attachment-id=\"73657\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/08\/15\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/screenshot-2025-08-15-at-9-16-58-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1.png\" data-orig-size=\"2060,986\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2025-08-15 at 9.16.58\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-300x144.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490.png\" alt=\"\" class=\"wp-image-73657\" \/><\/figure>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>Key Technical Innovations<\/strong><\/h3>\n<ul class=\"wp-block-list\">\n<li><strong>Group Relative Policy Optimization (GRPO)<\/strong><br \/>GRPO is a reinforcement learning algorithm that normalizes the reward for each generated answer relative to the group of responses for the same prompt. This method efficiently fine-tunes policy LLMs without a separate value function.<\/li>\n<li><strong>Uncertainty-Driven Curriculum<\/strong><br \/>The Challenger is rewarded for generating problems at the Solver\u2019s frontier\u2014neither too easy nor impossible. The reward function peaks for tasks where the Solver achieves 50% accuracy, maximizing learning efficiency per theoretical analysis.<\/li>\n<li><strong>Repetition Penalty and Format Checks<\/strong><br \/>To guarantee diverse and well-structured training data, a repetition penalty discourages similar questions within a batch, and strict format checks ensure data quality.<\/li>\n<li><strong>Pseudo-Label Quality Control<\/strong><br \/>Only question-answer pairs with intermediate answer consistency are used for training, filtering out ambiguous or ill-posed problems and calibrating label accuracy.<\/li>\n<\/ul>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img decoding=\"async\" width=\"1024\" height=\"538\" data-attachment-id=\"73659\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/08\/15\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/screenshot-2025-08-15-at-9-17-24-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.17.24-PM-1.png\" data-orig-size=\"1640,862\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2025-08-15 at 9.17.24\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.17.24-PM-1-300x158.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.17.24-PM-1-1024x538.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.17.24-PM-1-1024x538.png\" alt=\"\" class=\"wp-image-73659\" \/><\/figure>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>Empirical Performance<\/strong><\/h3>\n<h4 class=\"wp-block-heading\"><strong>Mathematical Reasoning Benchmarks<\/strong><\/h4>\n<p>R-Zero was evaluated using seven rigorous mathematical benchmarks, including AMC, Minerva, MATH-500, GSM8K, Olympiad-Bench, and AIME competitions. Compared with the base model and non-trained Challenger baseline, <strong>three iterations of R-Zero led to substantial improvements in reasoning accuracy across all model sizes and architectures<\/strong> (e.g., Qwen3-8B-Base improved from 49.18 to 54.69 average score after three iterations).<\/p>\n<h4 class=\"wp-block-heading\"><strong>General Reasoning Benchmarks<\/strong><\/h4>\n<p>Crucially, R-Zero\u2019s improvements <strong>generalize beyond math<\/strong>. Benchmarks including MMLU-Pro, SuperGPQA, and BIG-Bench Extra Hard (BBEH) show significant gains in general-domain reasoning accuracy (e.g., Qwen3-8B-Base\u2019s overall average jumps from 34.49 to 38.73), demonstrating strong transfer effects.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img decoding=\"async\" width=\"1024\" height=\"777\" data-attachment-id=\"73661\" data-permalink=\"https:\/\/www.marktechpost.com\/2025\/08\/15\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/screenshot-2025-08-15-at-9-17-48-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.17.48-PM-1.png\" data-orig-size=\"1642,1246\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2025-08-15 at 9.17.48\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.17.48-PM-1-300x228.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.17.48-PM-1-1024x777.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.17.48-PM-1-1024x777.png\" alt=\"\" class=\"wp-image-73661\" \/><\/figure>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>Conclusion<\/strong><\/h3>\n<p>R-Zero marks a major milestone toward self-sufficient, superhuman reasoning LLMs. Its fully autonomous co-evolutionary training pipeline offers not only strong empirical gains in reasoning but a new lens through which to view scalable, data-free AI development. Researchers and practitioners can experiment with this framework today, leveraging open-source tools to pioneer the next era of reasoning-centric language models.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<p>Check out the <strong><a href=\"https:\/\/arxiv.org\/abs\/2508.05004\" target=\"_blank\" rel=\"noreferrer noopener\">Paper<\/a> <\/strong>and <strong><a href=\"https:\/\/github.com\/Chengsong-Huang\/R-Zero\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub Page<\/a><\/strong>. Feel free to check out our\u00a0<strong><mark><a href=\"https:\/\/github.com\/Marktechpost\/AI-Tutorial-Codes-Included\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub Page for Tutorials, Codes and Notebooks<\/a><\/mark><\/strong>.\u00a0Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">100k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>.<\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2025\/08\/15\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/\">R-Zero: A Fully Autonomous AI Framework that Generates Its Own Training Data from Scratch<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Large Language Models (LLMs) have revolutionized fields from natural language understanding to reasoning and code generation. However, pushing their reasoning ability to truly superhuman levels has been limited by the need for massive, high-quality, human-annotated datasets. A team of researchers from Tencent AI Seattle Lab, Washington University, the University of Maryland, and the University of Texas have proposed R-Zero, a framework designed to train reasoning LLMs that can self-evolve without relying on external data labels. Beyond Human-Curated Data Most progress in LLM reasoning is tethered to datasets laboriously curated by humans, an approach that is resource-intensive and fundamentally limited by human knowledge. Even label-free methods using LLMs\u2019 own outputs for reward signals still depend on existing collections of unsolved tasks or problems. These dependencies bottleneck scalability and hinder the dream of open-ended AI reasoning beyond human capabilities. R-Zero: Self-Evolution from Zero Data R-Zero forges a novel path by entirely removing the reliance on external tasks and labels. Instead, it introduces a co-evolutionary dynamic between two instances of a base model: Challenger: Responsible for creating new, challenging reasoning tasks near the edge of the Solver\u2019s capability. Solver: Trained to solve increasingly difficult problems posed by the Challenger, improving iteratively. This synergy enables the curriculum\u2014the set of training data\u2014to be self-generated and adapted continuously to the model\u2019s evolving strengths and weaknesses. The process works as follows: Challenger Training: Trained via reinforcement learning (specifically Group Relative Policy Optimization [GRPO]), it generates diverse, hard-to-solve questions. The reward signal for each question is based on the Solver\u2019s uncertainty: highest when Solver\u2019s answers are maximally inconsistent (empirical accuracy approaches 50%). Solver Training: Solver is fine-tuned on the Challenger\u2019s curated problems. Pseudo-labels (answers) are determined by majority vote among Solver\u2019s own responses. Only questions with answers neither too consistent nor too scattered (i.e., in an informative band) are used for training. Iterative Loop: Challenger and Solver alternate roles, co-evolving over several rounds, progressively improving reasoning abilities without human intervention. Key Technical Innovations Group Relative Policy Optimization (GRPO)GRPO is a reinforcement learning algorithm that normalizes the reward for each generated answer relative to the group of responses for the same prompt. This method efficiently fine-tunes policy LLMs without a separate value function. Uncertainty-Driven CurriculumThe Challenger is rewarded for generating problems at the Solver\u2019s frontier\u2014neither too easy nor impossible. The reward function peaks for tasks where the Solver achieves 50% accuracy, maximizing learning efficiency per theoretical analysis. Repetition Penalty and Format ChecksTo guarantee diverse and well-structured training data, a repetition penalty discourages similar questions within a batch, and strict format checks ensure data quality. Pseudo-Label Quality ControlOnly question-answer pairs with intermediate answer consistency are used for training, filtering out ambiguous or ill-posed problems and calibrating label accuracy. Empirical Performance Mathematical Reasoning Benchmarks R-Zero was evaluated using seven rigorous mathematical benchmarks, including AMC, Minerva, MATH-500, GSM8K, Olympiad-Bench, and AIME competitions. Compared with the base model and non-trained Challenger baseline, three iterations of R-Zero led to substantial improvements in reasoning accuracy across all model sizes and architectures (e.g., Qwen3-8B-Base improved from 49.18 to 54.69 average score after three iterations). General Reasoning Benchmarks Crucially, R-Zero\u2019s improvements generalize beyond math. Benchmarks including MMLU-Pro, SuperGPQA, and BIG-Bench Extra Hard (BBEH) show significant gains in general-domain reasoning accuracy (e.g., Qwen3-8B-Base\u2019s overall average jumps from 34.49 to 38.73), demonstrating strong transfer effects. Conclusion R-Zero marks a major milestone toward self-sufficient, superhuman reasoning LLMs. Its fully autonomous co-evolutionary training pipeline offers not only strong empirical gains in reasoning but a new lens through which to view scalable, data-free AI development. Researchers and practitioners can experiment with this framework today, leveraging open-source tools to pioneer the next era of reasoning-centric language models. Check out the Paper and GitHub Page. Feel free to check out our\u00a0GitHub Page for Tutorials, Codes and Notebooks.\u00a0Also,\u00a0feel free to follow us on\u00a0Twitter\u00a0and don\u2019t forget to join our\u00a0100k+ ML SubReddit\u00a0and Subscribe to\u00a0our Newsletter. The post R-Zero: A Fully Autonomous AI Framework that Generates Its Own Training Data from Scratch appeared first on MarkTechPost.<\/p>","protected":false},"author":2,"featured_media":32055,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"pmpro_default_level":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"_pvb_checkbox_block_on_post":false,"footnotes":""},"categories":[52,5,7,1],"tags":[],"class_list":["post-32054","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-club","category-committee","category-news","category-uncategorized","pmpro-has-access"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>R-Zero: A Fully Autonomous AI Framework that Generates Its Own Training Data from Scratch - YouZum<\/title>\n<meta name=\"description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/youzum.net\/es\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/\" \/>\n<meta property=\"og:locale\" content=\"es_ES\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"R-Zero: A Fully Autonomous AI Framework that Generates Its Own Training Data from Scratch - YouZum\" \/>\n<meta property=\"og:description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta property=\"og:url\" content=\"https:\/\/youzum.net\/es\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/\" \/>\n<meta property=\"og:site_name\" content=\"YouZum\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DroneAssociationTH\/\" \/>\n<meta property=\"article:published_time\" content=\"2025-08-16T06:06:27+00:00\" \/>\n<meta name=\"author\" content=\"admin NU\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Escrito por\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin NU\" \/>\n\t<meta name=\"twitter:label2\" content=\"Tiempo de lectura\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutos\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/\"},\"author\":{\"name\":\"admin NU\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\"},\"headline\":\"R-Zero: A Fully Autonomous AI Framework that Generates Its Own Training Data from Scratch\",\"datePublished\":\"2025-08-16T06:06:27+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/\"},\"wordCount\":699,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC.webp\",\"articleSection\":[\"AI\",\"Committee\",\"News\",\"Uncategorized\"],\"inLanguage\":\"es\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/\",\"url\":\"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/\",\"name\":\"R-Zero: A Fully Autonomous AI Framework that Generates Its Own Training Data from Scratch - YouZum\",\"isPartOf\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC.webp\",\"datePublished\":\"2025-08-16T06:06:27+00:00\",\"description\":\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\",\"breadcrumb\":{\"@id\":\"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#breadcrumb\"},\"inLanguage\":\"es\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"es\",\"@id\":\"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#primaryimage\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC.webp\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC.webp\",\"width\":1024,\"height\":490},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/youzum.net\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"R-Zero: A Fully Autonomous AI Framework that Generates Its Own Training Data from Scratch\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/yousum.gpucore.co\/#website\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"name\":\"YouSum\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/yousum.gpucore.co\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"es\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\",\"name\":\"Drone Association Thailand\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"es\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"width\":300,\"height\":300,\"caption\":\"Drone Association Thailand\"},\"image\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/DroneAssociationTH\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\",\"name\":\"admin NU\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"es\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"caption\":\"admin NU\"},\"url\":\"https:\/\/youzum.net\/es\/members\/adminnu\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"R-Zero: A Fully Autonomous AI Framework that Generates Its Own Training Data from Scratch - YouZum","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/youzum.net\/es\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/","og_locale":"es_ES","og_type":"article","og_title":"R-Zero: A Fully Autonomous AI Framework that Generates Its Own Training Data from Scratch - YouZum","og_description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","og_url":"https:\/\/youzum.net\/es\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/","og_site_name":"YouZum","article_publisher":"https:\/\/www.facebook.com\/DroneAssociationTH\/","article_published_time":"2025-08-16T06:06:27+00:00","author":"admin NU","twitter_card":"summary_large_image","twitter_misc":{"Escrito por":"admin NU","Tiempo de lectura":"3 minutos"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#article","isPartOf":{"@id":"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/"},"author":{"name":"admin NU","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c"},"headline":"R-Zero: A Fully Autonomous AI Framework that Generates Its Own Training Data from Scratch","datePublished":"2025-08-16T06:06:27+00:00","mainEntityOfPage":{"@id":"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/"},"wordCount":699,"commentCount":0,"publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"image":{"@id":"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC.webp","articleSection":["AI","Committee","News","Uncategorized"],"inLanguage":"es","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/","url":"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/","name":"R-Zero: A Fully Autonomous AI Framework that Generates Its Own Training Data from Scratch - YouZum","isPartOf":{"@id":"https:\/\/yousum.gpucore.co\/#website"},"primaryImageOfPage":{"@id":"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#primaryimage"},"image":{"@id":"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC.webp","datePublished":"2025-08-16T06:06:27+00:00","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","breadcrumb":{"@id":"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#breadcrumb"},"inLanguage":"es","potentialAction":[{"@type":"ReadAction","target":["https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/"]}]},{"@type":"ImageObject","inLanguage":"es","@id":"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#primaryimage","url":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC.webp","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC.webp","width":1024,"height":490},{"@type":"BreadcrumbList","@id":"https:\/\/youzum.net\/r-zero-a-fully-autonomous-ai-framework-that-generates-its-own-training-data-from-scratch\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/youzum.net\/"},{"@type":"ListItem","position":2,"name":"R-Zero: A Fully Autonomous AI Framework that Generates Its Own Training Data from Scratch"}]},{"@type":"WebSite","@id":"https:\/\/yousum.gpucore.co\/#website","url":"https:\/\/yousum.gpucore.co\/","name":"YouSum","description":"","publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/yousum.gpucore.co\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"es"},{"@type":"Organization","@id":"https:\/\/yousum.gpucore.co\/#organization","name":"Drone Association Thailand","url":"https:\/\/yousum.gpucore.co\/","logo":{"@type":"ImageObject","inLanguage":"es","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","width":300,"height":300,"caption":"Drone Association Thailand"},"image":{"@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DroneAssociationTH\/"]},{"@type":"Person","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c","name":"admin NU","image":{"@type":"ImageObject","inLanguage":"es","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","caption":"admin NU"},"url":"https:\/\/youzum.net\/es\/members\/adminnu\/"}]}},"rttpg_featured_image_url":{"full":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC.webp",1024,490,false],"landscape":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC.webp",1024,490,false],"portraits":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC.webp",1024,490,false],"thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC-150x150.webp",150,150,true],"medium":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC-300x144.webp",300,144,true],"large":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC.webp",1024,490,false],"1536x1536":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC.webp",1024,490,false],"2048x2048":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC.webp",1024,490,false],"trp-custom-language-flag":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC-18x9.webp",18,9,true],"woocommerce_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC-300x300.webp",300,300,true],"woocommerce_single":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC-600x287.webp",600,287,true],"woocommerce_gallery_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-15-at-9.16.58-PM-1-1024x490-Tju2ZC-100x100.webp",100,100,true]},"rttpg_author":{"display_name":"admin NU","author_link":"https:\/\/youzum.net\/es\/members\/adminnu\/"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/youzum.net\/es\/category\/ai-club\/\" rel=\"category tag\">AI<\/a> <a href=\"https:\/\/youzum.net\/es\/category\/committee\/\" rel=\"category tag\">Committee<\/a> <a href=\"https:\/\/youzum.net\/es\/category\/news\/\" rel=\"category tag\">News<\/a> <a href=\"https:\/\/youzum.net\/es\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","rttpg_excerpt":"Large Language Models (LLMs) have revolutionized fields from natural language understanding to reasoning and code generation. However, pushing their reasoning ability to truly superhuman levels has been limited by the need for massive, high-quality, human-annotated datasets. A team of researchers from Tencent AI Seattle Lab, Washington University, the University of Maryland, and the University of&hellip;","_links":{"self":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/posts\/32054","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/comments?post=32054"}],"version-history":[{"count":0,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/posts\/32054\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/media\/32055"}],"wp:attachment":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/media?parent=32054"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/categories?post=32054"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/tags?post=32054"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}