{"id":11834,"date":"2025-05-11T02:44:06","date_gmt":"2025-05-11T02:44:06","guid":{"rendered":"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/"},"modified":"2025-05-11T02:44:06","modified_gmt":"2025-05-11T02:44:06","slug":"microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use","status":"publish","type":"post","link":"https:\/\/youzum.net\/es\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/","title":{"rendered":"Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs with Agentic Reasoning and Dynamic Tool Use"},"content":{"rendered":"<p>LLMs have made impressive gains in complex reasoning, primarily through innovations in architecture, scale, and training approaches like RL. RL enhances LLMs by using reward signals to guide the model towards more effective reasoning strategies, resulting in longer and more coherent thought processes that adapt dynamically to a task\u2019s complexity. Despite this, most RL-enhanced LLMs rely heavily on static internal knowledge and text-only reasoning, making them ill-suited for tasks requiring real-time information, domain-specific expertise, or precise computations. This limitation is especially evident in knowledge-intensive or open-ended problems where the inability to access and interact with external tools leads to inaccuracies or hallucinations.<\/p>\n<p>To overcome these constraints, recent work has explored agentic reasoning, where LLMs dynamically engage with external tools and environments during the reasoning process. These tools include web search, APIs, and code execution platforms, while environments range from simulated browsers to operating systems. Agentic reasoning enables models to plan, adapt, and solve tasks interactively, beyond static inference. However, current methods for tool integration often depend on manually designed prompts or supervised fine-tuning, which hinder scalability and generalization. Emerging reinforcement learning techniques like Group Relative Policy Optimization (GRPO) provide more efficient and adaptive training for tool use without step-level supervision. Yet, the intersection of RL, tool use, and agentic decision-making remains underexplored, particularly in real-world tasks that demand multi-turn reasoning, dynamic planning, and robust external interaction.\u00a0<\/p>\n<p>Microsoft Research introduces ARTIST (Agentic Reasoning and Tool Integration in Self-improving Transformers), a framework that combines agentic reasoning, reinforcement learning, and dynamic tool use to enhance LLMs. ARTIST enables models to autonomously decide when, how, and which tools to use during multi-step reasoning, learning robust strategies without step-level supervision. The model improves reasoning and interaction with external environments through integrated tool queries and outputs. Evaluated on challenging math and function-calling benchmarks, ARTIST outperforms top models like GPT-4o, achieving up to 22% gains. It demonstrates emergent agentic behaviors, setting a new standard in generalizable and interpretable problem-solving.\u00a0<\/p>\n<p>ARTIST is a flexible framework that enables LLMs to interact with external tools and environments using reinforcement learning. It alternates between reasoning and tool use, allowing the model to choose when and how to invoke tools like code interpreters or APIs. Training uses GRPO, which avoids value functions and uses outcome-based group rewards. ARTIST structures rollouts into reasoning, tool queries, tool outputs, and final answers, with a composite reward system encouraging correctness, proper format, and successful tool use, enabling adaptive, multi-step problem-solving.\u00a0<\/p>\n<p>ARTIST outperforms various baselines, including GPT-4o and tool-augmented LLMs, on complex mathematical benchmarks like AMC, AIME, and Olympiad. It achieves higher Pass@1 accuracy, with notable gains of up to 22% over base models and over 35% compared to other tool-integrated methods. ARTIST\u2019s advantage comes from its agentic reinforcement learning, enabling it to use external tools and refine multi-step solutions strategically. Compared to prompt-based tool usage, it shows superior tool invocation, response quality, and reasoning depth. While its benefits are most evident in complex tasks, ARTIST significantly improves simpler datasets like MATH-500 through selective tool use.\u00a0<\/p>\n<p><img fetchpriority=\"high\" decoding=\"async\" width=\"624\" height=\"232\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8?key=qEMu3aq2VkzM4hlZ5-AjnA\" \/><\/p>\n<p>In conclusion, ARTIST is a framework that combines agentic reasoning, reinforcement learning, and dynamic tool use to enhance the capabilities of LLMs. Unlike traditional prompt-based approaches, ARTIST enables models to autonomously plan, adapt, and solve complex tasks by interacting with external tools and environments. It learns effective tool-use strategies without step-by-step supervision, improving accuracy and deeper reasoning. Evaluations on mathematical and function-calling benchmarks show significant performance gains. ARTIST also produces more interpretable reasoning paths and robust behaviors. This work highlights the potential of agentic RL as a promising direction for creating more adaptive and capable AI systems.\u00a0<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<p>Check out the<strong> <a href=\"https:\/\/arxiv.org\/abs\/2505.01441\" target=\"_blank\" rel=\"noreferrer noopener\">Paper<\/a>.<\/strong> Also,\u00a0don\u2019t forget to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>.<\/p>\n<p><strong>Here\u2019s a brief overview of what we\u2019re building at Marktechpost:<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li><strong>ML News Community \u2013<a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">\u00a0r\/machinelearningnews<\/a>\u00a0(92k+ members)<\/strong><\/li>\n<li><strong>Newsletter\u2013\u00a0<a href=\"https:\/\/minicon.marktechpost.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">airesearchinsights.com\/<\/a>(30k+ subscribers)<\/strong><\/li>\n<li><strong>miniCON AI Events \u2013\u00a0<a href=\"https:\/\/minicon.marktechpost.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">minicon.marktechpost.com<\/a><\/strong><\/li>\n<li><strong>AI Reports &amp; Magazines \u2013\u00a0<a href=\"https:\/\/magazine.marktechpost.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">magazine.marktechpost.com<\/a><\/strong><\/li>\n<li><strong>AI Dev &amp; Research News \u2013\u00a0<a href=\"https:\/\/marktechpost.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">marktechpost.com<\/a>\u00a0(1M+ monthly readers)<\/strong><\/li>\n<li><strong><a href=\"https:\/\/forms.gle\/cnXafrh6Be8UigQ68\" target=\"_blank\" rel=\"noreferrer noopener\">Partner with us<\/a><\/strong><\/li>\n<\/ul>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2025\/05\/10\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/\">Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs with Agentic Reasoning and Dynamic Tool Use<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>LLMs have made impressive gains in complex reasoning, primarily through innovations in architecture, scale, and training approaches like RL. RL enhances LLMs by using reward signals to guide the model towards more effective reasoning strategies, resulting in longer and more coherent thought processes that adapt dynamically to a task\u2019s complexity. Despite this, most RL-enhanced LLMs rely heavily on static internal knowledge and text-only reasoning, making them ill-suited for tasks requiring real-time information, domain-specific expertise, or precise computations. This limitation is especially evident in knowledge-intensive or open-ended problems where the inability to access and interact with external tools leads to inaccuracies or hallucinations. To overcome these constraints, recent work has explored agentic reasoning, where LLMs dynamically engage with external tools and environments during the reasoning process. These tools include web search, APIs, and code execution platforms, while environments range from simulated browsers to operating systems. Agentic reasoning enables models to plan, adapt, and solve tasks interactively, beyond static inference. However, current methods for tool integration often depend on manually designed prompts or supervised fine-tuning, which hinder scalability and generalization. Emerging reinforcement learning techniques like Group Relative Policy Optimization (GRPO) provide more efficient and adaptive training for tool use without step-level supervision. Yet, the intersection of RL, tool use, and agentic decision-making remains underexplored, particularly in real-world tasks that demand multi-turn reasoning, dynamic planning, and robust external interaction.\u00a0 Microsoft Research introduces ARTIST (Agentic Reasoning and Tool Integration in Self-improving Transformers), a framework that combines agentic reasoning, reinforcement learning, and dynamic tool use to enhance LLMs. ARTIST enables models to autonomously decide when, how, and which tools to use during multi-step reasoning, learning robust strategies without step-level supervision. The model improves reasoning and interaction with external environments through integrated tool queries and outputs. Evaluated on challenging math and function-calling benchmarks, ARTIST outperforms top models like GPT-4o, achieving up to 22% gains. It demonstrates emergent agentic behaviors, setting a new standard in generalizable and interpretable problem-solving.\u00a0 ARTIST is a flexible framework that enables LLMs to interact with external tools and environments using reinforcement learning. It alternates between reasoning and tool use, allowing the model to choose when and how to invoke tools like code interpreters or APIs. Training uses GRPO, which avoids value functions and uses outcome-based group rewards. ARTIST structures rollouts into reasoning, tool queries, tool outputs, and final answers, with a composite reward system encouraging correctness, proper format, and successful tool use, enabling adaptive, multi-step problem-solving.\u00a0 ARTIST outperforms various baselines, including GPT-4o and tool-augmented LLMs, on complex mathematical benchmarks like AMC, AIME, and Olympiad. It achieves higher Pass@1 accuracy, with notable gains of up to 22% over base models and over 35% compared to other tool-integrated methods. ARTIST\u2019s advantage comes from its agentic reinforcement learning, enabling it to use external tools and refine multi-step solutions strategically. Compared to prompt-based tool usage, it shows superior tool invocation, response quality, and reasoning depth. While its benefits are most evident in complex tasks, ARTIST significantly improves simpler datasets like MATH-500 through selective tool use.\u00a0 In conclusion, ARTIST is a framework that combines agentic reasoning, reinforcement learning, and dynamic tool use to enhance the capabilities of LLMs. Unlike traditional prompt-based approaches, ARTIST enables models to autonomously plan, adapt, and solve complex tasks by interacting with external tools and environments. It learns effective tool-use strategies without step-by-step supervision, improving accuracy and deeper reasoning. Evaluations on mathematical and function-calling benchmarks show significant performance gains. ARTIST also produces more interpretable reasoning paths and robust behaviors. This work highlights the potential of agentic RL as a promising direction for creating more adaptive and capable AI systems.\u00a0 Check out the Paper. Also,\u00a0don\u2019t forget to follow us on\u00a0Twitter. Here\u2019s a brief overview of what we\u2019re building at Marktechpost: ML News Community \u2013\u00a0r\/machinelearningnews\u00a0(92k+ members) Newsletter\u2013\u00a0airesearchinsights.com\/(30k+ subscribers) miniCON AI Events \u2013\u00a0minicon.marktechpost.com AI Reports &amp; Magazines \u2013\u00a0magazine.marktechpost.com AI Dev &amp; Research News \u2013\u00a0marktechpost.com\u00a0(1M+ monthly readers) Partner with us The post Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs with Agentic Reasoning and Dynamic Tool Use appeared first on MarkTechPost.<\/p>","protected":false},"author":2,"featured_media":11835,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"pmpro_default_level":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"_pvb_checkbox_block_on_post":false,"footnotes":""},"categories":[52,5,7,1],"tags":[],"class_list":["post-11834","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-club","category-committee","category-news","category-uncategorized","pmpro-has-access"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs with Agentic Reasoning and Dynamic Tool Use - YouZum<\/title>\n<meta name=\"description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/youzum.net\/es\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/\" \/>\n<meta property=\"og:locale\" content=\"es_ES\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs with Agentic Reasoning and Dynamic Tool Use - YouZum\" \/>\n<meta property=\"og:description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta property=\"og:url\" content=\"https:\/\/youzum.net\/es\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/\" \/>\n<meta property=\"og:site_name\" content=\"YouZum\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DroneAssociationTH\/\" \/>\n<meta property=\"article:published_time\" content=\"2025-05-11T02:44:06+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1239\" \/>\n\t<meta property=\"og:image:height\" content=\"460\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"admin NU\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Escrito por\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin NU\" \/>\n\t<meta name=\"twitter:label2\" content=\"Tiempo de lectura\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutos\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/\"},\"author\":{\"name\":\"admin NU\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\"},\"headline\":\"Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs with Agentic Reasoning and Dynamic Tool Use\",\"datePublished\":\"2025-05-11T02:44:06+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/\"},\"wordCount\":702,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ.png\",\"articleSection\":[\"AI\",\"Committee\",\"News\",\"Uncategorized\"],\"inLanguage\":\"es\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/\",\"url\":\"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/\",\"name\":\"Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs with Agentic Reasoning and Dynamic Tool Use - YouZum\",\"isPartOf\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ.png\",\"datePublished\":\"2025-05-11T02:44:06+00:00\",\"description\":\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\",\"breadcrumb\":{\"@id\":\"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#breadcrumb\"},\"inLanguage\":\"es\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"es\",\"@id\":\"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#primaryimage\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ.png\",\"width\":1239,\"height\":460},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/youzum.net\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs with Agentic Reasoning and Dynamic Tool Use\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/yousum.gpucore.co\/#website\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"name\":\"YouSum\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/yousum.gpucore.co\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"es\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\",\"name\":\"Drone Association Thailand\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"es\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"width\":300,\"height\":300,\"caption\":\"Drone Association Thailand\"},\"image\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/DroneAssociationTH\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\",\"name\":\"admin NU\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"es\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"caption\":\"admin NU\"},\"url\":\"https:\/\/youzum.net\/es\/members\/adminnu\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs with Agentic Reasoning and Dynamic Tool Use - YouZum","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/youzum.net\/es\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/","og_locale":"es_ES","og_type":"article","og_title":"Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs with Agentic Reasoning and Dynamic Tool Use - YouZum","og_description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","og_url":"https:\/\/youzum.net\/es\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/","og_site_name":"YouZum","article_publisher":"https:\/\/www.facebook.com\/DroneAssociationTH\/","article_published_time":"2025-05-11T02:44:06+00:00","og_image":[{"width":1239,"height":460,"url":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ.png","type":"image\/png"}],"author":"admin NU","twitter_card":"summary_large_image","twitter_misc":{"Escrito por":"admin NU","Tiempo de lectura":"3 minutos"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#article","isPartOf":{"@id":"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/"},"author":{"name":"admin NU","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c"},"headline":"Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs with Agentic Reasoning and Dynamic Tool Use","datePublished":"2025-05-11T02:44:06+00:00","mainEntityOfPage":{"@id":"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/"},"wordCount":702,"commentCount":0,"publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"image":{"@id":"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ.png","articleSection":["AI","Committee","News","Uncategorized"],"inLanguage":"es","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/","url":"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/","name":"Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs with Agentic Reasoning and Dynamic Tool Use - YouZum","isPartOf":{"@id":"https:\/\/yousum.gpucore.co\/#website"},"primaryImageOfPage":{"@id":"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#primaryimage"},"image":{"@id":"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ.png","datePublished":"2025-05-11T02:44:06+00:00","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","breadcrumb":{"@id":"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#breadcrumb"},"inLanguage":"es","potentialAction":[{"@type":"ReadAction","target":["https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/"]}]},{"@type":"ImageObject","inLanguage":"es","@id":"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#primaryimage","url":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ.png","width":1239,"height":460},{"@type":"BreadcrumbList","@id":"https:\/\/youzum.net\/microsoft-researchers-introduce-artist-a-reinforcement-learning-framework-that-equips-llms-with-agentic-reasoning-and-dynamic-tool-use\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/youzum.net\/"},{"@type":"ListItem","position":2,"name":"Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs with Agentic Reasoning and Dynamic Tool Use"}]},{"@type":"WebSite","@id":"https:\/\/yousum.gpucore.co\/#website","url":"https:\/\/yousum.gpucore.co\/","name":"YouSum","description":"","publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/yousum.gpucore.co\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"es"},{"@type":"Organization","@id":"https:\/\/yousum.gpucore.co\/#organization","name":"Drone Association Thailand","url":"https:\/\/yousum.gpucore.co\/","logo":{"@type":"ImageObject","inLanguage":"es","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","width":300,"height":300,"caption":"Drone Association Thailand"},"image":{"@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DroneAssociationTH\/"]},{"@type":"Person","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c","name":"admin NU","image":{"@type":"ImageObject","inLanguage":"es","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","caption":"admin NU"},"url":"https:\/\/youzum.net\/es\/members\/adminnu\/"}]}},"rttpg_featured_image_url":{"full":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ.png",1239,460,false],"landscape":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ.png",1239,460,false],"portraits":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ.png",1239,460,false],"thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ-150x150.png",150,150,true],"medium":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ-300x111.png",300,111,true],"large":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ-1024x380.png",1024,380,true],"1536x1536":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ.png",1239,460,false],"2048x2048":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ.png",1239,460,false],"trp-custom-language-flag":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ-18x7.png",18,7,true],"woocommerce_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ-300x300.png",300,300,true],"woocommerce_single":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ-600x223.png",600,223,true],"woocommerce_gallery_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXcAH9IJtJJCZ52yhB_maAdNn70-IEQK1MgoI73RxO3jwudF1vZ3XHJr1TVoFMXNLYg4NfcCz9d3vlqVYr6f1G1JVGzGA54qDpGK9aR2hSqtx_2kaBrxpBcJI04TpPWDg_iNjrq8-2C86aZ-100x100.png",100,100,true]},"rttpg_author":{"display_name":"admin NU","author_link":"https:\/\/youzum.net\/es\/members\/adminnu\/"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/youzum.net\/es\/category\/ai-club\/\" rel=\"category tag\">AI<\/a> <a href=\"https:\/\/youzum.net\/es\/category\/committee\/\" rel=\"category tag\">Committee<\/a> <a href=\"https:\/\/youzum.net\/es\/category\/news\/\" rel=\"category tag\">News<\/a> <a href=\"https:\/\/youzum.net\/es\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","rttpg_excerpt":"LLMs have made impressive gains in complex reasoning, primarily through innovations in architecture, scale, and training approaches like RL. RL enhances LLMs by using reward signals to guide the model towards more effective reasoning strategies, resulting in longer and more coherent thought processes that adapt dynamically to a task\u2019s complexity. Despite this, most RL-enhanced LLMs&hellip;","_links":{"self":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/posts\/11834","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/comments?post=11834"}],"version-history":[{"count":0,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/posts\/11834\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/media\/11835"}],"wp:attachment":[{"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/media?parent=11834"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/categories?post=11834"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/youzum.net\/es\/wp-json\/wp\/v2\/tags?post=11834"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}