{"id":15453,"date":"2025-05-29T03:39:28","date_gmt":"2025-05-29T03:39:28","guid":{"rendered":"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/"},"modified":"2025-05-29T03:39:28","modified_gmt":"2025-05-29T03:39:28","slug":"this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency","status":"publish","type":"post","link":"https:\/\/youzum.net\/de\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/","title":{"rendered":"This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10\u00d7 Cost Efficiency"},"content":{"rendered":"<p>Web navigation focuses on teaching machines how to interact with websites to perform tasks such as searching for information, shopping, or booking services. Building a capable web navigation agent is a complex task because it requires understanding the structure of websites, interpreting user goals, and making a series of decisions across multiple steps. These tasks are further complicated by the need for agents to adapt in dynamic web environments, where content can change frequently and where multimodal information, such as text and images, must be understood together.<\/p>\n<p>A key problem in web navigation is the absence of reliable and detailed reward models that can guide agents in real-time. Existing methods primarily rely on multimodal large language models (MLLMs) like GPT-4o and GPT-4o-mini as evaluators, which are expensive, slow, and often inaccurate, especially when handling long sequences of actions in multi-step tasks. These models use prompting-based evaluation or binary success\/failure feedback but fail to provide step-level guidance, often leading to errors such as repeated actions or missing critical steps like clicking specific buttons or filling form fields. This limitation reduces the practicality of deploying web agents in real-world scenarios, where efficiency, accuracy, and cost-effectiveness are crucial.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter is-resized\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A?key=0qlBNS6aPrXuXvTinSZ9vA\" alt=\"\"\/><\/figure>\n<\/div>\n<p>The research team from Yonsei University and Carnegie Mellon University introduced WEB-SHEPHERD, a process reward model specifically designed for web navigation tasks. WEB-SHEPHERD is the first model to evaluate web navigation agents at the step level, using structured checklists to guide assessments. The researchers also developed the WEBPRM COLLECTION, a dataset of 40,000 step-level annotated web navigation tasks, and the WEBREWARDBENCH benchmark for evaluating PRMs. These resources were designed to enable WEB-SHEPHERD to provide detailed feedback by breaking down complex tasks into smaller, measurable subgoals.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter is-resized\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXcXzwodSskXd0Zr1txD9jK5DcsUsa2apun-NiFBl3fK2Kwyc1zHREgOsV4oJeioGnhvXfphda-aLtlFsMBLn7SNZ07UDtXWUVOSGSfHKcV7DJ1vuIJC3iz89W9wGhUUrbBlmvev?key=0qlBNS6aPrXuXvTinSZ9vA\" alt=\"\"\/><\/figure>\n<\/div>\n<p>WEB-SHEPHERD works by generating a checklist for each task based on the user\u2019s instruction, such as \u201cSearch for product\u201d or \u201cClick on product page,\u201d and evaluates the agent\u2019s progress against these subgoals. The model uses next-token prediction to generate feedback and assigns rewards based on checklist completion. This process enables WEB-SHEPHERD to assess the correctness of each step with fine-grained judgment. The model estimates the reward for each step by combining the probabilities of \u201cYes,\u201d \u201cNo,\u201d and \u201cIn Progress\u201d tokens and averages these across the checklist. This detailed scoring system enables agents to receive targeted feedback on their progress, enhancing their ability to navigate complex websites.<\/p>\n<p>The researchers demonstrated that WEB-SHEPHERD significantly outperforms existing models. On the WEBREWARDBENCH benchmark, WEB-SHEPHERD achieved a Mean Reciprocal Rank (MRR) score of 87.6% and a trajectory accuracy of 55% in the text-only setting, compared to GPT-4o-mini\u2019s 47.5% MRR and 0% trajectory accuracy without checklists. When tested in WebArena-lite using GPT-4o-mini as the policy model, WEB-SHEPHERD achieved a 34.55% success rate, which is 10.9 points higher than using GPT-4o-mini as the evaluator, while also being ten times more cost-efficient. In ablation studies, the researchers observed that WEB-SHEPHERD\u2019s performance dropped significantly when checklists or feedback were removed, proving their importance for accurate reward assignments. They also showed that multimodal input, surprisingly, did not always improve performance and sometimes introduced noise.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter is-resized\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXfeWbeKiwvxkSvTFQLDPyhZ9vv45OGv-i-GejMMUUGZejdA8ZRAVAH4IWi8RcWO9knktyMi7vNnL2BNfn483XNAQHUPXg5DCN5FxuzkK7IwOUbuPBcncaSInkhstrhbWBF6P5twqA?key=0qlBNS6aPrXuXvTinSZ9vA\" alt=\"\"\/><\/figure>\n<\/div>\n<p>This research highlights the critical role of detailed process-level rewards in building reliable web agents. The team\u2019s work addresses the core challenge of web navigation\u2014evaluating complex, multi-step actions\u2014and offers a solution that is both scalable and cost-effective. With WEB-SHEPHERD, agents can now receive accurate feedback during navigation, enabling them to make better decisions and complete tasks more effectively.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<p><strong>Check out the<a href=\"https:\/\/arxiv.org\/abs\/2505.15277\" target=\"_blank\" rel=\"noreferrer noopener\"> Paper<\/a> and <a href=\"https:\/\/github.com\/kyle8581\/Web-Shepherd\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub Page<\/a><em>.<\/em><\/strong>\u00a0All credit for this research goes to the researchers of this project. Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">95k+ ML SubReddit<\/a><\/strong> and Subscribe to <strong><a href=\"https:\/\/www.airesearchinsights.com\/subscribe\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>.<\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2025\/05\/28\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/\">This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10\u00d7 Cost Efficiency<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Web navigation focuses on teaching machines how to interact with websites to perform tasks such as searching for information, shopping, or booking services. Building a capable web navigation agent is a complex task because it requires understanding the structure of websites, interpreting user goals, and making a series of decisions across multiple steps. These tasks are further complicated by the need for agents to adapt in dynamic web environments, where content can change frequently and where multimodal information, such as text and images, must be understood together. A key problem in web navigation is the absence of reliable and detailed reward models that can guide agents in real-time. Existing methods primarily rely on multimodal large language models (MLLMs) like GPT-4o and GPT-4o-mini as evaluators, which are expensive, slow, and often inaccurate, especially when handling long sequences of actions in multi-step tasks. These models use prompting-based evaluation or binary success\/failure feedback but fail to provide step-level guidance, often leading to errors such as repeated actions or missing critical steps like clicking specific buttons or filling form fields. This limitation reduces the practicality of deploying web agents in real-world scenarios, where efficiency, accuracy, and cost-effectiveness are crucial. The research team from Yonsei University and Carnegie Mellon University introduced WEB-SHEPHERD, a process reward model specifically designed for web navigation tasks. WEB-SHEPHERD is the first model to evaluate web navigation agents at the step level, using structured checklists to guide assessments. The researchers also developed the WEBPRM COLLECTION, a dataset of 40,000 step-level annotated web navigation tasks, and the WEBREWARDBENCH benchmark for evaluating PRMs. These resources were designed to enable WEB-SHEPHERD to provide detailed feedback by breaking down complex tasks into smaller, measurable subgoals. WEB-SHEPHERD works by generating a checklist for each task based on the user\u2019s instruction, such as \u201cSearch for product\u201d or \u201cClick on product page,\u201d and evaluates the agent\u2019s progress against these subgoals. The model uses next-token prediction to generate feedback and assigns rewards based on checklist completion. This process enables WEB-SHEPHERD to assess the correctness of each step with fine-grained judgment. The model estimates the reward for each step by combining the probabilities of \u201cYes,\u201d \u201cNo,\u201d and \u201cIn Progress\u201d tokens and averages these across the checklist. This detailed scoring system enables agents to receive targeted feedback on their progress, enhancing their ability to navigate complex websites. The researchers demonstrated that WEB-SHEPHERD significantly outperforms existing models. On the WEBREWARDBENCH benchmark, WEB-SHEPHERD achieved a Mean Reciprocal Rank (MRR) score of 87.6% and a trajectory accuracy of 55% in the text-only setting, compared to GPT-4o-mini\u2019s 47.5% MRR and 0% trajectory accuracy without checklists. When tested in WebArena-lite using GPT-4o-mini as the policy model, WEB-SHEPHERD achieved a 34.55% success rate, which is 10.9 points higher than using GPT-4o-mini as the evaluator, while also being ten times more cost-efficient. In ablation studies, the researchers observed that WEB-SHEPHERD\u2019s performance dropped significantly when checklists or feedback were removed, proving their importance for accurate reward assignments. They also showed that multimodal input, surprisingly, did not always improve performance and sometimes introduced noise. This research highlights the critical role of detailed process-level rewards in building reliable web agents. The team\u2019s work addresses the core challenge of web navigation\u2014evaluating complex, multi-step actions\u2014and offers a solution that is both scalable and cost-effective. With WEB-SHEPHERD, agents can now receive accurate feedback during navigation, enabling them to make better decisions and complete tasks more effectively. Check out the Paper and GitHub Page.\u00a0All credit for this research goes to the researchers of this project. Also,\u00a0feel free to follow us on\u00a0Twitter\u00a0and don\u2019t forget to join our\u00a095k+ ML SubReddit and Subscribe to our Newsletter. The post This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10\u00d7 Cost Efficiency appeared first on MarkTechPost.<\/p>","protected":false},"author":2,"featured_media":15454,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"pmpro_default_level":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"_pvb_checkbox_block_on_post":false,"footnotes":""},"categories":[52,5,7,1],"tags":[],"class_list":["post-15453","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-club","category-committee","category-news","category-uncategorized","pmpro-has-access"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10\u00d7 Cost Efficiency - YouZum<\/title>\n<meta name=\"description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/youzum.net\/de\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/\" \/>\n<meta property=\"og:locale\" content=\"de_DE\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10\u00d7 Cost Efficiency - YouZum\" \/>\n<meta property=\"og:description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta property=\"og:url\" content=\"https:\/\/youzum.net\/de\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/\" \/>\n<meta property=\"og:site_name\" content=\"YouZum\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DroneAssociationTH\/\" \/>\n<meta property=\"article:published_time\" content=\"2025-05-29T03:39:28+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1406\" \/>\n\t<meta property=\"og:image:height\" content=\"568\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"admin NU\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Verfasst von\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin NU\" \/>\n\t<meta name=\"twitter:label2\" content=\"Gesch\u00e4tzte Lesezeit\" \/>\n\t<meta name=\"twitter:data2\" content=\"3\u00a0Minuten\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/\"},\"author\":{\"name\":\"admin NU\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\"},\"headline\":\"This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10\u00d7 Cost Efficiency\",\"datePublished\":\"2025-05-29T03:39:28+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/\"},\"wordCount\":652,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16.png\",\"articleSection\":[\"AI\",\"Committee\",\"News\",\"Uncategorized\"],\"inLanguage\":\"de\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/\",\"url\":\"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/\",\"name\":\"This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10\u00d7 Cost Efficiency - YouZum\",\"isPartOf\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16.png\",\"datePublished\":\"2025-05-29T03:39:28+00:00\",\"description\":\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\",\"breadcrumb\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#breadcrumb\"},\"inLanguage\":\"de\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#primaryimage\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16.png\",\"width\":1406,\"height\":568},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/youzum.net\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10\u00d7 Cost Efficiency\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/yousum.gpucore.co\/#website\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"name\":\"YouSum\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/yousum.gpucore.co\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"de\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\",\"name\":\"Drone Association Thailand\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"width\":300,\"height\":300,\"caption\":\"Drone Association Thailand\"},\"image\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/DroneAssociationTH\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\",\"name\":\"admin NU\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"caption\":\"admin NU\"},\"url\":\"https:\/\/youzum.net\/de\/members\/adminnu\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10\u00d7 Cost Efficiency - YouZum","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/youzum.net\/de\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/","og_locale":"de_DE","og_type":"article","og_title":"This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10\u00d7 Cost Efficiency - YouZum","og_description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","og_url":"https:\/\/youzum.net\/de\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/","og_site_name":"YouZum","article_publisher":"https:\/\/www.facebook.com\/DroneAssociationTH\/","article_published_time":"2025-05-29T03:39:28+00:00","og_image":[{"width":1406,"height":568,"url":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16.png","type":"image\/png"}],"author":"admin NU","twitter_card":"summary_large_image","twitter_misc":{"Verfasst von":"admin NU","Gesch\u00e4tzte Lesezeit":"3\u00a0Minuten"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#article","isPartOf":{"@id":"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/"},"author":{"name":"admin NU","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c"},"headline":"This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10\u00d7 Cost Efficiency","datePublished":"2025-05-29T03:39:28+00:00","mainEntityOfPage":{"@id":"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/"},"wordCount":652,"commentCount":0,"publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"image":{"@id":"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16.png","articleSection":["AI","Committee","News","Uncategorized"],"inLanguage":"de","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/","url":"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/","name":"This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10\u00d7 Cost Efficiency - YouZum","isPartOf":{"@id":"https:\/\/yousum.gpucore.co\/#website"},"primaryImageOfPage":{"@id":"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#primaryimage"},"image":{"@id":"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16.png","datePublished":"2025-05-29T03:39:28+00:00","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","breadcrumb":{"@id":"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#breadcrumb"},"inLanguage":"de","potentialAction":[{"@type":"ReadAction","target":["https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/"]}]},{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#primaryimage","url":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16.png","width":1406,"height":568},{"@type":"BreadcrumbList","@id":"https:\/\/youzum.net\/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/youzum.net\/"},{"@type":"ListItem","position":2,"name":"This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10\u00d7 Cost Efficiency"}]},{"@type":"WebSite","@id":"https:\/\/yousum.gpucore.co\/#website","url":"https:\/\/yousum.gpucore.co\/","name":"YouSum","description":"","publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/yousum.gpucore.co\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"de"},{"@type":"Organization","@id":"https:\/\/yousum.gpucore.co\/#organization","name":"Drone Association Thailand","url":"https:\/\/yousum.gpucore.co\/","logo":{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","width":300,"height":300,"caption":"Drone Association Thailand"},"image":{"@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DroneAssociationTH\/"]},{"@type":"Person","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c","name":"admin NU","image":{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","caption":"admin NU"},"url":"https:\/\/youzum.net\/de\/members\/adminnu\/"}]}},"rttpg_featured_image_url":{"full":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16.png",1406,568,false],"landscape":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16.png",1406,568,false],"portraits":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16.png",1406,568,false],"thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16-150x150.png",150,150,true],"medium":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16-300x121.png",300,121,true],"large":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16-1024x414.png",1024,414,true],"1536x1536":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16.png",1406,568,false],"2048x2048":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16.png",1406,568,false],"trp-custom-language-flag":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16-18x7.png",18,7,true],"woocommerce_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16-300x300.png",300,300,true],"woocommerce_single":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16-600x242.png",600,242,true],"woocommerce_gallery_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/05\/AD_4nXdFvQH2qmBM_u6BNMlbjwez3yUCWqogdQN0UePte_Iik0n_T2Z6scGMaO6PdZZH6AlwC1Xhnbz8NGCksv9OurqNMK_fJH_KAx7DmJDMPQrG7fJaWWrXbGqRh-SBA6JrtrU-Iw5b0A-rPjB16-100x100.png",100,100,true]},"rttpg_author":{"display_name":"admin NU","author_link":"https:\/\/youzum.net\/de\/members\/adminnu\/"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/youzum.net\/de\/category\/ai-club\/\" rel=\"category tag\">AI<\/a> <a href=\"https:\/\/youzum.net\/de\/category\/committee\/\" rel=\"category tag\">Committee<\/a> <a href=\"https:\/\/youzum.net\/de\/category\/news\/\" rel=\"category tag\">News<\/a> <a href=\"https:\/\/youzum.net\/de\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","rttpg_excerpt":"Web navigation focuses on teaching machines how to interact with websites to perform tasks such as searching for information, shopping, or booking services. Building a capable web navigation agent is a complex task because it requires understanding the structure of websites, interpreting user goals, and making a series of decisions across multiple steps. These tasks&hellip;","_links":{"self":[{"href":"https:\/\/youzum.net\/de\/wp-json\/wp\/v2\/posts\/15453","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/youzum.net\/de\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/youzum.net\/de\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/youzum.net\/de\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/youzum.net\/de\/wp-json\/wp\/v2\/comments?post=15453"}],"version-history":[{"count":0,"href":"https:\/\/youzum.net\/de\/wp-json\/wp\/v2\/posts\/15453\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/youzum.net\/de\/wp-json\/wp\/v2\/media\/15454"}],"wp:attachment":[{"href":"https:\/\/youzum.net\/de\/wp-json\/wp\/v2\/media?parent=15453"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/youzum.net\/de\/wp-json\/wp\/v2\/categories?post=15453"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/youzum.net\/de\/wp-json\/wp\/v2\/tags?post=15453"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}