{"id":92592,"date":"2026-05-24T16:58:45","date_gmt":"2026-05-24T16:58:45","guid":{"rendered":"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/"},"modified":"2026-05-24T16:58:45","modified_gmt":"2026-05-24T16:58:45","slug":"microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5","status":"publish","type":"post","link":"https:\/\/youzum.net\/zh\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/","title":{"rendered":"Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4\u2019s 33.5%"},"content":{"rendered":"<p class=\"wp-block-paragraph\">Most web agents today drive a browser one action at a time. The model receives the current page state \u2014 as a screenshot or DOM text \u2014 and predicts the next click, keypress, or scroll. This action-at-a-time design made sense when language models had limited reasoning ability. As models have become more capable at writing and debugging code, that rigid loop has become a constraint rather than a structure that helps.<\/p>\n<p class=\"wp-block-paragraph\">Microsoft Research\u2019s AI Frontiers lab built a different approach. Their new open-source framework, <strong>Webwright<\/strong>, gives the agent a terminal instead of a stateful browser session. The agent writes Playwright code to control browsers, runs bash commands, inspects logs, and iteratively refines scripts. Playwright is an open-source browser automation library, also from Microsoft, that supports programmatic control of Chromium, Firefox, and WebKit browsers. <\/p>\n<h2 class=\"wp-block-heading\"><strong>What Webwright Does Differently<\/strong><\/h2>\n<p class=\"wp-block-paragraph\"><strong>Webwright<\/strong> separates the agent from the browser and treats the browser as something the agent can launch, inspect, and discard while developing a program. The persistent artifact is not the browser session but the code and logs in the local workspace. <\/p>\n<p class=\"wp-block-paragraph\">This is the same model a developer uses when writing an RPA (Robotic Process Automation) script. Instead of manually clicking through a site each time, they write a script once. That script can be rerun, adapted, and shared. Webwright applies this to LLM-powered agents.<\/p>\n<p class=\"wp-block-paragraph\">The system has <strong>three core components:<\/strong> a Runner, a Model Endpoint, and a terminal Environment. The runner is about 150 lines of code, the model interface about 550 lines, and the environment about 300 lines. There is no multi-agent orchestration or complex planning hierarchy \u2014 just a single agent loop.<\/p>\n<p class=\"wp-block-paragraph\">All intermediate code, logs, screenshots, and results are stored in the workspace, making each run easy to inspect. <\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1694\" height=\"1258\" data-attachment-id=\"80079\" data-permalink=\"https:\/\/www.marktechpost.com\/2026\/05\/24\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/screenshot-2026-05-24-at-1-42-33-am-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1.png\" data-orig-size=\"1694,1258\" data-comments-opened=\"0\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\",\"alt\":\"\"}' data-image-title=\"Screenshot 2026-05-24 at 1.42.33\u202fAM\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-1024x760.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1.png\" alt=\"\" class=\"wp-image-80079\" \/><figcaption class=\"wp-element-caption\">https:\/\/www.microsoft.com\/en-us\/research\/articles\/webwright-a-terminal-is-all-you-need-for-web-agents\/<\/figcaption><\/figure>\n<\/div>\n<h2 class=\"wp-block-heading\"><strong>The Agent Loop<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">The Runner sends the current context to the model. The model returns a thinking block and a shell command. That command runs in the Environment, which returns terminal output, logs, screenshots, or error tracebacks. These observations go back into context, and the loop continues.<\/p>\n<p class=\"wp-block-paragraph\">Rather than issuing one primitive action at a time, a coding agent can naturally express multi-step interactions \u2014 such as selecting a date or filling out an entire form \u2014 as a compact program. Loops, functions, and abstractions allow the agent to generalize across similar tasks without repeatedly predicting similar sequences of low-level steps.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Two Engineering Challenges<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">Premature \u2018done\u2019 and context explosion are the two core issues. With open-ended bash actions, the model must self-report completion and often claims success without actually finishing. They added a gate: the agent must generate a self-reflection config, run a final script in a fresh folder with logs and screenshots, and pass its own self-reflection judgement that outputs success or failure before emitting <code>done: true<\/code>. Otherwise, the flag is dropped and it retries. <\/p>\n<p class=\"wp-block-paragraph\">For context length, long coding trajectories quickly exceed context limits, so they compact history every 20 steps into a single summary. <\/p>\n<h2 class=\"wp-block-heading\"><strong>Benchmark Results<\/strong><\/h2>\n<p class=\"wp-block-paragraph\"><strong>Webwright was evaluated on two benchmarks: Online-Mind2Web and Odysseys.<\/strong><\/p>\n<p class=\"wp-block-paragraph\">Online-Mind2Web contains 300 tasks across 136 widely used sites and uses an automated LLM-as-a-Judge evaluation framework. GPT-5.4 achieves 86.67% overall accuracy, representing the highest among all open-sourced harness recipes in the AutoEval category of the Online-Mind2Web benchmark, with a 100-step budget. Claude Opus 4.7 reached 84.7% overall but performed better on hard tasks at N=100 steps \u2014 80.5% versus 76.6% for GPT-5.4. <\/p>\n<p class=\"wp-block-paragraph\">They also reproduced a GPT-5.4 baseline in a conventional screenshot-based agent setting, where the model predicts x,y coordinates for clicks and typing actions. Using the same underlying model, Webwright achieves substantial gains across all three difficulty categories, highlighting the benefit of the code-driven terminal-based approach over step-by-step coordinate prediction.<\/p>\n<p class=\"wp-block-paragraph\">Odysseys evaluates long-horizon browsing tasks spanning multiple websites. Tasks average 272.3 words of instructions. In the April 2026 leaderboard, the best-performing model was Opus 4.6, with a top score of 44.5. Webwright powered by GPT-5.4 reaches 60.1%, a 35.1% relative improvement over the previous state of the art. Compared to the base GPT-5.4 performance of 33.5%, this corresponds to a 79.4% relative improvement \u2014 or 26.6 absolute points.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Cost Analysis<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">Claude Opus 4.7 is more efficient in the number of steps to solve each task (mean 21.9 steps) compared to GPT-5.4 (mean 26.3 steps). However, Claude Opus 4.7 is priced significantly higher compared to GPT-5.4 ($5 vs. $2.50 per 1M input tokens, and $25 vs. $15.00 per 1M output tokens, April 2026), which makes the average per-task cost higher compared to GPT-5.4 ($2.37 vs. $6.09). The first 50 steps deliver 82% accuracy, and the next 50 steps deliver 3\u20134 additional points. <\/p>\n<h2 class=\"wp-block-heading\"><strong>Small Model Performance<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">The research team also tested Qwen3.5-9B on the hard split of Online-Mind2Web. When tasks are augmented with pre-built reusable tool scripts, Qwen3.5-9B achieves 66.2% on Online-Mind2Web websites with more than five tools. This shows that smaller, lower-cost models can handle complex web tasks when paired with a pre-built tool library. <\/p>\n<h2 class=\"wp-block-heading\"><strong>Marktechpost\u2019s Visual Explainer<\/strong><\/h2>\n<div>\n<div class=\"wh\">\n  <span class=\"wh-t\">Webwright<\/span><br \/>\n  <span class=\"wh-b\">Quick Start Guide<\/span>\n<\/div>\n<div class=\"wo\">\n<div class=\"wt\">\n<p>  <!-- SLIDE 1 --><\/p>\n<div class=\"ws\">\n    <span class=\"wsl\">01 \/ 05 \u2014 Overview<\/span><br \/>\n    <span class=\"wst\">What Is Webwright?<\/span><br \/>\n    <span class=\"wd\">Webwright is an open-source, terminal-native web agent framework from <b>Microsoft Research<\/b>. Instead of predicting one browser click at a time, the agent writes <b>Playwright<\/b> code, runs bash commands, and stores reusable scripts in a local workspace.<\/span>\n<div class=\"wdiv\"><\/div>\n<ul class=\"wl\">\n<li><b>~1,000 lines<\/b> of harness code across 3 modules \u2014 no hidden orchestration<\/li>\n<li><b>Single agent loop<\/b>: Runner, Model Endpoint, and terminal Environment<\/li>\n<li><b>86.7%<\/b> on Online-Mind2Web \u00a0|\u00a0 <b>60.1%<\/b> on Odysseys with GPT-5.4<\/li>\n<li>Backends: <b>OpenAI, Anthropic, OpenRouter<\/b><\/li>\n<li>Scripts reusable in <b>Claude Code, Codex, OpenClaw<\/b><\/li>\n<\/ul>\n<div class=\"wc\">\n<pre><span class=\"cm\"># GitHub repository<\/span>\ngithub.com\/microsoft\/Webwright<\/pre>\n<\/div><\/div>\n<p>  <!-- SLIDE 2 --><\/p>\n<div class=\"ws\">\n    <span class=\"wsl\">02 \/ 05 \u2014 Prerequisites<\/span><br \/>\n    <span class=\"wst\">What You Need Before Installing<\/span><br \/>\n    <span class=\"wd\">Confirm the following are ready before running any install commands.<\/span>\n<div class=\"wdiv\"><\/div>\n<ul class=\"wl\">\n<li><b>Python 3.10+<\/b> \u2014 required minimum runtime<\/li>\n<li><b>Chromium<\/b> \u2014 installed via Playwright in the next step<\/li>\n<li><b>API key<\/b> \u2014 OpenAI, Anthropic, or OpenRouter<\/li>\n<li><b>Git<\/b> \u2014 to clone the repository<\/li>\n<\/ul>\n<div class=\"wc\">\n<pre><span class=\"cm\"># Check your Python version<\/span>\n<span class=\"kw\">python<\/span> --version\n<span class=\"cm\"># Must return Python 3.10 or higher<\/span><\/pre>\n<\/div><\/div>\n<p>  <!-- SLIDE 3 --><\/p>\n<div class=\"ws\">\n    <span class=\"wsl\">03 \/ 05 \u2014 Installation<\/span><br \/>\n    <span class=\"wst\">Clone and Install Webwright<\/span><br \/>\n    <span class=\"wd\">Clone the repo, install in editable mode, then install Chromium for Playwright browser control.<\/span>\n<div class=\"wc\">\n<pre><span class=\"cm\"># 1. Clone the repository<\/span>\n<span class=\"kw\">git<\/span> clone https:\/\/github.com\/microsoft\/Webwright\n<span class=\"kw\">cd<\/span> Webwright\n\n<span class=\"cm\"># 2. Install the package in editable mode<\/span>\n<span class=\"kw\">pip<\/span> install -e .\n\n<span class=\"cm\"># 3. Install Chromium for Playwright<\/span>\n<span class=\"kw\">playwright<\/span> install chromium<\/pre>\n<\/div>\n<p>    <span class=\"wd\">The <b>-e<\/b> flag means local source edits apply immediately without reinstalling.<\/span>\n  <\/p><\/div>\n<p>  <!-- SLIDE 4 --><\/p>\n<div class=\"ws\">\n    <span class=\"wsl\">04 \/ 05 \u2014 Running a Task<\/span><br \/>\n    <span class=\"wst\">Run Your First Web Task<\/span><br \/>\n    <span class=\"wd\">Export your API key, then pass a task instruction and start URL to the CLI.<\/span>\n<div class=\"wc\">\n<pre><span class=\"cm\"># Export your key<\/span>\n<span class=\"kw\">export<\/span> OPENAI_API_KEY=<span class=\"st\">\"sk-...\"<\/span>\n<span class=\"kw\">export<\/span> ANTHROPIC_API_KEY=<span class=\"st\">\"sk-ant-...\"<\/span>\n\n<span class=\"cm\"># Run a task<\/span>\n<span class=\"kw\">python<\/span> -m webwright.run.cli \n  -c base.yaml -c model_openai.yaml \n  -t <span class=\"st\">\"Find cheapest economy flight SEA to JFK on 2026-05-15\"<\/span> \n  --start-url https:\/\/www.google.com\/flights \n  --task-id demo_openai \n  -o outputs\/default<\/pre>\n<\/div>\n<table class=\"wtb\">\n<thead>\n<tr>\n<th>Flag<\/th>\n<th>Description<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>-c<\/td>\n<td>Config file from src\/webwright\/config\/ \u2014 stackable<\/td>\n<\/tr>\n<tr>\n<td>-t<\/td>\n<td>Task instruction in plain English<\/td>\n<\/tr>\n<tr>\n<td>\u2013start-url<\/td>\n<td>Initial URL for the browser session<\/td>\n<\/tr>\n<tr>\n<td>\u2013task-id<\/td>\n<td>Output subfolder name<\/td>\n<\/tr>\n<tr>\n<td>-o<\/td>\n<td>Root output directory for logs and scripts<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/div>\n<p>  <!-- SLIDE 5 --><\/p>\n<div class=\"ws\">\n    <span class=\"wsl\">05 \/ 05 \u2014 Claude Code Integration<\/span><br \/>\n    <span class=\"wst\">Use Webwright as a Claude Code Skill<\/span><br \/>\n    <span class=\"wd\">Webwright ships a built-in Claude Code skill. No separate LLM API key is needed beyond your Claude Code subscription. Claude Code reads PNG screenshots natively.<\/span>\n<div class=\"wc\">\n<pre><span class=\"cm\"># Project-scoped (inside this repo only)<\/span>\n<span class=\"kw\">mkdir<\/span> -p .claude\/skills .claude\/commands\n<span class=\"kw\">ln<\/span> -s \"$PWD\/skills\/webwright\" .claude\/skills\/webwright\n<span class=\"kw\">ln<\/span> -s \"$PWD\/skills\/webwright\/commands\" .claude\/commands\/webwright\n\n<span class=\"cm\"># User-scoped (all projects)<\/span>\n<span class=\"kw\">mkdir<\/span> -p ~\/.claude\/skills ~\/.claude\/commands\n<span class=\"kw\">ln<\/span> -s \"$PWD\/skills\/webwright\" ~\/.claude\/skills\/webwright\n<span class=\"kw\">ln<\/span> -s \"$PWD\/skills\/webwright\/commands\" ~\/.claude\/commands\/webwright<\/pre>\n<\/div>\n<p>    <span class=\"wd\">Restart Claude Code after installing, then use slash commands:<\/span><\/p>\n<div class=\"wc\">\n<pre><span class=\"cm\"># One-shot task<\/span>\n\/webwright:run search Google Flights SEA to JFK 2026-05-15\n\n<span class=\"cm\"># Reusable parameterized CLI tool<\/span>\n\/webwright:craft search a ticket from LAX to SFO depart June 7<\/pre>\n<\/div><\/div>\n<\/div>\n<\/div>\n<p><!-- NAV --><\/p>\n<div class=\"wn\">\n  <button class=\"wb wb-dim\">\u2190 Prev<\/button>\n<div class=\"wdots\">\n    <span class=\"wdot on\"><\/span><br \/>\n    <span class=\"wdot\"><\/span><br \/>\n    <span class=\"wdot\"><\/span><br \/>\n    <span class=\"wdot\"><\/span><br \/>\n    <span class=\"wdot\"><\/span>\n  <\/div>\n<p>  <button class=\"wb wb-sol\">Next \u2192<\/button>\n<\/p><\/div>\n<p><!-- FOOTER --><\/p>\n<div class=\"wf\">\n  <span class=\"wf-tag\">\u00a9 <a href=\"https:\/\/www.marktechpost.com\/\" target=\"_blank\">Marktechpost<\/a> \u2014 AI &amp; ML Research for Practitioners<\/span><br \/>\n  <span class=\"wf-src\">Source: github.com\/microsoft\/Webwright<\/span>\n<\/div>\n<\/div>\n<h2 class=\"wp-block-heading\"><strong>Key Takeaways<\/strong><\/h2>\n<ul class=\"wp-block-list\">\n<li>Webwright uses a terminal loop where the agent writes and runs Playwright code instead of predicting one browser action at a time.<\/li>\n<li>GPT-5.4 reached 86.7% on Online-Mind2Web (100-step budget) and 60.1% on Odysseys \u2014 26.6 points above the base GPT-5.4 score of 33.5%.<\/li>\n<li>The harness is ~1,000 lines across three modules with no multi-agent orchestration.<\/li>\n<li>Qwen3.5-9B reached 66.2% on the hard split of Online-Mind2Web when augmented with pre-built tool scripts.<\/li>\n<li>Task scripts are packaged as reusable CLIs, shareable across Claude Code, Codex, and OpenClaw.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<\/p><p class=\"wp-block-paragraph\">\n<\/p><p class=\"wp-block-paragraph\">Check out\u00a0the\u00a0<strong><a href=\"https:\/\/github.com\/microsoft\/Webwright\" target=\"_blank\" rel=\"noreferrer noopener\">Repo<\/a> <\/strong>and<strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/articles\/webwright-a-terminal-is-all-you-need-for-web-agents\/\" target=\"_blank\" rel=\"noreferrer noopener\">Technical details<\/a>.\u00a0<\/strong>Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">150k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p class=\"wp-block-paragraph\">Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.?\u00a0<strong><a href=\"https:\/\/forms.gle\/MTNLpmJtsFA3VRVd9\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Connect with us<\/mark><\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2026\/05\/24\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/\">Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4\u2019s 33.5%<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Most web agents today drive a browser one action at a time. The model receives the current page state \u2014 as a screenshot or DOM text \u2014 and predicts the next click, keypress, or scroll. This action-at-a-time design made sense when language models had limited reasoning ability. As models have become more capable at writing and debugging code, that rigid loop has become a constraint rather than a structure that helps. Microsoft Research\u2019s AI Frontiers lab built a different approach. Their new open-source framework, Webwright, gives the agent a terminal instead of a stateful browser session. The agent writes Playwright code to control browsers, runs bash commands, inspects logs, and iteratively refines scripts. Playwright is an open-source browser automation library, also from Microsoft, that supports programmatic control of Chromium, Firefox, and WebKit browsers. What Webwright Does Differently Webwright separates the agent from the browser and treats the browser as something the agent can launch, inspect, and discard while developing a program. The persistent artifact is not the browser session but the code and logs in the local workspace. This is the same model a developer uses when writing an RPA (Robotic Process Automation) script. Instead of manually clicking through a site each time, they write a script once. That script can be rerun, adapted, and shared. Webwright applies this to LLM-powered agents. The system has three core components: a Runner, a Model Endpoint, and a terminal Environment. The runner is about 150 lines of code, the model interface about 550 lines, and the environment about 300 lines. There is no multi-agent orchestration or complex planning hierarchy \u2014 just a single agent loop. All intermediate code, logs, screenshots, and results are stored in the workspace, making each run easy to inspect. https:\/\/www.microsoft.com\/en-us\/research\/articles\/webwright-a-terminal-is-all-you-need-for-web-agents\/ The Agent Loop The Runner sends the current context to the model. The model returns a thinking block and a shell command. That command runs in the Environment, which returns terminal output, logs, screenshots, or error tracebacks. These observations go back into context, and the loop continues. Rather than issuing one primitive action at a time, a coding agent can naturally express multi-step interactions \u2014 such as selecting a date or filling out an entire form \u2014 as a compact program. Loops, functions, and abstractions allow the agent to generalize across similar tasks without repeatedly predicting similar sequences of low-level steps. Two Engineering Challenges Premature \u2018done\u2019 and context explosion are the two core issues. With open-ended bash actions, the model must self-report completion and often claims success without actually finishing. They added a gate: the agent must generate a self-reflection config, run a final script in a fresh folder with logs and screenshots, and pass its own self-reflection judgement that outputs success or failure before emitting done: true. Otherwise, the flag is dropped and it retries. For context length, long coding trajectories quickly exceed context limits, so they compact history every 20 steps into a single summary. Benchmark Results Webwright was evaluated on two benchmarks: Online-Mind2Web and Odysseys. Online-Mind2Web contains 300 tasks across 136 widely used sites and uses an automated LLM-as-a-Judge evaluation framework. GPT-5.4 achieves 86.67% overall accuracy, representing the highest among all open-sourced harness recipes in the AutoEval category of the Online-Mind2Web benchmark, with a 100-step budget. Claude Opus 4.7 reached 84.7% overall but performed better on hard tasks at N=100 steps \u2014 80.5% versus 76.6% for GPT-5.4. They also reproduced a GPT-5.4 baseline in a conventional screenshot-based agent setting, where the model predicts x,y coordinates for clicks and typing actions. Using the same underlying model, Webwright achieves substantial gains across all three difficulty categories, highlighting the benefit of the code-driven terminal-based approach over step-by-step coordinate prediction. Odysseys evaluates long-horizon browsing tasks spanning multiple websites. Tasks average 272.3 words of instructions. In the April 2026 leaderboard, the best-performing model was Opus 4.6, with a top score of 44.5. Webwright powered by GPT-5.4 reaches 60.1%, a 35.1% relative improvement over the previous state of the art. Compared to the base GPT-5.4 performance of 33.5%, this corresponds to a 79.4% relative improvement \u2014 or 26.6 absolute points. Cost Analysis Claude Opus 4.7 is more efficient in the number of steps to solve each task (mean 21.9 steps) compared to GPT-5.4 (mean 26.3 steps). However, Claude Opus 4.7 is priced significantly higher compared to GPT-5.4 ($5 vs. $2.50 per 1M input tokens, and $25 vs. $15.00 per 1M output tokens, April 2026), which makes the average per-task cost higher compared to GPT-5.4 ($2.37 vs. $6.09). The first 50 steps deliver 82% accuracy, and the next 50 steps deliver 3\u20134 additional points. Small Model Performance The research team also tested Qwen3.5-9B on the hard split of Online-Mind2Web. When tasks are augmented with pre-built reusable tool scripts, Qwen3.5-9B achieves 66.2% on Online-Mind2Web websites with more than five tools. This shows that smaller, lower-cost models can handle complex web tasks when paired with a pre-built tool library. Marktechpost\u2019s Visual Explainer Webwright Quick Start Guide 01 \/ 05 \u2014 Overview What Is Webwright? Webwright is an open-source, terminal-native web agent framework from Microsoft Research. Instead of predicting one browser click at a time, the agent writes Playwright code, runs bash commands, and stores reusable scripts in a local workspace. ~1,000 lines of harness code across 3 modules \u2014 no hidden orchestration Single agent loop: Runner, Model Endpoint, and terminal Environment 86.7% on Online-Mind2Web \u00a0|\u00a0 60.1% on Odysseys with GPT-5.4 Backends: OpenAI, Anthropic, OpenRouter Scripts reusable in Claude Code, Codex, OpenClaw # GitHub repository github.com\/microsoft\/Webwright 02 \/ 05 \u2014 Prerequisites What You Need Before Installing Confirm the following are ready before running any install commands. Python 3.10+ \u2014 required minimum runtime Chromium \u2014 installed via Playwright in the next step API key \u2014 OpenAI, Anthropic, or OpenRouter Git \u2014 to clone the repository # Check your Python version python &#8211;version # Must return Python 3.10 or higher 03 \/ 05 \u2014 Installation Clone and Install Webwright Clone the repo, install in editable mode, then install Chromium for Playwright browser control. # 1. Clone the<\/p>","protected":false},"author":2,"featured_media":92593,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"pmpro_default_level":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"_pvb_checkbox_block_on_post":false,"footnotes":""},"categories":[52,5,7,1],"tags":[],"class_list":["post-92592","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-club","category-committee","category-news","category-uncategorized","pmpro-has-access"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4\u2019s 33.5% - YouZum<\/title>\n<meta name=\"description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/youzum.net\/zh\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/\" \/>\n<meta property=\"og:locale\" content=\"zh_CN\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4\u2019s 33.5% - YouZum\" \/>\n<meta property=\"og:description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta property=\"og:url\" content=\"https:\/\/youzum.net\/zh\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/\" \/>\n<meta property=\"og:site_name\" content=\"YouZum\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DroneAssociationTH\/\" \/>\n<meta property=\"article:published_time\" content=\"2026-05-24T16:58:45+00:00\" \/>\n<meta name=\"author\" content=\"admin NU\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"\u4f5c\u8005\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin NU\" \/>\n\t<meta name=\"twitter:label2\" content=\"\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 \u5206\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/\"},\"author\":{\"name\":\"admin NU\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\"},\"headline\":\"Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4\u2019s 33.5%\",\"datePublished\":\"2026-05-24T16:58:45+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/\"},\"wordCount\":1248,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x.png\",\"articleSection\":[\"AI\",\"Committee\",\"News\",\"Uncategorized\"],\"inLanguage\":\"zh-Hans\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/\",\"url\":\"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/\",\"name\":\"Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4\u2019s 33.5% - YouZum\",\"isPartOf\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x.png\",\"datePublished\":\"2026-05-24T16:58:45+00:00\",\"description\":\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\",\"breadcrumb\":{\"@id\":\"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#breadcrumb\"},\"inLanguage\":\"zh-Hans\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#primaryimage\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x.png\",\"width\":1694,\"height\":1258},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/youzum.net\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4\u2019s 33.5%\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/yousum.gpucore.co\/#website\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"name\":\"YouSum\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/yousum.gpucore.co\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"zh-Hans\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\",\"name\":\"Drone Association Thailand\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"width\":300,\"height\":300,\"caption\":\"Drone Association Thailand\"},\"image\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/DroneAssociationTH\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\",\"name\":\"admin NU\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"caption\":\"admin NU\"},\"url\":\"https:\/\/youzum.net\/zh\/members\/adminnu\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4\u2019s 33.5% - YouZum","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/youzum.net\/zh\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/","og_locale":"zh_CN","og_type":"article","og_title":"Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4\u2019s 33.5% - YouZum","og_description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","og_url":"https:\/\/youzum.net\/zh\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/","og_site_name":"YouZum","article_publisher":"https:\/\/www.facebook.com\/DroneAssociationTH\/","article_published_time":"2026-05-24T16:58:45+00:00","author":"admin NU","twitter_card":"summary_large_image","twitter_misc":{"\u4f5c\u8005":"admin NU","\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4":"7 \u5206"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#article","isPartOf":{"@id":"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/"},"author":{"name":"admin NU","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c"},"headline":"Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4\u2019s 33.5%","datePublished":"2026-05-24T16:58:45+00:00","mainEntityOfPage":{"@id":"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/"},"wordCount":1248,"commentCount":0,"publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"image":{"@id":"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x.png","articleSection":["AI","Committee","News","Uncategorized"],"inLanguage":"zh-Hans","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/","url":"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/","name":"Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4\u2019s 33.5% - YouZum","isPartOf":{"@id":"https:\/\/yousum.gpucore.co\/#website"},"primaryImageOfPage":{"@id":"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#primaryimage"},"image":{"@id":"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x.png","datePublished":"2026-05-24T16:58:45+00:00","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","breadcrumb":{"@id":"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#breadcrumb"},"inLanguage":"zh-Hans","potentialAction":[{"@type":"ReadAction","target":["https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/"]}]},{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#primaryimage","url":"https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x.png","width":1694,"height":1258},{"@type":"BreadcrumbList","@id":"https:\/\/youzum.net\/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/youzum.net\/"},{"@type":"ListItem","position":2,"name":"Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4\u2019s 33.5%"}]},{"@type":"WebSite","@id":"https:\/\/yousum.gpucore.co\/#website","url":"https:\/\/yousum.gpucore.co\/","name":"YouSum","description":"","publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/yousum.gpucore.co\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"zh-Hans"},{"@type":"Organization","@id":"https:\/\/yousum.gpucore.co\/#organization","name":"Drone Association Thailand","url":"https:\/\/yousum.gpucore.co\/","logo":{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","width":300,"height":300,"caption":"Drone Association Thailand"},"image":{"@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DroneAssociationTH\/"]},{"@type":"Person","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c","name":"admin NU","image":{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","caption":"admin NU"},"url":"https:\/\/youzum.net\/zh\/members\/adminnu\/"}]}},"rttpg_featured_image_url":{"full":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x.png",1694,1258,false],"landscape":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x.png",1694,1258,false],"portraits":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x.png",1694,1258,false],"thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x-150x150.png",150,150,true],"medium":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x-300x223.png",300,223,true],"large":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x-1024x760.png",1024,760,true],"1536x1536":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x-1536x1141.png",1536,1141,true],"2048x2048":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x.png",1694,1258,false],"trp-custom-language-flag":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x-16x12.png",16,12,true],"woocommerce_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x-300x300.png",300,300,true],"woocommerce_single":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x-600x446.png",600,446,true],"woocommerce_gallery_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-24-at-1.42.33-AM-1-4QAJ0x-100x100.png",100,100,true]},"rttpg_author":{"display_name":"admin NU","author_link":"https:\/\/youzum.net\/zh\/members\/adminnu\/"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/youzum.net\/zh\/category\/ai-club\/\" rel=\"category tag\">AI<\/a> <a href=\"https:\/\/youzum.net\/zh\/category\/committee\/\" rel=\"category tag\">Committee<\/a> <a href=\"https:\/\/youzum.net\/zh\/category\/news\/\" rel=\"category tag\">News<\/a> <a href=\"https:\/\/youzum.net\/zh\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","rttpg_excerpt":"Most web agents today drive a browser one action at a time. The model receives the current page state \u2014 as a screenshot or DOM text \u2014 and predicts the next click, keypress, or scroll. This action-at-a-time design made sense when language models had limited reasoning ability. As models have become more capable at writing&hellip;","_links":{"self":[{"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/posts\/92592","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/comments?post=92592"}],"version-history":[{"count":0,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/posts\/92592\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/media\/92593"}],"wp:attachment":[{"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/media?parent=92592"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/categories?post=92592"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/youzum.net\/zh\/wp-json\/wp\/v2\/tags?post=92592"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}