{"id":27065,"date":"2025-07-24T06:24:32","date_gmt":"2025-07-24T06:24:32","guid":{"rendered":"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/"},"modified":"2025-07-24T06:24:32","modified_gmt":"2025-07-24T06:24:32","slug":"this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks","status":"publish","type":"post","link":"https:\/\/youzum.net\/it\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/","title":{"rendered":"This AI Paper Introduces PyVision: A Python-Centric Framework Where AI Writes Tools as It Thinks"},"content":{"rendered":"<p>Visual reasoning tasks challenge artificial intelligence models to interpret and process visual information using both perception and logical reasoning. These tasks span a wide range of applications, including medical diagnostics, visual math, symbolic puzzles, and image-based question answering. Success in this field requires more than object recognition\u2014it demands dynamic adaptation, abstraction, and contextual inference. Models must analyze images, identify relevant features, and often generate explanations or solutions that require a sequence of reasoning steps tied to the visual input.<\/p>\n<p>The limitation becomes evident when models are expected to apply reasoning or modify their strategies for varied visual tasks. Many current models lack flexibility, often defaulting to pattern matching or hardcoded routines. These systems struggle to break down unfamiliar problems or create solutions beyond their preset toolkits. They also fail when tasks involve abstract reasoning or require models to look beyond surface-level features in visual content. The need for a system that can autonomously adapt and construct new tools for reasoning has become a significant bottleneck.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter is-resized\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-?key=FJhywyqmzgU368vWujCM-A\" alt=\"\" \/><\/figure>\n<\/div>\n<p>Previous models typically rely on fixed toolsets and rigid single-turn processing. Solutions like Visual ChatGPT, HuggingGPT, or ViperGPT integrate tools like segmentation or detection models, but they are constrained to predefined workflows. This setup limits creativity and adaptability. These models operate without the ability to modify or expand their toolset during a task. They process tasks linearly, which limits their usefulness in domains that require iterative reasoning. Multi-turn capabilities are either missing or severely limited, preventing models from engaging in more in-depth analytical reasoning.<\/p>\n<p>Researchers introduced PyVision to overcome these issues. Developed by teams from Shanghai AI Lab, Rice University, CUHK, NUS, and SII, this framework enables large multimodal language models (MLLMs) to autonomously create and execute Python-based tools tailored to specific visual reasoning problems. Unlike previous approaches, PyVision is not bound by static modules. It uses Python as its primary language and builds tools dynamically in a multi-turn loop. This allows the system to adapt its approach mid-task, enabling the model to make decisions, reflect on results, and refine its code or reasoning across several steps.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter is-resized\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXfqXIMYGKvX_OWfGwO9KQGPUoimUrJC_JOOb5P66aSqZuvqahN7UiBe65Adbrlnx7Yft40w1p0VlASCdBCqQKUMqbRukzM8WecU4hDd5Lu-XF3sILuL200Yh_rBDk_WtcLkr69kQg?key=FJhywyqmzgU368vWujCM-A\" alt=\"\" \/><\/figure>\n<\/div>\n<p>In practice, PyVision initiates by receiving a user query and corresponding visual input. The MLLM, such as GPT-4.1 or Claude-4.0-Sonnet, generates Python code based on the prompt, which is executed in an isolated environment. The results\u2014textual, visual, or numerical\u2014are fed back into the model. Using this feedback, the model can revise its plan, generate new code, and iterate until it produces a solution. This system supports cross-turn persistence, which means variable states are maintained between interactions, allowing sequential reasoning. PyVision includes internal safety features, such as process isolation and structured I\/O, ensuring robust performance even under complex reasoning loads. It utilizes Python libraries such as OpenCV, NumPy, and Pillow to perform operations like segmentation, OCR, image enhancement, and statistical analysis.<\/p>\n<p>Quantitative benchmarks validate PyVision\u2019s effectiveness. On the visual search benchmark V*, PyVision improved GPT-4.1\u2019s performance from 68.1% to 75.9%, a gain of +7.8%. On the symbolic visual reasoning benchmark VLMsAreBlind-mini, Claude-4.0-Sonnet\u2019s accuracy increased from 48.1% to 79.2%, a 31.1% improvement. Additional gains were observed on other tasks: +2.4% on MMMU and +2.5% on VisualPuzzles for GPT-4.1; +4.8% on MathVista and +8.3% on VisualPuzzles for Claude-4.0-Sonnet. The improvements vary depending on the underlying model\u2019s strengths\u2014models that excel in perception benefit more from PyVision in perception-heavy tasks, while reasoning-strong models gain more in abstract challenges. PyVision amplifies the base model\u2019s abilities rather than masking or replacing them.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter is-resized\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXcawMlo7e9UnvwtVok5Sia_P_g2G21D8qBFJS7upozvHtk7Pxgf39hIo7xQ4btQQJcK7Sp7y2qxovhr3zr7vTNUFFKO367pnBWXAJ_9IXHulPI5WWblumStAOgM_fsZFEwdU4L7?key=FJhywyqmzgU368vWujCM-A\" alt=\"\" \/><\/figure>\n<\/div>\n<p>This research highlights a substantial advancement in visual reasoning. PyVision addresses a fundamental limitation by enabling models to create problem-specific tools in real-time. The approach transforms static models into agentic systems capable of thoughtful, iterative problem-solving. By dynamically linking perception and reasoning, PyVision takes a critical step toward building intelligent, adaptable AI for complex real-world visual challenges.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<p>Check out the<strong>\u00a0<mark><a href=\"https:\/\/arxiv.org\/abs\/2507.07998\" target=\"_blank\" rel=\"noreferrer noopener\">Paper<\/a>, <a href=\"https:\/\/github.com\/agents-x-project\/PyVision\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub Page<\/a> and <a href=\"https:\/\/agent-x.space\/pyvision\/\" target=\"_blank\" rel=\"noreferrer noopener\">Project<\/a><\/mark>.<\/strong>\u00a0All credit for this research goes to the researchers of this project.<\/p>\n<p class=\"has-background dropcapp1\">Meet the AI Dev Newsletter read by 40k+ Devs and Researchers from NVIDIA, OpenAI, DeepMind, Meta, Microsoft, JP Morgan Chase, Amgen, Aflac, Wells Fargo and 100s more<a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">\u00a0<strong>[SUBSCRIBE NOW]<\/strong><\/a><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2025\/07\/23\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/\">This AI Paper Introduces PyVision: A Python-Centric Framework Where AI Writes Tools as It Thinks<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Visual reasoning tasks challenge artificial intelligence models to interpret and process visual information using both perception and logical reasoning. These tasks span a wide range of applications, including medical diagnostics, visual math, symbolic puzzles, and image-based question answering. Success in this field requires more than object recognition\u2014it demands dynamic adaptation, abstraction, and contextual inference. Models must analyze images, identify relevant features, and often generate explanations or solutions that require a sequence of reasoning steps tied to the visual input. The limitation becomes evident when models are expected to apply reasoning or modify their strategies for varied visual tasks. Many current models lack flexibility, often defaulting to pattern matching or hardcoded routines. These systems struggle to break down unfamiliar problems or create solutions beyond their preset toolkits. They also fail when tasks involve abstract reasoning or require models to look beyond surface-level features in visual content. The need for a system that can autonomously adapt and construct new tools for reasoning has become a significant bottleneck. Previous models typically rely on fixed toolsets and rigid single-turn processing. Solutions like Visual ChatGPT, HuggingGPT, or ViperGPT integrate tools like segmentation or detection models, but they are constrained to predefined workflows. This setup limits creativity and adaptability. These models operate without the ability to modify or expand their toolset during a task. They process tasks linearly, which limits their usefulness in domains that require iterative reasoning. Multi-turn capabilities are either missing or severely limited, preventing models from engaging in more in-depth analytical reasoning. Researchers introduced PyVision to overcome these issues. Developed by teams from Shanghai AI Lab, Rice University, CUHK, NUS, and SII, this framework enables large multimodal language models (MLLMs) to autonomously create and execute Python-based tools tailored to specific visual reasoning problems. Unlike previous approaches, PyVision is not bound by static modules. It uses Python as its primary language and builds tools dynamically in a multi-turn loop. This allows the system to adapt its approach mid-task, enabling the model to make decisions, reflect on results, and refine its code or reasoning across several steps. In practice, PyVision initiates by receiving a user query and corresponding visual input. The MLLM, such as GPT-4.1 or Claude-4.0-Sonnet, generates Python code based on the prompt, which is executed in an isolated environment. The results\u2014textual, visual, or numerical\u2014are fed back into the model. Using this feedback, the model can revise its plan, generate new code, and iterate until it produces a solution. This system supports cross-turn persistence, which means variable states are maintained between interactions, allowing sequential reasoning. PyVision includes internal safety features, such as process isolation and structured I\/O, ensuring robust performance even under complex reasoning loads. It utilizes Python libraries such as OpenCV, NumPy, and Pillow to perform operations like segmentation, OCR, image enhancement, and statistical analysis. Quantitative benchmarks validate PyVision\u2019s effectiveness. On the visual search benchmark V*, PyVision improved GPT-4.1\u2019s performance from 68.1% to 75.9%, a gain of +7.8%. On the symbolic visual reasoning benchmark VLMsAreBlind-mini, Claude-4.0-Sonnet\u2019s accuracy increased from 48.1% to 79.2%, a 31.1% improvement. Additional gains were observed on other tasks: +2.4% on MMMU and +2.5% on VisualPuzzles for GPT-4.1; +4.8% on MathVista and +8.3% on VisualPuzzles for Claude-4.0-Sonnet. The improvements vary depending on the underlying model\u2019s strengths\u2014models that excel in perception benefit more from PyVision in perception-heavy tasks, while reasoning-strong models gain more in abstract challenges. PyVision amplifies the base model\u2019s abilities rather than masking or replacing them. This research highlights a substantial advancement in visual reasoning. PyVision addresses a fundamental limitation by enabling models to create problem-specific tools in real-time. The approach transforms static models into agentic systems capable of thoughtful, iterative problem-solving. By dynamically linking perception and reasoning, PyVision takes a critical step toward building intelligent, adaptable AI for complex real-world visual challenges. Check out the\u00a0Paper, GitHub Page and Project.\u00a0All credit for this research goes to the researchers of this project. Meet the AI Dev Newsletter read by 40k+ Devs and Researchers from NVIDIA, OpenAI, DeepMind, Meta, Microsoft, JP Morgan Chase, Amgen, Aflac, Wells Fargo and 100s more\u00a0[SUBSCRIBE NOW] The post This AI Paper Introduces PyVision: A Python-Centric Framework Where AI Writes Tools as It Thinks appeared first on MarkTechPost.<\/p>","protected":false},"author":2,"featured_media":27066,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"pmpro_default_level":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"_pvb_checkbox_block_on_post":false,"footnotes":""},"categories":[52,5,7,1],"tags":[],"class_list":["post-27065","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-club","category-committee","category-news","category-uncategorized","pmpro-has-access"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>This AI Paper Introduces PyVision: A Python-Centric Framework Where AI Writes Tools as It Thinks - YouZum<\/title>\n<meta name=\"description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/youzum.net\/it\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/\" \/>\n<meta property=\"og:locale\" content=\"it_IT\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"This AI Paper Introduces PyVision: A Python-Centric Framework Where AI Writes Tools as It Thinks - YouZum\" \/>\n<meta property=\"og:description\" content=\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\" \/>\n<meta property=\"og:url\" content=\"https:\/\/youzum.net\/it\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/\" \/>\n<meta property=\"og:site_name\" content=\"YouZum\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DroneAssociationTH\/\" \/>\n<meta property=\"article:published_time\" content=\"2025-07-24T06:24:32+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-?key=FJhywyqmzgU368vWujCM-A\" \/>\n<meta name=\"author\" content=\"admin NU\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Scritto da\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin NU\" \/>\n\t<meta name=\"twitter:label2\" content=\"Tempo di lettura stimato\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minuti\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/\"},\"author\":{\"name\":\"admin NU\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\"},\"headline\":\"This AI Paper Introduces PyVision: A Python-Centric Framework Where AI Writes Tools as It Thinks\",\"datePublished\":\"2025-07-24T06:24:32+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/\"},\"wordCount\":712,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17.png\",\"articleSection\":[\"AI\",\"Committee\",\"News\",\"Uncategorized\"],\"inLanguage\":\"it-IT\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/\",\"url\":\"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/\",\"name\":\"This AI Paper Introduces PyVision: A Python-Centric Framework Where AI Writes Tools as It Thinks - YouZum\",\"isPartOf\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17.png\",\"datePublished\":\"2025-07-24T06:24:32+00:00\",\"description\":\"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19\",\"breadcrumb\":{\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#breadcrumb\"},\"inLanguage\":\"it-IT\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"it-IT\",\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#primaryimage\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17.png\",\"width\":1472,\"height\":614},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/youzum.net\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"This AI Paper Introduces PyVision: A Python-Centric Framework Where AI Writes Tools as It Thinks\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/yousum.gpucore.co\/#website\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"name\":\"YouSum\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/yousum.gpucore.co\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"it-IT\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/yousum.gpucore.co\/#organization\",\"name\":\"Drone Association Thailand\",\"url\":\"https:\/\/yousum.gpucore.co\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"it-IT\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png\",\"width\":300,\"height\":300,\"caption\":\"Drone Association Thailand\"},\"image\":{\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/DroneAssociationTH\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c\",\"name\":\"admin NU\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"it-IT\",\"@id\":\"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"contentUrl\":\"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png\",\"caption\":\"admin NU\"},\"url\":\"https:\/\/youzum.net\/it\/members\/adminnu\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"This AI Paper Introduces PyVision: A Python-Centric Framework Where AI Writes Tools as It Thinks - YouZum","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/youzum.net\/it\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/","og_locale":"it_IT","og_type":"article","og_title":"This AI Paper Introduces PyVision: A Python-Centric Framework Where AI Writes Tools as It Thinks - YouZum","og_description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","og_url":"https:\/\/youzum.net\/it\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/","og_site_name":"YouZum","article_publisher":"https:\/\/www.facebook.com\/DroneAssociationTH\/","article_published_time":"2025-07-24T06:24:32+00:00","og_image":[{"url":"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-?key=FJhywyqmzgU368vWujCM-A","type":"","width":"","height":""}],"author":"admin NU","twitter_card":"summary_large_image","twitter_misc":{"Scritto da":"admin NU","Tempo di lettura stimato":"3 minuti"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#article","isPartOf":{"@id":"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/"},"author":{"name":"admin NU","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c"},"headline":"This AI Paper Introduces PyVision: A Python-Centric Framework Where AI Writes Tools as It Thinks","datePublished":"2025-07-24T06:24:32+00:00","mainEntityOfPage":{"@id":"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/"},"wordCount":712,"commentCount":0,"publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"image":{"@id":"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17.png","articleSection":["AI","Committee","News","Uncategorized"],"inLanguage":"it-IT","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/","url":"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/","name":"This AI Paper Introduces PyVision: A Python-Centric Framework Where AI Writes Tools as It Thinks - YouZum","isPartOf":{"@id":"https:\/\/yousum.gpucore.co\/#website"},"primaryImageOfPage":{"@id":"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#primaryimage"},"image":{"@id":"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#primaryimage"},"thumbnailUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17.png","datePublished":"2025-07-24T06:24:32+00:00","description":"\u0e01\u0e34\u0e08\u0e01\u0e23\u0e23\u0e21\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e42\u0e14\u0e23\u0e19","breadcrumb":{"@id":"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#breadcrumb"},"inLanguage":"it-IT","potentialAction":[{"@type":"ReadAction","target":["https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/"]}]},{"@type":"ImageObject","inLanguage":"it-IT","@id":"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#primaryimage","url":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17.png","width":1472,"height":614},{"@type":"BreadcrumbList","@id":"https:\/\/youzum.net\/this-ai-paper-introduces-pyvision-a-python-centric-framework-where-ai-writes-tools-as-it-thinks\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/youzum.net\/"},{"@type":"ListItem","position":2,"name":"This AI Paper Introduces PyVision: A Python-Centric Framework Where AI Writes Tools as It Thinks"}]},{"@type":"WebSite","@id":"https:\/\/yousum.gpucore.co\/#website","url":"https:\/\/yousum.gpucore.co\/","name":"YouSum","description":"","publisher":{"@id":"https:\/\/yousum.gpucore.co\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/yousum.gpucore.co\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"it-IT"},{"@type":"Organization","@id":"https:\/\/yousum.gpucore.co\/#organization","name":"Drone Association Thailand","url":"https:\/\/yousum.gpucore.co\/","logo":{"@type":"ImageObject","inLanguage":"it-IT","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/2024\/11\/tranparent-logo.png","width":300,"height":300,"caption":"Drone Association Thailand"},"image":{"@id":"https:\/\/yousum.gpucore.co\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DroneAssociationTH\/"]},{"@type":"Person","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/97fa48242daf3908e4d9a5f26f4a059c","name":"admin NU","image":{"@type":"ImageObject","inLanguage":"it-IT","@id":"https:\/\/yousum.gpucore.co\/#\/schema\/person\/image\/","url":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","contentUrl":"https:\/\/youzum.net\/wp-content\/uploads\/avatars\/2\/1746849356-bpfull.png","caption":"admin NU"},"url":"https:\/\/youzum.net\/it\/members\/adminnu\/"}]}},"rttpg_featured_image_url":{"full":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17.png",1472,614,false],"landscape":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17.png",1472,614,false],"portraits":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17.png",1472,614,false],"thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17-150x150.png",150,150,true],"medium":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17-300x125.png",300,125,true],"large":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17-1024x427.png",1024,427,true],"1536x1536":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17.png",1472,614,false],"2048x2048":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17.png",1472,614,false],"trp-custom-language-flag":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17-18x8.png",18,8,true],"woocommerce_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17-300x300.png",300,300,true],"woocommerce_single":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17-600x250.png",600,250,true],"woocommerce_gallery_thumbnail":["https:\/\/youzum.net\/wp-content\/uploads\/2025\/07\/AD_4nXepL7DwRHMyLszpIf6BYSAWtgDD-KGWmtk__st0I3nNEdic4Xn0F5Et-B0wO5HQg4QYBtq887Z5Yu7CboPNxPxl3C4KOnFfw0hre3DEg3z428ubgleE1BEtXgMHORPMlWXuwBV-CUEh17-100x100.png",100,100,true]},"rttpg_author":{"display_name":"admin NU","author_link":"https:\/\/youzum.net\/it\/members\/adminnu\/"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/youzum.net\/it\/category\/ai-club\/\" rel=\"category tag\">AI<\/a> <a href=\"https:\/\/youzum.net\/it\/category\/committee\/\" rel=\"category tag\">Committee<\/a> <a href=\"https:\/\/youzum.net\/it\/category\/news\/\" rel=\"category tag\">News<\/a> <a href=\"https:\/\/youzum.net\/it\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","rttpg_excerpt":"Visual reasoning tasks challenge artificial intelligence models to interpret and process visual information using both perception and logical reasoning. These tasks span a wide range of applications, including medical diagnostics, visual math, symbolic puzzles, and image-based question answering. Success in this field requires more than object recognition\u2014it demands dynamic adaptation, abstraction, and contextual inference. Models&hellip;","_links":{"self":[{"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/posts\/27065","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/comments?post=27065"}],"version-history":[{"count":0,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/posts\/27065\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/media\/27066"}],"wp:attachment":[{"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/media?parent=27065"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/categories?post=27065"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/youzum.net\/it\/wp-json\/wp\/v2\/tags?post=27065"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}