Committee Archives - 第7页共98页

AI, Committee, 新闻, Uncategorized

Revolutionizing MLOps: Enhanced BigQuery ML UI for Seamless Model Creation and Management

admin NU / 10 月 18, 2025

Exciting news for BigQuery ML (BQML) users.

Revolutionizing MLOps: Enhanced BigQuery ML UI for Seamless Model Creation and Management Read Post »

AI, Committee, 新闻, Uncategorized

Developers can now add live Google Maps data to Gemini-powered AI app outputs

admin NU / 10 月 18, 2025

Google is adding a new feature for third-party developers building atop its Gemini AI models that rivals like OpenAI’s ChatGPT, Anthropic’s Claude, and the growing array of Chinese open source options are unlikely to get anytime soon: grounding with Google Maps. This addition allows developers to connect Google’s Gemini AI models’ reasoning capabilities with live geospatial data from Google Maps, enabling applications to deliver detailed, location-relevant responses to user queries—such as business hours, reviews, or the atmosphere of a specific venue. By tapping into data from over 250 million places, developers can now build more intelligent and responsive location-aware experiences. This is particularly useful for applications where proximity, real-time availability, or location-specific personalization matter—such as local search, delivery services, real estate, and travel planning. When the user’s location is known, developers can pass latitude and longitude into the request to enhance the response quality. By tightly integrating real-time and historical Maps data into the Gemini API, Google enables applications to generate grounded, location-specific responses with factual accuracy and contextual depth that are uniquely possible through its mapping infrastructure. Merging AI and Geospatial Intelligence The new feature is accessible in Google AI Studio, where developers can try a live demo powered by the Gemini Live API. Models that support the grounding with Google Maps include: Gemini 2.5 Pro Gemini 2.5 Flash Gemini 2.5 Flash-Lite Gemini 2.0 Flash In one demonstration, a user asked for Italian restaurant recommendations in Chicago. The assistant, leveraging Maps data, retrieved top-rated options and clarified a misspelled restaurant name before locating the correct venue with accurate business details. Developers can also retrieve a context token to embed a Google Maps widget in their app’s user interface. This interactive component displays photos, reviews, and other familiar content typically found in Google Maps. Integration is handled via the generateContent method in the Gemini API, where developers include googleMaps as a tool. They can also enable a Maps widget by setting a parameter in the request. The widget, rendered using a returned context token, can provide a visual layer alongside the AI-generated text. Use Cases Across Industries The Maps grounding tool is designed to support a wide range of practical use cases: Itinerary generation: Travel apps can create detailed daily plans with routing, timing, and venue information. Personalized local recommendations: Real estate platforms can highlight listings near kid-friendly amenities like schools and parks. Detailed location queries: Applications can provide specific information, such as whether a cafe offers outdoor seating, using community reviews and Maps metadata. Developers are encouraged to only enable the tool when geographic context is relevant, to optimize both performance and cost. According to the developer documentation, pricing starts at $25 per 1,000 grounded prompts — a steep sum for those trafficking in numerous queries. Combining Search and Maps for Enhanced Context Developers can use Grounding with Google Maps alongside Grounding with Google Search in the same request. While the Maps tool contributes factual data—like addresses, hours, and ratings—the Search tool adds broader context from web content, such as news or event listings. For example, when asked about live music on Beale Street, the combined tools provide venue details from Maps and event times from Search. According to Google, internal testing shows that using both tools together leads to significantly improved response quality. Unfortunately, it doesn’t appear that the Google Maps grounding includes live vehicular traffic data — at least not yet. Customization and Developer Flexibility The experience is built for customization. Developers can tweak system prompts, choose from different Gemini models, and configure voice settings to tailor interactions. The demo app in Google AI Studio is also remixable, enabling developers to test ideas, add features, and iterate on designs within a flexible development environment. The API returns structured metadata—including source links, place IDs, and citation spans—that developers can use to build inline citations or verify the AI-generated outputs. This supports transparency and enhances trust in user-facing applications. Google also requires that Maps-based sources be attributed clearly and linked back to the source using their URI. Implementation Considerations for AI Builders For technical teams integrating this capability, Google recommends: Passing user location context when known, for better results. Displaying Google Maps source links directly beneath the relevant content. Only enabling the tool when the query clearly involves geographic context. Monitoring latency and disabling grounding when performance is critical. Grounding with Google Maps is currently available globally, though prohibited in several territories (including China, Iran, North Korea, and Cuba), and not permitted for emergency response use cases. Availability and Access Grounding with Google Maps is now generally available through the Gemini API. With this release, Google continues to expand the capabilities of the Gemini API, empowering developers to build AI-driven applications that understand and respond to the world around them.

Developers can now add live Google Maps data to Gemini-powered AI app outputs Read Post »

AI, Committee, 新闻, Uncategorized

Readers Prefer Outputs of AI Trained on Copyrighted Books over Expert Human Writers

admin NU / 10 月 17, 2025

arXiv:2510.13939v1 Announce Type: new Abstract: The use of copyrighted books for training AI models has led to numerous lawsuits from authors concerned about AI’s ability to generate derivative content.Yet it’s unclear whether these models can generate high quality literary text while emulating authors’ styles. To answer this we conducted a preregistered study comparing MFA-trained expert writers with three frontier AI models: ChatGPT, Claude & Gemini in writing up to 450 word excerpts emulating 50 award-winning authors’ diverse styles. In blind pairwise evaluations by 159 representative expert & lay readers, AI-generated text from in-context prompting was strongly disfavored by experts for both stylistic fidelity (OR=0.16, p

Readers Prefer Outputs of AI Trained on Copyrighted Books over Expert Human Writers Read Post »

AI, Committee, 新闻, Uncategorized

Interpreting the Latent Structure of Operator Precedence in Language Models

admin NU / 10 月 17, 2025

arXiv:2510.13908v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated impressive reasoning capabilities but continue to struggle with arithmetic tasks. Prior works largely focus on outputs or prompting strategies, leaving the open question of the internal structure through which models do arithmetic computation. In this work, we investigate whether LLMs encode operator precedence in their internal representations via the open-source instruction-tuned LLaMA 3.2-3B model. We constructed a dataset of arithmetic expressions with three operands and two operators, varying the order and placement of parentheses. Using this dataset, we trace whether intermediate results appear in the residual stream of the instruction-tuned LLaMA 3.2-3B model. We apply interpretability techniques such as logit lens, linear classification probes, and UMAP geometric visualization. Our results show that intermediate computations are present in the residual stream, particularly after MLP blocks. We also find that the model linearly encodes precedence in each operator’s embeddings post attention layer. We introduce partial embedding swap, a technique that modifies operator precedence by exchanging high-impact embedding dimensions between operators.

Interpreting the Latent Structure of Operator Precedence in Language Models Read Post »

AI, Committee, 新闻, Uncategorized

Knowledge Reasoning Language Model: Unifying Knowledge and Language for Inductive Knowledge Graph Reasoning

admin NU / 10 月 17, 2025

arXiv:2510.13909v1 Announce Type: new Abstract: Inductive Knowledge Graph Reasoning (KGR) aims to discover facts in open-domain KGs containing unknown entities and relations, which poses a challenge for KGR models in comprehending uncertain KG components. Existing studies have proposed Knowledge Graph Foundation Models (KGFMs) that learn structural invariances across KGs to handle this uncertainty. Recently, Large Language Models (LLMs) have demonstrated strong capabilities for open-domain knowledge reasoning. As a result, the latest research has focused on LLM-based KGFMs that integrate LLM knowledge with KG context for inductive KGR. However, the intrinsic knowledge of LLMs may be overshadowed by sparse KG context, leading to LLM knowledge distortion, which can cause irreversible damage to model reasoning. Moreover, existing LLM-based KGR methods still struggle to fully constrain generative hallucinations in LLMs, severely limiting the credibility of reasoning results. To address these limitations, we propose a Knowledge Reasoning Language Model (KRLM) that achieves unified coordination between LLM knowledge and KG context throughout the KGR process. Specifically, we design a Knowledge Reasoning Language (KRL) instruction format and a KRL tokenizer to align LLM knowledge with KG representations. Then, we propose a KRL attention layer that coordinates intrinsic LLM knowledge with additional KG context through a dynamic knowledge memory mechanism. Finally, a structure-aware next-entity predictor is proposed, which strictly constrains the reasoning results within a trustworthy knowledge domain. Extensive experimental results on 25 real-world inductive KGR datasets demonstrate the significant superiority of the proposed KRLMfootnote{Our source codes are available at https://anonymous.4open.science/r/KRLM-EA36 in both zero-shot reasoning and fine-tuning scenarios.

Knowledge Reasoning Language Model: Unifying Knowledge and Language for Inductive Knowledge Graph Reasoning Read Post »

AI, Committee, 新闻, Uncategorized

Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents

admin NU / 10 月 17, 2025

arXiv:2509.06917v2 Announce Type: replace-cross Abstract: We introduce Paper2Agent, an automated framework that converts research papers into AI agents. Paper2Agent transforms research output from passive artifacts into active systems that can accelerate downstream use, adoption, and discovery. Conventional research papers require readers to invest substantial effort to understand and adapt a paper’s code, data, and methods to their own work, creating barriers to dissemination and reuse. Paper2Agent addresses this challenge by automatically converting a paper into an AI agent that acts as a knowledgeable research assistant. It systematically analyzes the paper and the associated codebase using multiple agents to construct a Model Context Protocol (MCP) server, then iteratively generates and runs tests to refine and robustify the resulting MCP. These paper MCPs can then be flexibly connected to a chat agent (e.g. Claude Code) to carry out complex scientific queries through natural language while invoking tools and workflows from the original paper. We demonstrate Paper2Agent’s effectiveness in creating reliable and capable paper agents through in-depth case studies. Paper2Agent created an agent that leverages AlphaGenome to interpret genomic variants and agents based on ScanPy and TISSUE to carry out single-cell and spatial transcriptomics analyses. We validate that these paper agents can reproduce the original paper’s results and can correctly carry out novel user queries. Paper2Agent automatically created AI co-scientist that identified new splicing variant associated with ADHD risk. By turning static papers into dynamic, interactive AI agents, Paper2Agent introduces a new paradigm for knowledge dissemination and a foundation for the collaborative ecosystem of AI co-scientists.

Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents Read Post »

AI, Committee, 新闻, Uncategorized

Echoes of BERT: Do Modern Language Models Rediscover the Classical NLP Pipeline?

admin NU / 10 月 17, 2025

arXiv:2506.02132v4 Announce Type: replace Abstract: Large transformer-based language models dominate modern NLP, yet our understanding of how they encode linguistic information relies primarily on studies of early models like BERT and GPT-2. Building on classic BERTology work, we analyze 25 models spanning from classical architectures (BERT, DeBERTa, GPT-2) to modern large language models (Pythia, OLMo-2, Gemma-2, Qwen2.5, Llama-3.1), probing layer-by-layer representations across eight linguistic tasks in English. Consistent with earlier findings, we find that hierarchical organization persists in modern models: early layers capture syntax, middle layers handle semantics and entity-level information, and later layers encode discourse phenomena. We dive deeper, conducting an in-depth multilingual analysis of two specific linguistic properties – lexical identity and inflectional morphology – that help disentangle form from meaning. We find that lexical information concentrates linearly in early layers but becomes increasingly nonlinear deeper in the network, while inflectional information remains linearly accessible throughout all layers. Additional analyses of attention mechanisms, steering vectors, and pretraining checkpoints reveal where this information resides within layers, how it can be functionally manipulated, and how representations evolve during pretraining. Taken together, our findings suggest that, even with substantial advances in LLM technologies, transformer models learn to organize linguistic information in similar ways, regardless of model architecture, size, or training regime, indicating that these properties are important for next token prediction. Our code is available at https://github.com/ml5885/model_internal_sleuthing

Echoes of BERT: Do Modern Language Models Rediscover the Classical NLP Pipeline? Read Post »

AI, Committee, 新闻, Uncategorized

A Matter of Representation: Towards Graph-Based Abstract Code Generation

admin NU / 10 月 16, 2025

arXiv:2510.13163v1 Announce Type: new Abstract: Most large language models (LLMs) today excel at generating raw, sequential code with minimal abstractions and custom structures. However, there has been little work on graph-based abstract code generation, where significant logic is encapsulated in predefined nodes and execution flow is determined by edges. This is relevant for visual programming languages, and in cases where raw source code is inaccessible to users and LLM training sets. In this work, we propose and evaluate JSON representations for graphs to enable high accuracy graph-based abstract code generation. We evaluate these representations on ScratchTest, a mini-benchmark based on our custom Python re-implementation of Scratch, which tests the LLM in code graph space. Our findings demonstrate that LLMs can indeed perform the aforementioned generation task in a single pass without relying on specialized or complex pipelines, given the correct graph representations. We also show that different representations induce significantly different accuracies, highlighting the instrumental role of representations in this generation task. All in all, this work establishes the first steps towards representation learning for graph-based abstract code generation.

A Matter of Representation: Towards Graph-Based Abstract Code Generation Read Post »

AI, Committee, 新闻, Uncategorized

Toward LLM-Supported Automated Assessment of Critical Thinking Subskills

admin NU / 10 月 16, 2025

arXiv:2510.12915v1 Announce Type: cross Abstract: Critical thinking represents a fundamental competency in today’s education landscape. Developing critical thinking skills through timely assessment and feedback is crucial; however, there has not been extensive work in the learning analytics community on defining, measuring, and supporting critical thinking. In this paper, we investigate the feasibility of measuring core “subskills” that underlie critical thinking. We ground our work in an authentic task where students operationalize critical thinking: student-written argumentative essays. We developed a coding rubric based on an established skills progression and completed human coding for a corpus of student essays. We then evaluated three distinct approaches to automated scoring: zero-shot prompting, few-shot prompting, and supervised fine-tuning, implemented across three large language models (GPT-5, GPT-5-mini, and ModernBERT). GPT-5 with few-shot prompting achieved the strongest results and demonstrated particular strength on subskills with separable, frequent categories, while lower performance was observed for subskills that required detection of subtle distinctions or rare categories. Our results underscore critical trade-offs in automated critical thinking assessment: proprietary models offer superior reliability at higher cost, while open-source alternatives provide practical accuracy with reduced sensitivity to minority categories. Our work represents an initial step toward scalable assessment of higher-order reasoning skills across authentic educational contexts.

Toward LLM-Supported Automated Assessment of Critical Thinking Subskills Read Post »

AI, Committee, 新闻, Uncategorized

What Does Neuro Mean to Cardio? Investigating the Role of Clinical Specialty Data in Medical LLMs

admin NU / 10 月 16, 2025

arXiv:2505.10113v3 Announce Type: replace Abstract: In this paper, we introduce S-MedQA, an English medical question-answering (QA) dataset for benchmarking large language models (LLMs) in fine-grained clinical specialties. S-MedQA has over 20k examples, covers 15 medical specialties, and QA pairs can have multiple specialty annotations (e.g., when a question is cross-disciplinary), constructed with both machine and expert verification to maximize data availability. We use S-MedQA to investigate the role of clinical specialty data in the knowledge-intensive scenario of medical QA. Our results show that 1) training on data from a clinical specialty does not necessarily lead to best performance on that specialty, and 2) regardless of the specialty the LLM was fine-tuned on, token probabilities of clinically relevant terms increase consistently across all specialties. Thus, we hypothesize improvement gains are derived mostly from domain shifting (e.g., general to medical) rather than specialty-specific knowledge injection, and suggest rethinking the role of fine-tuning data in the medical domain.

What Does Neuro Mean to Cardio? Investigating the Role of Clinical Specialty Data in Medical LLMs Read Post »

Committee

Revolutionizing MLOps: Enhanced BigQuery ML UI for Seamless Model Creation and Management

Developers can now add live Google Maps data to Gemini-powered AI app outputs

Readers Prefer Outputs of AI Trained on Copyrighted Books over Expert Human Writers

Interpreting the Latent Structure of Operator Precedence in Language Models

Knowledge Reasoning Language Model: Unifying Knowledge and Language for Inductive Knowledge Graph Reasoning

Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents

Echoes of BERT: Do Modern Language Models Rediscover the Classical NLP Pipeline?

A Matter of Representation: Towards Graph-Based Abstract Code Generation

Toward LLM-Supported Automated Assessment of Critical Thinking Subskills

What Does Neuro Mean to Cardio? Investigating the Role of Clinical Specialty Data in Medical LLMs

我们的服务

首页

工作原理

新闻

定价

支持

幫助中心

报告问题

提供反馈

隱私權政策

用户账户

关注我们