YouZum

Uncategorized

AI, Committee, 新闻, Uncategorized

CUPID: Evaluating Personalized and Contextualized Alignment of LLMs from Interactions

arXiv:2508.01674v1 Announce Type: new Abstract: Personalization of Large Language Models (LLMs) often assumes users hold static preferences that reflect globally in all tasks. In reality, humans hold dynamic preferences that change depending on the context. As users interact with an LLM in various contexts, they naturally reveal their contextual preferences, which a model must infer and apply in future contexts to ensure alignment. To assess this, we introduce CUPID, a benchmark of 756 human-curated interaction session histories between users and LLM-based chat assistants. In each interaction session, the user provides a request in a specific context and expresses their preference through multi-turn feedback. Given a new user request and prior interaction sessions, our benchmark assesses whether LLMs can infer the preference relevant to this request and generate a response that satisfies this preference. With CUPID, we evaluated 10 open and proprietary LLMs, revealing that state-of-the-art LLMs struggle to infer preferences from multi-turn interactions and fail to discern what previous context is relevant to a new request — under 50% precision and 65% recall. Our work highlights the need to advance LLM capabilities for more contextually personalized interactions and proposes CUPID as a resource to drive these improvements.

CUPID: Evaluating Personalized and Contextualized Alignment of LLMs from Interactions Read Post »

AI, Committee, 新闻, Uncategorized

FinCoT: Grounding Chain-of-Thought in Expert Financial Reasoning

arXiv:2506.16123v2 Announce Type: replace Abstract: This paper presents FinCoT, a structured chain-of-thought (CoT) prompting framework that embeds domain-specific expert financial reasoning blueprints to guide large language models’ behaviors. We identify three main prompting styles in financial NLP (FinNLP): (1) standard prompting (zero-shot), (2) unstructured CoT (free-form reasoning), and (3) structured CoT (with explicitly structured reasoning steps). Prior work has mainly focused on the first two, while structured CoT remains underexplored and lacks domain expertise incorporation. Therefore, we evaluate all three prompting approaches across ten CFA-style financial domains and introduce FinCoT as the first structured finance-specific prompting approach incorporating blueprints from domain experts. FinCoT improves the accuracy of a general-purpose model, Qwen3-8B-Base, from 63.2% to 80.5%, and boosts Fin-R1 (7B), a finance-specific model, from 65.7% to 75.7%, while reducing output length by up to 8.9x and 1.16x compared to structured CoT methods, respectively. We find that FinCoT proves most effective for models lacking financial post-training. Our findings show that FinCoT does not only improve performance and reduce inference costs but also yields more interpretable and expert-aligned reasoning traces.

FinCoT: Grounding Chain-of-Thought in Expert Financial Reasoning Read Post »

AI, Committee, 新闻, Uncategorized

One Trigger Token Is Enough: A Defense Strategy for Balancing Safety and Usability in Large Language Models

arXiv:2505.07167v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have been extensively used across diverse domains, including virtual assistants, automated code generation, and scientific research. However, they remain vulnerable to jailbreak attacks, which manipulate the models into generating harmful responses despite safety alignment. Recent studies have shown that current safety-aligned LLMs often undergo the shallow safety alignment, where the first few tokens largely determine whether the response will be harmful. Through comprehensive observations, we find that safety-aligned LLMs and various defense strategies generate highly similar initial tokens in their refusal responses, which we define as safety trigger tokens. Building on this insight, we propose texttt{D-STT}, a simple yet effective defense algorithm that identifies and explicitly decodes safety trigger tokens of the given safety-aligned LLM to trigger the model’s learned safety patterns. In this process, the safety trigger is constrained to a single token, which effectively preserves model usability by introducing minimum intervention in the decoding process. Extensive experiments across diverse jailbreak attacks and benign prompts demonstrate that ours significantly reduces output harmfulness while preserving model usability and incurring negligible response time overhead, outperforming ten baseline methods.

One Trigger Token Is Enough: A Defense Strategy for Balancing Safety and Usability in Large Language Models Read Post »

AI, Committee, 新闻, Uncategorized

AgentArmor: Enforcing Program Analysis on Agent Runtime Trace to Defend Against Prompt Injection

arXiv:2508.01249v1 Announce Type: cross Abstract: Large Language Model (LLM) agents offer a powerful new paradigm for solving various problems by combining natural language reasoning with the execution of external tools. However, their dynamic and non-transparent behavior introduces critical security risks, particularly in the presence of prompt injection attacks. In this work, we propose a novel insight that treats the agent runtime traces as structured programs with analyzable semantics. Thus, we present AgentArmor, a program analysis framework that converts agent traces into graph intermediate representation-based structured program dependency representations (e.g., CFG, DFG, and PDG) and enforces security policies via a type system. AgentArmor consists of three key components: (1) a graph constructor that reconstructs the agent’s working traces as graph-based intermediate representations with control flow and data flow described within; (2) a property registry that attaches security-relevant metadata of interacted tools & data, and (3) a type system that performs static inference and checking over the intermediate representation. By representing agent behavior as structured programs, AgentArmor enables program analysis over sensitive data flow, trust boundaries, and policy violations. We evaluate AgentArmor on the AgentDojo benchmark, the results show that AgentArmor can achieve 95.75% of TPR, with only 3.66% of FPR. Our results demonstrate AgentArmor’s ability to detect prompt injection vulnerabilities and enforce fine-grained security constraints.

AgentArmor: Enforcing Program Analysis on Agent Runtime Trace to Defend Against Prompt Injection Read Post »

AI, Committee, 新闻, Uncategorized

Google AI Releases LangExtract: An Open Source Python Library that Extracts Structured Data from Unstructured Text Documents

In today’s data-driven world, valuable insights are often buried in unstructured text—be it clinical notes, lengthy legal contracts, or customer feedback threads. Extracting meaningful, traceable information from these documents is both a technical and practical challenge. Google AI’s new open-source Python library, LangExtract, is designed to address this gap directly, using LLMs like Gemini to deliver powerful, automated extraction with traceability and transparency at its core. Key Innovations of LangExtract 1. Declarative and Traceable Extraction LangExtract lets users define custom extraction tasks using natural language instructions and high-quality “few-shot” examples. This empowers developers and analysts to specify exactly which entities, relationships, or facts to extract, and in what structure. Crucially, every extracted piece of information is tied directly back to its source text—enabling validation, auditing, and end-to-end traceability. 2. Domain Versatility The library works not just in tech demos but in critical real-world domains—including health (clinical notes, medical reports), finance (summaries, risk documents), law (contracts), research literature, and even the arts (analyzing Shakespeare). Original use cases include automatic extraction of medications, dosages, and administration details from clinical documents, as well as relationships and emotions from plays or literature. 3. Schema Enforcement with LLMs Powered by Gemini and compatible with other LLMs, LangExtract enables enforcement of custom output schemas (like JSON), so results aren’t just accurate—they’re immediately usable in downstream databases, analytics, or AI pipelines. It solves traditional LLM weaknesses around hallucination and schema drift by grounding outputs to both user instructions and actual source text. 4. Scalability and Visualization Handles Large Volumes: LangExtract efficiently processes long documents by chunking, parallelizing, and aggregating results. Interactive Visualization: Developers can generate interactive HTML reports, viewing each extracted entity with context by highlighting its location in the original document—making auditing and error analysis seamless. Smooth Integration: Works in Google Colab, Jupyter, or as standalone HTML files, supporting a rapid feedback loop for developers and researchers. 5. Installation and Usage Install easily with pip: Copy CodeCopiedUse a different Browser pip install langextract Example Workflow (Extracting Character Info from Shakespeare): Copy CodeCopiedUse a different Browser import langextract as lx import textwrap # 1. Define your prompt prompt = textwrap.dedent(“”” Extract characters, emotions, and relationships in order of appearance. Use exact text for extractions. Do not paraphrase or overlap entities. Provide meaningful attributes for each entity to add context. “””) # 2. Give a high-quality example examples = [ lx.data.ExampleData( text=”ROMEO. But soft! What light through yonder window breaks? It is the east, and Juliet is the sun.”, extractions=[ lx.data.Extraction(extraction_class=”character”, extraction_text=”ROMEO”, attributes={“emotional_state”: “wonder”}), lx.data.Extraction(extraction_class=”emotion”, extraction_text=”But soft!”, attributes={“feeling”: “gentle awe”}), lx.data.Extraction(extraction_class=”relationship”, extraction_text=”Juliet is the sun”, attributes={“type”: “metaphor”}), ], ) ] # 3. Extract from new text input_text = “Lady Juliet gazed longingly at the stars, her heart aching for Romeo” result = lx.extract( text_or_documents=input_text, prompt_description=prompt, examples=examples, model_id=”gemini-2.5-pro” ) # 4. Save and visualize results lx.io.save_annotated_documents([result], output_name=”extraction_results.jsonl”) html_content = lx.visualize(“extraction_results.jsonl”) with open(“visualization.html”, “w”) as f: f.write(html_content) This results in structured, source-anchored JSON outputs, plus an interactive HTML visualization for easy review and demonstration. Specialized & Real-World Applications Medicine: Extracts medications, dosages, timing, and links them back to source sentences. Powered by insights from research conducted on accelerating medical information extraction, LangExtract’s approach is directly applicable to structuring clinical and radiology reports—improving clarity and supporting interoperability. Finance & Law: Automatically pulls relevant clauses, terms, or risks from dense legal or financial text, ensuring every output can be traced back to its context. Research & Data Mining: Streamlines high-throughput extraction from thousands of scientific papers. The team even provides a demonstration called RadExtract for structuring radiology reports—highlighting not just what was extracted, but exactly where the information appeared in the original input. How LangExtract Compares Feature Traditional Approaches LangExtract Approach Schema Consistency Often manual/error-prone Enforced via instructions & few-shot examples Result Traceability Minimal All output linked to input text Scaling to Long Texts Windowed, lossy Chunked + parallel extraction, then aggregation Visualization Custom, usually absent Built-in, interactive HTML reports Deployment Rigid, model-specific Gemini-first, open to other LLMs & on-premises In Summary LangExtract presents a new era for extracting structured, actionable data from text—delivering: Declarative, explainable extraction Traceable results backed by source context Instant visualization for rapid iteration Easy integration into any Python workflow Check out the GitHub Page and Technical Blog. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. The post Google AI Releases LangExtract: An Open Source Python Library that Extracts Structured Data from Unstructured Text Documents appeared first on MarkTechPost.

Google AI Releases LangExtract: An Open Source Python Library that Extracts Structured Data from Unstructured Text Documents Read Post »

AI, Committee, 新闻, Uncategorized

Team “better_call_claude”: Style Change Detection using a Sequential Sentence Pair Classifier

arXiv:2508.00675v1 Announce Type: new Abstract: Style change detection – identifying the points in a document where writing style shifts – remains one of the most important and challenging problems in computational authorship analysis. At PAN 2025, the shared task challenges participants to detect style switches at the most fine-grained level: individual sentences. The task spans three datasets, each designed with controlled and increasing thematic variety within documents. We propose to address this problem by modeling the content of each problem instance – that is, a series of sentences – as a whole, using a Sequential Sentence Pair Classifier (SSPC). The architecture leverages a pre-trained language model (PLM) to obtain representations of individual sentences, which are then fed into a bidirectional LSTM (BiLSTM) to contextualize them within the document. The BiLSTM-produced vectors of adjacent sentences are concatenated and passed to a multi-layer perceptron for prediction per adjacency. Building on the work of previous PAN participants classical text segmentation, the approach is relatively conservative and lightweight. Nevertheless, it proves effective in leveraging contextual information and addressing what is arguably the most challenging aspect of this year’s shared task: the notorious problem of “stylistically shallow”, short sentences that are prevalent in the proposed benchmark data. Evaluated on the official PAN-2025 test datasets, the model achieves strong macro-F1 scores of 0.923, 0.828, and 0.724 on the EASY, MEDIUM, and HARD data, respectively, outperforming not only the official random baselines but also a much more challenging one: claude-3.7-sonnet’s zero-shot performance.

Team “better_call_claude”: Style Change Detection using a Sequential Sentence Pair Classifier Read Post »

AI, Committee, 新闻, Uncategorized

IFEvalCode: Controlled Code Generation

arXiv:2507.22462v2 Announce Type: replace Abstract: Code large language models (Code LLMs) have made significant progress in code generation by translating natural language descriptions into functional code; however, real-world applications often demand stricter adherence to detailed requirements such as coding style, line count, and structural constraints, beyond mere correctness. To address this, the paper introduces forward and backward constraints generation to improve the instruction-following capabilities of Code LLMs in controlled code generation, ensuring outputs align more closely with human-defined guidelines. The authors further present IFEvalCode, a multilingual benchmark comprising 1.6K test samples across seven programming languages (Python, Java, JavaScript, TypeScript, Shell, C++, and C#), with each sample featuring both Chinese and English queries. Unlike existing benchmarks, IFEvalCode decouples evaluation into two metrics: correctness (Corr.) and instruction-following (Instr.), enabling a more nuanced assessment. Experiments on over 40 LLMs reveal that closed-source models outperform open-source ones in controllable code generation and highlight a significant gap between the models’ ability to generate correct code versus code that precisely follows instructions.

IFEvalCode: Controlled Code Generation Read Post »

AI, Committee, 新闻, Uncategorized

Loss Landscape Degeneracy and Stagewise Development in Transformers

arXiv:2402.02364v3 Announce Type: replace-cross Abstract: Deep learning involves navigating a high-dimensional loss landscape over the neural network parameter space. Over the course of training, complex computational structures form and re-form inside the neural network, leading to shifts in input/output behavior. It is a priority for the science of deep learning to uncover principles governing the development of neural network structure and behavior. Drawing on the framework of singular learning theory, we propose that model development is deeply linked to degeneracy in the local geometry of the loss landscape. We investigate this link by monitoring loss landscape degeneracy throughout training, as quantified by the local learning coefficient, for a transformer language model and an in-context linear regression transformer. We show that training can be divided into distinct periods of change in loss landscape degeneracy, and that these changes in degeneracy coincide with significant changes in the internal computational structure and the input/output behavior of the transformers. This finding provides suggestive evidence that degeneracy and development are linked in transformers, underscoring the potential of a degeneracy-based perspective for understanding modern deep learning.

Loss Landscape Degeneracy and Stagewise Development in Transformers Read Post »

AI, Committee, 新闻, Uncategorized

Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training

arXiv:2508.00414v1 Announce Type: cross Abstract: General AI Agents are increasingly recognized as foundational frameworks for the next generation of artificial intelligence, enabling complex reasoning, web interaction, coding, and autonomous research capabilities. However, current agent systems are either closed-source or heavily reliant on a variety of paid APIs and proprietary tools, limiting accessibility and reproducibility for the research community. In this work, we present textbf{Cognitive Kernel-Pro}, a fully open-source and (to the maximum extent) free multi-module agent framework designed to democratize the development and evaluation of advanced AI agents. Within Cognitive Kernel-Pro, we systematically investigate the curation of high-quality training data for Agent Foundation Models, focusing on the construction of queries, trajectories, and verifiable answers across four key domains: web, file, code, and general reasoning. Furthermore, we explore novel strategies for agent test-time reflection and voting to enhance agent robustness and performance. We evaluate Cognitive Kernel-Pro on GAIA, achieving state-of-the-art results among open-source and free agents. Notably, our 8B-parameter open-source model surpasses previous leading systems such as WebDancer and WebSailor, establishing a new performance standard for accessible, high-capability AI agents. Code is available at https://github.com/Tencent/CognitiveKernel-Pro

Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training Read Post »

AI, Committee, 新闻, Uncategorized

Tutorial: Exploring SHAP-IQ Visualizations

In this tutorial, we’ll explore a range of SHAP-IQ visualizations that provide insights into how a machine learning model arrives at its predictions. These visuals help break down complex model behavior into interpretable components—revealing both the individual and interactive contributions of features to a specific prediction. Check out the Full Codes here. Installing the dependencies Copy CodeCopiedUse a different Browser !pip install shapiq overrides scikit-learn pandas numpy seaborn Copy CodeCopiedUse a different Browser from sklearn.ensemble import RandomForestRegressor from sklearn.metrics import mean_squared_error, r2_score from sklearn.model_selection import train_test_split from tqdm.asyncio import tqdm import shapiq print(f”shapiq version: {shapiq.__version__}”) Importing the dataset In this tutorial, we’ll use the MPG (Miles Per Gallon) dataset, which we’ll load directly from the Seaborn library. This dataset contains information about various car models, including features like horsepower, weight, and origin. Check out the Full Codes here. Copy CodeCopiedUse a different Browser import seaborn as sns df = sns.load_dataset(“mpg”) df Processing the dataset We use Label Encoding to convert the categorical column(s) into numeric format, making them suitable for model training. Copy CodeCopiedUse a different Browser import pandas as pd from sklearn.preprocessing import LabelEncoder # Drop rows with missing values df = df.dropna() # Encoding the origin column le = LabelEncoder() df.loc[:, “origin”] = le.fit_transform(df[“origin”]) df[‘origin’].unique() Copy CodeCopiedUse a different Browser for i, label in enumerate(le.classes_): print(f”{label} → {i}”) Splitting the data into training & test subsets Copy CodeCopiedUse a different Browser # Select features and target X = df.drop(columns=[“mpg”, “name”]) y = df[“mpg”] feature_names = X.columns.tolist() x_data, y_data = X.values, y.values # Train-test split x_train, x_test, y_train, y_test = train_test_split(x_data, y_data, test_size=0.2, random_state=42) Model Training We train a Random Forest Regressor with a maximum depth of 10 and 10 decision trees (n_estimators=10). A fixed random_state ensures reproducibility. Copy CodeCopiedUse a different Browser # Train model model = RandomForestRegressor(random_state=42, max_depth=10, n_estimators=10) model.fit(x_train, y_train) Model Evaluation Copy CodeCopiedUse a different Browser # Evaluate mse = mean_squared_error(y_test, model.predict(x_test)) r2 = r2_score(y_test, model.predict(x_test)) print(f”Mean Squared Error: {mse:.2f}”) print(f”R2 Score: {r2:.2f}”) Explaining a Local Instance We choose a specific test instance (with instance_id = 7) to explore how the model arrived at its prediction. We’ll print the true value, predicted value, and the feature values for this instance. Check out the Full Codes here. Copy CodeCopiedUse a different Browser # select a local instance to be explained instance_id = 7 x_explain = x_test[instance_id] y_true = y_test[instance_id] y_pred = model.predict(x_explain.reshape(1, -1))[0] print(f”Instance {instance_id}, True Value: {y_true}, Predicted Value: {y_pred}”) for i, feature in enumerate(feature_names): print(f”{feature}: {x_explain[i]}”) Generating Explanations for Multiple Interaction Orders We generate Shapley-based explanations for different interaction orders using the shapiq package. Specifically, we compute: Order 1 (Standard Shapley Values): Individual feature contributions Order 2 (Pairwise Interactions): Combined effects of feature pairs Order N (Full Interaction): All interactions up to the total number of features Copy CodeCopiedUse a different Browser # create explanations for different orders feature_names = list(X.columns) # get the feature names n_features = len(feature_names) si_order: dict[int, shapiq.InteractionValues] = {} for order in tqdm([1, 2, n_features]): index = “k-SII” if order > 1 else “SV” # will also be set automatically by the explainer explainer = shapiq.TreeExplainer(model=model, max_order=order, index=index) si_order[order] = explainer.explain(x=x_explain) si_order 1. Force Chart The force plot is a powerful visualization tool that helps us understand how a machine learning model arrived at a specific prediction. It displays the baseline prediction (i.e., the expected value of the model before seeing any features), and then shows how each feature “pushes” the prediction higher or lower. In this plot: Red bars represent features or interactions that increase the prediction. Blue bars represent those that decrease it. The length of each bar corresponds to the magnitude of its effect. When using Shapley interaction values, the force plot can visualize not just individual contributions but also interactions between features. This makes it especially insightful when interpreting complex models, as it visually decomposes how combinations of features work together to influence the outcome. Check out the Full Codes here. Copy CodeCopiedUse a different Browser sv = si_order[1] # get the SV si = si_order[2] # get the 2-SII mi = si_order[n_features] # get the Moebius transform sv.plot_force(feature_names=feature_names, show=True) si.plot_force(feature_names=feature_names, show=True) mi.plot_force(feature_names=feature_names, show=True) From the first plot, we can see that the base value is 23.5. Features like Weight, Cylinders, Horsepower, and Displacement have a positive influence on the prediction, pushing it above the baseline. On the other hand, Model Year and Acceleration have a negative impact, pulling the prediction downward. 2. Waterfall Chart Similar to the force plot, the waterfall plot is another popular way to visualize Shapley values, originally introduced with the shap library. It shows how different features push the prediction higher or lower compared to the baseline. One key advantage of the waterfall plot is that it automatically groups features with very small impacts into an “other” category, making the chart cleaner and easier to understand. Check out the Full Codes here. Copy CodeCopiedUse a different Browser sv.plot_waterfall(feature_names=feature_names, show=True) si.plot_waterfall(feature_names=feature_names, show=True) mi.plot_waterfall(feature_names=feature_names, show=True) 3. Network Plot The network plot shows how features interact with each other using first- and second-order Shapley interactions. Node size reflects individual feature impact, while edge width and color show interaction strength and direction. It’s especially helpful when dealing with many features, revealing complex interactions that simpler plots might miss. Check out the Full Codes here. Copy CodeCopiedUse a different Browser si.plot_network(feature_names=feature_names, show=True) mi.plot_network(feature_names=feature_names, show=True) 4. SI Graph Plot The SI graph plot extends the network plot by visualizing all higher-order interactions as hyper-edges connecting multiple features. Node size shows individual feature impact, while edge width, color, and transparency reflect the strength and direction of interactions. It provides a comprehensive view of how features jointly influence the model’s prediction. Check out the Full Codes here. Copy CodeCopiedUse a different Browser # we abbreviate the feature names since, they are plotted inside the nodes abbrev_feature_names = shapiq.plot.utils.abbreviate_feature_names(feature_names) sv.plot_si_graph( feature_names=abbrev_feature_names, show=True, size_factor=2.5, node_size_scaling=1.5, plot_original_nodes=True, ) si.plot_si_graph( feature_names=abbrev_feature_names, show=True, size_factor=2.5, node_size_scaling=1.5, plot_original_nodes=True, ) mi.plot_si_graph( feature_names=abbrev_feature_names, show=True, size_factor=2.5, node_size_scaling=1.5, plot_original_nodes=True, ) 5. Bar Plot The bar plot is tailored for global explanations. While other plots can

Tutorial: Exploring SHAP-IQ Visualizations Read Post »

zh_CN