AI Archives - 40ページ目 (102ページ中)

Fast, Slow, and Tool-augmented Thinking for LLMs: A Review

admin NU / 8月 19, 2025

arXiv:2508.12265v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable progress in reasoning across diverse domains. However, effective reasoning in real-world tasks requires adapting the reasoning strategy to the demands of the problem, ranging from fast, intuitive responses to deliberate, step-by-step reasoning and tool-augmented thinking. Drawing inspiration from cognitive psychology, we propose a novel taxonomy of LLM reasoning strategies along two knowledge boundaries: a fast/slow boundary separating intuitive from deliberative processes, and an internal/external boundary distinguishing reasoning grounded in the model’s parameters from reasoning augmented by external tools. We systematically survey recent work on adaptive reasoning in LLMs and categorize methods based on key decision factors. We conclude by highlighting open challenges and future directions toward more adaptive, efficient, and reliable LLMs.

Fast, Slow, and Tool-augmented Thinking for LLMs: A Review 投稿を読む »

AI, Committee, ニュース, Uncategorized

Model Interpretability and Rationale Extraction by Input Mask Optimization

admin NU / 8月 18, 2025

arXiv:2508.11388v1 Announce Type: new Abstract: Concurrent to the rapid progress in the development of neural-network based models in areas like natural language processing and computer vision, the need for creating explanations for the predictions of these black-box models has risen steadily. We propose a new method to generate extractive explanations for predictions made by neural networks, that is based on masking parts of the input which the model does not consider to be indicative of the respective class. The masking is done using gradient-based optimization combined with a new regularization scheme that enforces sufficiency, comprehensiveness and compactness of the generated explanation, three properties that are known to be desirable from the related field of rationale extraction in natural language processing. In this way, we bridge the gap between model interpretability and rationale extraction, thereby proving that the latter of which can be performed without training a specialized model, only on the basis of a trained classifier. We further apply the same method to image inputs and obtain high quality explanations for image classifications, which indicates that the conditions proposed for rationale extraction in natural language processing are more broadly applicable to different input types.

Model Interpretability and Rationale Extraction by Input Mask Optimization 投稿を読む »

AI, Committee, ニュース, Uncategorized

Feedback Indicators: The Alignment between Llama and a Teacher in Language Learning

admin NU / 8月 18, 2025

arXiv:2508.11364v1 Announce Type: new Abstract: Automated feedback generation has the potential to enhance students’ learning progress by providing timely and targeted feedback. Moreover, it can assist teachers in optimizing their time, allowing them to focus on more strategic and personalized aspects of teaching. To generate high-quality, information-rich formative feedback, it is essential first to extract relevant indicators, as these serve as the foundation upon which the feedback is constructed. Teachers often employ feedback criteria grids composed of various indicators that they evaluate systematically. This study examines the initial phase of extracting such indicators from students’ submissions of a language learning course using the large language model Llama 3.1. Accordingly, the alignment between indicators generated by the LLM and human ratings across various feedback criteria is investigated. The findings demonstrate statistically significant strong correlations, even in cases involving unanticipated combinations of indicators and criteria. The methodology employed in this paper offers a promising foundation for extracting indicators from students’ submissions using LLMs. Such indicators can potentially be utilized to auto-generate explainable and transparent formative feedback in future research.

Feedback Indicators: The Alignment between Llama and a Teacher in Language Learning 投稿を読む »

AI, Committee, ニュース, Uncategorized

Retrieval-augmented reasoning with lean language models

admin NU / 8月 18, 2025

arXiv:2508.11386v1 Announce Type: new Abstract: This technical report details a novel approach to combining reasoning and retrieval augmented generation (RAG) within a single, lean language model architecture. While existing RAG systems typically rely on large-scale models and external APIs, our work addresses the increasing demand for performant and privacy-preserving solutions deployable in resource-constrained or secure environments. Building on recent developments in test-time scaling and small-scale reasoning models, we develop a retrieval augmented conversational agent capable of interpreting complex, domain-specific queries using a lightweight backbone model. Our system integrates a dense retriever with fine-tuned Qwen2.5-Instruct models, using synthetic query generation and reasoning traces derived from frontier models (e.g., DeepSeek-R1) over a curated corpus, in this case, the NHS A-to-Z condition pages. We explore the impact of summarisation-based document compression, synthetic data design, and reasoning-aware fine-tuning on model performance. Evaluation against both non-reasoning and general-purpose lean models demonstrates that our domain-specific fine-tuning approach yields substantial gains in answer accuracy and consistency, approaching frontier-level performance while remaining feasible for local deployment. All implementation details and code are publicly released to support reproducibility and adaptation across domains.

Retrieval-augmented reasoning with lean language models 投稿を読む »

AI, Committee, ニュース, Uncategorized

ToxiFrench: Benchmarking and Enhancing Language Models via CoT Fine-Tuning for French Toxicity Detection

admin NU / 8月 18, 2025

arXiv:2508.11281v1 Announce Type: new Abstract: Detecting toxic content using language models is crucial yet challenging. While substantial progress has been made in English, toxicity detection in French remains underdeveloped, primarily due to the lack of culturally relevant, large-scale datasets. In this work, we introduce TOXIFRENCH, a new public benchmark of 53,622 French online comments, constructed via a semi-automated annotation pipeline that reduces manual labeling to only 10% through high-confidence LLM-based pre-annotation and human verification. Then, we benchmark a broad range of models and uncover a counterintuitive insight: Small Language Models (SLMs) outperform many larger models in robustness and generalization under the toxicity detection task. Motivated by this finding, we propose a novel Chain-of-Thought (CoT) fine-tuning strategy using a dynamic weighted loss that progressively emphasizes the model’s final decision, significantly improving faithfulness. Our fine-tuned 4B model achieves state-of-the-art performance, improving its F1 score by 13% over its baseline and outperforming LLMs such as GPT-40 and Gemini-2.5. Further evaluation on a cross-lingual toxicity benchmark demonstrates strong multilingual ability, suggesting that our methodology can be effectively extended to other languages and safety-critical classification tasks.

ToxiFrench: Benchmarking and Enhancing Language Models via CoT Fine-Tuning for French Toxicity Detection 投稿を読む »

AI, Committee, ニュース, Uncategorized

What is AI Inference? A Technical Deep Dive and Top 9 AI Inference Providers (2025 Edition)

admin NU / 8月 18, 2025

Artificial Intelligence (AI) has evolved rapidly—especially in how models are deployed and operated in real-world systems. The core function that connects model training to practical applications is “inference”. This article offers a technical deep dive into AI inference as of 2025, covering its distinction from training, latency challenges for modern models, and optimization strategies such as quantization, pruning, and hardware acceleration. Inference vs. Training: The Critical Difference AI model deployment consists of two primary phases: Training is the process where a model learns patterns from massive, labeled datasets, using iterative algorithms (typically, backpropagation on neural networks). This phase is computation-heavy and generally done offline, leveraging accelerators like GPUs. Inference is the model’s “in action” phase—making predictions on new, unseen data. Here, the trained network is fed input, and the output is produced via a forward pass only. Inference happens in production environments, often requiring rapid responses and lower resource use. Aspect Training Inference Purpose Learn patterns, optimize weights Make predictions on new data Computation Heavy, iterative, uses backpropagation Lighter, forward pass only Time Sensitivity Offline, can take hours/days/weeks Real-time or near-real-time Hardware GPUs/TPUs, datacenter-scale CPUs, GPUs, FPGAs, edge devices Inference Latency: Challenges for 2025 Latency—the time from input to output—is one of the top technical challenges in deploying AI, especially large language models (LLMs) and real-time applications (autonomous vehicles, conversational bots, etc.). Key Sources of Latency Computational Complexity: Modern architectures—like transformers—have quadratic computational costs due to self-attention. (e.g., O ( n 2 d ) O(n 2 d) for sequence length n n and embedding dimension d d). Memory Bandwidth: Large models (with billions of parameters) require tremendous data movement, which often bottlenecks on memory speed and system I/O. Network Overhead: For cloud inference, network latency and bandwidth become critical—especially for distributed and edge deployments. Predictable vs. Unpredictable Latency: Some delays can be designed for (e.g., batch inference), while others—hardware contention, network jitter—cause unpredictable delays. Real-World Impact Latency directly affects user experience (voice assistants, fraud detection), system safety (driverless cars), and operational cost (cloud compute resources). As models grow, optimizing latency becomes increasingly complex and essential. Quantization: Lightening the Load Quantization reduces model size and computational requirements by lowering the numerical precision (e.g., converting 32-bit floats to 8-bit integers). How It Works: Quantization replaces high-precision parameters with lower-precision approximations, decreasing memory and compute needs. Types: Uniform/Non-uniform quantization Post-Training Quantization (PTQ) Quantization-Aware Training (QAT) Trade-offs: While quantization can dramatically speed up inference, it might slightly reduce model accuracy—careful application maintains performance within acceptable bounds. LLMs & Edge Devices: Especially valuable for LLMs and battery-powered devices, allowing for fast, low-cost inference. Pruning: Model Simplification Pruning is the process of removing redundant or non-essential model components—such as neural network weights or decision tree branches. Techniques: L1 Regularization: Penalizes large weights, shrinking less useful ones to zero. Magnitude Pruning: Removes lowest-magnitude weights or neurons. Taylor Expansion: Estimates the least impactful weights and prunes them. SVM Pruning: Reduces support vectors to simplify decision boundaries. Benefits: Lower memory. Faster inference. Reduced overfitting. Easier model deployment to resource-constrained environments. Risks: Aggressive pruning may degrade accuracy—balancing efficiency and accuracy is key. Hardware Acceleration: Speeding Up Inference Specialized hardware is transforming AI inference in 2025: GPUs: Offer massive parallelism, ideal for matrix and vector operations. NPUs (Neural Processing Units): Custom processors, optimized for neural network workloads. FPGAs (Field-Programmable Gate Arrays): Configurable chips for targeted, low-latency inference in embedded/edge devices. ASICs (Application-Specific Integrated Circuits): Purpose-built for highest efficiency and speed in large-scale deployments. Trends: Real-time, Energy-efficient Processing: Essential for autonomous systems, mobile devices, and IoT. Versatile Deployment: Hardware accelerators now span cloud servers to edge devices. Reduced Cost and Energy: Emerging accelerator architectures slash operational costs and carbon footprints. Here are the top 9 AI inference providers in 2025: Together AI Specializes in scalable LLM deployments, offering fast inference APIs and unique multi-model routing for hybrid cloud setups. Fireworks AI Renowned for ultra-fast multi-modal inference and privacy-oriented deployments, leveraging optimized hardware and proprietary engines for low latency. Hyperbolic Delivers serverless inference for generative AI, integrating automated scaling and cost optimization for high-volume workloads. Replicate Focuses on model hosting and deployment, allowing developers to run and share AI models rapidly in production with easy integrations. Hugging Face The go-to platform for transformer and LLM inference, providing robust APIs, customization options, and community-backed open-source models. Groq Known for custom Language Processing Unit (LPU) hardware that achieves unprecedented low-latency and high-throughput inference speeds for large models. DeepInfra Offers a dedicated cloud for high-performance inference, catering especially to startups and enterprise teams with customizable infrastructure. OpenRouter Aggregates multiple LLM engines, providing dynamic model routing and cost transparency for enterprise-grade inference orchestration. Lepton (Acquired by NVIDIA) Specializes in compliance-focused, secure AI inference with real-time monitoring and scalable edge/cloud deployment options. Conclusion Inference is where AI meets the real world, turning data-driven learning into actionable predictions. Its technical challenges—latency, resource constraints—are being met by innovations in quantization, pruning, and hardware acceleration. As AI models scale and diversify, mastering inference efficiency is the frontier for competitive, impactful deployment in 2025. Whether deploying conversational LLMs, real-time computer vision systems, or on-device diagnostics, understanding and optimizing inference will be central for technologists and enterprises aiming to lead in the AI era. The post What is AI Inference? A Technical Deep Dive and Top 9 AI Inference Providers (2025 Edition) appeared first on MarkTechPost.

What is AI Inference? A Technical Deep Dive and Top 9 AI Inference Providers (2025 Edition) 投稿を読む »

AI, Committee, ニュース, Uncategorized

Meet dots.ocr: A New 1.7B Vision-Language Model that Achieves SOTA Performance on Multilingual Document Parsing

admin NU / 8月 17, 2025

dots.ocr is an open-source vision-language transformer model developed for multilingual document layout parsing and optical character recognition (OCR). It performs both layout detection and content recognition within a single architecture, supporting over 100 languages and a wide variety of structured and unstructured document types. Architecture Unified Model: dots.ocr combines layout detection and content recognition into a single transformer-based neural network. This eliminates the complexity of separate detection and OCR pipelines, allowing users to switch tasks by adjusting input prompts. Parameters: The model contains 1.7 billion parameters, balancing computational efficiency with performance for most practical scenarios. Input Flexibility: Inputs can be image files or PDF documents. The model features preprocessing options (such as fitz_preprocess) for optimizing quality on low-resolution or dense multi-page files. Capabilities Multilingual: dots.ocr is trained on datasets spanning more than 100 languages, including major world languages and less common scripts, reflecting broad multilingual support. Content Extraction: The model extracts plain text, tabular data, mathematical formulas (in LaTeX), and preserves reading order within documents. Output formats include structured JSON, Markdown, and HTML, depending on the layout and content type. Preserves Structure: dots.ocr maintains document structure, including table boundaries, formula regions, and image placements, ensuring extracted data remains faithful to the original document. Benchmark Performance dots.ocr has been evaluated against modern document AI systems, with results summarized below: Benchmark dots.ocr Gemini2.5-Pro Table TEDS accuracy 88.6% 85.8% Text edit distance 0.032 0.055 Tables: Outperforms Gemini2.5-Pro in table parsing accuracy. Text: Demonstrates lower text edit distance (indicating higher precision). Formulas and Layout: Matches or exceeds leading models in formula recognition and document structure reconstruction. https://github.com/rednote-hilab/dots.ocr/blob/master/assets/blog.md Deployment and Integration Open-Source: Released under the MIT license, with source, documentation, and pre-trained models available on GitHub. The repository provides installation instructions for pip, Conda, and Docker-based deployments. API and Scripting: Supports flexible task configuration via prompt templates. The model can be used interactively or within automated pipelines for batch document processing. Output Formats: Extracted results are supplied in structured JSON for programmatic use, with options for Markdown and HTML where appropriate. Visualization scripts enable inspection of detected layouts. Conclusion dots.ocr provides a technical solution for high-accuracy, multilingual document parsing by unifying layout detection and content recognition in a single, open-source model. It is particularly suited for scenarios requiring robust, language-agnostic document analysis and structured information extraction in resource-constrained or production environments. Check out the GitHub Page. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Partner with Marktechpost for Promotion The post Meet dots.ocr: A New 1.7B Vision-Language Model that Achieves SOTA Performance on Multilingual Document Parsing appeared first on MarkTechPost.

Meet dots.ocr: A New 1.7B Vision-Language Model that Achieves SOTA Performance on Multilingual Document Parsing 投稿を読む »

AI, Committee, ニュース, Uncategorized

A Coding Guide to Build and Validate End-to-End Partitioned Data Pipelines in Dagster with Machine Learning Integration

admin NU / 8月 17, 2025

In this tutorial, we implement an advanced data pipeline using Dagster. We set up a custom CSV-based IOManager to persist assets, define partitioned daily data generation, and process synthetic sales data through cleaning, feature engineering, and model training. Along the way, we add a data-quality asset check to validate nulls, ranges, and categorical values, and we ensure that metadata and outputs are stored in a structured way. The focus throughout is on hands-on implementation, showing how to integrate raw data ingestion, transformations, quality checks, and machine learning into a single reproducible workflow. Copy CodeCopiedUse a different Browser import sys, subprocess, json, os subprocess.check_call([sys.executable, “-m”, “pip”, “install”, “-q”, “dagster”, “pandas”, “scikit-learn”]) import numpy as np, pandas as pd from pathlib import Path from dagster import ( asset, AssetCheckResult, asset_check, Definitions, materialize, Output, DailyPartitionsDefinition, IOManager, io_manager ) from sklearn.linear_model import LinearRegression BASE = Path(“/content/dagstore”); BASE.mkdir(parents=True, exist_ok=True) START = “2025-08-01″ We begin by installing the required libraries, Dagster, Pandas, and scikit-learn, so that we have the full toolset available in Colab. We then import essential modules, set up NumPy and Pandas for data handling, and define a base directory along with a start date to organize our pipeline outputs. Copy CodeCopiedUse a different Browser class CSVIOManager(IOManager): def __init__(self, base: Path): self.base = base def _path(self, key, ext): return self.base / f”{‘_’.join(key.path)}.{ext}” def handle_output(self, context, obj): if isinstance(obj, pd.DataFrame): p = self._path(context.asset_key, “csv”); obj.to_csv(p, index=False) context.log.info(f”Saved {context.asset_key} -> {p}”) else: p = self._path(context.asset_key, “json”); p.write_text(json.dumps(obj, indent=2)) context.log.info(f”Saved {context.asset_key} -> {p}”) def load_input(self, context): k = context.upstream_output.asset_key; p = self._path(k, “csv”) df = pd.read_csv(p); context.log.info(f”Loaded {k} <- {p} ({len(df)} rows)”); return df @io_manager def csv_io_manager(_): return CSVIOManager(BASE) daily = DailyPartitionsDefinition(start_date=START) We define a custom CSVIOManager to save asset outputs as CSV or JSON files and reload them when needed. We then register it with Dagster as csv_io_manager and set up a daily partitioning scheme so that our pipeline can process data for each date independently. Copy CodeCopiedUse a different Browser @asset(partitions_def=daily, description=”Synthetic raw sales with noise & occasional nulls.”) def raw_sales(context) -> Output[pd.DataFrame]: rng = np.random.default_rng(42) n = 200; day = context.partition_key x = rng.normal(100, 20, n); promo = rng.integers(0, 2, n); noise = rng.normal(0, 10, n) sales = 2.5 * x + 30 * promo + noise + 50 x[rng.choice(n, size=max(1, n // 50), replace=False)] = np.nan df = pd.DataFrame({“date”: day, “units”: x, “promo”: promo, “sales”: sales}) meta = {“rows”: n, “null_units”: int(df[“units”].isna().sum()), “head”: df.head().to_markdown()} return Output(df, metadata=meta) @asset(description=”Clean nulls, clip outliers for robust downstream modeling.”) def clean_sales(context, raw_sales: pd.DataFrame) -> Output[pd.DataFrame]: df = raw_sales.dropna(subset=[“units”]).copy() lo, hi = df[“units”].quantile([0.01, 0.99]); df[“units”] = df[“units”].clip(lo, hi) meta = {“rows”: len(df), “units_min”: float(df.units.min()), “units_max”: float(df.units.max())} return Output(df, metadata=meta) @asset(description=”Feature engineering: interactions & standardized columns.”) def features(context, clean_sales: pd.DataFrame) -> Output[pd.DataFrame]: df = clean_sales.copy() df[“units_sq”] = df[“units”] ** 2; df[“units_promo”] = df[“units”] * df[“promo”] for c in [“units”, “units_sq”, “units_promo”]: mu, sigma = df[c].mean(), df[c].std(ddof=0) or 1.0 df[f”z_{c}”] = (df[c] – mu) / sigma return Output(df, metadata={“rows”: len(df), “cols”: list(df.columns)}) We create three core assets for the pipeline. First, raw_sales generates synthetic daily sales data with noise and occasional missing values, simulating real-world imperfections. Next, clean_sales removes nulls and clips outliers to stabilize the dataset, while logging metadata about ranges and row counts. Finally, features perform feature engineering by adding interaction and standardized variables, preparing the data for downstream modeling. Copy CodeCopiedUse a different Browser @asset_check(asset=clean_sales, description=”No nulls; promo in {0,1}; units within clipped bounds.”) def clean_sales_quality(clean_sales: pd.DataFrame) -> AssetCheckResult: nulls = int(clean_sales.isna().sum().sum()) promo_ok = bool(set(clean_sales[“promo”].unique()).issubset({0, 1})) units_ok = bool(clean_sales[“units”].between(clean_sales[“units”].min(), clean_sales[“units”].max()).all()) passed = bool((nulls == 0) and promo_ok and units_ok) return AssetCheckResult( passed=passed, metadata={“nulls”: nulls, “promo_ok”: promo_ok, “units_ok”: units_ok}, ) @asset(description=”Train a tiny linear regressor; emit R^2 and coefficients.”) def tiny_model_metrics(context, features: pd.DataFrame) -> dict: X = features[[“z_units”, “z_units_sq”, “z_units_promo”, “promo”]].values y = features[“sales”].values model = LinearRegression().fit(X, y) return {“r2_train”: float(model.score(X, y)), **{n: float(c) for n, c in zip([“z_units”,”z_units_sq”,”z_units_promo”,”promo”], model.coef_)}} We strengthen the pipeline with validation and modeling. The clean_sales_quality asset check enforces data integrity by verifying that there are no nulls, the promo field only has 0/1 values, and the cleaned units remain within valid bounds. After that, tiny_model_metrics trains a simple linear regression on the engineered features and outputs key metrics like training and learned coefficients, giving us a lightweight but complete modeling step within the Dagster workflow. Copy CodeCopiedUse a different Browser defs = Definitions( assets=[raw_sales, clean_sales, features, tiny_model_metrics, clean_sales_quality], resources={“io_manager”: csv_io_manager} ) if __name__ == “__main__”: run_day = os.environ.get(“RUN_DATE”) or START print(“Materializing everything for:”, run_day) result = materialize( [raw_sales, clean_sales, features, tiny_model_metrics, clean_sales_quality], partition_key=run_day, resources={“io_manager”: csv_io_manager}, ) print(“Run success:”, result.success) for fname in [“raw_sales.csv”,”clean_sales.csv”,”features.csv”,”tiny_model_metrics.json”]: f = BASE / fname if f.exists(): print(fname, “->”, f.stat().st_size, “bytes”) if fname.endswith(“.json”): print(“Metrics:”, json.loads(f.read_text())) We register our assets and the IO manager in Definitions, then materialize the entire DAG for a selected partition key in one run. We persist CSV/JSON artifacts to /content/dagstore and print a quick success flag, plus saved file sizes and model metrics for immediate verification. In conclusion, we materialize all assets and checks in a single Dagster run, confirm data quality, and train a regression model whose metrics are stored for inspection. We keep the pipeline modular, with each asset producing and persisting its outputs in CSV or JSON, and ensure compatibility by explicitly converting metadata values to supported types. This tutorial demonstrates how we can combine partitioning, asset definitions, and checks to build a technically robust and reproducible workflow, giving us a practical framework to extend toward more complex real-world pipelines. Check out the FULL CODES here. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Partner with Marktechpost for Promotion The post A Coding Guide to Build and Validate End-to-End Partitioned Data Pipelines in Dagster with Machine Learning Integration appeared first on MarkTechPost.

A Coding Guide to Build and Validate End-to-End Partitioned Data Pipelines in Dagster with Machine Learning Integration 投稿を読む »

AI, Committee, ニュース, Uncategorized

Teaching the model: Designing LLM feedback loops that get smarter over time

admin NU / 8月 17, 2025

How to close the loop between user behavior and LLM performance, and why human-in-the-loop systems are still essential in the age of gen AI.Read More

Teaching the model: Designing LLM feedback loops that get smarter over time 投稿を読む »

AI, Committee, ニュース, Uncategorized

The Download: Taiwan’s silicon shield, and ChatGPT’s personality misstep

admin NU / 8月 16, 2025

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Taiwan’s “silicon shield” could be weakening Taiwanese politics increasingly revolves around one crucial question: Will China invade? China’s ruling party has wanted to seize Taiwan for more than half a century. But in recent years, China’s leader, Xi Jinping, has placed greater emphasis on the idea of “taking back” the island (which the Chinese Communist Party, or CCP, has never controlled). Many in Taiwan and elsewhere think one major deterrent has to do with the island’s critical role in semiconductor manufacturing. Taiwan produces the majority of the world’s semiconductors and more than 90% of the most advanced chips needed for AI applications. But now some Taiwan specialists and some of the island’s citizens are worried that this “silicon shield,” if it ever existed, is cracking. Read the full story. —Johanna M. Costigan This story is from our forthcoming print issue, which is all about security. If you haven’t already, subscribe now to receive future issues once they land. Why there’s a big backlash against ChatGPT’s new ‘personality’ When OpenAI made the switch to its new GPT-5 model last week, a number of people reacted with shock, frustration, sadness, or anger to previous model 4o’s sudden disappearance from ChatGPT. Despite its awareness that people are developing emotional bonds with the model, OpenAI appears to have been caught flat-footed by the fervor of users’ pleas for its return. Within a day, the company made 4o available again to its paying customers (free users are stuck with GPT-5). MIT Technology Review spoke with several ChatGPT users who were deeply affected by the loss of 4o. All are women between the ages of 20 and 40, and all bar one considered 4o to be a romantic partner. Read the full story. —Grace Huckins Why US federal health agencies are abandoning mRNA vaccines This time five years ago, we were in the throes of the covid-19 pandemic. Then came the vaccines. The first mRNA vaccines for covid were authorized for use in December 2020. The US government played an important role in the introduction of these vaccines, providing $18 billion to support their development. But now, that government is turning its back on the technology. Funding is being withdrawn. Partnerships are being canceled. Leaders of US health agencies are casting doubt on the vaccines’ effectiveness and safety. And this week, the director of the National Institutes of Health implied that the reversal was due to a lack of public trust in the technology. Plenty of claims are being thrown about. So let’s consider the evidence. Read the full story. —Jessica Hamzelou This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 The Trump administration is in talks to buy a stake in Intel Just weeks after Trump called for the CEO to step down. (Bloomberg $)+ It’s part of its plan to increase US market share in chip manufacturing. (WSJ $)+ Intel is probably hoping such a deal could help its beleaguered Ohio factory. (TechCrunch) 2 Meta’s AI rules allowed its chatbots to flirt with childrenAnd it only recently amended the guidelines after being questioned about it. (Reuters)+ We don’t know how long the policies were in place. (The Verge)+ An AI companion site is hosting sexually charged conversations with underage celebrity bots. (MIT Technology Review) 3 Erin is America’s first real test of hurricane readiness under TrumpIt looks like it’ll become the season’s first hurricane. (Vox)+ Trackers are uncertain about where the storm will head. (NYT $)+ Here’s what we know about hurricanes and climate change. (MIT Technology Review) 4 xAI lost a major US government contract after Grok praised HitlerLeaving the government to partner with OpenAI, Anthropic, and Gemini instead. (Wired $)+ xAI’s ‘Grok for Government’ site doesn’t appear to reflect this. (Ars Technica) 5 Tech leaders are upping their securityAs public hostility towards corporate executives deepens. (FT $) 6 These TikTokers are documenting their lives after deportationThey’re sharing their realities and creating new communities. (NY Mag $)+ ICE added a random person to a highly sensitive group chat. (404 Media) 7 We may soon be able to hear some patients’ inner voicesNew research has successfully guessed words imagined by people unable to speak. (NYT $)+ Motor neuron diseases took their voices. AI is bringing them back. (MIT Technology Review) 8 China’s plug-in hybrids are everywhereAnd they’re likely to dominate exports for the next three years at least. (Rest of World)+ China’s EV giants are betting big on humanoid robots. (MIT Technology Review) 9 The UK is working with TikTok influencers to tackle medical tourismIt’s a bid to raise awareness of the risks of undertaking cosmetic surgery abroad. (BBC) 10 AI may experience the passage of time differently to usWhat does this mean for our future? (IEEE Spectrum)+ What is AI? (MIT Technology Review) Quote of the day “We’ve realized the best way to get them is when they’re scrolling social media.” —Ryan Odendahl, president and CEO of construction company Kwest Group, tells the Washington Post how his company is getting young people interested in learning traditional trades. One more thing The next generation of neural networks could live in hardware Networks programmed directly into computer chip hardware can identify images faster, and use much less energy, than the traditional neural networks that underpin most modern AI systems. Neural networks, from GPT-4 to Stable Diffusion, are built by wiring together perceptrons, which are highly simplified simulations of the neurons in our brains. In very large numbers, perceptrons are powerful, but they also consume enormous volumes of energy. Part of the trouble is that perceptrons are just software abstractions—running a perceptron network on a GPU requires translating that network into the language of hardware, which takes

The Download: Taiwan’s silicon shield, and ChatGPT’s personality misstep 投稿を読む »

AI

Fast, Slow, and Tool-augmented Thinking for LLMs: A Review

Model Interpretability and Rationale Extraction by Input Mask Optimization

Feedback Indicators: The Alignment between Llama and a Teacher in Language Learning

Retrieval-augmented reasoning with lean language models

ToxiFrench: Benchmarking and Enhancing Language Models via CoT Fine-Tuning for French Toxicity Detection

What is AI Inference? A Technical Deep Dive and Top 9 AI Inference Providers (2025 Edition)

Meet dots.ocr: A New 1.7B Vision-Language Model that Achieves SOTA Performance on Multilingual Document Parsing

A Coding Guide to Build and Validate End-to-End Partitioned Data Pipelines in Dagster with Machine Learning Integration

Teaching the model: Designing LLM feedback loops that get smarter over time

The Download: Taiwan’s silicon shield, and ChatGPT’s personality misstep

私たちのサービス

ホーム

仕組み

ニュース

料金

サポート

ヘルプセンター

問題を報告

フィードバックを送る

プライバシーポリシー

ユーザーアカウント

フォローする