YouZum

Uncategorized

AI, Committee, 新闻, Uncategorized

Google Cloud Introduces Open Knowledge Format (OKF): A Vendor-Neutral Markdown Spec for Giving AI Agents Curated Context

Foundation models keep getting stronger, yet they still stall on the same thing: context. A model can write code or analyze a dataset, but only with the right internal knowledge. That knowledge includes table schemas, metric definitions, runbooks, join paths and it lives scattered across catalogs, wikis, and a few senior engineers’ heads. Google Cloud introduced the Open Knowledge Format (OKF), an open specification that formalizes the LLM-wiki pattern into a portable, interoperable format. It is a vendor-neutral, agent- and human-friendly standard for the context modern AI systems need. Open Knowledge Format (OKF) OKF is a format, not a service or a platform. OKF v0.1 represents knowledge as a directory of markdown files with YAML frontmatter. A small set of agreed-upon conventions lets wikis written by one producer be consumed by a different agent without translation. That is the whole idea. There is no compression scheme, no new runtime, and no required SDK. A bundle of OKF documents is just markdown, just files, and just YAML frontmatter. It renders on GitHub, ships as a tarball, and mounts on any filesystem. If you have used Obsidian, Notion, or Hugo, the shape will feel familiar. OKF only formalizes the conventions needed to make those patterns interoperable. The Fragmented Context Problem In most organizations, model context is overwhelmingly internal knowledge. Today it sits in incompatible silos: metadata catalogs with their own APIs, wikis, shared drives, code comments, and docstrings. Ask an agent ‘How do I compute weekly active users from our event stream?’ It must assemble that answer from scattered, mutually incompatible surfaces. Every vendor offers its own catalog, SDK, and knowledge-graph schema. None of the knowledge is portable across products or organizations. The result is duplicated effort. Every agent builder solves the same context-assembly problem from scratch. Every catalog vendor reinvents the same data models. Andrej Karpathy articulated the underlying idea in his April 2026 LLM Wiki gist. His point: LLMs do not get bored, do not forget to update cross-references, and can edit many files in one pass. The bookkeeping that makes humans abandon personal wikis is exactly what LLMs handle well. The same pattern keeps reappearing under different names. Examples include Obsidian vaults wired to coding agents, the AGENTS.md and CLAUDE.md convention files, and ‘metadata as code’ repos. Each instance is bespoke, so none of them interoperate. OKF standardizes that interoperability layer so agents can do the heavy lifting. How OKF Works: The Design in One Screen An OKF bundle is a directory of markdown files representing concepts — tables, datasets, metrics, playbooks, runbooks, or APIs. Each concept is one file, and the file path is its identity. Copy CodeCopiedUse a different Browser sales/ ├── index.md ├── datasets/ │ ├── index.md │ └── orders_db.md ├── tables/ │ ├── index.md │ ├── orders.md │ └── customers.md └── metrics/ ├── index.md └── weekly_active_users.md Each concept carries a small YAML front-matter block, then a markdown body for everything else. Copy CodeCopiedUse a different Browser — type: BigQuery Table title: Orders description: One row per completed customer order. resource: https://console.cloud.google.com/bigquery?p=acme&d=sales&t=orders tags: [sales, revenue] timestamp: 2026-05-28T14:30:00Z — # Schema | Column | Type | Description | |—————|——–|——————————————| | `order_id` | STRING | Globally unique order identifier. | | `customer_id` | STRING | FK to [customers](/tables/customers.md). | The reserved structured fields are type, title, description, resource, tags, and timestamp. Concepts link to each other with normal markdown links. Those links turn the directory into a graph that is richer than file-system parent/child relationships. Bundles can optionally include index.md files for progressive disclosure and log.md files for change history. Three Principles Behind the Design Minimally opinionated: OKF requires exactly one field on every concept: type. Everything else is left to the producer. The spec defines the interoperability surface, not the content model. Producer/consumer independence: A human-written bundle can be read by an agent. A pipeline-generated bundle can be browsed in a visualizer. The format is the contract; tooling at each end is swappable. Format, not platform: OKF is tied to no cloud, database, model provider, or agent framework. It will never require a proprietary account to read, write, or serve. Use Cases, With Examples Data team metadata-as-code: Export BigQuery table and metric definitions as a bundle. Commit it next to the SQL it describes, and review changes through pull requests. Incident runbooks for agents: Store each runbook as a concept. An on-call agent reads index.md, follows cross-links, and resolves the join path it needs. Cross-org knowledge exchange: A vendor ships a catalog export as OKF. Your agent consumes it directly, with no integration work. Developer-team wiki: Replace a stale Notion or Obsidian space with versioned markdown that an agent keeps current. How OKF Compares Approach Storage Schema required Portable SDK/registry Agent-readable OKF v0.1 Markdown + YAML files Only type Yes No Yes, no translation Notion Proprietary DB Per-workspace Export-only API needed Via API Obsidian vault Markdown files None enforced Yes No Bespoke conventions Metadata catalog Vendor store Vendor schema Export-only Vendor SDK Vendor-specific RAG index Vector store Embedding model No Yes Chunks, not concepts The distinction from RAG is useful for developers. RAG re-derives knowledge at query time from raw chunks. An OKF bundle stores curated, cross-linked concepts that an agent reads and updates directly. A Minimal OKF Consumer OKF is parseable with standard tools. This reads a bundle and builds its link graph. Copy CodeCopiedUse a different Browser import pathlib, re, yaml def load_bundle(root): concepts, links = {}, [] for path in pathlib.Path(root).rglob(“*.md”): text = path.read_text() meta = {} if text.startswith(“—“): _, fm, body = text.split(“—“, 2) meta = yaml.safe_load(fm) or {} else: body = text concepts[str(path)] = meta # type, title, tags, etc. for target in set(re.findall(r”]((/[^)]+.md))”, body)): links.append((str(path), target)) # markdown cross-links return concepts, links concepts, graph = load_bundle(“sales/”) No backend or install is needed to read or serve a bundle. The same files live in version control beside the code they describe. Key Takeaways Google’s Open Knowledge Format (OKF) v0.1 formalizes the LLM-wiki pattern into a portable, vendor-neutral spec. A bundle is

Google Cloud Introduces Open Knowledge Format (OKF): A Vendor-Neutral Markdown Spec for Giving AI Agents Curated Context Read Post »

AI, Committee, 新闻, Uncategorized

Want to get a data center online quickly? Give it some flex.

At the end of a tense and scoreless first half of a soccer match between the English men’s team and rival Germany, millions of Brits let out a collective sigh and did what they so often do in moments of stress: They made tea. That wave of electric kettles clicking on, however, caused a different kind of stress: a huge and sudden increase in demand for electricity. But National Grid, which operates the local transmission network, was ready. Just as those kettles started heating up, an AI program sent instructions to a data center in London to slow down some of the facility’s power-hungry chips. This reduction helped make sure there was enough supply to match demand, staving off potential blackouts or damage to electrical hardware. For data centers, which normally guzzle power without consideration for anyone or anything else’s needs, it was a radical departure. It was also a simulation. In December 2025, engineers sought to test a new breed of data center built to be flexible about its electricity needs, so they re-created the energy demand facing the UK’s grid during a match from the 2020 Euro tournament. They wanted to see how their software, called Conductor, would have responded had it been online at the time. Conductor is the signature product of Emerald AI, a firm based in Washington, DC, that’s part of a wave of companies trying to figure out whether data centers can work within the confines of the existing electric grid. This year, Emerald is set to deploy Conductor in a new facility in the part of Virginia known as Data Center Alley, this time connected to the live grid. When overall demand spikes, Conductor will turn down the power used by the data center, while making sure its servers still carry out their timeliest and most important jobs. Emerald’s partners on the project—which include Nvidia and the giant data-center operator Digital Realty—bill it as one of the world’s first “power-flexible AI factories.” Demonstrating that data centers can participate in this kind of give-and-take could ease what many tech leaders identify as the bottleneck in getting facilities online: It takes far longer to get approval for, construct, and connect new power plants than to build data centers. PJM, the grid operator in Virginia and the largest one in the US, for instance, needs eight years to bring new generation online, according to RMI, an energy research and advocacy group. “We need to solve the energy equation,” says Josh Parker, head of sustainability at Nvidia. “AI factory flexibility is the bridge between the incredible demand for AI and the immediate limitations of our energy grid.” Speed, though, is only one of the issues. Once facilities do plug in, neighbors often criticize them for drawing too much electricity and contributing to rising prices. They say the data centers generate more noise than they do long-term jobs, contribute to pollution, and threaten to put people out of work. Organizers stalled over $150 billion worth of projects in 2025, according to Data Center Watch, and policymakers alert to the public mood are starting to impose limitations on development. More than a dozen states are considering bans, and local moratoriums are in effect in places like Minneapolis and DeKalb County in Georgia. At the federal level, the GRID Act, a bipartisan bill in the US Senate, proposes to sever new data centers from public grids entirely. Some operators are already moving that way by trying to develop their own power generation. Rather than rushing to build new power plants, companies could find part of the solution to the crunch right under our noses—or, more precisely, in the transmission lines under our feet and above our heads. The existing system operates near its full capacity during only a small number of high-demand hours throughout the year. This means, some grid experts argue, that if data centers can limit the power they draw during those stretches, they won’t need to wait for big infrastructure upgrades or build their own off-grid generation.  Indeed, a growing number of studies have shown there could be plenty of power available for data centers that can flex. A widely discussed 2025 report from researchers at Duke University found that the US grid could offer an additional 76 gigawatts—about 5% of its entire capacity, and about enough to accommodate projected data-center growth in the US through 2030—to facilities that are willing to reduce their usage just 0.25% of the time. That’s about 22 hours a year. And when researchers from Princeton University and two grid-modernization companies looked at locations for new data centers in the PJM region, their report, which was funded by Google, found that a 500-megawatt facility capable of flexing for less than 1% of the year could reach full operation three to five years faster than one that’s inflexible.  Flexible power connections could also help data centers address some of their PR problems. By decreasing their draw at times of grid stress, for instance, they could avoid diverting power from where it’s most needed, thus boosting stability. By using existing capacity, they might be able to reduce the need for new fossil-fuel power plants and spread fixed costs over more electricity users, pushing prices down.  The AI power pinch is attracting resources and research into strategies for grid flexibility overall, which could help negotiate a tricky period: Taken together with electric vehicles, air-conditioning, and other sectors, data centers are helping drive what analysts predict will be a 25% increase in US electricity demand by 2030 compared with 2023 levels. Ideally, flexibility gives grid operators more control over the flow of electrons, making them leaders of a harmonious ensemble rather than hostages to inflexible electricity requirements. That will help them manage demand spikes across the entire system and deal more effectively with the intermittent nature of renewables like wind and solar. “Demand flexibility is incredibly useful for power grids,” says Johanna Mathieu, a grid expert at the University of Michigan. “It helps reduce electricity

Want to get a data center online quickly? Give it some flex. Read Post »

AI, Committee, 新闻, Uncategorized

Building an End-to-End Sentiment Analysis Pipeline with Scikit-LLM

Traditional machine learning pipelines for predictive tasks like text classification usually rely on extracting structured, numerical features from raw text — for instance, TF-IDF frequencies or token embeddings — to feed into classical models such as logistic regression, ensembles, or support vector machines.

Building an End-to-End Sentiment Analysis Pipeline with Scikit-LLM Read Post »

AI, Committee, 新闻, Uncategorized

The Download: the first brain implant power user and South Korea’s AI obsession

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. This man with ALS is the first “power user” of a brain implant that lets him speak Casey Harrell has had a set of electrodes embedded in his brain for almost three years. Harrell, who has ALS and is paralyzed, first used his brain-computer interface (BCI) to “speak” in 2023. Since then, he’s clocked thousands of hours of use.  Harrell can now use the device largely independently. His team has added new features to it, and he also uses it to surf the web and perform his job. “Living with a disease like ALS, you are supposed to have diminished dreams. I do not,” Harrell told MIT Technology Review.  The team behind the device call Harrell “the first power user of a speech BCI.” They now plan to add further enhancements to the device. Dive into the groundbreaking impact of Casey Harrell’s BCI. —Jessica Hamzelou Why do South Koreans love AI so much? While a public backlash against AI brews across the US, South Koreans are optimistic. Only 16% say they are more concerned than excited about AI—the lowest of the 25 countries surveyed by the Pew Research Center—while 50% of Americans were more worried than excited.  South Koreans share a deep conviction that embracing technology is integral to modernizing the country and cementing its place in the global order. Their fascination with AI is just the latest incarnation of that ethos—and it’s making them anxious to stay ahead. Read the full story on South Korea’s AI fervour. —Michelle Kim This story is from The Algorithm, our weekly newsletter giving you the inside track on all things AI. Sign up to receive it in your inbox every Monday. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 The US says it restricted Anthropic AI over foreign intelligence risksCommerce chief Lutnick said he acted over national security fears. (Reuters $)+ Following the ban, Anthropic disabled access to its new models. (BBC)+ Both sides are increasingly desperate for a resolution. (WSJ $) 2 DeepSeek just became China’s most valuable startupIt raised $7 billion, the largest-ever first-round funding for an AI startup. (The Information $)+ The deal values DeepSeek at over $50 billion. (WSJ $)+ Its unusual structure preserves founder control. (Reuters $)+ DeepSeek’s new flagship model has caused a stir. (MIT Technology Review) 3 Alibaba has unveiled AI models for robots amid a shift from chatbotsIt’s joined a global race to move AI into the physical world. (SCMP)+ AI is learning to understand its surroundings. (MIT Technology Review) 4 Fox is buying streaming giant Roku for $22 billionThe deal creates the third-largest player in US TV by viewing share. (BBC)+ Fox is making a big bet on free streaming. (Washington Post $) 5 EA has launched a new way to advertise “directly into gameplay”EA Advertising allows brands to become part of the game itself. (CNBC)+ Xbox’s new chief strategy officer is also eyeing in-game ads. (PC Gamer)+ GenAI could reinvent what it means to play. (MIT Technology Review) 6 It’s trivially easy to use Reddit to manipulate AI searchA tiny snippet of text can trick ChatGPT and Google’s AI search. (404 Media)+ AI search is being manipulated to generate dangerous biases. (BBC) 7 Sperm have been made magnetic to allow IVF inside the bodyThe technique enables remote guidance towards an egg. (New Scientist $)+ Automation and AI are transforming IVF. (MIT Technology Review) 8 The world’s leading deepfake expert no longer trusts his own eyesHe’s struggling to prove what’s real before the internet decides. (NYT $) 9 Meta’s CTO admits its AI reorganisation was “atrocious”He’s promised staff better communication—and snacks. (Wired $) 10 Silicon Valley billionaires are pretending to kill each other for funIn a new game show from Peter Thiel’s Founders Fund. (WSJ $) Quote of the day “There was a speeding ticket, and they gave Fable the death penalty.”  —Alex Stamos, the former chief security officer of Facebook, tells the Washington Post that banning foreign access to Anthropic’s leading model is a disproportionate punishment. One More Thing VICTOR KERLOW Inside effective altruism, where the far future counts a lot more than the present Since its birth in the late 2000s, effective altruism has aimed to answer a deceptively simple question: “How can those with means have the greatest impact?” Directing money to evidence-based approaches is EA’s best-known technique. But as it’s expanded from an academic philosophy into a community and a movement, its ideas of the “best” way to change the world have evolved as well.  Find out how effective altruism became one of the most influential—and contested—forces in philanthropy. —Rebecca Ackermann We can still have nice things A place for comfort, fun, and distraction to brighten up your day. (Got any ideas? Drop me a line.) + The humble table has been reimagined as an unconventional public artifact.+ Take a visual tour of the weird, centuries-old history of architecture’s most gruesome gargoyles.+ A colorful parakeet unseen for an entire century was triumphantly rediscovered in an unexplored Indonesian forest.+ This shimmering Southern Lights timelapse filmed by an astronaut on the SpaceX Dragon is stunning.

The Download: the first brain implant power user and South Korea’s AI obsession Read Post »

AI, Committee, 新闻, Uncategorized

Meet Qwen-RobotSuite: Three Embodied AI Models for VLA Manipulation, Video World Modeling, and Navigation

The Qwen team has released three embodied AI models, grouped as Qwen-Robot-Suite. The three are Qwen-RobotManip, Qwen-RobotWorld, and Qwen-RobotNav. Each is built on a Qwen vision-language backbone and targets a different robotics problem. Qwen-RobotManip is a Vision-Language-Action model for manipulation, built on Qwen3.5-4B. Qwen-RobotWorld is a language-conditioned video world model with a 60-layer MMDiT and a frozen Qwen2.5-VL encoder. Qwen-RobotNav is a navigation model built on Qwen3-VL, available at 2B, 4B, and 8B sizes. Qwen-Robot-Suite Qwen-Robot-Suite is not a single model. It is a suite of three independent foundation models. Two of them, RobotManip and RobotNav, ship with public GitHub repositories. Robotics data is fragmented across hardware and tasks. Different robots use incompatible observation and action formats. A policy trained on one arm rarely transfers to another. The three research reports address this fragmentation in different ways. RobotManip aligns action representations so manipulation data scales. RobotWorld uses language as a unified action interface for video prediction. RobotNav exposes a controllable observation interface for navigation tasks. Here is the core split between the three releases: Model Problem Backbone Output Qwen-RobotManip Robotic manipulation Qwen3.5-4B (Qwen-VL) Continuous robot actions Qwen-RobotWorld Embodied world modeling Frozen Qwen2.5-VL Predicted future video Qwen-RobotNav Mobile navigation Qwen3-VL (2B/4B/8B) Waypoint trajectories Qwen-RobotManip: Alignment Unlocks Scale for Manipulation Qwen-RobotManip is a Vision-Language-Action (VLA) foundation model. It is built on Qwen-VL and predicts continuous robot actions. A VLA model takes camera views and a language instruction. It then outputs low-level robot actions. The challenge is that manipulation data is heterogeneous by nature. Different robots record states and actions in incompatible formats. When demonstrations arrive with mismatched representations, scaling data produces interference. RobotManip solves this with a unified alignment framework. The Unified Alignment Framework The framework has three complementary mechanisms. First is a canonical state-action representation. It is an 80-dimensional vector with per-dimension binary masking. This vector holds two 29-dimensional per-arm blocks plus 22 reserved dimensions. Each block stores joint positions, end-effector pose, gripper state, and dexterous hand joints. Robots populate only the dimensions they have. Second is a camera-frame delta pose parameterization. End-effector actions are expressed as deltas in the camera frame. This makes visually similar motions numerically proximate across embodiments. Third is an in-context policy adaptation mechanism. It reads recent execution history as an implicit embodiment identifier. The policy adjusts behavior at deployment time without parameter updates. A dual-stream co-training strategy runs alongside this. It jointly optimizes manipulation data and a vision-language stream. This prevents the backbone’s perception and reasoning from eroding. The Data Engine RobotManip assembles roughly 38,100 hours of manipulation data. It uses only open-source datasets and human videos. No proprietary data collection was used. A human-to-robot synthesis pipeline produces most of this scale. It converts egocentric hand demonstrations into robot trajectories. The pipeline renders across 15 robot platforms. This synthesis alone yields about 24,808 hours of demonstrations. The egocentric source data is about 1,933 hours. Open-source robot datasets contribute over 11,000 hours. The pipeline separates action alignment from visual alignment. Action alignment retargets hand keypoints to gripper poses. Visual alignment uses SAM3 masking, ProPainter inpainting, and MuJoCo inverse kinematics. A five-stage curation pipeline then filters the combined corpus. It catches sudden changes, temporal misalignment, and extreme values. One check found 81% of episodes in a subset failed state-action alignment. Benchmark Results The research report argues standard benchmarks fail to measure generalization. Models without robot pretraining match pretrained ones on in-distribution tests. RobotManip therefore focuses on out-of-distribution (OOD) settings. Benchmark (OOD) Prev. SOTA (π0.5) Qwen-RobotManip LIBERO-Plus 84.4 91.4 RoboTwin-C2R Hard 47.9 69.4 EBench 27.1 45.6 RoboCasa365 16.9 35.9 RoboTwin-IF 49.6 72.2 The largest reported gap is on cross-embodiment transfer. RobotManip reaches 23.9% using camera-frame EEF actions. That is 3.2× the 7.5% achieved by π0.5. The model also ranks 1st on the RoboChallenge Table30-v1 generalist track. It scores a 20% relative improvement over the prior best. Real-robot validation covers AgileX ALOHA, Franka, UR, and ARX platforms. Qwen-RobotWorld: Language as a Universal Action Interface Qwen-RobotWorld is a language-conditioned video world model. It predicts future visual trajectories from a current observation. Natural language serves as the unified action interface. A world model learns environment dynamics. Given a current state and an action, it predicts the next state. RobotWorld represents states as video frames and actions as text. This is important because language is embodiment-agnostic. One instruction encodes the action sequence, goal, and constraints. It works across a Franka gripper, an Aloha dual-arm system, or a humanoid. The Double-Stream MMDiT Architecture The model uses a 60-layer double-stream Multimodal Diffusion Transformer. An understanding stream processes a frozen Qwen2.5-VL encoder’s features. A generation stream processes video-VAE latents. The two streams interact via joint attention at every layer. Using an MLLM as the action encoder gives two advantages. It parses compositional instructions and constrains physically plausible transitions. The MMDiT has 20B parameters. The VAE adopts the Wan-VAE architecture. The context length supports up to 48,360 video tokens. A Scene2Robot mechanism reuses this backbone for cross-embodiment synthesis. It processes scene, robot reference, and generation segments together. This enables human-to-robot video transfer without robot-specific prompting. The Embodied World Knowledge Dataset Training uses the Embodied World Knowledge (EWK) dataset. It contains roughly 8.6M video-text pairs. That spans over 200M observation frames. The corpus covers four embodied domains plus general video. Manipulation provides about 5.9M samples across 20+ morphologies. Driving, navigation, and human-to-robot transfer fill out the rest. An action-language mapping framework standardizes everything. It converts 20+ embodiment types and 500+ action categories into language. A hierarchical five-layer annotation pipeline produces the captions. Benchmark Results RobotWorld was evaluated on four established benchmarks. It ranks 1st overall on two of them: Benchmark Result Ranking EWMBench 4.60 1st overall DreamGen Bench 4.952 1st overall WorldModelBench 8.99 1st open-source (3rd overall) PBench 0.804 1st open-source On EWMBench it leads motion fidelity with an HSD of 0.566. That is a 33% gain over the runner-up. Scene consistency reaches 0.914. On WorldModelBench it scores 1.00 on four physics-adherence categories. These are Newton’s laws, mass conservation, fluid dynamics, and gravity. Penetration scores 0.94, and

Meet Qwen-RobotSuite: Three Embodied AI Models for VLA Manipulation, Video World Modeling, and Navigation Read Post »

AI, Committee, 新闻, Uncategorized

These new solid-state ACs promise a cool future. Scientists aren’t so sure.

After three years of record-­breaking heat, this one is set to be yet another scorcher. Air-conditioning? Not going anywhere. The International Energy Agency projects that the number of AC units will triple by 2050. That’s good for health—one Lancet study estimated that AC prevented nearly 200,000 premature deaths in 2019 alone—but bad for the planet. Artificial chill already accounts for 7% of global electricity use and 3% of greenhouse-gas emissions, and if improperly disposed of, the units can leak refrigerants with more global-­warming potential than carbon dioxide. Feeling the heat, a number of scientists and startups are hoping to amp up solid-­state cooling, which is currently used at a small scale for things like mini fridges, EV batteries, and some high-end gaming computers. Traditional ACs transfer heat by using a compressor and a fan to circulate a refrigerant and turn it from liquid to gas. Solid-state systems, on the other hand, move heat through conductive materials like gadolinium and bismuth telluride—which could theoretically cool spaces and surfaces with fewer messy side effects.  The catch is whether they can match the efficiency of conventional AC. “One of the key questions that remain is why are the solid-state coolers not as efficient as typical thermodynamic cycles?” says Pramod Reddy, a professor of mechanical engineering at the University of Michigan who studies heat transfer.  Research and pilot programs are underway to test a range of approaches. Brooklyn-based Mimic Systems uses thermo­electric cooling, which passes a current through semiconductive materials to shift heat from one side to another. Its room-scale climate control system is being piloted in an apartment in Vancouver. The German company Magnotherm is set to test its system, which relies on a magneto­caloric setup that transfers heat by magnetizing and demagnetizing materials, in a chain of supermarkets. A team in Hong Kong has announced that its elastocaloric device, whose material heats and cools as it expands and contracts, can dip below 0 °C. And the UK’s Barocal is betting on barocaloric systems, which change temperature in response to shifts in pressure.  But experts, especially in thermoelectrics, have doubts about how well any solid-­state scheme can compete. For most modern HVAC systems, the coefficient of performance (COP) is 3, explains Jeff Snyder, a professor at Northwestern University who studies electrical and thermal conductivity. That essentially means the system moves three units of heat for every unit of energy that goes into it. Thermoelectrics in particular tend to have a much lower performance at high levels of temperature change, Snyder says, which means they’re best suited for niche uses such as cooling the back of a car seat.  Mimic’s room-scale thermoelectric HVAC unit is being tested in a Vancouver apartment.COURTESY OF MIMIC SYSTEMS, INC Efficiency, however, isn’t everything, argues Lindsay Rasmussen, a manager at the Rocky Mountain Institute’s climate tech accelerator Third Derivative, which supports both Magnotherm and Mimic. In the US, most ACs currently in use employ a refrigerant called R410A, which has a global-­warming potential more than 2,000 times that of carbon dioxide. Plus, their moving parts can make them less durable, especially compared with a solid-state model that’s less mechanically complex. Still, a dearth of units makes it hard to answer the efficiency question. To understand how well alternatives work, says Rasmussen, researchers need to compare their long-term energy consumption with that of conventional models instead of simply looking at COP. Mimic claims, for example, that its room-scale model should match the draw of a typical AC unit over the course of a year. Elastocaloric and barocaloric systems also have promise, Rasmussen adds, but room-scale prototypes are probably two to three years away.  In the end, the likelihood that solid-state cooling could replace compressor-based AC is slim. But as the planet warms and places like India install tens of millions of new AC units over the next decade, supplanting even a small number could make a dent. “If [solid-state] could take over even a 5% market share,” Rasmussen says, “that is a really large potential impact.”  Sara Kiley Watson is a science journalist specializing in climate and sustainability. She’s based in The Hague.

These new solid-state ACs promise a cool future. Scientists aren’t so sure. Read Post »

AI, Committee, 新闻, Uncategorized

Meet Flash-KMeans: An IO-Aware, Exact K-Means That Runs Over 200× Faster Than FAISS on GPUs

k-means has been an offline tool for decades. You run it once to preprocess data, then move on. A team of researchers from UC Berkeley and UT Austin released Flash-KMeans, a new open-source library that targets a different setting. Modern AI pipelines now call k-means inside training and inference loops. At that frequency, latency per call matters more than theoretical FLOPs. Flash-KMeans is an IO-aware implementation of standard Lloyd’s k-means. It does not change the math, and it does not approximate. It only restructures how the algorithm moves data on a GPU. On an NVIDIA H200, the research team reported up to 17.9× end-to-end speedup over the best baseline. Against NVIDIA cuML they report 33×. Against FAISS they report over 200×. What is Flash-KMeans Flash-KMeans is a batched k-means library written in Triton GPU kernels. It ships under Apache 2.0 and installs with pip install flash-kmeans. The output is mathematically identical to standard Lloyd’s k-means. The speedup comes from kernel-level dataflow, not from skipping work. That separates it from algorithmic methods like triangle-inequality pruning or coreset sampling. A standard Lloyd iteration has two stages. The assignment stage computes each point’s distance to every centroid, then picks the nearest. The update stage averages the points in each cluster to form new centroids. Both stages are simple arithmetic. On GPUs, both are bottlenecked by memory, not compute. The Two Bottlenecks It Attacks The first bottleneck is the assignment stage. Standard code builds a full distance matrix D of shape N×K in High Bandwidth Memory (HBM). It writes the matrix, then reads it back to run argmin. For N=65536, K=1024, d=128, B=32, the distance math takes 2.6ms. Writing and consuming D takes about 23ms. The matrix is the cost, not the arithmetic. Flash-KMeans replaces this with FlashAssign. The design borrows from FlashAttention. FlashAssign streams tiles of points and centroids from HBM into on-chip SRAM. It fuses distance computation with an online argmin. The full N×K matrix is never materialized. This cuts the dominant IO complexity from O(NK) to O(Nd + Kd). At the kernel level, FlashAssign reaches up to 21.2×. In one case it cut assignment from 122.5ms to 5.8ms. The second bottleneck is the centroid update stage. Standard code uses scatter-style atomic adds. Each thread adds its point into a shared sum buffer keyed by cluster id. Many threads hit the same ‘hot’ cluster at once. That causes atomic contention and hardware serialization. The research team measured only 50 GB/s effective bandwidth here on an H200. Flash-KMeans replaces this with Sort-Inverse Update. It sorts the 1D assignment vector by cluster id using argsort. Identical cluster ids then form contiguous segments. Each thread block reduces a segment on-chip, then issues one atomic add per segment. The heavy point matrix is never physically permuted. Atomic operations drop from (O((K+NBN)d))(O((K + frac{N}{B_N})d)) . The kernel reaches up to 6.3×. Benchmark The research team test it on an H200 with CUDA 12.8, FP16 data, and d=128. They sweep N, K, and batch size B. They compare against four optimized baselines: fast_pytorch_kmeans, fastkmeans, cuML, and FAISS. Comparison Reported speedup Workload context End-to-end vs best baseline up to 17.9× N=8M, K=1024 (large N, small K) vs NVIDIA cuML 33× industry library vs FAISS over 200× industry library FlashAssign kernel up to 21.2× N=1M, K=8192 (assignment) Sort-Inverse Update kernel up to 6.3× N=33M, K=4096 (update) Out-of-core, large scale up to 10.5× N=400M, K=16384 vs fastkmeans One failure mode matters for context. Standard PyTorch implementations run out of memory in large-K regimes. They cannot materialize the N×K matrix. FAISS is the industry-standard library under many production vector-search systems. The library also runs out-of-core. On one billion points (K=32768, d=128), it finishes an iteration in 41.4s, against 261.8s for the baseline. It uses chunked stream overlap to hide PCIe transfer behind compute. A cache-aware compile heuristic also cuts tuning overhead by up to 175×, within 0.3% of tuned speed. MTP Interactive Explainer Marktechpost · Interactive Explainer Flash-KMeans: exact k-means, rebuilt around GPU memory Same Lloyd’s math as standard k-means — faster only because of dataflow. Run clustering live, watch the update bottleneck, and size the IO it removes. 17.9×end-to-end vs best baseline 33×vs NVIDIA cuML 200×+vs FAISS 1Bpoints, out-of-core 1 · Live clustering 2 · Update contention 3 · IO calculator Data points (N) 800 Clusters (K) 5 Run Step New data Iteration0 Centroid shift— Statusidle This runs real Lloyd’s k-means in your browser on 2-D points. The algorithm is identical to what Flash-KMeans accelerates — only the GPU dataflow differs. Each step = one assignment + one centroid update. Press play. Standard scatter-update serializes when blocks write the same “hot” centroid (red stalls). Sort-Inverse Update sorts cluster IDs first, so each block merges contiguous segments with one atomic add — no conflict. Play timeline Reset Standard atomicsO(N·d) Sort-Inverse atomicsO((K+N/B)·d) Measured std bandwidth50 GB/s Kernel speedup6.3× Standard updates issue one atomic add per token. Many threads hit the same centroid at once, causing contention. Sorting by cluster ID turns scatters into segment-level reductions in on-chip memory. Standard — materialize N×K matrix, O(NK)— FlashAssign — stream inputs, O(Nd+Kd)— —less HBM traffic for the assignment step (theoretical) Points N 1M Clusters K 1024 Dimension d 128 Standard k-means writes then reads a full N×K distance matrix in HBM. FlashAssign never builds it — it reads X and C once and writes assignments once. Bars show relative HBM round-trips, FP16. © Marktechpost Speedups: Flash-KMeans paper (arXiv:2603.09229), NVIDIA H200. Demo runs in-browser for illustration · github.com/svg-project/flash-kmeans Use Cases Faster exact k-means changes what you can run online, not just offline. Vector search indexing: FAISS builds its search indices with k-means. Faster k-means lets you re-index as data shifts, instead of rebuilding overnight. Sparse attention routing: Routing Transformers and Tactic cluster tokens to route attention. Millisecond k-means makes this viable inside the inference loop. KV-cache compression: ClusterKV clusters tokens in semantic space to compress the cache. Cheaper clustering makes per-layer, per-step compression practical. Low-bit KV quantization: Recent methods cluster KV entries into codebooks, repeatedly. Faster

Meet Flash-KMeans: An IO-Aware, Exact K-Means That Runs Over 200× Faster Than FAISS on GPUs Read Post »

AI, Committee, 新闻, Uncategorized

The Download: cutting AC emissions, and nature’s drug designer

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. These new solid-state ACs promise a cool future. Scientists aren’t so sure. After three years of record-­breaking heat and another scorcher underway, air-conditioning isn’t going anywhere. That’s good for our health, but bad for the planet: it already accounts for 7% of global electricity use and 3% of greenhouse-gas emissions.  Feeling the heat, scientists and startups are hoping to amp up solid-­state cooling. These systems move heat through conductive materials, which could cool spaces and surfaces with fewer messy side effects. The catch is whether it can match the efficiency of traditional AC. Find out how the unconventional coolers aim to dial down AC emissions. —Sara Kiley Watson This story is from the next edition of our magazine, which is all about engineering. Subscribe now to get a copy when it lands!  Job titles of the future: nature’s drug designer In 2018, after nearly two decades working in Big Pharma, chemist Tim Cernak was ready to put his skills to a new use.  As a lifelong nature lover, he had become concerned that animals are often treated with human pharmaceuticals that can be harmful or even lethal. He decided to address this with a new approach: “conservation chemistry.”  Using AI tools and robots, he’s now rapidly designing and testing drugs for animals. Here’s what it takes to treat nature’s patients. —Anna Gibbs The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 Anthropic has shut down access to its top models after a US directiveThe US barred foreigners from using Fable 5 and Mythos 5 on Friday. (NYT $)+ Anthropic disabled access globally as it can’t filter users in real time.(BBC)+ Talks with Amazon’s CEO apparently prompted the ban. (WSJ $)+ Cybersecurity experts have called for the ban to end. (Axios)+ But the White House’s war against Anthropic has previously backfired. (MIT Technology Review) 2 The UK is banning social media for under-16sDetails are scant, but the measure is due to take effect in early 2027. (The Guardian)+ The ban covers Snapchat, TikTok, YouTube, Instagram, Facebook, and X. (BBC)+ Many countries are curbing children’s social media access. (Reuters $) 3 New space data suggests black holes formed before galaxiesIt could resolve cosmology’s chicken-and-egg dilemma. (New Scientist $)+ Odd tricks have formed a massive black hole. (MIT Technology Review) 4 Skepticism around AI layoffs is increasingThere are growing doubts that AI is really the culprit. (TechCrunch)+ We need a reality check on AI jobs hysteria. (MIT Technology Review) 5 A coalition of states has opened an investigation into OpenAIOver matters including user data, child safety and advertising. (NYT $) 6 Tesla has been accused of misleading regulators over “full self-driving”By exaggerating its safety statistics. (Reuters $) 7 NASA’s “quiet supersonic” plane has hit critical new milestonesThe X-59 reached 924 mph and 55,000 feet. (Scientific American) + Which are essential for flying over populated areas. (Engadget)+ It’s designed to take the boom out of supersonic travel. (BBC) 8 Deepfakes are getting harder to spot—and weirder—in the midtermsThanks to improvements in free AI tools. (WSJ $) 9 AI is revealing the secret lives of animalsBy tracing their movements, landmarks, and social practices. (Nature)  10 Where did Earth get its oceans? Maybe it made them itself.Scientists now suspect that Earth’s waters are homegrown. (Quanta)  Quote of the day “This action has taken the best models away from defenders, created market uncertainty, and risked America’s AI leadership without any real risk to justify it.”  —Cybersecurity leaders urge the Trump administration to reverse restrictions on Anthropic’s most advanced AI models in an open letter. One More Thing CHRISTIE HEMM KLOK How scientists want to make you young again A little over 15 years ago, scientists at Kyoto University made a remarkable discovery. When they added just four proteins to a skin cell and waited about two weeks, some of the cells underwent an unexpected and astounding transformation: they became young again. Now, after more than a decade of developing this cellular reprogramming, biotech companies and research labs have tantalising hints that the process could be the gateway to an unprecedented new technology for human age reversal.  Read the full story on their efforts to “reprogram” aging bodies back to youth.  —Antonio Regalado We can still have nice things A place for comfort, fun, and distraction to brighten up your day. (Got any ideas? Drop me a line.) + Evolutionary biologists may have figured out why the T-Rex had such tiny arms.+ This beautifully sustainable bento box design is engineered to eliminate single-use takeout waste.+ Search across 5.8 million museum artworks spanning from 3000 BC to today at The Last Museum.+ Here’s a sharp cosmic snapshot of Thor’s Helmet, an interstellar gas bubble sitting 15,000 light-years away.

The Download: cutting AC emissions, and nature’s drug designer Read Post »

AI, Committee, 新闻, Uncategorized

This man with ALS is “the first power user” of a brain implant that lets him speak

Casey Harrell has had a set of electrodes embedded in his brain for almost three years. Harrell, who has amyotrophic lateral sclerosis (ALS) and is paralyzed, first used his brain-computer interface (BCI) to “speak” sentences with the help of a research team in 2023. Since then, Harrell has clocked thousands of hours of use. He can use the device largely independently, once he’s been “plugged in” with the help of a carer. His team has added new features to it, and Harrell also uses it to surf the web and perform his job. “Living with a disease like ALS, you are supposed to have diminished dreams. I do not,” Harrell tells MIT Technology Review. “Any one of these things would be an absolute godsend of improvement. To have all of them, and many, many more, is truly revolutionary.”  Within the first 22.6 months after the device was implanted, Harrell had used it for more than 3,800 hours at home without any researchers present, the team reported today in the journal Nature Medicine. “He’s the first power user of a speech BCI,” says team member Sergey Stavisky, a neuroengineer at the University of California, Davis. Decoding speech Three years ago, Harrell entrusted David Brandman, an associate professor of neurological surgery at the University of California, Davis, and his colleagues with his brain. Harrell, who was 45 at the time, had already been diagnosed with ALS, a degenerative disease that robs people of the use of their muscles. Harrell was dependent on others to control his wheelchair and to dress and feed him. He had difficulty speaking; people struggled to understand what he was saying. Then Brandman and his colleagues asked if he’d like to trial a brain implant that might help him communicate. “The industry was [on the] cusp of a transformation, and I wanted to be part of it,” says Harrell. He signed up. In July 2023, during a five-hour operation, doctors implanted four arrays of 64 electrodes each into his brain. Each pair of arrays was wired to a “pedestal” connection point—creating two docking locations on the exterior of his skull to connect the electrodes to a computer. The team had long been working on developing algorithms to decode brain activity into speech. Their system works by recording activity from the speech motor cortex—a region of the brain responsible for the movements that allow us to speak. “There are 39 phonemes that make up all the sounds in the [American] English language,” says Nicholas Card, a neuroengineer at UC Davis and member of the team. Mapping neural activity related to producing each of those phonemes can allow the team to create a personalized speech decoder and software that can “speak” those words. “We first go from brain data to phonemes, and then from phonemes to words,” he says. They started using the device around a month after the surgery. The team got Harrell’s speech decoder working on the first day, says Card. On that day in August, Harrell used the device to speak with a 50-word vocabulary, and 99.6% of the words were as he’d intended. That vocabulary was later expanded to 125,000 words with 97.5% accuracy. At the time, it was unclear how long the device might last. Brain-computer interfaces are still new—not many people have had them implanted for long periods of time. Scar tissue can form around electrodes in a person’s brain, interfering with their ability to pick up neural activity, for example. But that doesn’t seem to be the case for Harrell. Power user In another advance, Harrell is now able to use the device more independently. In 2023, members of the research team would have to visit Harrell at his home and physically connect and disconnect him from the device on the days he wanted to use it. Not anymore. The team has since automated more of the system—today, Harrell’s care partner can don and doff it for him. “He’ll wake up, get plugged in, and just get going,” says Stavisky. This is important, says Mariska Vansteesel, a BCI researcher at Utrecht Medical Center who was not involved in the trial. “For these technologies to be relevant for patients, we really need to test them in settings in which they will eventually be used … to demonstrate that it has value, that it’s usable, and that it functions well without the constant involvement of a research team,” she says. Casey Harrell uses his BCI to speak in “private mode.” The team has also worked to improve the system itself. It is now 99% accurate, says Stavisky. Harrell can also control a cursor—a game changer that enables him to use his personal computer to send text messages and emails, surf the web, and keep up with his job as an environmental activist. Over the years, the team has updated the system to accommodate specific requests from Harrell. He is now able to switch on a “privacy mode”—when active, any decoded text will be automatically deleted. He can also opt to use a “profanity filter” while he’s talking to his young daughter. “We have been able to add on to the software side of the device … improving the accuracy and adding more bells and whistles to enable me to be more independent when using the device,” says Harrell. “We are making the road as we walk it, or roll it, so to speak.” Nothing short of revolutionary Vansteesel cautions that while the device is working well for Harrell, there’s no guarantee it will work as well, or as long, for other people with ALS. Over the last decade, she has worked with a woman with ALS who used a fully implanted device to communicate using “brain clicks”—cursor clicks made using brain activity. The woman used her BCI for seven years, but it stopped working toward the end of that period, apparently due to brain degeneration. At any rate, not everyone with ALS will be willing to undergo invasive brain surgery, says Jane Huggins, who

This man with ALS is “the first power user” of a brain implant that lets him speak Read Post »

We use cookies to improve your experience and performance on our website. You can learn more at 隱私權政策 and manage your privacy settings by clicking Settings.

Privacy Preferences

You can choose your cookie settings by turning on/off each type of cookie as you wish, except for essential cookies.

Allow All
Manage Consent Preferences
  • Always Active

Save
zh_CN