YouZum

Committee

AI, Committee, Notizie, Uncategorized

Cross-Domain Data Selection and Augmentation for Automatic Compliance Detection

arXiv:2604.21469v1 Announce Type: new Abstract: Automating the detection of regulatory compliance remains a challenging task due to the complexity and variability of legal texts. Models trained on one regulation often fail to generalise to others. This limitation underscores the need for principled methods to improve cross-domain transfer. We study data selection as a strategy to mitigate negative transfer in compliance detection framed as a natural language inference (NLI) task. Specifically, we evaluate four approaches for selecting augmentation data from a larger source domain: random sampling, Moore-Lewis’s cross-entropy difference, importance weighting, and embedding-based retrieval. We systematically vary the proportion of selected data to analyse its effect on cross-domain adaptation. Our findings demonstrate that targeted data selection substantially reduces negative transfer, offering a practical path toward scalable and reliable compliance automation across heterogeneous regulations.

Cross-Domain Data Selection and Augmentation for Automatic Compliance Detection Leggi l'articolo »

AI, Committee, Notizie, Uncategorized

Learning Dynamic Representations and Policies from Multimodal Clinical Time-Series with Informative Missingness

arXiv:2604.21235v1 Announce Type: cross Abstract: Multimodal clinical records contain structured measurements and clinical notes recorded over time, offering rich temporal information about the evolution of patient health. Yet these observations are sparse, and whether they are recorded depends on the patient’s latent condition. Observation patterns also differ across modalities, as structured measurements and clinical notes arise under distinct recording processes. While prior work has developed methods that accommodate missingness in clinical time series, how to extract and use the information carried by the observation process itself remains underexplored. We therefore propose a patient representation learning framework for multimodal clinical time series that explicitly leverages informative missingness. The framework combines (1) a multimodal encoder that captures signals from structured and textual data together with their observation patterns, (2) a Bayesian filtering module that updates a latent patient state over time from observed multimodal signals, and (3) downstream modules for offline treatment policy learning and patient outcome prediction based on the learned patient state. We evaluate the framework on ICU sepsis cohorts from MIMIC-III, MIMIC-IV, and eICU. It improves both offline treatment policy learning and adverse outcome prediction, achieving FQE 0.679 versus 0.528 for clinician behavior and AUROC 0.886 for post-72-hour mortality prediction on MIMIC-III.

Learning Dynamic Representations and Policies from Multimodal Clinical Time-Series with Informative Missingness Leggi l'articolo »

AI, Committee, Notizie, Uncategorized

Health-care AI is here. We don’t know if it actually helps patients.

I don’t need to tell you that AI is everywhere. Or that it is being used, increasingly, in hospitals. Doctors are using AI to help them with notetaking. AI-based tools are trawling through patient records, flagging people who may require certain support or treatments. They are also used to interpret medical exam results and X-rays. A growing number of studies suggest that many of these tools can deliver accurate results. But there’s a bigger question here: Does using them actually translate into better health outcomes for patients? We don’t yet have a good answer. That’s what Jenna Wiens, a computer scientist at the University of Michigan, and Anna Goldenberg of the University of Toronto, argue in a paper published in the journal Nature Medicine this week. Wiens tells me she has spent years investigating how AI might benefit health care. For the first decade of her career she tried to pitch the technology to clinicians. Over the last few years, she says, it’s as though “a switch flipped.” Health-care providers not only appear much more interested in the promise of these technologies, they have also begun rapidly deploying them. The problem is that many providers aren’t rigorously assessing how well they actually work. Take “ambient AI” tools, for example. Also known as AI scribes, they “listen” to conversations between doctors and patients, then transcribe and summarize them. Multiple tools are available, and they are already being widely adopted by health-care providers. A few months ago, a staffer at a major New York medical center who develops AI tools for doctors told me that, anecdotally, medics are “overjoyed” by the technology—it allows them to focus all their attention on their patients during appointments, and it saves them from a lot of time-consuming paperwork. Early studies support these anecdotes and suggest that the tools can reduce clinician burnout. That’s all well and good. But what about patient health outcomes? “[Researchers] have evaluated provider or clinician and patient satisfaction, but not really how these tools are affecting clinical decision-making,” says Wiens. “We just don’t know.” The same holds true for other AI-based technologies used in health-care settings. Some are used to predict patients’ health trajectories, others to recommend treatments. They are designed to make health care more effective and efficient. But even a tool that is “accurate” won’t necessarily improve health outcomes. AI might speed up the interpretation of a chest X-ray, for example. But how much will a doctor rely on its analysis? How will that tool affect the way a doctor interacts with patients or recommends treatment? And ultimately: What will this mean for those patients? The answers to those questions might vary between hospitals or departments and could depend on clinical workflows, says Wiens. They might also differ between doctors at various stages of their careers. Take the AI scribes, as another example. Some research on AI use in education suggests that such tools can impact the way people cognitively process information. Could they affect the way a doctor processes a patient’s information? Will the tools affect the way medical students think about patient data in a way that impacts care? These questions need to be explored, says Wiens. “We like things that save us time, but we have to think about the unintended consequences of this,” she says. In a study published in January 2025, Paige Nong at the University of Minnesota and her colleagues found that around 65% of US hospitals used AI-assisted predictive tools. Only two-thirds of those hospitals evaluated their accuracy. Even fewer assessed them for bias. The number of hospitals using these tools has probably increased since then, says Wiens. Those hospitals, or entities other than the companies developing the tools, need to evaluate how much they help in specific settings. There’s a possibility that they could leave patients worse off, although it’s more likely that AI tools just aren’t as beneficial as health-care providers might assume they are, says Wiens. “I do believe in the potential of AI to really improve clinical care,” says Wiens, who stresses that she doesn’t want to stop the adoption of AI tools in health care. She just wants more information about how they are affecting people. “I have to believe that in the future it’s not all AI or no AI,” she says. “It’s somewhere in between.” This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here. 

Health-care AI is here. We don’t know if it actually helps patients. Leggi l'articolo »

AI, Committee, Notizie, Uncategorized

Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates

Training frontier AI models is, at its core, a coordination problem. Thousands of chips must communicate with each other continuously, synchronizing every gradient update across the network. When one chip fails or even slows down, the entire training run can stall. As models scale toward hundreds of billions of parameters, that fragility becomes increasingly untenable. Google DeepMind is now proposing a different model entirely. Google DeepMind researchers introduced Decoupled DiLoCo (Distributed Low-Communication), a distributed training architecture that decouples compute into asynchronous, fault-isolated ‘islands,’ enabling large language model pre-training across geographically distant data centers without requiring the tight synchronization that makes conventional approaches brittle at scale. The Problem with Traditional Distributed Training To understand why Decoupled DiLoCo is important, it helps to understand how distributed training typically works. Standard Data-Parallel training replicates a model across many accelerators (GPUs or TPUs), each processing a different mini-batch of data. After each forward and backward pass, gradients must be averaged across every device — a process called AllReduce — before the next training step can begin. This blocking synchronization step means every device must wait for the slowest one. Across thousands of chips spanning multiple data centers, that bottleneck is not just inconvenient; it makes global-scale training effectively impractical. Bandwidth is another hard constraint. Conventional Data-Parallel training requires approximately 198 Gbps of inter-datacenter bandwidth across eight data centers — far beyond what standard wide-area networking (WAN) can support between geographically distributed facilities. How Decoupled DiLoCo Works Decoupled DiLoCo builds on two prior systems from Google. The first is Pathways, which introduced a distributed AI system based on asynchronous data flow, allowing different compute resources to work at their own pace without blocking on one another. The second is DiLoCo, which dramatically reduced the inter-datacenter bandwidth required for distributed training by having each worker perform many local gradient steps before communicating with peers — dramatically reducing how much data needs to flow between data centers. Decoupled DiLoCo brings both ideas together. Built on top of Pathways, training is divided across separate clusters of accelerators called learner units — the ‘islands’ of compute. Each learner unit trains semi-independently, performing many local steps, before sharing a compressed gradient signal with an outer optimizer that aggregates updates across all learner units. Because this outer synchronization step is asynchronous, a chip failure or slow learner unit in one island does not block the others from continuing to train. The bandwidth savings are dramatic. Decoupled DiLoCo reduces required inter-datacenter bandwidth from 198 Gbps to just 0.84 Gbps across eight data centers — multiple orders of magnitude lower — making it compatible with standard internet-scale connectivity between datacenter facilities rather than requiring custom high-speed network infrastructure. Self-Healing Through Chaos Engineering One of the most technically significant properties of Decoupled DiLoCo is its fault tolerance. The research team used chaos engineering, a method that deliberately introduces artificial hardware failures into a running system to test its robustness during training runs. The system continued training after the loss of entire learner units, and then seamlessly reintegrated those units when they came back online. This behavior is what the research team describes as ‘self-healing’. In simulations involving 1.2 million chips under high failure rates, Decoupled DiLoCo maintained a goodput (the fraction of time the system is performing useful training) of 88%, compared to just 27% for standard Data-Parallel methods. Goodput is the practical metric that matters here: a training run with high nominal compute but low goodput wastes significant resources. https://deepmind.google/blog/decoupled-diloco/? Critically, these resilience gains come with minimal degradation in model quality. In real-world experiments using Gemma 4 models, Decoupled DiLoCo achieved an average ML benchmark accuracy of 64.1%, compared to 64.4% for the conventional baseline — a difference well within the noise of typical evaluation variance. Training a 12B Model Across Four U.S. Regions The research team validated Decoupled DiLoCo at production scale by successfully training a 12 billion parameter model across four separate U.S. regions using just 2–5 Gbps of wide-area networking, a bandwidth level achievable with existing commercial internet infrastructure between data center facilities. The system accomplished this more than 20 times faster than conventional synchronization methods. The key reason: rather than forcing compute to pause and wait for communication to complete, Decoupled DiLoCo incorporates required communication into longer periods of computation, eliminating the “blocking” bottlenecks that make conventional distributed training slow at global scale. Mixing Hardware Generations An underappreciated implication of the architecture is its support for heterogeneous hardware. Because learner units operate asynchronously, they do not need to run on identical hardware at the same clock speed. The research team demonstrated training runs that mixed TPU v6e and TPU v5p chips — different hardware generations with different performance characteristics — in a single training job, without degrading ML performance relative to homogeneous runs. This has two practical consequences worth noting. First, it extends the useful life of existing hardware, allowing older accelerators to continue contributing meaningfully to large-scale training. Second, because new hardware generations do not arrive everywhere at once, being able to train across generations can alleviate the recurring logistical and capacity bottlenecks that arise during hardware transition periods — a real operational challenge at organizations running large training infrastructure. Key Takeaways Decoupled DiLoCo eliminates the single-point-of-failure problem in large-scale AI training by dividing training across asynchronous, fault-isolated “islands” of compute called learner units — so a chip or cluster failure in one island does not stall the rest of the training run. The architecture reduces inter-datacenter bandwidth requirements by orders of magnitude — from 198 Gbps down to 0.84 Gbps across eight data centers — making globally distributed pre-training feasible over standard wide-area networking rather than requiring custom high-speed infrastructure. Decoupled DiLoCo is self-healing: using chaos engineering to simulate real hardware failures, the system maintained 88% goodput compared to just 27% for standard Data-Parallel training under high failure rates, and seamlessly reintegrated offline learner units when they came back online. The approach was validated at production scale, successfully training a 12 billion parameter model across four

Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates Leggi l'articolo »

AI, Committee, Notizie, Uncategorized

The Download: supercharged scams and studying AI healthcare

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. We’re in a new era of AI-driven scams When ChatGPT was released in late 2022, it showed how easily generative AI could create human-like text. This quickly caught the eye of cybercriminals, who began using LLMs to compose malicious emails. Since then, they’ve adopted AI for everything from turbocharged phishing and hyperrealistic deepfakes to automated vulnerability scans. Many organizations are now struggling to cope with the sheer volume of cyberattacks. AI is making them faster, cheaper, and easier to carry out, a problem set to worsen as more cybercriminals adopt these tools—and their capabilities improve. Read the full story on how AI is reshaping cybercrime. —Rhiannon Williams “Supercharged scams” is one of the 10 Things That Matter in AI Right Now, our essential guide to what’s really worth your attention in the field. Subscribers can watch an exclusive roundtable unveiling the technologies and trends on the list, with analysis from MIT Technology Review’s AI reporter Grace Huckins and executive editors Amy Nordrum and Niall Firth. Healthcare AI is here. We don’t know if it actually helps patients. Doctors are using AI to help them with notetaking. AI-based tools are trawling through patient records, flagging people who may require certain support or treatments. They are also used to interpret medical exam results and X-rays. A growing number of studies suggest that many of these tools can deliver accurate results. But there’s a bigger question here: Does using them actually translate into better health outcomes for patients? We don’t yet have a good answer—here’s why. —Jessica Hamzelou The story is from The Checkup, our weekly newsletter that gives you the latest from the worlds of health and biotech. Sign up to receive it in your inbox every Thursday. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 DeepSeek has unveiled its long-awaited new AI modelThe Chinese company has just launched preview versions of DeepSeek-V4. (CNN)+It says V4 is the most powerful open-source platform. (Bloomberg $) + And rivals top closed-source models from OpenAI and DeepMind. (SCMP)+ The model is adapted for Huawei chip technology. (Reuters $) 2 More countries are curbing children’s social media accessNorway is set to enforce the latest ban. (Reuters $)+ The Philippines could follow soon. (Bloomberg $)+ Americans are pushing to get AI out of schools. (The New Yorker) 3 The US has accused China of mass AI theft as tensions riseA White House memo claims Chinese firms are exploiting American models. (BBC)+ Beijing calls the accusations “slander.” (Ars Technica) 4 OpenAI set itself apart from Anthropic by widely releasing its new modelIt’s releasing GPT-5.5 to all ChatGPT users, despite cybersecurity concerns. (NYT $)+ OpenAI says the new model is better at coding and more efficient. (The Verge) 5 Meta is cutting 10% of jobs to offset AI spendingRoughly 8,000 layoffs are set to be announced on May 20. (QZ)+ Anti-AI protests are growing. (MIT Technology Review) 6 Palantir is facing a backlash from employeesThanks to its work with ICE and the Trump administration. (Wired $)+ Surveillance tech is reshaping the fight for privacy. (MIT Technology Review) 7 The era of free access to advanced AI is coming to an endAI labs are under mounting pressure to start turning profits. (The Verge) 8 Elon Musk’s feud with Sam Altman is heading to court The case has already revealed several unflattering secrets. (WP $) 9 A new movement is encouraging people to ditch their smartphones for a month“Month Offline” is like a Dry January for smartphones. (The Atlantic) 10 Spotify has revealed its most-streamed music of the last 20 yearsFeaturing Taylor Swift, Bad Bunny, and The Weeknd. (Gizmodo)  Quote of the day “We want a childhood where children get to be children. Play, friendships, and everyday life must not be taken over by algorithms and screens.”  —Norwegian Prime Minister Jonas Gahr Store announces age restrictions for social media. One More Thing NASA/JPL-CALTECH VIA WIKIMEDIA COMMONS; CRAFT NASA/JPL-CALTECH/SWRI/MSSS; IMAGE PROCESSING: KEVIN M. GILL The search for extraterrestrial life is targeting Jupiter’s icy moon Europa As astronomers have discovered more about Europa over the past few decades, Jupiter’s fourth-largest moon has excited planetary scientists interested in the geophysics of alien worlds.  All that water and energy—and hints of elements essential for building organic molecules —point to an extraordinary possibility. In the depths of its ocean, or perhaps crowded in subsurface lakes or below icy surface vents, Jupiter’s big, bright moon could host life.  To find further evidence, NASA is now searching for signs of alien existence on Europa. Read the full story on the mission. —Stephen Ornes We can still have nice things A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line.) + Here’s a fun look at the secret collaborations of pop history.+ Meet the mannequins showing how the “ideal” body has evolved.+ A photographer has cataloged all 12,795 objects in her home into an archive of a life.+ Slime molds are unexpectedly beautiful when viewed through these high-detail macro shots.

The Download: supercharged scams and studying AI healthcare Leggi l'articolo »

AI, Committee, Notizie, Uncategorized

Meta-Tool: Efficient Few-Shot Tool Adaptation for Small Language Models

arXiv:2604.20148v1 Announce Type: new Abstract: Can small language models achieve strong tool-use performance without complex adaptation mechanisms? This paper investigates this question through Meta-Tool, a controlled empirical study comparing hypernetwork-based LoRA adaptation against carefully designed few-shot prompting. Using a Llama-3.2-3B-Instruct backbone, we evaluate four adaptation mechanisms–few-shot prompting, documentation encoding, hypernetwork-generated LoRA weights, and value-guided beam search–across four diverse benchmarks: Gorilla APIBench, Spider 2.0, WebArena, and InterCode. Our central finding is a well-supported negative result: despite generating non-trivial weight matrices, the 227.8M-parameter hypernetwork provides no measurable improvement over few-shot prompting alone. Comprehensive ablation studies reveal that few-shot examples contribute +21.5% to performance and documentation contributes +5.0%, while the hypernetwork adds 0%. A 3B model with well-designed prompts achieves 79.7% of GPT-5’s average performance at $10 times$ lower latency. Error analysis across 722 failure cases spanning all shot counts (0–5) shows that at the 5-shot configuration (106 failures), failure modes are task-dependent: schema-heavy tasks (Spider 2.0, WebArena) show near-zero format errors with remaining failures semantic, while format errors dominate on Gorilla (100%) and InterCode (70%). These findings redirect practitioners toward prompt engineering and example curation rather than complex adaptation architectures.

Meta-Tool: Efficient Few-Shot Tool Adaptation for Small Language Models Leggi l'articolo »

AI, Committee, Notizie, Uncategorized

Google Cloud AI Research Introduces ReasoningBank: A Memory Framework that Distills Reasoning Strategies from Agent Successes and Failures

Most AI agents today have a fundamental amnesia problem. Deploy one to browse the web, resolve GitHub issues, or navigate a shopping platform, and it approaches every single task as if it has never seen anything like it before. No matter how many times it has stumbled on the same type of problem, it repeats the same mistakes. Valuable lessons evaporate the moment a task ends. A team of researchers from Google Cloud AI, the University of Illinois Urbana-Champaign and Yale University introduces ReasoningBank, a memory framework that doesn’t just record what an agent did — it distills why something worked or failed into reusable, generalizable reasoning strategies. The Problem with Existing Agent Memory To understand why ReasoningBank is important, you need to understand what existing agent memory actually does. Two popular approaches are trajectory memory (used in a system called Synapse) and workflow memory (used in Agent Workflow Memory, or AWM). Trajectory memory stores raw action logs — every click, scroll, and typed query an agent executed. Workflow memory goes a step further and extracts reusable step-by-step procedures from successful runs only. Both have critical blind spots. Raw trajectories are noisy and too long to be directly useful for new tasks. Workflow memory only mines successful attempts, which means the rich learning signal buried in every failure — and agents fail a lot — gets completely discarded. https://arxiv.org/pdf/2509.25140 How ReasoningBank Works ReasoningBank operates as a closed-loop memory process with three stages that run around every completed task: memory retrieval, memory extraction, and memory consolidation. https://arxiv.org/pdf/2509.25140 Before an agent starts a new task, it queries ReasoningBank using embedding-based similarity search to retrieve the top-k most relevant memory items. Those items get injected directly into the agent’s system prompt as additional context. Importantly, the default is k=1, a single retrieved memory item per task. Ablation experiments show that retrieving more memories actually hurts performance: success rate drops from 49.7% at k=1 to 44.4% at k=4. The quality and relevance of retrieved memory matter far more than quantity. Once the task is finished, a Memory Extractor — powered by the same backbone LLM as the agent — analyzes the trajectory and distills it into structured memory items. Each item has three components: a title (a concise strategy name), a description (a one-sentence summary), and content (1–3 sentences of distilled reasoning steps or operational insights). Crucially, the extractor treats successful and failed trajectories differently: successes contribute validated strategies, while failures supply counterfactual pitfalls and preventative lessons. To decide whether a trajectory was successful or not — without access to ground-truth labels at test time — the system uses an LLM-as-a-Judge, which outputs a binary “Success” or “Failure” verdict given the user query, the trajectory, and the final page state. The judge doesn’t need to be perfect; ablation experiments show ReasoningBank remains robust even when judge accuracy drops to around 70%. New memory items are then appended directly to the ReasoningBank store, maintained as JSON with pre-computed embeddings for fast cosine similarity search, completing the loop. MaTTS: Pairing Memory with Test-Time Scaling The research team goes further and introduces memory-aware test-time scaling (MaTTS), which links ReasoningBank with test-time compute scaling — a technique that has already proven powerful in math reasoning and coding tasks. The insight is simple but important: scaling at test time generates multiple trajectories for the same task. Instead of just picking the best answer and discarding the rest, MaTTS uses the full set of trajectories as rich contrastive signals for memory extraction. MaTTS comes in two ways. Parallel scaling generates k independent trajectories for the same query, then uses self-contrast — comparing what went right and wrong across all trajectories — to extract higher-quality, more reliable memory items. Sequential scaling iteratively refines a single trajectory using self-refinement, capturing intermediate corrections and insights as memory signals. The result is a positive feedback loop: better memory guides the agent toward more promising rollouts, and richer rollouts forge even stronger memory. The paper notes that at k=5, parallel scaling (55.1% SR) edges out sequential scaling (54.5% SR) on WebArena-Shopping — sequential gains saturate quickly once the model reaches a decisive success or failure, while parallel scaling keeps providing diverse rollouts that the agent can contrast and learn from. https://arxiv.org/pdf/2509.25140 Results Across Three Benchmarks Tested on WebArena (a web navigation benchmark spanning shopping, admin, GitLab, and Reddit tasks), Mind2Web (which tests generalization across cross-task, cross-website, and cross-domain settings), and SWE-Bench-Verified (a repository-level software engineering benchmark with 500 verified instances), ReasoningBank consistently outperforms all baselines across all three datasets and all tested backbone models. On WebArena with Gemini-2.5-Flash, ReasoningBank improved overall success rate by +8.3 percentage points over the memory-free baseline (40.5% → 48.8%), while reducing average interaction steps by up to 1.4 compared to no-memory and up to 1.6 compared to other memory baselines. The efficiency gains are sharpest on successful trajectories — on the Shopping subset, for example, ReasoningBank cut 2.1 steps from successful task completions (a 26.9% relative reduction). The agent reaches solutions faster because it knows the right path, not simply because it gives up on failed attempts sooner. On Mind2Web, ReasoningBank delivers consistent gains across cross-task, cross-website, and cross-domain evaluation splits, with the most pronounced improvements in the cross-domain setting — where the highest degree of strategy transfer is required and where competing methods like AWM actually degrade relative to the no-memory baseline. On SWE-Bench-Verified, results vary meaningfully by backbone model. With Gemini-2.5-Pro, ReasoningBank achieves a 57.4% resolve rate versus 54.0% for the no-memory baseline, saving 1.3 steps per task. With Gemini-2.5-Flash, the step savings are more dramatic — 2.8 fewer steps per task (30.3 → 27.5) alongside a resolve rate improvement from 34.2% to 38.8%. Adding MaTTS (parallel scaling, k=5) pushes results further. ReasoningBank with MaTTS reaches 56.3% overall SR on WebArena with Gemini-2.5-Pro — compared to 46.7% for the no-memory baseline — while also reducing average steps from 8.8 to 7.1 per task. Emergent Strategy Evolution One of the most striking findings is that ReasoningBank’s memory doesn’t

Google Cloud AI Research Introduces ReasoningBank: A Memory Framework that Distills Reasoning Strategies from Agent Successes and Failures Leggi l'articolo »

AI, Committee, Notizie, Uncategorized

Will fusion power get cheap? Don’t count on it.

Fusion power could provide a steady, zero-emissions source of electricity in the future—if companies can get plants built and running. But a new study suggests that even if that future arrives, it might not come cheap. Technologies tend to get less expensive over time. Lithium-ion batteries are now about 90% cheaper than they were in 2013. But historically, different technologies tend to go through this curve at different rates. And the cost of fusion might not sink as quickly as the prices of batteries or solar. It’s tricky to make any predictions about the cost of a technology that doesn’t exist yet. But when there’s billions of dollars of public and private funding on the line, it’s worth considering what assumptions we’re making about our future energy mix and its cost. One crucial measure is a metric called experience rate—the percentage by which an energy technology’s cost declines every time capacity doubles. A higher figure means a quicker price drop and better economic gains with scaling. Historically, the experience rate is 12% for onshore wind power, 20% for lithium-ion batteries, and 23% for solar modules. Other energy technologies haven’t gotten cheap quite as quickly—fission is at just 2%. In the new study, published in Nature Energy, researchers aimed to improve predictions of fusion’s future price by estimating the technology’s experience rate. The team looked at three key characteristics that can correlate with experience rate: unit size, design complexity, and the need for customization. The larger and more complex a technology is, and/or the more it needs to be customized for different use cases, the lower the experience rate. The researchers interviewed fusion experts, including public-sector researchers and those working at companies in the private sector. They had the experts evaluate fusion power plants on those characteristics and used that info to predict the experience rate. (One note here: The study focused only on magnetic confinement and laser inertial confinement, two of the leading fusion approaches, which together receive the vast majority of funding today. Other approaches could come with different cost benefits.) Fusion plants will likely be relatively large, similar to other types of facilities (like coal and fission power plants) that rely on generating heat. They will probably need less customization than fission plants—largely because regulations and safety considerations should be simpler—but more than technologies like solar panels. And as for complexity, “there was almost unanimous agreement that fusion is incredibly complex,” says Lingxi Tang, a PhD candidate in the energy and technology policy group at ETH Zurich in Switzerland and one of the authors of the study. (Some experts said it was literally off the scale the researchers gave them.) The final figure the researchers suggest for fusion’s experience rate is between 2% and 8%, meaning it will see a faster price reduction than nuclear power but not as dramatic an improvement as many common energy technologies being deployed today. That means that it would take a lot of deployment—and likely quite a long time—for the price of building a fusion reactor to drop significantly, so electricity produced by fusion plants could be expensive for a while. And it’s a much slower rate than the 8% to 20% that many modeling studies assume today. “On the whole, I think questions should be raised about current investment levels in fusion,” Tang says. (The US allocated over $1 billion to fusion in the 2024 fiscal year, and private-sector funding totaled $2.2 billion between July 2024 and July 2025.) “If you’re talking about decarbonization of the energy system, is this really the best use of public money?” But some experts say that looking to the past to understand the future of energy prices might be misleading.“It’s a good exercise, but we have to be humble about how much we don’t know,” says Egemen Kolemen, a professor at the Princeton Plasma Physics Laboratory. In 2000, many analysts predicted that solar power would remain expensive—but then production exploded and prices came crashing down, largely because China went all in, he says. “People weren’t exactly wrong then,” he adds. “They were just extrapolating what they saw into the future.” How fast prices drop depends on regulations, geopolitical dynamics, and labor cost, he says: “We haven’t built the thing yet, so we don’t know.” This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

Will fusion power get cheap? Don’t count on it. Leggi l'articolo »

AI, Committee, Notizie, Uncategorized

The Download: introducing the Nature issue

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Introducing: the Nature issue When we talk about “nature,” we usually mean something untouched by humans. But little of that world exists today.  From microplastics in rainforest wildlife to artificial light in the Arctic Ocean, human influence now reaches every corner of Earth. In this context, what even is nature? And should we employ technology to try to make the world more “natural”?   In our new Nature issue, MIT Technology Review grapples with these questions. We investigate birds that can’t sing, wolves that aren’t wolves, and grass that isn’t grass. We look for the meaning of life under Arctic ice, within ourselves, and in the far future on a distant world, courtesy of new fiction by the renowned author Jeff VanderMeer.  Together, these stories examine how technology has altered our planet—and how it might be used to repair it. Subscribe now to read the full print issue. What’s next for large language models? After ChatGPT launched in late 2022, the OpenAI chatbot became an everyday everything app for hundreds of millions of people. It led to LLMs being heralded as the new future. The entire tech industry was consumed by the inferno, with companies racing to spin up rival products. But what’s the next big thing after LLMs? More LLMs—but better. Let’s call them LLMs+. Find out how they’re set to become cheaper, more efficient, and more powerful. —Will Douglas Heaven LLMs+ is on our list of the 10 Things That Matter in AI Right Now, MIT Technology Review’s guide to what’s really worth your attention in the busy, buzzy world of AI. We’ll be unpacking one item from the list each day here in The Download, so stay tuned. Will fusion power get cheap? Don’t count on it. Fusion power could provide a steady, zero-emissions source of electricity in the future—if companies can get plants built and running. But a new study published in Nature Energy suggests that even if that future arrives, it might not come cheap. The research team aimed to improve predictions of fusion’s future price by estimating the technology’s experience rate—the percentage by which its cost declines every time capacity doubles. Their findings offer new clues on the technology’s path to deployment. Read the full story. —Casey Crownhart This story is from The Spark, our weekly climate newsletter. Sign up to receive it in your inbox every Wednesday. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 Trump signaled he’s open to reversing the Anthropic banWhat that really means in practice remains to be seen. (Reuters $)+ Anthropic says there’s no “kill switch” for its AI. (Axios)+ “Humans in the loop” in AI warfare is an illusion. (MIT Technology Review) 2 SpaceX plans to manufacture its own GPUsTo support the company’s growing AI ambitions. (Reuters $)+ Musk is shifting SpaceX’s focus from Mars to AI ahead of its IPO. (NYT $)+ SpaceX and Tesla may be on a collision course. (FT $) 3 Chinese tech giant Tencent has unveiled its first flagship AI modelA former OpenAI researcher is at the helm. (SCMP)+ Chinese open models are spreading fast. (MIT Technology Review) 4 High earners are racing ahead on AI, deepening workplace dividesThe division in adoption risks widening inequality. (FT $)+ Startups are bragging they spend more on AI than staff. (404 Media) 5 Thousands of Samsung workers are demanding a new share of AI profitsChip-division employees want 15% of the operating profit. (Bloomberg $)+ Here’s why opinion on AI is so divided. (MIT Technology Review) 6 AI is helping mediocre Korean hackers steal millionsThey’re vibe coding their malware. (Wired $)+ AI is making online crimes easier. (MIT Technology Review) 7 Kalshi suspended three political candidates for betting on their own racesIncluding a Democrat and a Republican running for Congress. (CNN)+ And an independent candidate who said he did it to make a point. (Gizmodo)+ Lawmakers argue that prediction markets are a loophole for gambling. (NPR) 8 A ping-pong robot is beating elite human players for the first timeThe Sony AI system was trained with reinforcement learning. (New Scientist)+ Just days earlier, a humanoid smashed the human half-marathon record. (AP) 9 Crypto scammers are luring ships into the Strait of HormuzBy falsely promising safe passage. (Ars Technica) 10 ‘Age tech’ could help us grow old comfortably at homeApps, wearables, and remote monitoring could fill caregiving gaps. (NYT $)   Quote of the day “It’s a hallucinogenic business plan.” —Ross Gerber, the chief executive of Gerber Kawasaki, an investment firm that owns SpaceX shares, tells the New York Times that he’s unimpressed by Musk’s changing goals for the aerospace company.  One More Thing AP PHOTO/LINDSEY WASSON This grim but revolutionary DNA technology is changing how we respond to mass disasters After hundreds went missing in Maui’s deadly fires, victims were identified with rapid DNA analysis—an increasingly vital tool for putting names to the dead in mass-casualty events. The technology helped identify victims within just a few hours and bring families some closure more quickly than ever before. But it also previews a dark future marked by the rising frequency of catastrophic events. Find out how this forensic breakthrough is preparing us for a more volatile world. —Erika Hayasaki We can still have nice things A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line.) + This fascinating dive into botanical history reveals the origins of the first true plants.+ Here’s how to use Google’s reference desk to find what ordinary search engines miss.+ Watch duct tape get deconstructed to reveal the physics behind its legendary stickiness.+ When Radiohead covers Joy Division, the result is a beautiful intersection of two legendary musical eras.

The Download: introducing the Nature issue Leggi l'articolo »

We use cookies to improve your experience and performance on our website. You can learn more at Politica sulla privacy and manage your privacy settings by clicking Settings.

Privacy Preferences

You can choose your cookie settings by turning on/off each type of cookie as you wish, except for essential cookies.

Allow All
Manage Consent Preferences
  • Always Active

Save
it_IT