YouZum

AI

AI, Committee, Nachrichten, Uncategorized

Large Language Model Post-Training: A Unified View of Off-Policy and On-Policy Learning

arXiv:2604.07941v1 Announce Type: new Abstract: Post-training has become central to turning pretrained large language models (LLMs) into aligned and deployable systems. Recent progress spans supervised fine-tuning (SFT), preference optimization, reinforcement learning (RL), process supervision, verifier-guided methods, distillation, and multi-stage pipelines. Yet these methods are often discussed in fragmented ways, organized by labels or objective families rather than by the behavioral bottlenecks they address. This survey argues that LLM post-training is best understood as structured intervention on model behavior. We organize the field first by trajectory provenance, which defines two primary learning regimes: off-policy learning on externally supplied trajectories, and on-policy learning on learner-generated rollouts. We then interpret methods through two recurring roles — effective support expansion, which makes useful behaviors more reachable, and policy reshaping, which improves behavior within already reachable regions — together with a complementary systems-level role, behavioral consolidation, which preserves, transfers, and amortizes behavior across stages and model transitions. This perspective yields a unified reading of major paradigms. SFT may serve either support expansion or policy reshaping, whereas preference-based methods are usually off-policy reshaping. On-policy RL often improves behavior on learner-generated states, though under stronger guidance it can also make hard-to-reach reasoning paths reachable. Distillation is often best understood as consolidation rather than only compression, and hybrid pipelines emerge as coordinated multi-stage compositions. Overall, the framework helps diagnose post-training bottlenecks and reason about stage composition, suggesting that progress in LLM post-training increasingly depends on coordinated system design rather than any single dominant objective.

Large Language Model Post-Training: A Unified View of Off-Policy and On-Policy Learning Beitrag lesen »

AI, Committee, Nachrichten, Uncategorized

Symbiotic-MoE: Unlocking the Synergy between Generation and Understanding

arXiv:2604.07753v1 Announce Type: cross Abstract: Empowering Large Multimodal Models (LMMs) with image generation often leads to catastrophic forgetting in understanding tasks due to severe gradient conflicts. While existing paradigms like Mixture-of-Transformers (MoT) mitigate this conflict through structural isolation, they fundamentally sever cross-modal synergy and suffer from capacity fragmentation. In this work, we present Symbiotic-MoE, a unified pre-training framework that resolves task interference within a native multimodal Mixture-of-Experts (MoE) Transformers architecture with zero-parameter overhead. We first identify that standard MoE tuning leads to routing collapse, where generative gradients dominate expert utilization. To address this, we introduce Modality-Aware Expert Disentanglement, which partitions experts into task-specific groups while utilizing shared experts as a multimodal semantic bridge. Crucially, this design allows shared experts to absorb fine-grained visual semantics from generative tasks to enrich textual representations. To optimize this, we propose a Progressive Training Strategy featuring differential learning rates and early-stage gradient shielding. This mechanism not only shields pre-trained knowledge from early volatility but eventually transforms generative signals into constructive feedback for understanding. Extensive experiments demonstrate that Symbiotic-MoE achieves rapid generative convergence while unlocking cross-modal synergy, boosting inherent understanding with remarkable gains on MMLU and OCRBench.

Symbiotic-MoE: Unlocking the Synergy between Generation and Understanding Beitrag lesen »

AI, Committee, Nachrichten, Uncategorized

HyperMem: Hypergraph Memory for Long-Term Conversations

arXiv:2604.08256v1 Announce Type: new Abstract: Long-term memory is essential for conversational agents to maintain coherence, track persistent tasks, and provide personalized interactions across extended dialogues. However, existing approaches as Retrieval-Augmented Generation (RAG) and graph-based memory mostly rely on pairwise relations, which can hardly capture high-order associations, i.e., joint dependencies among multiple elements, causing fragmented retrieval. To this end, we propose HyperMem, a hypergraph-based hierarchical memory architecture that explicitly models such associations using hyperedges. Particularly, HyperMem structures memory into three levels: topics, episodes, and facts, and groups related episodes and their facts via hyperedges, unifying scattered content into coherent units. Leveraging this structure, we design a hybrid lexical-semantic index and a coarse-to-fine retrieval strategy, supporting accurate and efficient retrieval of high-order associations. Experiments on the LoCoMo benchmark show that HyperMem achieves state-of-the-art performance with 92.73% LLM-as-a-judge accuracy, demonstrating the effectiveness of HyperMem for long-term conversations.

HyperMem: Hypergraph Memory for Long-Term Conversations Beitrag lesen »

AI, Committee, Nachrichten, Uncategorized

How Knowledge Distillation Compresses Ensemble Intelligence into a Single Deployable AI Model

Complex prediction problems often lead to ensembles because combining multiple models improves accuracy by reducing variance and capturing diverse patterns. However, these ensembles are impractical in production due to latency constraints and operational complexity. Instead of discarding them, Knowledge Distillation offers a smarter approach: keep the ensemble as a teacher and train a smaller student model using its soft probability outputs. This allows the student to inherit much of the ensemble’s performance while being lightweight and fast enough for deployment. In this article, we build this pipeline from scratch — training a 12-model teacher ensemble, generating soft targets with temperature scaling, and distilling it into a student that recovers 53.8% of the ensemble’s accuracy edge at 160× the compression. What is Knowledge Distillation? Knowledge distillation is a model compression technique in which a large, pre-trained “teacher” model transfers its learned behavior to a smaller “student” model. Instead of training solely on ground-truth labels, the student is trained to mimic the teacher’s predictions—capturing not just final outputs but the richer patterns embedded in its probability distributions. This approach enables the student to approximate the performance of complex models while remaining significantly smaller and faster. Originating from early work on compressing large ensemble models into single networks, knowledge distillation is now widely used across domains like NLP, speech, and computer vision, and has become especially important in scaling down massive generative AI models into efficient, deployable systems. Knowledge Distillation: From Ensemble Teacher to Lean Student Setting up the dependencies Copy CodeCopiedUse a different Browser pip install torch scikit-learn numpy Copy CodeCopiedUse a different Browser import torch import torch.nn as nn import torch.nn.functional as F from torch.utils.data import DataLoader, TensorDataset from sklearn.datasets import make_classification from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler import numpy as np Copy CodeCopiedUse a different Browser torch.manual_seed(42) np.random.seed(42) Creating the dataset This block creates and prepares a synthetic dataset for a binary classification task (like predicting whether a user clicks an ad). First, make_classification generates 5,000 samples with 20 features, of which some are informative and some redundant to simulate real-world data complexity. The dataset is then split into training and testing sets to evaluate model performance on unseen data. Next, StandardScaler normalizes the features so they have a consistent scale, which helps neural networks train more efficiently. The data is then converted into PyTorch tensors so it can be used in model training. Finally, a DataLoader is created to feed the data in mini-batches (size 64) during training, improving efficiency and enabling stochastic gradient descent. Copy CodeCopiedUse a different Browser X, y = make_classification( n_samples=5000, n_features=20, n_informative=10, n_redundant=5, random_state=42 ) X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=42 ) scaler = StandardScaler() X_train = scaler.fit_transform(X_train) X_test = scaler.transform(X_test) # Convert to tensors X_train_t = torch.tensor(X_train, dtype=torch.float32) y_train_t = torch.tensor(y_train, dtype=torch.long) X_test_t = torch.tensor(X_test, dtype=torch.float32) y_test_t = torch.tensor(y_test, dtype=torch.long) train_loader = DataLoader( TensorDataset(X_train_t, y_train_t), batch_size=64, shuffle=True ) Model Architecture This section defines two neural network architectures: a TeacherModel and a StudentModel. The teacher represents one of the large models in the ensemble—it has multiple layers, wider dimensions, and dropout for regularization, making it highly expressive but computationally expensive during inference. The student model, on the other hand, is a smaller and more efficient network with fewer layers and parameters. Its goal is not to match the teacher’s complexity, but to learn its behavior through distillation. Importantly, the student still retains enough capacity to approximate the teacher’s decision boundaries—too small, and it won’t be able to capture the richer patterns learned by the ensemble. Copy CodeCopiedUse a different Browser class TeacherModel(nn.Module): “””Represents one heavy model inside the ensemble.””” def __init__(self, input_dim=20, num_classes=2): super().__init__() self.net = nn.Sequential( nn.Linear(input_dim, 256), nn.ReLU(), nn.Dropout(0.3), nn.Linear(256, 128), nn.ReLU(), nn.Dropout(0.3), nn.Linear(128, 64), nn.ReLU(), nn.Linear(64, num_classes) ) def forward(self, x): return self.net(x) class StudentModel(nn.Module): “”” The lean production model that learns from the ensemble. Two hidden layers — enough capacity to absorb distilled knowledge, still ~30x smaller than the full ensemble. “”” def __init__(self, input_dim=20, num_classes=2): super().__init__() self.net = nn.Sequential( nn.Linear(input_dim, 64), nn.ReLU(), nn.Linear(64, 32), nn.ReLU(), nn.Linear(32, num_classes) ) def forward(self, x): return self.net(x) Helpers This section defines two utility functions for training and evaluation. train_one_epoch handles one full pass over the training data. It puts the model in training mode, iterates through mini-batches, computes the loss, performs backpropagation, and updates the model weights using the optimizer. It also tracks and returns the average loss across all batches to monitor training progress. evaluate is used to measure model performance. It switches the model to evaluation mode (disabling dropout and gradients), makes predictions on the input data, and computes the accuracy by comparing predicted labels with true labels. Copy CodeCopiedUse a different Browser def train_one_epoch(model, loader, optimizer, criterion): model.train() total_loss = 0 for xb, yb in loader: optimizer.zero_grad() loss = criterion(model(xb), yb) loss.backward() optimizer.step() total_loss += loss.item() return total_loss / len(loader) def evaluate(model, X, y): model.eval() with torch.no_grad(): preds = model(X).argmax(dim=1) return (preds == y).float().mean().item() Training the Ensemble This section trains the teacher ensemble, which serves as the source of knowledge for distillation. Instead of a single model, 12 teacher models are trained independently with different random initializations, allowing each one to learn slightly different patterns from the data. This diversity is what makes ensembles powerful. Each teacher is trained for multiple epochs until convergence, and their individual test accuracies are printed. Once all models are trained, their predictions are combined using soft voting—by averaging their output logits rather than taking a simple majority vote. This produces a stronger, more stable final prediction, giving you a high-performing ensemble that will act as the “teacher” in the next step. Copy CodeCopiedUse a different Browser print(“=” * 55) print(“STEP 1: Training the 12-model Teacher Ensemble”) print(” (this happens offline, not in production)”) print(“=” * 55) NUM_TEACHERS = 12 teachers = [] for i in range(NUM_TEACHERS): torch.manual_seed(i) # different init per teacher model = TeacherModel() optimizer = torch.optim.Adam(model.parameters(), lr=1e-3) criterion = nn.CrossEntropyLoss() for epoch in range(30): # train until convergence train_one_epoch(model, train_loader, optimizer, criterion) acc = evaluate(model,

How Knowledge Distillation Compresses Ensemble Intelligence into a Single Deployable AI Model Beitrag lesen »

AI, Committee, Nachrichten, Uncategorized

The Art of (Mis)alignment: How Fine-Tuning Methods Effectively Misalign and Realign LLMs in Post-Training

arXiv:2604.07754v1 Announce Type: cross Abstract: The deployment of large language models (LLMs) raises significant ethical and safety concerns. While LLM alignment techniques are adopted to improve model safety and trustworthiness, adversaries can exploit these techniques to undermine safety for malicious purposes, resulting in emph{misalignment}. Misaligned LLMs may be published on open platforms to magnify harm. To address this, additional safety alignment, referred to as emph{realignment}, is necessary before deploying untrusted third-party LLMs. This study explores the efficacy of fine-tuning methods in terms of misalignment, realignment, and the effects of their interplay. By evaluating four Supervised Fine-Tuning (SFT) and two Preference Fine-Tuning (PFT) methods across four popular safety-aligned LLMs, we reveal a mechanism asymmetry between attack and defense. While Odds Ratio Preference Optimization (ORPO) is most effective for misalignment, Direct Preference Optimization (DPO) excels in realignment, albeit at the expense of model utility. Additionally, we identify model-specific resistance, residual effects of multi-round adversarial dynamics, and other noteworthy findings. These findings highlight the need for robust safeguards and customized safety alignment strategies to mitigate potential risks in the deployment of LLMs. Our code is available at https://github.com/zhangrui4041/The-Art-of-Mis-alignment.

The Art of (Mis)alignment: How Fine-Tuning Methods Effectively Misalign and Realign LLMs in Post-Training Beitrag lesen »

AI, Committee, Nachrichten, Uncategorized

SealQA: Raising the Bar for Reasoning in Search-Augmented Language Models

arXiv:2506.01062v4 Announce Type: replace Abstract: We introduce SealQA, a new challenge benchmark for evaluating SEarch-Augmented Language models on fact-seeking questions where web search yields conflicting, noisy, or unhelpful results. SealQA comes in three flavors: (1) Seal-0 (main) and (2) Seal-Hard, which assess factual accuracy and reasoning capabilities, with Seal-0 focusing on the most challenging questions where chat models (e.g., GPT-4.1) typically achieve near-zero accuracy; and (3) LongSeal, which extends SealQA to test long-context, multi-document reasoning in “needle-in-a-haystack” settings. Our evaluation reveals critical limitations in current models: Even frontier LLMs perform poorly across all SealQA flavors. On Seal-0, frontier agentic models equipped with tools like o3 and o4-mini achieve only 17.1% and 6.3% accuracy, respectively, at their best reasoning efforts. We find that advanced reasoning models such as DeepSeek-R1-671B and o3-mini are highly vulnerable to noisy search results. Notably, increasing test-time compute does not yield reliable gains across o3-mini, o4-mini, and o3, with performance often plateauing or even declining early. Additionally, while recent models are less affected by the “lost-in-the-middle” issue, they still fail to reliably identify relevant documents in LongSeal when faced with numerous distractors. To facilitate future work, we release SealQA at huggingface.co/datasets/vtllms/sealqa.

SealQA: Raising the Bar for Reasoning in Search-Augmented Language Models Beitrag lesen »

AI, Committee, Nachrichten, Uncategorized

Constellations

I. We had crash-landed on the planet. We were far from home. The spaceship could not be repaired, and the rescue beacon had failed. Besides me, only the astrogator, part of the captain, and the ship’s AI mind were left.  Outside, the atmosphere registered as hostile to most organisms. We huddled in the lifeboat, which was inoperable but still held air. Vast storms buffeted our cockleshell shelter, although we knew from prior readings that other areas remained calm. All that remained to us was to explore, if we wanted to live. The captain gave me the sole weapon. She tasked the astrogator with carrying some tools that would not unduly weigh him down. Little existed on the planet except deserts of snow. But alien artifacts lay in an area near us. We were an exploration team, so this discovery had oddly comforted us, even though we had been on our way elsewhere. The massive systems failure had no discernible source, and the planet had been our only choice for landfall. The artifacts took the form of 13 domes, spread out over that hostile terrain. The domes had been linked by cables just below shoulder level, threaded through the tops of metal posts at irregular intervals. Whether intended or not, these cables and rods formed a series of paths between the domes.  Before our instruments failed, the AI had reported that the domes appeared to have a heat signature. The cables pulsed under our grip in a way that teased promised warmth far ahead. It took some time to get used to the feeling. The shortest path between domes was a thousand miles long. The longest path was 10 thousand miles long. Our suit technology was good: A suit could recycle water, generate food, create oxygen. It could push us into various states of near hibernation while motors in the legs drove us forward. For the captain, the suit would compensate for having lost her legs and ease her pain. We estimated we could reach the nearest path and follow it to the nearest dome … and that was it. If the dome had life support capabilities, or even just a way to replenish our suits, we would live. Otherwise, we would probably die. We revised the estimate of our survival downward when we reached the path and soon encountered the skeletons of dead astronauts littering the way. In all shapes and sizes, cocooned within their suits. Their huddled forms under the snow displayed a serenity at odds with their fate. But when I wiped the frost from face plates, we saw the extremity of their suffering. It is difficult to explain how we felt walking among so many fatalities. So many dead first contacts.  We no longer had to puzzle over the systems failure. Spaceships came here to crash, and intelligent entities came here to die, for whatever reason. We could not presume our fate would be any different, and adjusted our expectations accordingly. The AI’s platitudes about courage did not raise morale. There were too many lost there in the frozen wastes.  Here were the ghastly emissaries of hundreds of spacefaring species we had never before encountered. The number of the bodies and their haphazard positioning hampered our ability to make progress to the dome. The AI estimated our chances of survival at below 50% for the first time. We would starve in our suits as the motors propelled us forward. We would become desiccated and exist in an elongation of our thoughts that made us weak and stupid until the light winked out. But still, we had no choice. So even in places where the dead in their suits were piled high, we would simply plunge forward, over and through them, headed for the dome.  What we would find there, as I have said, we did not know. But we were in an area of the galaxy where ancient civilizations had died out millions of years ago. We had been on our way to a major site, an ancient city on a moon with no atmosphere in a wilderness of stars.  Although our emotions fluctuated, a professional awe and curiosity about the dead eventually came over us. This created much debate over the comms. We had made a discovery for the ages, but our satisfaction was bittersweet. Even if we lived longer than expected, we would never return home, never see our friends or family again. The AI might continue on after we were dead, but I doubt it envied being the one to report on our discovery centuries hence. And to who? Here were the ghastly emissaries of hundreds of spacefaring species we had never before encountered. Their suits displayed an extraordinary range, although our examination was cursory. Some even appeared to be made out of scales and other biological substances from their home worlds, giving us further clues as to their origins.  The burial of the suits by snow and the lack of access to anything other than a screaming face or faces, often distorted by time and ice, worked against recording much usable data. This issue was compounded in those cases where the suit was part of the organism and they had not needed any “artificial skin,” as the AI put it, to survive harsh conditions. That many had died despite appearing well-­prepared for the planet’s environment sobered us up even before our own suits dispensed drugs to help our mental states.  After a time, each face seemed to express some aspect of our own stress and terror at the seriousness of our situation. After a time, the sheer welter of detail defeated us and caused us extreme distress. The captain made the observation that even one instance of alien contact might cause physiological and mental conditions, including anxiety, stress, fatigue. Here, we were constantly encountering the alien dead of what seemed at times an infinite number of civilizations.  We stopped recording. We recommitted ourselves to the slog toward the nearest dome.  The captain’s

Constellations Beitrag lesen »

AI, Committee, Nachrichten, Uncategorized

The Download: an exclusive Jeff VanderMeer story and AI models too scary to release

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Constellations  —Constellations is a short story by Jeff VanderMeer, the author of the critically acclaimed, bestselling Southern Reach series.   A spacecraft has crash-landed on a hostile planet. The only survivors are three members of the exploration team and the ship’s AI mind.   Little exists on the planet except deserts of snow. But alien artifacts lie nearby, in the form of 13 domes, spread across the terrain. Linked by cables threaded through metal posts, the domes form a series of paths—the only hope for life support.  As the team treks across the frozen hellscape, they discover the remains of countless astronauts from unknown species who followed the same route before them. Is their trail a path to salvation, or a cosmic trap? Read the rest of this short story in full.  This story is from the next issue of our print magazine, packed with stories all about nature. Subscribe now to read the full thing when it lands on Wednesday, April 22.  The must-reads  I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.  1 OpenAI has joined Anthropic in curbing an AI release over security fears Only select partners will get its new cybersecurity tool. (Axios)  + Anthropic said only yesterday that its new AI is too dangerous for the public. (NBC News) + Top models may not be so public going forward. (Bloomberg $)  + The US has summoned bank CEOs to discuss the risks. (FT $)   2 Florida is investigating OpenAI over an alleged role in a shooting  ChatGPT may have helped someone plan a mass shooting in Florida. (WSJ $)  + OpenAI has backed a bill that would limit AI liability for deaths. (Wired $)  + The family of a victim plans to sue the company. (Guardian)  + AI’s role in delusions is dividing opinion. (MIT Technology Review)   3 Volkswagen is ditching EV production for more gasoline models  The carmaker will stop making its top electric vehicle in the US. (NYT $)  + Instead, it will concentrate on developing a new SUV. (Ars Technica)  + Western carmakers are retreating from electric vehicles. (Guardian)  4 Elon Musk’s xAI has sued Colorado over an AI anti-discrimination law  It’s the first state bill of its kind. (Bloomberg $)  + xAI says it will force the firm to “promote the state’s ideological views.” (FT $)  5 A fifth of US employees say AI now does parts of their job  The survey found half of US adults used AI in the past week. (NBC News)  + Missing data could shed light on AI’s job impact. (MIT Technology Review)   6 Google DeepMind’s CEO wants to automate drug design  He hopes to develop AI capable of curing all diseases. (The Economist)  + A scientist is using AI to hunt for antibiotics. (MIT Technology Review)  7 China’s Unitree is launching a viral robot on the international marketR1, its cheapest humanoid, will go on sale outside China next week. (SCMP)+ Gig workers are training humanoids at home. (MIT Technology Review) 8 An experiment on Artemis II astronauts could reshape space medicineChips containing their cells will model spaceflight’s effects. (WP $) 9 A pro-Iran meme machine is trolling Trump with AI Lego cartoonsThe videos have racked up millions of views. (Wired $) + You can learn to love AI slop. (MIT Technology Review) 10 Short breaks could erase 10 years of social media brain damage Studies show that a two-week detox could have a dramatic benefit. (WP $)  Quote of the day  “AI should advance mankind, not destroy it. We’re demanding answers on OpenAI’s activities that have hurt kids, endangered Americans, and facilitated the recent FSU mass shooting.” —Florida Attorney General James Uthmeier explains on X why he’s probing OpenAI.  One More Thing  TOM HUMBERSTONE It’s time to retire the term “user”  People have been called “users” for a long time. Often, it’s the right word to describe people who use software. But “users” is also unspecific enough to refer to just about everyone. It can accommodate almost any big idea or long-term vision.  We use—and are used by—computers and platforms and companies. The label “user” suggests these interactions are deeply transactional, but they’re frequently quite personal. Is it time for a more human vocabulary? Read the full story.  —Taylor Majewski  We can still have nice things  A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line.)  + This flawless levitation trick will leave you questioning the laws of physics.+ The World Press Photo winners expose the beauty (and brutality) of our planet.+ Over 3 million pink flamingos gathered to create a stunning pink horizon.+ Behold the galaxy’s enormity in this comparison of its largest known star to Earth

The Download: an exclusive Jeff VanderMeer story and AI models too scary to release Beitrag lesen »

AI, Committee, Nachrichten, Uncategorized

What’s in a name? Moderna’s “vaccine” vs. “therapy” dilemma

Is it the Department of Defense or the Department of War? The Gulf of Mexico or the Gulf of America? A vaccine—or an “individualized neoantigen treatment”? That’s the Trump-era vocabulary paradox facing Moderna, the covid-19 shot maker whose plans for next-generation mRNA vaccines against flus and emerging pathogens have been dashed by vaccine skeptics in the federal government. Canceled contracts and unfriendly regulators have pushed the Massachusetts-based biotech firm to a breaking point. Last year, Robert F. Kennedy Jr., head of the Department of Health and Human Services, zeroed in on mRNA, unwinding support for dozens of projects—including a $776 million award to Moderna for a bird flu vaccine. By January, the company was warning it might have to stop late-stage programs to develop vaccines against infections altogether. That raises the stakes for a second area of Moderna’s research. In a partnership with Merck, it’s been using its mRNA technology to destroy tumors through a very, very promising technique known as a cancer vacc— “It’s not a vaccine,” a spokesperson for Merck jumped in before the V-word could leave my mouth. “It’s an individualized neoantigen therapy.” Oh, but it is a vaccine. And here’s how it works. Moderna sequences a patient’s cancer cells to find the ugliest, most peculiar molecules on their surface. Then it packages the genetic code for those same molecules, called neoantigens, into a shot. The patient’s immune system has its orders: Kill any cells with those yucky surface markers. Mechanistically, it’s similar to the covid-19 vaccines. What’s different, of course, is that the patient is being immunized against a cancer, not a virus. And it looks like a possible breakthrough. This year, Moderna and Merck showed that such shots halved the chance that patients with the deadliest form of skin cancer would die from a recurrence after surgery. In its formal communications, like regulatory filings, Moderna hasn’t called the shot a cancer vaccine since 2023. That’s when it partnered up with Merck and rebranded the tech as individualized neoantigen therapy, or INT. Moderna’s CEO said at the time that the renaming was to “better describe the goal of the program.” (BioNTech, the European vaccine maker that’s also working in cancer, has shifted its language too, moving from “neoantigen vaccine” in 2021 to “mRNA cancer immunotherapies” in its latest report.) The logic of casting it as a therapy is that patients already have cancer—so it’s a treatment as opposed to a preventive measure. But it’s no secret what the other goal is: to distance important innovation from vaccine fearmongering, which has been inflamed by high-ranking US officials. “Vaccines are maybe a dirty word nowadays, but we still believe in the science and harnessing our immune system to not only fight infections, but hopefully to also fight … cancers,” Kyle Holen, head of Moderna’s cancer program, said last summer during BIO 2025, a big biotech event in Boston. Not everyone is happy with the word games. Take Ryan Sullivan, a physician at Massachusetts General Hospital who has enrolled patients in Moderna’s trials. He says the change raises questions over whether trial volunteers are being properly informed. “There is some concern that there will be patients who decline to treat their cancer because it is a vaccine,” Sullivan told me. “But I also felt it was important, as many of my colleagues did, that you have to call it what it is.” But is it worth going to the mat for a word? Lillian Siu, a medical oncologist at the Princess Margaret Cancer Centre, in Toronto, who has played a role in safety testing for the new shots, watches US politics from a distance. She believes name change is acceptable “if it allows the research to continue.” Holen told me the doctors complaining to Moderna were basically motivated by a desire to defend vaccines—which are, of course, among the greatest public health interventions of all time. They wanted the company to stand strong.  But that’s not what’s happening. When Moderna’s latest results were published in February, the paper’s main text didn’t use the word “vaccine” at all. It was only in the footnotes that you could see the term—in the titles of old papers and patents. All this could be a sign that Kennedy’s strategy is working. His agencies often appear to make mRNA vaccines a focus of people’s worries, impede their reach, devalue them for companies, and sideline their defenders.  Still, Moderna’s strategy may be working too. So far, at least, the government hasn’t had much to say about the company’s cancer vacc— I mean, its individualized neoantigen therapy. This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.

What’s in a name? Moderna’s “vaccine” vs. “therapy” dilemma Beitrag lesen »

We use cookies to improve your experience and performance on our website. You can learn more at Datenschutzrichtlinie and manage your privacy settings by clicking Settings.

Privacy Preferences

You can choose your cookie settings by turning on/off each type of cookie as you wish, except for essential cookies.

Allow All
Manage Consent Preferences
  • Always Active

Save
de_DE