OpenAI updates Operator to o3, making its $200 monthly ChatGPT Pro subscription more enticing
Operator remains a research preview and is accessible only to ChatGPT Pro users. The Responses API version will continue to use GPT-4o.Read More
Operator remains a research preview and is accessible only to ChatGPT Pro users. The Responses API version will continue to use GPT-4o.Read More
A prominent area of exploration involves enabling large language models (LLMs) to function collaboratively. Multi-agent systems powered by LLMs are now being examined for their potential to coordinate challenging problems by splitting tasks and working simultaneously. This direction has gained attention due to its potential to increase efficiency and reduce latency in real-time applications. A common issue in collaborative LLM systems is agents’ sequential, turn-based communication. In such systems, each agent must wait for others to complete their reasoning steps before proceeding. This slows down processing, especially in situations demanding rapid responses. Moreover, agents often duplicate efforts or generate inconsistent outputs, as they cannot see the evolving thoughts of their peers during generation. This latency and redundancy reduce the practicality of deploying multi-agent LLMs, particularly when time and computation are constrained, such as edge devices. Most current solutions have relied on sequential or independently parallel sampling techniques to improve reasoning. Methods like Chain-of-Thought prompting help models to solve problems in a structured way but often come with increased inference time. Approaches such as Tree-of-Thoughts and Graph-of-Thoughts expand on this by branching reasoning paths. However, these approaches still do not allow for real-time mutual adaptation among agents. Multi-agent setups have explored collaborative methods, but mostly through alternating message exchanges, which again introduces delays. Some advanced systems propose complex dynamic scheduling or role-based configurations, which are not optimized for efficient inference. Research from MediaTek Research introduced a new method called Group Think. This approach enables multiple reasoning agents within a single LLM to operate concurrently, observing each other’s partial outputs at the token level. Each reasoning thread adapts to the evolving thoughts of the others mid-generation. This mechanism reduces duplication and enables agents to shift direction if another thread is better positioned to continue a specific line of reasoning. Group Think is implemented through a token-level attention mechanism that lets each agent attend to previously generated tokens from all agents, supporting real-time collaboration. The method works by assigning each agent its own sequence of token indices, allowing their outputs to be interleaved in memory. These interleaved tokens are stored in a shared cache accessible to all agents during generation. This design allows efficient attention across reasoning threads without architectural changes to the transformer model. The implementation works both on personal devices and in data centers. On local devices, it effectively uses idle compute by batching multiple agent outputs, even with a batch size of one. In data centers, Group Think allows multiple requests to be processed together, interleaving tokens across agents while maintaining correct attention dynamics. Performance tests demonstrate that Group Think significantly improves latency and output quality. In enumeration tasks, such as listing 100 distinct names, it achieved near-complete results more rapidly than conventional Chain-of-Thought approaches. The acceleration was proportional to the number of thinkers; for example, four thinkers reduced latency by a factor of about four. In divide-and-conquer problems, using the Floyd–Warshall algorithm on a graph of five nodes, four thinkers reduced the completion time to half that of a single agent. Group Think solved code generation challenges in programming tasks more effectively than baseline models. With four or more thinkers, the model produced correct code segments much faster than traditional reasoning models. This research shows that existing LLMs, though not explicitly trained for collaboration, can already demonstrate emergent group reasoning behaviors under the Group Think setup. In experiments, agents naturally diversified their work to avoid redundancy, often dividing tasks by topic or focus area. These findings suggest that Group Think’s efficiency and sophistication could be enhanced further with dedicated training on collaborative data. Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter. The post This AI Paper Introduces Group Think: A Token-Level Multi-Agent Reasoning Paradigm for Faster and Collaborative LLM Inference appeared first on MarkTechPost.
“I’m feeling blue today” versus “I painted the fence blue.
A Gentle Introduction to Word Embedding and Text Vectorization Beitrag lesen »
We have seen a new era of agentic IDEs like Windsurf and Cursor AI.
Understanding OpenAI Codex CLI Commands Beitrag lesen »
This week, two new leaders at the US Food and Drug Administration announced plans to limit access to covid vaccines, arguing that there is not much evidence to support the value of annual shots in healthy people. New vaccines will be made available only to the people who are most vulnerable—namely, those over 65 and others with conditions that make them more susceptible to severe disease. Anyone else will have to wait. Covid vaccines will soon be required to go through more rigorous trials to ensure that they really are beneficial for people who aren’t at high risk. The plans have been met with fear and anger in some quarters. But they weren’t all that shocking to me. In the UK, where I live, covid boosters have been offered only to vulnerable groups for a while now. And the immunologists I spoke to agree: The plans make sense. They are still controversial. Covid hasn’t gone away. And while most people are thought to have some level of immunity to the virus, some of us still stand to get very sick if infected. The threat of long covid lingers, too. Given that people respond differently to both the virus and the vaccine, perhaps individuals should be able to choose whether they get a vaccine or not. I should start by saying that covid vaccines have been a remarkable success story. The drugs were developed at record-breaking speed—they were given to people in clinical trials just 69 days after the virus had been identified. They are, on the whole, very safe. And they work remarkably well. They have saved millions of lives. And they rescued many of us from lockdowns. But while many of us have benefited hugely from covid vaccinations in the past, there are questions over how useful continuing annual booster doses might be. That’s the argument being made by FDA head Marty Makary and Vinay Prasad, director of the agency’s Center for Biologics Evaluation and Research. Both men have been critical of the FDA in the past. Makary has long been accused of downplaying the benefits of covid vaccines. He made incorrect assumptions about the coronavirus responsible for covid-19 and predicted that the disease would be “mostly gone” by April 2021. Most recently, he also testified in Congress that the theory that the virus came from a lab in China was a “no-brainer.” (The strongest evidence suggests the virus jumped from animals to humans in a market in Wuhan.) Prasad has said “the FDA is a failure” and has called annual covid boosters “a public health disaster the likes of which we’ve never seen before,” because of a perceived lack of clinical evidence to support their use. Makary and Prasad’s plans, which were outlined in the New England Journal of Medicine on Tuesday, don’t include such inflammatory language or unfounded claims, thankfully. In fact, they seem pretty measured: Annual covid booster shots will continue to be approved for vulnerable people but will have to be shown to benefit others before people outside the approved groups can access them. There are still concerns being raised, though. Let’s address a few of the biggest ones. Shouldn’t I get an annual covid booster alongside my flu vaccine? At the moment, a lot of people in the US opt to get a covid vaccination around the time they get their annual flu jab. Each year, a flu vaccine is developed to protect against what scientists predict will be the dominant strain of virus circulating come flu season, which tends to run from October through March. But covid doesn’t seem to stick to the same seasonal patterns, says Susanna Dunachie, a clinical doctor and professor of infectious diseases at the University of Oxford in the UK. “We seem to be getting waves of covid year-round,” she says. And an annual shot might not offer the best protection against covid anyway, says Fikadu Tafesse, an immunologist and virologist at Oregon Health & Science University in Portland. His own research suggests that leaving more than a year between booster doses could enhance their effectiveness. “One year is really a random time,” he says. It might be better to wait five or 10 years between doses instead, he adds. “If you are at risk [of a serious covid infection] you may actually need [a dose] every six months,” says Tafesse. “But for healthy individuals, it’s a very different conversation.” What about children—shouldn’t we be protecting them? There are reports that pediatricians are concerned about the impact on children, some of whom can develop serious cases of covid. “If we have safe and effective vaccines that prevent illness, we think they should be available,” James Campbell, vice chair of the committee on infectious diseases at the American Academy of Pediatrics, told STAT. This question has been on my mind for a while. My two young children, who were born in the UK, have never been eligible for a covid vaccine in this country. I found this incredibly distressing when the virus started tearing through child-care centers—especially given that at the time, the US was vaccinating babies from the age of six months. My kids were eventually offered a vaccine in the US, when we temporarily moved there a couple of years ago. But by that point, the equation had changed. They’d both had covid by then. I had a better idea of the general risks of the virus to children. I turned it down. I was relieved to hear that Tafesse had made the same decision for his own children. “There are always exceptions, but in general, [covid] is not severe in kids,” he says. The UK’s Joint Committee on Vaccination and Immunology found that the benefits of vaccination are much smaller for children than they are for adults. “Of course there are children with health problems who should definitely have it,” says Dunachie. “But for healthy children in healthy households, the benefits probably are quite marginal.” Shouldn’t healthy people get vaccinated to help protect more vulnerable
The FDA plans to limit access to covid vaccines. Here’s why that’s not all bad. Beitrag lesen »
The effectiveness of language models relies on their ability to simulate human-like step-by-step deduction. However, these reasoning sequences are resource-intensive and can be wasteful for simple questions that do not require elaborate computation. This lack of awareness regarding the complexity of the task is one of the core challenges in these models. They often default to detailed reasoning even for queries that could be answered directly. Such an approach increases token usage, extends response time, and increases system latency and memory usage. As a result, there’s a pressing need to equip language models with a mechanism that allows them to make autonomous decisions about whether to think deeply or respond succinctly. Current tools attempting to solve this issue either rely on manually set heuristics or prompt engineering to switch between short and long responses. Some methods use separate models and route questions based on complexity estimates. Still, these external routing systems often lack insight into the target model’s strengths and fail to make optimal decisions. Other techniques fine-tune models with prompt-based cues like “reasoning on/off,” but these rely on static rules rather than dynamic understanding. Despite some improvements, these approaches fail to enable fully autonomous and context-sensitive control within a single model. Researchers from the National University of Singapore introduced a new framework called Thinkless, which equips a language model with the ability to dynamically decide between using short or long-form reasoning. The framework is built on reinforcement learning and introduces two special control tokens—<short> for concise answers and <think> for detailed responses. By incorporating a novel algorithm called Decoupled Group Relative Policy Optimization (DeGRPO), Thinkless separates the training focus between selecting the reasoning mode and improving the accuracy of the generated response. This design prevents the model from falling into one-dimensional behavior and enables adaptive reasoning tailored to each query. The methodology involves two stages: warm-up distillation and reinforcement learning. In the distillation phase, Thinkless is trained using outputs from two expert models—one specializing in short responses and the other in detailed reasoning. This stage helps the model establish a firm link between the control token and the desired reasoning format. The reinforcement learning stage then fine-tunes the model’s ability to decide which reasoning mode to use. DeGRPO decomposes the learning into two separate objectives: one for training the control token and another for refining the response tokens. This approach avoids the gradient imbalances in earlier models, where longer responses would overpower the learning signal, leading to a collapse in reasoning diversity. Thinkless ensures that both <short> and <think> tokens receive balanced updates, promoting stable learning across response types. When evaluated, Thinkless significantly reduced long-form reasoning while preserving high accuracy. On the Minerva Algebra benchmark, the model used the <think> token in only 25.88% of cases while achieving 94.59% accuracy. In contrast, conventional reasoning models had to use extended chains of thought much more frequently. On the AIME 2024 dataset, Thinkless reached a 27.33% accuracy rate with 100% usage of the reasoning mode, showing that it could maintain performance when full reasoning was necessary. On the GSM8K dataset, it utilized <think> only 13.31% of the time, yet still achieved 84.18% accuracy. These results reflect the model’s ability to handle simple and complex queries with appropriate reasoning depth, cutting down on unnecessary token generation by as much as 90% in some tasks. Overall, this study from the National University of Singapore researchers presents a compelling solution to the inefficiencies of uniform reasoning in large language models. By introducing a mechanism that enables models to judge task complexity and adjust their inference strategy accordingly, Thinkless optimizes both accuracy and efficiency. The method balances depth of reasoning and response precision without relying on fixed rules, offering a data-driven approach to more intelligent language model behavior. Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter. The post Researchers from the National University of Singapore Introduce ‘Thinkless,’ an Adaptive Framework that Reduces Unnecessary Reasoning by up to 90% Using DeGRPO appeared first on MarkTechPost.
Since the Chinese biophysicist He Jiankui was released from prison in 2022, he has sought to make a scientific comeback and to repair his reputation after a three-year incarceration for illegally creating the world’s first gene-edited children. While he has bounced between cities, jobs, and meetings with investors, one area of visible success on his comeback trail has been his X.com account, @Jiankui_He, which has become his main way of spreading his ideas to the world. Starting in September 2022, when he joined the platform, the account stuck to the scientist’s main themes, including promising a more careful approach to his dream of creating more gene-edited children. “I will do it, only after society has accepted it,” he posted in August 2024. He also shared mundane images of his daily life, including golf games and his family. But over time, it evolved and started to go viral. First came a series of selfies accompanied by grandiose statements (“Every pioneer or prophet must suffer”). Then, in April of this year, it became particularly outrageous and even troll-like, blasting out bizarre messages (“Good morning bitches. How many embryos have you gene edited today?”). This has left observers unsure what to take seriously. Last month, in reply to MIT Technology Review’s questions about who was responsible for the account’s transformation into a font of clever memes, He emailed us back: “It’s thanks to Cathy Tie.” You may not be familiar with Tie, but she’s no stranger to the public spotlight. A former Thiel fellow, she is a partner in the attention-grabbing Los Angeles Project, which promised to create glow-in-the-dark pets. Over the past several weeks, though, the 29-year-old Canadian entrepreneur has started to get more and more attention as the new wife to (and apparent social media mastermind behind) He Jiankui. On April 15, He announced a new venture, Cathy Medicine, that would take up his mission of editing human embryos to create people resistant to diseases like Alzheimer’s or cancer. Just a few days later, on April 18, He and Tie announced that they had married, posting pictures of themselves in traditional Chinese wedding attire. But now Tie says that just a month after she married “the most controversial scientist in the world,” her plans to relocate from Los Angeles to Beijing to be with He are in disarray; she says she’s been denied entry to China and the two “may never see each other again,” as He’s passport is being held by Chinese authorities and he can’t leave the country. Reached by phone in Manila, Tie said authorities in the Philippines had intercepted her during a layover on May 17 and told her she couldn’t board a plane to China, where she was born and where she says she has a valid 10-year visa. She claims they didn’t say why but told her she is likely “on a watch list.” (MIT Technology Review could not independently confirm Tie’s account.) “While I’m concerned about my marriage, I am more concerned about what this means for humanity and the future of science,” Tie posted to her own X account. A match made in gene-editing heaven The romance between He and Tie has been playing out in public over the past several weeks through a series of reveals on He’s X feed, which had already started going viral late last year thanks to his style of posting awkward selfies alongside maxims about the untapped potential of heritable gene editing, which involves changing people’s DNA when they’re just embryos in an IVF dish. “Human [sic] will no longer be controlled by Darwin’s evolution,” He wrote in March. That post, which showed him standing in an empty lab, gazing into the distance, garnered 9.7 million views. And then, a week later, he collected 13.3 million for this one: “Ethics is holding back scientific innovation and progress.” In April, the feed started to change even more drastically. He’s posts became increasingly provocative, with better English and a unique sensibility reflecting online culture. “Stop asking for cat girls. I’m trying to cure disease,” the account posted on April 15. Two days later, it followed up: “I literally went to prison for this shit.” This shift coincided with the development of his romance with Tie. Tie told us she has visited China three times this year, including a three-week stint in April when she and He got married after a whirlwind romance. She bought him a silver wedding ring made up of intertwined DNA strands. The odd behavior on He’s X feed and the sudden marriage have left followers wondering if they are watching a love story, a new kind of business venture, or performance art. It might be all three. A wedding photo posted by Tie on the Chinese social media platform Rednote shows the couple sitting at a table in a banquet hall, with a small number of guests. MIT Technology Review has been able to identify several people who attended: Cai Xilei, He’s criminal attorney; Liu Haiyan, an investor and former business partner of He; and Darren Zhu, an artist and Thiel fellow who is making a “speculative” documentary about the biophysicist that will blur the boundaries of fiction and reality. In the phone interview, Tie declined to say if she and He are legally married. She also confirmed she celebrated a wedding less than one year ago with someone else in California, in July of 2024, but said they broke up after a few months; she also declined to describe the legal status of that marriage. In the phone call, Tie emphasized that her relationship with He is genuine: “I wouldn’t marry him if I wasn’t in love with him.” An up-and-comer Years before Tie got into a relationship with He, she was getting plenty of attention in her own right. She became a Thiel fellow in 2015, when she was just 18. That program, started by the billionaire Peter Thiel, gave her a grant of $100,000 to drop out of the University of
Meet Cathy Tie, Bride of “China’s Frankenstein” Beitrag lesen »
Google’s “sufficient context” helps refine RAG systems, reduce LLM hallucinations, and boost AI reliability for business applications.Read More
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Meet Cathy Tie, Bride of “China’s Frankenstein” Since the Chinese biophysicist He Jiankui was released from prison in 2022, he has sought to make a scientific comeback and to repair his reputation after a three-year incarceration for illegally creating the world’s first gene-edited children. One area of visible success on his come-back trail has been his X.com account. Over the past few years, his account has evolved from sharing mundane images of his daily life to spreading outrageous, antagonistic messages. This has left observers unsure what to take seriously. Last month, in reply to MIT Technology Review’s questions about who was responsible for the account’s transformation into a font of clever memes, He emailed us back: “It’s thanks to Cathy Tie.” Tie is no stranger to the public spotlight. A former Thiel fellow, she is a partner in a project which promised to create glow-in-the-dark pets. Over the past several weeks, though, the Canadian entrepreneur has started to get more and more attention as the new wife to He Jiankui. Read the full story. —Caiwei Chen & Antonio Regalado Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time Anthropic has announced two new AI models that it claims represent a major step toward making AI agents truly useful. AI agents trained on Claude Opus 4, the company’s most powerful model to date, raise the bar for what such systems are capable of by tackling difficult tasks over extended periods of time and responding more usefully to user instructions, the company says. They’ve achieved some impressive results: Opus 4 created a guide for the video game Pokémon Red while playing it for more than 24 hours straight. The company’s previously most powerful model was capable of playing for just 45 minutes. Read the full story. —Rhiannon Williams The FDA plans to limit access to covid vaccines. Here’s why that’s not all bad. This week, two new leaders at the US Food and Drug Administration announced plans to limit access to covid vaccines, arguing that there is not much evidence to support the value of annual shots in healthy people. New vaccines will be made available only to the people who are most vulnerable—namely, those over 65 and others with conditions that make them more susceptible to severe disease. The plans have been met with fear and anger in some quarters. But they weren’t all that shocking to me. In the UK, where I live, covid boosters have been offered only to vulnerable groups for a while now. And the immunologists I spoke to agree: The plans make sense. Read the full story. —Jessica Hamzelou This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 Thousands of Americans are facing extreme weatherBut help from the federal government may never arrive. (Slate $)+ States struck by tornadoes and floods are begging the Trump administration for aid. (Scientific American $) 2 Spain’s grid operator has accused power plants of not doing their jobIt claims they failed to control the system’s voltage shortly before the blackout. (FT $)+ Did solar power cause Spain’s blackout? (MIT Technology Review) 3 Google is facing a DoJ probe over its AI chatbot dealIt will probe whether Google’s deal with Character.AI gives it an unfair advantage. (Bloomberg $)+ It may not lead to enforcement action, though. (Reuters) 4 DOGE isn’t bad news for everyoneThese smaller US government IT contractors say it’s good for business—for now. (WSJ $)+ It appears that DOGE used a Meta AI model to review staff emails, not Grok. (Wired $)+ Can AI help DOGE slash government budgets? It’s complex. (MIT Technology Review)5 Google’s new shopping tool adds breasts to minorsTry it On distorts uploaded photos to clothing models’ proportions, even when they’re children. (The Atlantic $)+ It feels like this could have easily been avoided. (Axios)+ An AI companion site is hosting sexually charged conversations with underage celebrity bots. (MIT Technology Review) 6 Apple is reportedly planning a smart glasses product launchBy the end of next year. (Bloomberg $)+ It’s playing catchup with Meta and Google, among others. (Engadget)+ What’s next for smart glasses. (MIT Technology Review) 7 What it’s like to live in Elon Musk’s corner of TexasComplete with an ugly bust and furious locals. (The Guardian)+ West Lake Hills residents are pushing back against his giant fences. (Architectural Digest $) 8 Our solar system may contain a hidden ninth planetA possible dwarf planet has been spotted orbiting beyond Neptune. (New Scientist $) 9 Wikipedia does swag nowHow else will you let everyone know you love the open web? (Fast Company $) 10 One of the last good apps is shutting downMozilla is closing Pocket, its article-saving app, and the internet is worse for it. (404 Media)+ Parent company Mozilla said the way people use the web has changed. (The Verge) Quote of the day “This is like the Mount Everest of corruption.” —Senator Jeff Merkley protests outside Donald Trump’s exclusive dinner for the highest-paying customers of his personal cryptocurrency, the New York Times reports. One more thing The iPad was meant to revolutionize accessibility. What happened? On April 3, 2010, Steve Jobs debuted the iPad. What for most people was basically a more convenient form factor was something far more consequential for non-speakers: a life-changing revolution in access to a portable, powerful communication device for just a few hundred dollars. But a piece of hardware, however impressively designed and engineered, is only as valuable as what a person can do with it. After the iPad’s release, the flood of new, easy-to-use augmentative and alternative communication apps that users were in desperate need of never came. Today, there are only
The Download: meet Cathy Tie, and Anthropic’s new AI models Beitrag lesen »
If you’ve been into machine learning for a while, you’ve probably noticed that the same books get recommended over and over again.
10 Underrated Books for Mastering Machine Learning Beitrag lesen »