Committee Archives - 第2页共186页

Innovation abounds in device charging

admin NU / 5 月 11, 2026

The changes may be less perceptible than in smartphones, tablets, or wearables, but chargers have also been quietly reinvented over the last decade. At one time a bulky mix of tangled cables and connectors, slow to perform and prone to overheating, they’re now smaller, safer, and faster, thanks to a slew of technological advances. These advances include a switch to gallium nitride (GaN), which has now usurped silicon as the preferred semiconductor, capable of handling higher voltages, faster switches, and more efficient conduction. Multi-port chargers, coupled with an industry-wide shift toward USB-C standardization, mean a single charger can handle multiple devices. And early smart chargers are also trickling onto the market, able to dynamically distribute power and carry out autonomous safety checks. Combined, these have repositioned chargers as differentiated standalone devices, rather than peripheral accessories. But, manufacturers say there is much further to go if chargers are to accommodate the demands of a connected ecosystem now made up of an estimated 20 billion devices, according to IoT Analytics. “Charging products are undergoing a fundamental identity shift—from accessory to primary component,” says Mario Wu, general manager for North America at Anker Innovations. “This is not simply a functional upgrade; It is a repositioning of charging’s role within the broader digital lifestyle ecosystem. As charging becomes normalized, the charger is no longer an appendage to your devices—it is the infrastructure underlying every digital experience.” Pillars of performance If this vision for the future of charging sounds ambitious, there are concrete advancements to back it up. Newly refined semiconductors are already bolstering power and performance, building on the gains delivered by GaN with some sweeping changes to systems architecture. To take advantage of the fast-moving technology, Anker launched GaNPrime 2.0, which combines GaN materials with higher-frequency controllers and other power devices, achieving higher power output and lower heat generation, explains Wu. For example, the addition of a multi-level buck converter converts voltage from a binary on/off pattern, to multiple, smaller steps that create smoother transitions and reduce stress on components. Combined with Anker’s proprietary control algorithm, this simultaneously achieves a more compact product design and reduced energy loss. Changes such as this mean secondary-stage power conversion now reaches over 99.5%, says Wu, and some products can maintain 140 watts on a single port without falling below optimal levels. “In traditional setups, you might use three separate chargers—adding up to roughly 210 watts combined,” says Wu. “But Anker’s Prime 160W Charger with PowerIQ 5.0 can charge those same three devices in roughly the same time because it dynamically reallocates unused capacity instead of locking it in place.” But if GaNPrime 2.0 represents where the architecture stands today, it’s by no means the end point. Says Wu, “The next phase of GaN development focuses on higher frequency switching: When paired with breakthroughs in materials and control technology, higher switching frequency enables lower energy loss, improved conversion efficiency, and even more compact designs.” Other third-generation semiconductors like silicon carbide (SiC) will also have a role to play. Already deployed at scale in EV inverters and industrial power systems, Wu explains that SiC can deliver “exceptional, high-temperature stability and reliable support for high-voltage, high-power applications.” Improving circuit design using SiC to make it compact and cost-effective for smaller devices has proven a stumbling block until now, but Wu is hopeful that as manufacturing scales up, the material will become “an increasingly credible direction.” Without constraints Consumers also demand portability in their device charger. They want chargers without the spatial constraints of wires or surface-to-surface connection—or what’s known as imperceptible charging. Wireless charging innovations today go part of the way, but they’re based on the principle of magnetic coupling—i.e., only when transmitter and receiver coils are aligned is energy transfer efficient and stable. That means devices must be in contact with the charging pad surface. But research into technologies that use magnetic resonance and infrared are moving the dial. Best known for creating non-invasive imaging in health care via MRIs, magnetic resonance uses magnetic fields to allow energy transfer over greater distances by tuning transmitter and receiver coils to the same resonant frequency. Transmitters emit an oscillating magnetic field from which the receiver can extract energy even if coils are not perfectly aligned. This “significantly relaxes placement requirements for users, [but currently] the trade-off is reduced transmission efficiency,” says Wu. Infrared wireless charging also represents a meaningful area ripe for exploration, Wu adds. This sees infrared beams deliver energy to photovoltaic receivers on devices, with transmitters installable at any location so long as there is clear line-of-sight to the device. This enables wireless power delivery across meters rather than centimetres. He explains, “The core challenge it currently faces is further increasing power levels, and related research is ongoing.” Wu says Anker is engaged in technical exchanges with both universities and industry associations to find workarounds for these trade-offs. “Our strategy is to remain at the forefront: continuously tracking, conducting in-depth evaluations, and delivering the next generation of wireless charging technology to users the moment it matures and becomes viable.” Levelling up intelligence If the power, performance, and portability of chargers have made incremental gains in the last decade, though, then imbuing devices with smart capabilities is arguably more of a step change in what users might expect. Wu defines smart charging as “the shift from passive power delivery to active, adaptive energy management.” In short, if conventional chargers supply fixed current, then smart chargers can read device signals, monitor conditions, and adjust their output accordingly to optimize speed, safety, and efficiency. Some products on the market already hint at these possibilities. Next-gen chargers already deliver dynamic power allocation, for example, recognizing individual device IDs to adapt the distribution of power to multiple devices simultaneously. But in 10 years’ time, the goal is to create chargers that go much further, says Wu, capable of autonomously managing energy across multiple connected devices, communicating with users, and adaptively optimizing performance. “Smart charging will feel less like a feature and more

Innovation abounds in device charging Read Post »

AI, Committee, 新闻, Uncategorized

The Download: the hantavirus outbreak and Musk v. Altman week 2

admin NU / 5 月 11, 2026

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Here’s what you need to know about the cruise ship hantavirus outbreak Last week, eight passengers aboard a Dutch-flagged cruise ship contracted a type of hantavirus transmitted by rats. Three have since died. But health experts stress that this situation is nothing like the coronavirus outbreak in 2020. The Andes virus is known to spread between people, and there are no specific antiviral treatments or vaccines. Yet transmission appears to require a specific form of contact that the cruise ship fostered. Here’s what you need to know about the outbreak—and why experts believe it can be contained. —Jessica Hamzelou This story is part of MIT Technology Review Explains, our series untangling the complex, messy world of technology to help you understand what’s coming next. You can read more from the series here. Musk v. Altman week 2: OpenAI fires back, and Shivon Zilis reveals that Musk tried to poach Sam Altman In the second week of the landmark trial between Elon Musk and OpenAI, Musk’s motivations for bringing the suit came under intense scrutiny. OpenAI president Greg Brockman testified that Musk had pushed for the company to create a for-profit entity, while Shivon Zilis, a former board member, revealed that the Tesla tycoon had sought to lure Sam Altman to a new AI venture. The courtroom also heard about Brockman’s private journals, Musk’s abandoned plans for a rival AI lab, and the moment he stormed out of a pivotal meeting carrying a painting of a Tesla. Here’s what happened in the second week of the trial—and what’s coming next. —Michelle Kim Michelle Kim, who’s also a lawyer, has been in court on each day of the Musk v. Altman trial. To keep up with her ongoing coverage of their legal showdown, follow @techreview or @michelletomkim on X. How LLMs could supercharge mass surveillance in the US: 10 Things That Matter in AI Right Now There are pieces of your life scattered all over the internet, and some of them are for sale. Data brokers collect web searches, financial records, and location data from millions of people and sell them to various clients, including the US government. While gathering that data has become easier in the smartphone era, making use of it at scale has remained difficult. But researchers are beginning to show that LLM agents can connect anonymized data to real people quickly, cheaply, and at a massive scale. Find out why privacy experts fear AI could remove the friction that has long protected the public from mass surveillance. —Grace Huckin “How LLMs could supercharge mass surveillance in the US” is a feature accompanying MIT Technology Review’s 10 Things That Matter in AI Right Now, our guide to what’s really worth your attention in the busy, buzzy world of AI. Check out the full list of the big ideas, trends, and advances in the field here. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 Meta’s embrace of AI is making employees miserableWorkers feel pressured to use the tech while fearing AI-driven layoffs. (NYT $)+ They’re also unhappy about Meta tracking them to train AI. (The Verge)+ AI’s rise has been described as “the most joyless tech revolution ever.” (WSJ $)+ Gen-Z is particularly fed up with it. (NYT $)+ We’ve entered the era of AI malaise. (MIT Technology Review) 2 South Korea’s military wants robots to fill gaps in troop numbersIt’s in talks with Hyundai to bring robotics to the front lines. (Bloomberg $)+ They could include Boston Dynamics’ Spot and a new exoskeleton. (SCMP)+ South Korea’s military has shrunk by 20% over six years. (BBC) 3 OpenAI is being sued over ChatGPT’s alleged role in guiding a mass shooterA lawsuit claims the bot said targeting children would bring more attention. (NBC)+ Florida’s AG has opened a criminal investigation into the case. (NPR)+ Does AI cause or amplify delusions? (MIT Technology Review) 4 The Canvas hack was the biggest-ever student data privacy disasterIt exposes the risks of centralizing the data of millions of students. (404 Media)+ While the platform is back online, the hack disrupted university exams. (NPR)+ The breach is part of a trend of edtech vulnerabilities. (WP $) 5 Alibaba has joined China’s “chat to buy” shopping crazeBy integrating AI assistant Qwen into its e-commerce platforms. (Reuters $)+ Companies are betting that chat is the future of online shopping. (SCMP)+ OpenClaw is a driving force behind the trend. (MIT Technology Review) 6 Cybercrime increasingly comes with threats of physical violenceIn the US, the physical threats rose more than twofold last year. (BBC) 7 AI’s next phase plays into TSMC’s handsTaiwan’s chip-making giant stands to gain from the supply squeeze. (WSJ $) 8 Europe is confronting life without American techDependence on Silicon Valley is a growing geopolitical concern. (FT $) 9 The US, UK, and China top new rankings for AI in life sciencesSwitzerland and Germany follow in the AI Competitiveness Index. (SCMP) 10 The Pentagon has released a massive trove of declassified UFO filesIncluding newly declassified documents, images and footage. (New Scientist)+ The files contain reports of “orbs,” “saucers,” and lunar “flashes.” (Wired $)+ Here’s how to spot an alien. (MIT Technology Review) Quote of the day “There’s a real sense where ‘safety’ isn’t a bad word anymore.” —Nathan Calvin, general counsel at Encode, a nonprofit AI advocacy group, tells the Washington Post that Anthropic’s Mythos has forced a White House reset on AI safety. One More Thing This computer-generated image of Mars was built with laser altimeter data from NASA’s Mars Global Surveyor, which operated for nine years in orbit around the planet.NASA/JPL-CALTECH Inside NASA’s bid to make spacecraft as small as possible As NASA’s InSight lander descended to Mars in November 2018, two tiny spacecraft tracked its progress. InSight had touched down, they reported, and survived its treacherous journey. The mission offered a pathway to cheaper space exploration, with small,

The Download: the hantavirus outbreak and Musk v. Altman week 2 Read Post »

AI, Committee, 新闻, Uncategorized

Implementing Prompt Compression to Reduce Agentic Loop Costs

admin NU / 5 月 11, 2026

Agentic loops in production can be synonymous with high costs, especially when it comes to both LLM and external application usage via APIs, where billing is often closely related to token usage.

Implementing Prompt Compression to Reduce Agentic Loop Costs Read Post »

AI, Committee, 新闻, Uncategorized

Fostering breakthrough AI innovation through customer-back engineering

admin NU / 5 月 11, 2026

Despite years of digitization, organizations capture less than one-third of the value expected from digital investments, according to McKinsey research. That’s because most big companies begin with technological capabilities and bolt applications onto them, rather than starting with customer needs and working backward to technology solutions. Not prioritizing the customer can create fragmented solutions; disjointed customer experiences; and ultimately, failed transformations. Organizations that achieve outsized results from AI flip the script. They adopt a “customer-back engineering” mindset, putting customers at the heart of technology transformation. It’s a strategy in which products and services are developed with the customer experience first in mind, including the customers’ challenges, needs, and expectations. Product development teams then work backward in a nimble and agile way to find the steps necessary to design and build solutions that achieve the desired experience. “When you get your engineers closer to customers, you get a lot more sideways innovation,” says Ashish Agrawal, managing vice president of business cards and payments tech at Capital One. “That leads to a multiplier effect, because engineers can approach a problem from a different dimension that can be unique to the sales or product perspective.” The case for customer-centricity in engineering Engineers are problem-solvers by nature, says Agrawal. When they hear about challenges customers are experiencing, or how they are using products and services in the real world, they can devise ways to efficiently address customer needs, since they are naturally closer to systems and data than many other teams across the company. “Fostering a customer-centric culture has a motivational effect on engineers when they actually start seeing how the core changes they’re making, or the features they’re adding, are having a direct impact on the lives of customers,” says Agrawal. It also takes discipline. Agrawal explains that Capital One has set a goal for every engineer in his organization to establish several touchpoints with customers throughout the year in different forms, including: Digital empathy sessions to observe user journeys and identify where users hit friction Embedded customer support for periods of time to deepen understanding of servicing needs Engineering ride-alongs, in which engineers join customer success, sales, and support staff on calls or on-site visits Hackathon competitions to build solutions around real customer problems The AI opportunities with customer-centricity “The biggest challenge engineers within large companies face is a lack of direct access to customers,” says Agrawal. “This can make it harder for technologists to work with customers to identify problems and innovate solutions.” AI has accelerated the challenges as well as the opportunities. The lifecycle of launching products has become significantly faster. But the good news is that engineers are closer to the data that feeds into AI, so they can more rapidly apply AI-informed data techniques to solve customer problems. Agrawal outlines a recent scenario: In the customer servicing space, conversations can instantly be summarized and give a customer agent context on the member’s original request and remaining action points. Agentic AI can also be enabled to ask pointed follow-up questions about the interaction that would otherwise take human agents time to read through the entire thread. “A solution would have been a lot harder in an ecosystem without a lot of high-quality data,” says Agrawal. “But when you combine a rich data ecosystem with agentic tools, you move from incremental fixes to high-velocity transformation.” By investing in AI data and tools and focusing on rapid experimentation, Agrawal says the cycle of deploying solutions can be accelerated. Teams learn that if they meet customer needs and iterate on a wider range of solutions much faster, then the entire innovation cycle speeds up. For example, Capital One used customer insights to build a state-of-the-art, multi-agent AI framework called Chat Concierge to enhance the customer experience for car buyers and dealers. In a single conversation, Chat Concierge can perform tasks like comparing vehicles to help car buyers decide on the best choice and scheduling test drives or appointments with salespeople. Agrawal explains that car buyers can engage with Chat Concierge directly through participating dealer websites. Dealers can access and can take over the chat through Navigator Platform. The AI assistant consists of multiple logical agents that work together to mimic human reasoning, allowing it to provide information and take action based on the customer’s requests. The elements of an AI-first mindset According to a recent MIT Technology Review Insights survey, 70% of leaders say their firm uses agentic AI to some degree. Roughly half of executives say agentic AI systems are highly capable of improving fraud detection (56%) and security (51%), reducing cost and increasing efficiency (41%), and improving the customer experience (41%). Looking into the future, achieving these outcomes looks even more likely. More than half of the banking executives surveyed say they expect to continue to improve fraud detection (75%), security (64%), and the customer experience (51%). Agentic AI use cases that show strong potential to transform the customer experience in financial services include responding to customer services requests, adjusting bill payments to align with regular paychecks, or extracting key terms and conditions from financial agreements. Placing the customer at the center of a transformation requires an AI-first mindset. Companies must shift from simply augmenting an existing product to fundamentally reimagining the problem and the user’s needs through the lens of AI’s capabilities. A few best practices that Agrawal recommends include: Reimagine the core function of AI to solve a user’s problem: “The true value isn’t in chasing the AI hype; it’s in solving meaningful customer problems. By focusing on impact, we ensure that our innovation isn’t just fast; it’s transformative,” says Agrawal. Start with high-quality, well-governed data as the foundation: “Data readiness and unified information across systems are the non-negotiable foundations of AI. A clean data layer is what orchestrates the agentic loop— enabling the perception, reasoning, and execution required to solve a customer’s problem before they even have to ask,” explains Agrawal. Rebuild workflows with AI embedded from the start: “People treat models as black boxes, but

Fostering breakthrough AI innovation through customer-back engineering Read Post »

AI, Committee, 新闻, Uncategorized

Implementing advanced AI technologies in finance

admin NU / 5 月 11, 2026

In finance departments that have long been defined by precision and control, AI has arrived less as a neatly managed upgrade than as a quiet insurgency. Employees are already using it while leadership races to impose structure, governance, and strategy after the fact. The result is a paradox: one of the most tightly regulated functions in the enterprise is now among the most experimentally transformed. REGISTER TO WATCH What’s emerging is a layered shift in how work gets done. From variance commentary and fraud detection to contract review and close narrative drafting, AI is embedding itself across workflows, particularly where unstructured data once slowed down everything. Yet, as Glenn Hopper, head of AI and managing director at VAi Consulting, puts it, “the proliferation of AI happened kind of before governance and before a real plan came about.” That bottom-up adoption is forcing a recalibration at the top, where executives must now reconcile productivity gains with oversight, risk, and accountability. Just as critical is reframing AI’s role. “AI as a means to an end, as opposed to AI being the end,” says Ranga Bodla, VP of industry and field marketing at Oracle NetSuite, underscores a growing consensus: the technology is most effective when it disappears into existing processes rather than outright replaces them. Embedded systems, seamless integrations, and tools like model context protocol (MCP) are accelerating this shift, making AI an ambient capability. Notably, ease of integration, not cost savings or new features, has become the strongest driver of adoption. Still, the real constraint may be neither data nor technology, but people. “Talent is the actual root cause,” Hopper argues, pointing to a widening gap between domain expertise and AI fluency. Even as concerns about data security and model opacity persist, the more pressing risk may be misunderstanding the tools altogether or restricting them so tightly that employees look for workarounds beyond leadership control. “The auditability of it, I think, is critical,” Bodla notes. Looking ahead, the trajectory is clear but variable. AI agents capable of executing complex, multi-step tasks are beginning to materialize, while expanding context windows and interoperable systems promise deeper, more persistent intelligence. But the real transformation may be a gradual shift toward systems that bolster judgement, automate routines, and allow finance teams to spend less time reconciling the past and more time shaping what comes next. This webcast is produced in partnership with Oracle NetSuite. Register to watch the webcast. This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff. It was researched, designed, and written by human writers, editors, analysts, and illustrators. This includes the writing of surveys and collection of data for surveys. AI tools that may have been used were limited to secondary production processes that passed thorough human review.

Implementing advanced AI technologies in finance Read Post »

AI, Committee, 新闻, Uncategorized

Implementing Prompt Compression to Reduce Agentic Loop Costs

admin NU / 5 月 11, 2026

Agentic loops in production can be synonymous with high costs, especially when it comes to both LLM and external application usage via APIs, where billing is often closely related to token usage.

Implementing Prompt Compression to Reduce Agentic Loop Costs Read Post »

AI, Committee, 新闻, Uncategorized

NVIDIA AI Releases Star Elastic: One Checkpoint that Contains 30B, 23B, and 12B Reasoning Models with Zero-Shot Slicing

admin NU / 5 月 10, 2026

Training a family of large language models (LLMs) has always come with a painful multiplier: every model variant in the family—whether 8B, 30B, or 70B—typically requires its own full training run, its own storage, and its own deployment stack. For a dev team running inference at scale, this means multiplying compute costs by the number of model sizes they want to support. NVIDIA researchers are now proposing a different approach called Star Elastic. Star Elastic is a post-training method that embeds multiple nested submodels—at different parameter budgets—inside a single parent reasoning model, using a single training run. Applied to Nemotron Nano v3 (a hybrid Mamba–Transformer–MoE model with 30B total parameters and 3.6B active parameters), Star Elastic produces 23B (2.8B active) and 12B (2.0B active) nested variants trained with approximately 160B tokens. All three variants live in one checkpoint and can be extracted without any additional fine-tuning. paper pdf What does “Nested” Actually Mean here If you haven’t encountered elastic or nested architectures before, the idea is this: instead of training three separate 30B, 23B, and 12B models, you train one model that contains the smaller ones as subsets of itself. The smaller submodels reuse the most important weights from the parent, identified through a process called importance estimation. Star Elastic scores each model component: embedding channels, attention heads, Mamba SSM heads, MoE experts, and FFN channels by how much they contribute to model accuracy. Components are then ranked and sorted, so smaller-budget submodels always use the highest-ranked contiguous subset of components from the larger model. This property is called nested weight-sharing. The method supports nesting along multiple axes: the SSM (State Space Model) dimension, embedding channels, attention heads, Mamba heads and head channels, MoE expert count, and FFN intermediate dimension. For MoE layers specifically, Star Elastic uses Router-Weighted Expert Activation Pruning (REAP), which ranks experts by both routing gate values and expert output magnitudes—a more principled signal than naive frequency-based pruning, which ignores how much each expert actually contributes to the layer output. A Learnable Router, Not a Fixed Compression Recipe A key distinction from prior compression methods like Minitron is that Star Elastic uses an end-to-end trainable router to determine the nested submodel architectures. The router takes a target budget (e.g., “give me a 2.8B active parameter model”) as a one-hot input and outputs differentiable masks that select which components are active at that budget level. These masks are trained jointly with the model through Gumbel-Softmax, which allows gradient flow through discrete architectural decisions. The loss function combines knowledge distillation (KD) where the non-elastified parent model acts as the teacher with a router loss that penalizes deviation from the target resource budget (parameter count, memory, or latency). This means the router learns to make architecture choices that actually improve accuracy under KD, rather than just minimizing a proxy metric. Training uses a two-stage curriculum: a short-context phase (sequence length 8,192 tokens) with uniform budget sampling, followed by an extended-context phase (sequence length 49,152 tokens) with non-uniform sampling that prioritizes the full 30B model (p(30B)=0.5, p(23B)=0.3, p(12B)=0.2). The extended context phase is critical for reasoning performance. The research team’s ablations on Nano v2—explicitly reproduced as the empirical basis for the same curriculum choice on Nano v3 show gains of up to 19.8% on AIME-2025 for the 6B variant and 4.0 percentage points for the 12B variant from Stage 2 alone, motivating its use here. paper pdf Elastic Budget Control: Different Models for Different Reasoning Phases Existing budget control in reasoning models including Nemotron Nano v3’s own default behavior works by capping the number of tokens generated during a <think> phase before forcing a final answer. This approach uses the same model throughout. Star Elastic unlocks a different strategy: using different nested submodels for the thinking phase versus the answering phase. The researchers evaluated four configurations. The optimal one, called ℳS → ℳL (small model for thinking, large model for answering), allocates a cheaper model to generate extended reasoning traces and reserves the full-capacity model for synthesizing the final answer. The 23B → 30B configuration in particular advances the accuracy–latency Pareto frontier, achieving up to 16% higher accuracy and 1.9× lower latency compared to default Nemotron Nano v3 budget control. The intuition: reasoning tokens are high-volume but tolerant of some capacity reduction; the final answer requires higher precision. Quantization Without Breaking the Nested Structure A naive approach to deploying a quantized elastic model would be to quantize each variant separately after slicing. That breaks the nested weight-sharing property and requires a separate quantization pass per size. Instead, Star Elastic applies Quantization-Aware Distillation (QAD) directly on the elastic checkpoint, preserving the nested mask hierarchy throughout. For FP8 (E4M3 format), post-training quantization (PTQ) is sufficient, recovering 98.69% of BF16 accuracy on the 30B variant. For NVFP4 (NVIDIA’s 4-bit floating-point format), PTQ alone causes a 4.12% average accuracy drop, so a short nested QAD phase (~5B tokens at 48K context) brings recovery back to 97.79% for the 30B variant. In both cases, zero-shot slicing of the 23B and 12B variants from the single quantized checkpoint is preserved. The memory implications are significant. Storing separate 12B, 23B, and 30B BF16 checkpoints requires 126.1 GB; the single elastic checkpoint requires 58.9 GB. The 30B NVFP4 elastic checkpoint fits in 18.7 GB, enabling the 12B NVFP4 variant to run on an RTX 5080 where every BF16 configuration runs out of memory. On an RTX Pro 6000, the 12B NVFP4 variant reaches 7,426 tokens/s, a 3.4× throughput improvement over the 30B BF16 baseline. Depth vs. Width: Why Star Elastic Compresses Width One design choice worth calling out explicitly: the research team compared two compression strategies—removing layers entirely (depth compression) versus reducing internal dimensions like hidden size, expert count, and head count (width compression). With a 15% parameter reduction and 25B tokens of knowledge distillation, width compression recovered 98.1% of baseline performance while depth compression recovered only 95.2%, with noticeable degradation on HumanEval and MMLU-Pro. As a result, Star Elastic prioritizes width-based elasticity for its main results, though

NVIDIA AI Releases Star Elastic: One Checkpoint that Contains 30B, 23B, and 12B Reasoning Models with Zero-Shot Slicing Read Post »

AI, Committee, 新闻, Uncategorized

How to Build a Cost-Aware LLM Routing System with NadirClaw Using Local Prompt Classification and Gemini Model Switching

admin NU / 5 月 10, 2026

In this tutorial, we explore NadirClaw as an intelligent routing layer that classifies prompts into simple and complex tiers before sending them to the most suitable model. We start by installing the required packages, setting up an optional Gemini API key, and testing the local classifier through the NadirClaw CLI without making any live LLM calls. We then inspect the centroid vectors that power the routing decision, embed our own prompts, visualize how similarity scores separate simple and complex tasks, and experiment with confidence thresholds. After understanding the local routing logic, we move into live routing by launching the NadirClaw proxy server, sending OpenAI-compatible requests through it, comparing routed model behavior, and estimating cost savings against an always-Pro baseline. Copy CodeCopiedUse a different Browser import subprocess, sys def _pip(*pkgs): subprocess.run([sys.executable, “-m”, “pip”, “install”, “-q”, *pkgs], check=True) _pip(“nadirclaw”, “openai”, “sentence-transformers”, “matplotlib”, “scikit-learn”, “pandas”, “requests”) import os, json, time, signal, shutil, getpass from pathlib import Path import numpy as np import pandas as pd import matplotlib.pyplot as plt import requests GEMINI_API_KEY = os.environ.get(“GEMINI_API_KEY”, “”).strip() if not GEMINI_API_KEY: print(“Paste your Gemini API key (input hidden), or press Enter to skip:”) try: GEMINI_API_KEY = getpass.getpass(prompt=”GEMINI_API_KEY: “).strip() except (EOFError, KeyboardInterrupt): GEMINI_API_KEY = “” LIVE_ROUTING = bool(GEMINI_API_KEY) if LIVE_ROUTING: os.environ[“GEMINI_API_KEY”] = GEMINI_API_KEY print(f”✓ key captured ({len(GEMINI_API_KEY)} chars) — sections 8–11 enabled.”) else: print(” no key entered — sections 3–7 still run; live routing skipped.”) We install NadirClaw and the supporting Python libraries required for routing, embeddings, plotting, API calls, and data handling. We then import all required modules and securely capture the Gemini API key through the environment or a hidden prompt. We also decide whether live routing sections should run, while still allowing the local classifier sections to work without an API key. Copy CodeCopiedUse a different Browser def classify(prompt: str) -> dict: r = subprocess.run( [“nadirclaw”, “classify”, “–format”, “json”, prompt], capture_output=True, text=True, timeout=180, ) if r.returncode != 0: return {“prompt”: prompt, “error”: (r.stderr or r.stdout).strip()} return json.loads(r.stdout.strip()) prompts = [ “What is 2+2?”, “Format this JSON: {“a”:1,”b”:2}”, “Read the file at src/main.py”, “Add a docstring to the foo function”, “What does this function do?”, “Refactor the auth module to use dependency injection without breaking existing callers”, “Design a distributed event-sourced order pipeline that handles 50k req/s with strict ordering”, “Analyze the tradeoffs between actor-model and CSP-style concurrency for our codebase”, “Debug why this asyncio.gather call deadlocks under high load and provide a fix”, “Prove that this scheduling algorithm is optimal step by step and derive the worst-case bound”, ] print(“n[3] Classifying 10 prompts (first call warms the encoder)…”) rows = [classify(p) for p in prompts] df = pd.DataFrame(rows) cols = [c for c in [“tier”, “score”, “confidence”, “model”, “prompt”] if c in df.columns] print(df[cols].to_string(index=False)) import nadirclaw PKG = Path(nadirclaw.__file__).parent SIMPLE_C = np.load(PKG / “simple_centroid.npy”).astype(np.float32).flatten() COMPLEX_C = np.load(PKG / “complex_centroid.npy”).astype(np.float32).flatten() def cosine(a, b): return float(a @ b / (np.linalg.norm(a) * np.linalg.norm(b) + 1e-12)) print(f”n[4] simple_centroid shape={SIMPLE_C.shape} ‖·‖={np.linalg.norm(SIMPLE_C):.3f}”) print(f” complex_centroid shape={COMPLEX_C.shape} ‖·‖={np.linalg.norm(COMPLEX_C):.3f}”) print(f” cosine(simple,complex) = {cosine(SIMPLE_C, COMPLEX_C):.4f} ” “← if this were 1.0 the classifier couldn’t distinguish them.”) We define a reusable classify() function that sends prompts to the NadirClaw CLI and returns structured JSON results. We create a mixed set of simple and complex prompts, classify them, and display the routing tier, score, confidence, model, and prompt text in a table. We then load the simple and complex centroid vectors from the NadirClaw package and compare their shapes, norms, and cosine similarity. Copy CodeCopiedUse a different Browser from sentence_transformers import SentenceTransformer print(“n[5] Loading the same encoder NadirClaw uses (all-MiniLM-L6-v2)…”) encoder = SentenceTransformer(“sentence-transformers/all-MiniLM-L6-v2”) embs = encoder.encode(prompts, normalize_embeddings=True) sim_simple = np.array([cosine(e, SIMPLE_C) for e in embs]) sim_complex = np.array([cosine(e, COMPLEX_C) for e in embs]) fig, ax = plt.subplots(figsize=(8.5, 6)) colors = [“tab:blue”] * 5 + [“tab:red”] * 5 ax.scatter(sim_simple, sim_complex, c=colors, s=110, edgecolor=”k”, linewidth=0.5) for i, _ in enumerate(prompts): ax.annotate(str(i + 1), (sim_simple[i], sim_complex[i]), xytext=(6, 4), textcoords=”offset points”, fontsize=10) xs = np.linspace(min(sim_simple.min(), sim_complex.min()), max(sim_simple.max(), sim_complex.max()), 50) ax.plot(xs, xs, “k–“, alpha=0.4, label=”cos(simple) = cos(complex)”) ax.set_xlabel(“cosine similarity to SIMPLE centroid”) ax.set_ylabel(“cosine similarity to COMPLEX centroid”) ax.set_title(“Routing decision boundaryn(blue = expected simple, red = expected complex)”) ax.legend(loc=”lower right”) ax.grid(alpha=0.25) plt.tight_layout() plt.savefig(“centroid_decision_plot.png”, dpi=120) plt.show() print(“Legend: prompts above the dashed line route to COMPLEX, below to SIMPLE.”) print(“n[6] Prompts sorted by complexity score:”) sdf = df.sort_values(“score”).reset_index(drop=True) for _, row in sdf.iterrows(): bar = “█” * int(round(float(row[“score”]) * 30)) print(f” score={float(row[‘score’]):.2f} conf={float(row[‘confidence’]):.2f} ” f”{row[‘tier’]:7s} |{bar:<30s}| {row[‘prompt’][:55]}”) print(“n[6] Confidence-threshold sweep (low confidence → forced complex):”) print(” NadirClaw default threshold is 0.06.”) for thr in [0.02, 0.06, 0.10, 0.20, 0.30]: forced_complex = sum(1 for r in rows if float(r[“confidence”]) < thr) natural_complex = sum(1 for r in rows if float(r[“score”]) >= 0.5) print(f” threshold={thr:.2f} → {forced_complex} prompts force-complex ” f”(low-confidence), {natural_complex} naturally complex by score”) modifier_demos = [ (“agentic — text-only marker”, “You are a coding agent that can execute commands. Now add tests for the new endpoint.”), (“reasoning — chain-of-thought markers”, “Step by step, derive the closed form and prove correctness mathematically. ” “Compare and contrast both approaches.”), (“vision — would arrive with image_url part (only text shown)”, “Describe the screenshot.”), ] print(“n[7] Modifier-marker scan:”) for label, p in modifier_demos: r = classify(p) print(f” {label}”) print(f” prompt='{p[:65]}…'”) print(f” tier={r[‘tier’]} score={float(r[‘score’]):.2f} conf={float(r[‘confidence’]):.2f}”) print(” NB: agentic & vision routing also trigger from request shape ” “(tools=[…], image_url parts) — see live calls below.”) We use the same SentenceTransformer encoder as NadirClaw and embed all tutorial prompts locally. We compare each prompt embedding against the simple and complex centroids, then visualize the routing boundary with a scatter plot. We also sort prompts by complexity score, test confidence thresholds, and inspect routing modifier examples for agentic, reasoning, and vision-style requests. Copy CodeCopiedUse a different Browser PORT = 8856 server_proc = None if LIVE_ROUTING: print(f”n[8] Starting `nadirclaw serve` on :{PORT} (background subprocess)…”) env = os.environ.copy() env.update({ “GEMINI_API_KEY”: GEMINI_API_KEY, “NADIRCLAW_SIMPLE_MODEL”: “gemini-2.5-flash”, “NADIRCLAW_COMPLEX_MODEL”: “gemini-2.5-pro”, “NADIRCLAW_PORT”: str(PORT), }) server_proc = subprocess.Popen( [“nadirclaw”, “serve”, “–verbose”], env=env, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, preexec_fn=os.setsid if hasattr(os, “setsid”) else None, ) ready = False for _ in range(60): if server_proc.poll() is not None: break try: if requests.get(f”http://localhost:{PORT}/health”, timeout=1).ok:

How to Build a Cost-Aware LLM Routing System with NadirClaw Using Local Prompt Classification and Gemini Model Switching Read Post »

AI, Committee, 新闻, Uncategorized

NVIDIA AI Just Released cuda-oxide: An Experimental Rust-to-CUDA Compiler Backend that Compiles SIMT GPU Kernels Directly to PTX

admin NU / 5 月 10, 2026

NVIDIA AI researchers recently released cuda-oxide, an experimental compiler that allows developers to write CUDA SIMT (Single Instruction, Multiple Threads) GPU kernels in standard Rust code. The project compiles Rust directly to PTX (Parallel Thread Execution) — the assembly-like intermediate representation that CUDA uses to target NVIDIA GPUs — without requiring domain-specific languages, foreign function interface bindings, or C/C++ code. How This Makes a Change Writing GPU kernels today typically means writing C++ and using the CUDA programming model directly, or relying on Python-level abstractions like Triton that generate CUDA under the hood. The Rust GPU ecosystem has had projects attempting to bridge this gap — Rust-GPU targets SPIR-V for Vulkan/graphics compute, rust-cuda uses a rustc codegen backend targeting NVVM IR, CubeCL uses an embedded DSL with a JIT runtime that cross-compiles to CUDA/ROCm/WGPU, and std::offload uses LLVM’s implicit offload path. cuda-oxide occupies a specific position in this space. Its stated design center is “bringing CUDA into Rust” — kernel authoring, device intrinsics, the SIMT execution model, and the CUDA programming model expressed natively in safe Rust — closer in spirit to writing a __global__ function in C++ than to writing a generic Rust function that happens to run on a GPU. By contrast, the closest neighbor, rust-cuda, focuses on “bringing Rust to NVIDIA GPUs”: Rust ergonomics like async/.await, parts of the standard library running on-device, and a Rust-first programming model that abstracts over CUDA concepts. The NVlabs team notes it has been coordinating with rust-cuda maintainers and considers the two projects complementary. The Compilation Pipeline At the core of cuda-oxide is a custom rustc codegen backend — the layer in the Rust compiler responsible for generating machine code. Instead of emitting native CPU code, the rustc-codegen-cuda crate intercepts the compiler at the CodegenBackend::codegen_crate() entry point and runs a separate pipeline for device code: Rust Source → rustc frontend → rustc_public (Stable MIR) → dialect-mir → mem2reg → dialect-llvm → LLVM IR (.ll) → PTX (.ptx) Here are some important elements: Why rustc_public? The raw internal MIR representation in rustc changes between nightly versions with no stability guarantees. cuda-oxide uses rustc_public — also known as Stable MIR — which is Rust’s official versioned, stable API over the compiler’s internals. This lets the backend read MIR without breaking on every nightly update. What is Pliron? The middle stages use Pliron, a Rust-native MLIR-like IR framework written entirely in Rust. Choosing Pliron instead of upstream MLIR means the entire compiler builds with cargo — no C++ toolchain, no CMake, no tablegen. cuda-oxide defines three custom Pliron dialects: dialect-mir (modeling Rust MIR semantics — places, projections, rvalues, terminators), dialect-llvm (modeling LLVM IR with textual .ll export), and dialect-nvvm (NVIDIA GPU intrinsics like thread indexing, barriers, and TMA). What does llc do? After the dialect-llvm printer serializes the IR into a textual .ll file, the external llc binary (the LLVM static compiler with NVPTX backend) compiles it to PTX assembly. This is the one stage outside pure Rust. The resulting .ptx file is written next to the host binary — for example, target/debug/vecadd.ptx — and loaded by the CUDA driver at runtime. You as a developer can observe each stage with: cargo oxide pipeline vecadd This prints the full trace from Rust MIR through each dialect down to PTX output. Single-Source Compilation and the Host/Device Split Host and device code live in the same .rs source file. cargo oxide sets -Z codegen-backend=librustc_codegen_cuda.so, which routes code generation through cuda-oxide’s backend. The backend then scans compiled code for monomorphized functions whose names carry the reserved cuda_oxide_kernel_<hash>_<name> prefix — the namespace that the #[kernel] proc macro creates. Functions matching that prefix go through the cuda-oxide pipeline to produce PTX; all other host code is delegated to rustc’s standard LLVM backend. The result of a single cargo oxide build is a host binary plus a .ptx file. cargo oxide run vecadd cargo oxide debug vecadd –tui # debug with cuda-gdb Device code from library dependencies is compiled lazily: the backend reads their Stable MIR from .rlib metadata on demand, only compiling functions a kernel actually calls. What You Can Write in a Kernel cuda-oxide supports a meaningful subset of Rust in GPU kernel functions, marked with the #[kernel] attribute macro. This includes: Generic functions with monomorphization — fn scale<T: Copy>(…) is compiled to a concrete PTX kernel per type used at the call site. Closures with captures — closures passed from the host are scalarized and passed as PTX kernel parameters automatically. User-defined structs and enums — standard Rust data structures work inside kernels. Pattern matching — match, if let, and related constructs work in device code. Full GPU intrinsics — the cuda-device crate provides wrappers for thread indexing, warp operations (shfl_sync, ballot_sync, etc.), shared memory, barriers, TMA (Tensor Memory Accelerator), Thread Block Clusters, and scoped atomics (6 types × 3 scopes × 5 orderings). One important GPU-specific compiler detail: rustc’s JumpThreading MIR optimization — which duplicates function calls into both branches of an if-statement — is disabled for device code in cuda-oxide. On CPUs this is a safe optimization, but on GPUs it breaks barrier semantics: all threads in a block must converge at the same bar.sync instruction, and duplicating it across branches violates that requirement. Additionally, sync primitives are marked convergent in the emitted LLVM IR so that LLVM’s optimization passes cannot move or duplicate them across control flow. How to Use NVIDIA Star Elastic NVlabs cuda-oxide — Step-by-Step Guide Rust → Stable MIR → Pliron IR → LLVM IR → PTX | v0.1.0 Step 01 of 09 · Prerequisites What You Need Before You Start cuda-oxide has specific version requirements for each dependency. Before installing anything, verify your system meets all of these. The project is currently Linux-only (tested on Ubuntu 24.04). Linux (Ubuntu 24.04) Rust nightly CUDA Toolkit 12.x+ LLVM 21+ Clang 21 / libclang-common-21-dev Git ⓘ Why LLVM 21? Simple kernels may work on LLVM 20, but anything targeting Hopper or Blackwell — TMA, tcgen05, WGMMA — requires llc from LLVM 21

NVIDIA AI Just Released cuda-oxide: An Experimental Rust-to-CUDA Compiler Backend that Compiles SIMT GPU Kernels Directly to PTX Read Post »

AI, Committee, 新闻, Uncategorized

How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery

admin NU / 5 月 9, 2026

In this tutorial, we perform an advanced single-cell RNA-seq analysis workflow using Scanpy on the PBMC-3k benchmark dataset. We start by loading the dataset, inspecting its structure, and applying quality control checks to evaluate gene counts, total counts, mitochondrial content, and ribosomal gene signals. We then filter low-quality cells and genes, detect potential doublets with Scrublet, normalize the data, apply log transformation, and identify highly variable genes for downstream analysis. Also, we score cell-cycle phases, regress out unwanted technical variation, scale the data, and reduce dimensionality using PCA, UMAP, and t-SNE. We also cluster cells with the Leiden algorithm, identify marker genes, annotate cell populations using canonical PBMC markers, explore trajectory structure with PAGA and diffusion pseudotime, calculate a custom interferon-response score, and finally save the fully analyzed AnnData object for future use. Copy CodeCopiedUse a different Browser !pip install -q scanpy leidenalg python-igraph scrublet import scanpy as sc import numpy as np import pandas as pd import matplotlib.pyplot as plt import warnings warnings.filterwarnings(“ignore”) sc.settings.verbosity = 3 sc.settings.set_figure_params(dpi=80, facecolor=”white”, figsize=(5, 5)) sc.logging.print_header() adata = sc.datasets.pbmc3k() adata.var_names_make_unique() print(adata) adata.var[“mt”] = adata.var_names.str.startswith(“MT-“) adata.var[“ribo”] = adata.var_names.str.startswith((“RPS”, “RPL”)) sc.pp.calculate_qc_metrics( adata, qc_vars=[“mt”, “ribo”], percent_top=None, log1p=False, inplace=True ) sc.pl.violin( adata, [“n_genes_by_counts”, “total_counts”, “pct_counts_mt”], jitter=0.4, multi_panel=True, ) sc.pl.scatter(adata, x=”total_counts”, y=”pct_counts_mt”) sc.pl.scatter(adata, x=”total_counts”, y=”n_genes_by_counts”) We install the required single-cell analysis libraries and import Scanpy, NumPy, Pandas, Matplotlib, and warning controls. We load the PBMC-3k benchmark dataset, make gene names unique, and inspect the AnnData object structure. We then calculate quality control metrics for mitochondrial and ribosomal genes and visualize count-level quality patterns using violin and scatter plots. Copy CodeCopiedUse a different Browser sc.pp.filter_cells(adata, min_genes=200) sc.pp.filter_genes(adata, min_cells=3) adata = adata[adata.obs.n_genes_by_counts < 2500, :].copy() adata = adata[adata.obs.pct_counts_mt < 5, :].copy() sc.pp.scrublet(adata) print(“Predicted doublets:”, int(adata.obs[“predicted_doublet”].sum())) adata = adata[~adata.obs[“predicted_doublet”], :].copy() adata.layers[“counts”] = adata.X.copy() sc.pp.normalize_total(adata, target_sum=1e4) sc.pp.log1p(adata) sc.pp.highly_variable_genes(adata, min_mean=0.0125, max_mean=3, min_disp=0.5) sc.pl.highly_variable_genes(adata) adata.raw = adata adata = adata[:, adata.var.highly_variable].copy() We filter out low-quality cells and rarely detected genes to improve the reliability of the dataset. We use Scrublet through Scanpy to identify predicted doublets and remove them before deeper analysis. We then preserve raw counts, normalize expression values, apply log transformation, select highly variable genes, and keep only the most informative features. Copy CodeCopiedUse a different Browser s_genes = [“MCM5″,”PCNA”,”TYMS”,”FEN1″,”MCM2″,”MCM4″,”RRM1″,”UNG”,”GINS2″, “MCM6″,”CDCA7″,”DTL”,”PRIM1″,”UHRF1″,”HELLS”,”RFC2″,”NASP”, “RAD51AP1″,”GMNN”,”WDR76″,”SLBP”,”CCNE2″,”UBR7″,”POLD3″,”MSH2″, “ATAD2″,”RAD51″,”RRM2″,”CDC45″,”CDC6″,”EXO1″,”TIPIN”,”DSCC1″, “BLM”,”CASP8AP2″,”USP1″,”CLSPN”,”POLA1″,”CHAF1B”,”E2F8″] g2m_genes = [“HMGB2″,”CDK1″,”NUSAP1″,”UBE2C”,”BIRC5″,”TPX2″,”TOP2A”,”NDC80″, “CKS2″,”NUF2″,”CKS1B”,”MKI67″,”TMPO”,”CENPF”,”TACC3″,”SMC4″, “CCNB2″,”CKAP2L”,”CKAP2″,”AURKB”,”BUB1″,”KIF11″,”ANP32E”, “TUBB4B”,”GTSE1″,”KIF20B”,”HJURP”,”CDCA3″,”CDC20″,”TTK”, “CDC25C”,”KIF2C”,”RANGAP1″,”NCAPD2″,”DLGAP5″,”CDCA2″,”CDCA8″, “ECT2″,”KIF23″,”HMMR”,”AURKA”,”PSRC1″,”ANLN”,”LBR”,”CKAP5″, “CENPE”,”NEK2″,”G2E3″,”CBX5″,”CENPA”] s_genes = [g for g in s_genes if g in adata.var_names] g2m_genes = [g for g in g2m_genes if g in adata.var_names] sc.tl.score_genes_cell_cycle(adata, s_genes=s_genes, g2m_genes=g2m_genes) sc.pp.regress_out(adata, [“total_counts”, “pct_counts_mt”]) sc.pp.scale(adata, max_value=10) sc.tl.pca(adata, svd_solver=”arpack”) sc.pl.pca_variance_ratio(adata, log=True, n_pcs=50) sc.pp.neighbors(adata, n_neighbors=10, n_pcs=40) sc.tl.umap(adata) sc.tl.tsne(adata, n_pcs=40) We define S-phase and G2/M-phase marker genes and retain only those present in the dataset. We score each cell for cell-cycle phase, regress out unwanted variation from total counts and mitochondrial percentage, and scale the data for downstream modeling. We then run PCA, inspect explained variance, construct the neighborhood graph, and generate UMAP and t-SNE embeddings. Copy CodeCopiedUse a different Browser sc.tl.leiden(adata, resolution=0.5, flavor=”igraph”, n_iterations=2, directed=False) sc.pl.umap(adata, color=”leiden”, legend_loc=”on data”, title=”Leiden clusters”) sc.pl.tsne(adata, color=”leiden”, legend_loc=”on data”, title=”t-SNE clusters”) sc.tl.rank_genes_groups(adata, “leiden”, method=”wilcoxon”) sc.pl.rank_genes_groups(adata, n_genes=20, sharey=False) result = adata.uns[“rank_genes_groups”] groups = result[“names”].dtype.names top_df = pd.DataFrame({g: result[“names”][g][:10] for g in groups}) print(“nTop 10 markers per cluster:n”, top_df) marker_genes = { “B-cell”: [“CD79A”, “MS4A1”], “CD8 T-cell”: [“CD8A”, “CD8B”], “CD4 T-cell”: [“IL7R”, “CD4”], “NK”: [“GNLY”, “NKG7”], “CD14 Monocyte”: [“CD14”, “LYZ”], “FCGR3A Monocyte”: [“FCGR3A”, “MS4A7”], “Dendritic”: [“FCER1A”, “CST3”], “Megakaryocyte”: [“PPBP”], } sc.pl.dotplot(adata, marker_genes, groupby=”leiden”, standard_scale=”var”) sc.pl.stacked_violin(adata, marker_genes, groupby=”leiden”, swap_axes=True) We apply Leiden clustering to group cells based on the neighborhood graph and visualize the clusters on UMAP and t-SNE plots. We perform differential expression analysis using the Wilcoxon test to identify the top marker genes for each cluster. We then use canonical PBMC marker genes to support cell-type annotation through dot plots and stacked violin plots. Copy CodeCopiedUse a different Browser sc.tl.paga(adata, groups=”leiden”) sc.pl.paga(adata, color=”leiden”, threshold=0.1) sc.tl.umap(adata, init_pos=”paga”) sc.pl.umap(adata, color=”leiden”, legend_loc=”on data”) sc.tl.diffmap(adata) sc.pp.neighbors(adata, n_neighbors=10, use_rep=”X_diffmap”) adata.uns[“iroot”] = np.flatnonzero(adata.obs[“leiden”] == adata.obs[“leiden”].cat.categories[0])[0] sc.tl.dpt(adata) sc.pl.umap(adata, color=[“leiden”, “dpt_pseudotime”], legend_loc=”on data”) ifn_genes = [“ISG15”, “IFI6”, “IFIT1”, “IFIT3”, “MX1”, “OAS1”, “STAT1”, “IRF7″] ifn_genes = [g for g in ifn_genes if g in adata.raw.var_names] sc.tl.score_genes(adata, gene_list=ifn_genes, score_name=”IFN_score”) sc.pl.umap(adata, color=”IFN_score”, cmap=”viridis”) adata.write(“pbmc3k_analyzed.h5ad”) print(“n Analysis complete — saved to pbmc3k_analyzed.h5ad”) print(adata) We run PAGA to model connectivity between Leiden clusters and reinitialize UMAP using the PAGA graph to obtain a clearer trajectory structure. We compute diffusion maps and diffusion pseudotime to explore possible progression patterns across cell states. We also calculate an interferon-response gene-set score, visualize it on UMAP, and save the final analyzed object as an .h5ad file. In conclusion, we built an end-to-end Scanpy pipeline for single-cell RNA-seq analysis, transforming raw PBMC data into interpretable biological insights. We cleaned and preprocessed the dataset, removed noisy cells and doublets, selected informative genes, and generated meaningful embeddings to visualize cellular structure. We then used Leiden clustering and differential expression analysis to discover marker genes and connect clusters to known immune cell types. By adding PAGA, diffusion pseudotime, and custom gene-set scoring, we extended the workflow beyond basic clustering and showed how Scanpy supports deeper biological interpretation. At last, we had a saved .h5ad object that contains the processed data, annotations, scores, clusters, and visual analysis results, ready for downstream exploration or reporting. Check out the Full Codes with Notebook here. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well. Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us The post How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery appeared first on MarkTechPost.

How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery Read Post »

Committee

Innovation abounds in device charging

The Download: the hantavirus outbreak and Musk v. Altman week 2

Implementing Prompt Compression to Reduce Agentic Loop Costs

Fostering breakthrough AI innovation through customer-back engineering

Implementing advanced AI technologies in finance

Implementing Prompt Compression to Reduce Agentic Loop Costs

NVIDIA AI Releases Star Elastic: One Checkpoint that Contains 30B, 23B, and 12B Reasoning Models with Zero-Shot Slicing

How to Build a Cost-Aware LLM Routing System with NadirClaw Using Local Prompt Classification and Gemini Model Switching

NVIDIA AI Just Released cuda-oxide: An Experimental Rust-to-CUDA Compiler Backend that Compiles SIMT GPU Kernels Directly to PTX

How to Build a Single-Cell RNA-seq Analysis Pipeline with Scanpy for PBMC Clustering, Annotation, and Trajectory Discovery

我们的服务

首页

工作原理

新闻

定价

支持

幫助中心

报告问题

提供反馈

隱私權政策

用户账户

关注我们