YouZum

AI

AI, Committee, ニュース, Uncategorized

Google DeepMind Unveils AlphaGenome: A Unified Sequence-to-Function Model Using Hybrid Transformers and U-Nets to Decode the Human Genome

Google DeepMind is expanding its biological toolkit beyond the world of protein folding. After the success of AlphaFold, the Google’s research team has introduced AlphaGenome. This is a unified deep learning model designed for sequence to function genomics. This represents a major shift in how we model the human genome. AlphaGenome does not treat DNA as simple text. Instead, it processes 1,000,000 base pair windows of raw DNA to predict the functional state of a cell. Bridging the Scale Gap with Hybrid Architectures The complexity of the human genome comes from its scale. Most existing models struggle to see the big picture while keeping track of fine details. AlphaGenome solves this by using a hybrid architecture. It combines a U-Net backbone with Transformer blocks. This allows the model to capture long range interactions across 1 Megabase of sequence while maintaining base pair resolution. This is like building a system that can read a thousand page book and still remember the exact location of a single comma. Mapping Sequences to Functional Biological Modalities AlphaGenome is a sequence to function model. This means its primary goal is to map DNA sequences directly to biological activities. These activities are measured in genomic tracks. The research team trained AlphaGenome to predict 11 different genomic modalities. These modalities include RNA-seq, CAGE, and ATAC-seq. They also include ChIP-seq for various transcription factors and chromatin contact maps. By predicting all these tracks at once, the model gains a holistic understanding of how DNA regulates the cell. The Power of Multi-Task Learning in Genomics The technical advancement of AlphaGenome lies in its ability to handle 11 distinct types of data simultaneously. In the past, researchers often built separate models for each task. AlphaGenome uses a multi-task learning approach. This helps the model learn shared features across different biological processes. If the model understands how a protein binds to DNA, it can better predict how that DNA will be expressed as RNA. This unified approach reduces the need for multiple specialized models. Advancing Variant Effect Prediction via Distillation One of the most critical applications for AlphaGenome is Variant Effect Prediction, or VEP. This process determines how a single mutation in DNA affects the body. Mutations can lead to diseases like cancer or heart disease. AlphaGenome excels at this by using a specific training method called Teacher Student distillation. The research team first created an ensemble of ‘all folds’ teacher models. These teachers were trained on vast amounts of genomic data. Then, they distilled that knowledge into a single student model. Compressing Knowledge for Precision Medicine This distillation process makes the model both faster and more robust. This is a standard way to compress knowledge. However, applying it to genomics at this scale is a new milestone. The student model learns to replicate the high quality predictions of the teacher ensemble. This allows it to identify harmful mutations with high accuracy. The model can even predict how a mutation in a distant regulatory element might impact a gene far away on the DNA strand. High-Performance Computing with JAX and TPUs The architecture is implemented using JAX. JAX is a high performance numerical computing library. It is often used for high scale machine learning at Google. Using JAX allows AlphaGenome to run efficiently on Tensor Processing Units, or TPUs. The research team used sequence parallelism to handle the massive 1 Megabase input windows. This ensures that the memory requirements do not explode as the sequence length increases. This shows the importance of selecting the right framework for large scale biological data. Transfer Learning for Data-Scarce Cell Types AlphaGenome also addresses the challenge of data scarcity in certain cell types. Because it is a foundation model, it can be fine tuned for specific tasks. The model learns general biological rules from large public datasets. These rules can then be applied to rare diseases or specific tissues where data is hard to find. This transfer learning capability is one of the reasons why AlphaGenome is so versatile. It can predict how a gene will behave in a brain cell even if it was primarily trained on liver cell data. Toward a New Era of Personalized Care In the future, AlphaGenome could lead to a new era of personalized medicine. Doctors could use the model to scan a patient’s entire genome in 1,000,000 base pair chunks. They could identify exactly which variants are likely to cause health issues. This would allow for treatments that are tailored to a person’s specific genetic code. AlphaGenome moves us closer to this reality by providing a clear and accurate map of the functional genome. Setting the Standard for Biological AI AlphaGenome also marks a turning point for AI in genomics. It proves that we can model the most complex biological systems using the same principles used in modern AI. By combining U-Net structures with Transformers and using teacher student distillation, Google DeepMind team has set a new standard. Key Takeaways Hybrid Sequence Architecture: AlphaGenome uses a specialized hybrid design that combines a U-Net backbone with Transformer blocks. This allows the model to process massive windows of 1,000,000 base pairs while maintaining the high resolution needed to identify single mutations. Multi-Modal Functional Prediction: The model is trained to predict 11 different genomic modalities simultaneously, which include RNA-seq, CAGE, and ATAC-seq. By learning these various biological tracks together, the system gains a holistic understanding of how DNA regulates cellular activity across different tissues. Teacher-Student Distillation: To achieve industry leading accuracy in Variant Effect Prediction (VEP), researchers used a distillation method. They transferred the knowledge from an ensemble of high performing ‘teacher’ models into a single, efficient ‘student’ model that is faster and more robust for identifying disease-causing mutations. Built for High Performance Computing: The framework is implemented in JAX and optimized for TPUs. By using sequence parallelism, AlphaGenome can handle the computational load of analyzing megabase scale DNA sequences without exceeding memory limits, making it a powerful tool for large scale research. Check out the Paper and Repo. Also, feel free to follow us

Google DeepMind Unveils AlphaGenome: A Unified Sequence-to-Function Model Using Hybrid Transformers and U-Nets to Decode the Human Genome 投稿を読む »

AI, Committee, ニュース, Uncategorized

TRIM: Token-wise Attention-Derived Saliency for Data-Efficient Instruction Tuning

arXiv:2510.07118v2 Announce Type: replace Abstract: Instruction tuning is essential for aligning large language models (LLMs) to downstream tasks and commonly relies on large, diverse corpora. However, small, high-quality subsets, known as coresets, can deliver comparable or superior results, though curating them remains challenging. Existing methods often rely on coarse, sample-level signals like gradients, an approach that is computationally expensive and overlooks fine-grained features. To address this, we introduce TRIM (Token Relevance via Interpretable Multi-layer Attention), a forward-only, token-centric framework. Instead of using gradients, TRIM operates by matching underlying representational patterns identified via attention-based “fingerprints” from a handful of target samples. Such an approach makes TRIM highly efficient and uniquely sensitive to the structural features that define a task. Coresets selected by our method consistently outperform state-of-the-art baselines by up to 9% on downstream tasks and even surpass the performance of full-data fine-tuning in some settings. By avoiding expensive backward passes, TRIM achieves this at a fraction of the computational cost. These findings establish TRIM as a scalable and efficient alternative for building high-quality instruction-tuning datasets.

TRIM: Token-wise Attention-Derived Saliency for Data-Efficient Instruction Tuning 投稿を読む »

AI, Committee, ニュース, Uncategorized

HeuriGym: An Agentic Benchmark for LLM-Crafted Heuristics in Combinatorial Optimization

arXiv:2506.07972v2 Announce Type: replace-cross Abstract: While Large Language Models (LLMs) have demonstrated significant advancements in reasoning and agent-based problem-solving, current evaluation methodologies fail to adequately assess their capabilities: existing benchmarks either rely on closed-ended questions prone to saturation and memorization, or subjective comparisons that lack consistency and rigor. In this work, we introduce HeuriGym, an agentic framework designed for evaluating heuristic algorithms generated by LLMs for combinatorial optimization problems, characterized by clearly defined objectives and expansive solution spaces. HeuriGym empowers LLMs to propose heuristics, receive evaluative feedback via code execution, and iteratively refine their solutions. We evaluate nine state-of-the-art models on nine problems across domains such as computer systems, logistics, and biology, exposing persistent limitations in tool use, planning, and adaptive reasoning. To quantify performance, we propose the Quality-Yield Index (QYI), a metric that captures both solution pass rate and quality. Even top models like GPT-o4-mini-high and Gemini-2.5-Pro attain QYI scores of only 0.6, well below the expert baseline of 1. Our open-source benchmark aims to guide the development of LLMs toward more effective and realistic problem-solving in scientific and engineering domains.

HeuriGym: An Agentic Benchmark for LLM-Crafted Heuristics in Combinatorial Optimization 投稿を読む »

AI, Committee, ニュース, Uncategorized

LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization

arXiv:2510.13907v2 Announce Type: replace Abstract: Large language models (LLMs) are highly sensitive to prompts, but most automatic prompt optimization (APO) methods assume access to ground-truth references (e.g., labeled validation data) that are costly to obtain. We propose the Prompt Duel Optimizer (PDO), a sample-efficient framework for label-free prompt optimization based on pairwise preference feedback from an LLM judge. PDO casts prompt selection as a dueling-bandit problem and combines (i) Double Thompson Sampling to prioritize informative comparisons under a fixed judge budget, with (ii) top-performer guided mutation to expand the candidate pool while pruning weak prompts. Experiments on BIG-bench Hard (BBH) and MS MARCO show that PDO consistently identifies stronger prompts than label-free baselines, while offering favorable quality–cost trade-offs under constrained comparison budgets.

LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization 投稿を読む »

AI, Committee, ニュース, Uncategorized

Meet the Vitalists: the hardcore longevity enthusiasts who believe death is “wrong”

“Who here believes involuntary death is a good thing?”  Nathan Cheng has been delivering similar versions of this speech over the last couple of years, so I knew what was coming. He was about to try to convince the 80 or so people in the audience that death is bad. And that defeating it should be humanity’s number one priority—quite literally, that it should come above all else in the social and political hierarchy. “If you believe that life is good and there’s inherent moral value to life,” he told them, “it stands to reason that the ultimate logical conclusion here is that we should try to extend lifespan indefinitely.”  Solving aging, he added, is “a problem that has an incredible moral duty for all of us to get involved in.” It was the end of April, and the crowd—with its whoops and yeahs—certainly seemed convinced. They’d gathered at a compound in Berkeley, California, for a three-day event called the Vitalist Bay Summit. It was part of a longer, two-month residency (simply called Vitalist Bay) that hosted various events to explore tools—from drug regulation to cryonics—that might be deployed in the fight against death. One of the main goals, though, was to spread the word of Vitalism, a somewhat radical movement established by Cheng and his colleague Adam Gries a few years ago. No relation to the lowercase vitalism of old, this Vitalism has a foundational philosophy that’s deceptively simple: to acknowledge that death is bad and life is good. The strategy for executing it, though, is far more obviously complicated: to launch a longevity revolution.  Interest in longevity has certainly taken off in recent years, but as the Vitalists see it, it has a branding problem. The term “longevity” has been used to sell supplements with no evidence behind them, “anti-aging” has been used by clinics to sell treatments, and “transhumanism” relates to ideas that go well beyond the scope of defeating death. Not everyone in the broader longevity space shares Vitalists’ commitment to actually making death obsolete. As Gries, a longtime longevity devotee who has largely become the enthusiastic public face of Vitalism, said in an online presentation about the movement in 2024, “We needed some new word.” “Vitalism” became a clean slate: They would start a movement to defeat death, and make that goal the driving force behind the actions of individuals, societies, and nations. Longevity could no longer be a sideshow. For Vitalism to succeed, budgets would need to change. Policy would need to change. Culture would need to change. Consider it longevity for the most hardcore adherents—a sweeping mission to which nothing short of total devotion will do. “The idea is to change the systems and the priorities of society at the highest levels,” Gries said in the presentation. To be clear, the effective anti-aging treatments the Vitalists are after don’t yet exist. But that’s sort of the point: They believe they could exist if Vitalists are able to spread their gospel, influence science, gain followers, get cash, and ultimately reshape government policies and priorities.  For the past few years, Gries and Cheng have been working to recruit lobbyists, academics, biotech CEOs, high-net-worth individuals, and even politicians into the movement, and they’ve formally established a nonprofit foundation “to accelerate Vitalism.” Today, there’s a growing number of Vitalists (some paying foundation members, others more informal followers, and still others who support the cause but won’t publicly admit as much), and the foundation has started “certifying” qualifying biotech companies as Vitalist organizations. Perhaps most consequentially, Gries, Cheng, and their peers are also getting involved in shaping US state laws that make unproven, experimental treatments more accessible. They hope to be able to do the same at the national level. VITALISMFOUNDATION.ORG VITALISMFOUNDATION.ORG Vitalism cofounders Nathan Cheng and Adam Gries want to launch a longevity revolution. All this is helping Vitalists grow in prominence, if not also power. In the past, people who have spoken of living forever or making death “optional” have been dismissed by their academic colleagues. I’ve been covering the broader field of aging science for a decade, and I’ve seen scientists roll their eyes, shrug their shoulders, and turn their backs on people who have talked this way. That’s not the case for the Vitalists.   Even the scientists who think that Vitalist ideas of defeating death are wacky, unattainable ones, with the potential to discredit their field, have shown up on stage with Vitalism’s founders, and these serious researchers provide a platform for them at more traditionally academic events. I saw this collegiality firsthand at Vitalist Bay. Faculty members from Harvard, Stanford, and the University of California, Berkeley, all spoke at events. Eric Verdin, the prominent researcher who directs the Buck Institute for Research on Aging in Novato, California, had also planned to speak, although a scheduling clash meant he couldn’t make it in the end. “I have very different ideas in terms of what’s doable,” he told me. “But that’s part of the [longevity] movement—there’s freedom for people to say whatever they want.”  Many other well-respected scientists attended, including representatives of ARPA-H, the US federal agency for health research and breakthrough technologies. And as I left for a different event on longevity in Washington, DC, just after the Vitalist Bay Summit, a sizable group of Vitalist Bay attendees headed that way too, to make the case for longevity to US lawmakers. The Vitalists feel that momentum is building, not just for the science of aging and the development of lifespan-extending therapies, but for the acceptance of their philosophy that defeating death should be humanity’s top concern.  This, of course, sparks some pretty profound questions. What would a society without death look like—and would we even want it? After all, death has become an important part of human culture the world over. And even if Vitalists aren’t destined to realize their lofty goal, their growing influence could still have implications for us all. As they run more labs and companies, and insert themselves into the

Meet the Vitalists: the hardcore longevity enthusiasts who believe death is “wrong” 投稿を読む »

AI, Committee, ニュース, Uncategorized

Propaganda AI: An Analysis of Semantic Divergence in Large Language Models

arXiv:2504.12344v2 Announce Type: replace Abstract: Large language models (LLMs) can exhibit concept-conditioned semantic divergence: common high-level cues (e.g., ideologies, public figures) elicit unusually uniform, stance-like responses that evade token-trigger audits. This behavior falls in a blind spot of current safety evaluations, yet carries major societal stakes, as such concept cues can steer content exposure at scale. We formalize this phenomenon and present RAVEN (Response Anomaly Vigilance), a black-box audit that flags cases where a model is simultaneously highly certain and atypical among peers by coupling semantic entropy over paraphrastic samples with cross-model disagreement. In a controlled LoRA fine-tuning study, we implant a concept-conditioned stance using a small biased corpus, demonstrating feasibility without rare token triggers. Auditing five LLM families across twelve sensitive topics (360 prompts per model) and clustering via bidirectional entailment, RAVEN surfaces recurrent, model-specific divergences in 9/12 topics. Concept-level audits complement token-level defenses and provide a practical early-warning signal for release evaluation and post-deployment monitoring against propaganda-like influence.

Propaganda AI: An Analysis of Semantic Divergence in Large Language Models 投稿を読む »

AI, Committee, ニュース, Uncategorized

Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

arXiv:2601.19895v1 Announce Type: cross Abstract: Large language model (LLM) scaling is hitting a wall. Widening models yields diminishing returns, and extending context length does not improve fundamental expressivity. In contrast, depth scaling offers theoretically superior expressivity, yet current Transformer architectures struggle to train reliably at extreme depths. We revisit the Post-LayerNorm (Post-LN) formulation, whose instability at scale caused its replacement by Pre-LN in modern LLMs. We show that the central failure mode of Post-LN arises from the ResNet-style residual pathway, which introduces gradient vanishing in deep networks. We present Keel, a Post-LN Transformer that replaces this residual path with a Highway-style connection. This modification preserves the gradient flow through the residual branch, preventing signal vanishing from the top layers to the bottom. Unlike prior methods, Keel enables stable training at extreme depths without requiring specialized initialization or complex optimization tricks. Keel trains robustly at depths exceeding 1000 layers and consistently improves perplexity and depth-scaling characteristics over Pre-LN. These findings indicate that Post-LN, when paired with a Highway-style connection, provides a simple and effective foundation for building deeply scalable LLMs, opening the possibility for future infinite-depth architectures.

Post-LayerNorm Is Back: Stable, ExpressivE, and Deep 投稿を読む »

AI, Committee, ニュース, Uncategorized

Complex Logical Instruction Generation

arXiv:2508.09125v2 Announce Type: replace Abstract: Instruction following has catalyzed the recent era of Large Language Models (LLMs) and is the foundational skill underpinning more advanced capabilities such as reasoning and agentic behaviors. As tasks grow more challenging, the logic structures embedded in natural language instructions becomes increasingly intricate. However, how well LLMs perform on such logic-rich instructions remains under-explored. We propose LogicIFGen and LogicIFEval. LogicIFGen is a scalable, automated framework for generating verifiable instructions from code functions, which can naturally express rich logic such as conditions, loops, and function calls. We further curate a collection of complex code functions and use LogicIFGen to construct LogicIFEval, a benchmark comprising 426 verifiable logic-rich instructions. Our experiments demonstrate that current state-of-the-art LLMs still struggle to correctly follow the instructions in LogicIFEval. Most LLMs can only follow fewer than 60% of the instructions, revealing significant deficiencies in the instruction-following ability. Code and Benchmark: https://github.com/mianzhang/LogicIF

Complex Logical Instruction Generation 投稿を読む »

AI, Committee, ニュース, Uncategorized

Tencent Hunyuan Releases HPC-Ops: A High Performance LLM Inference Operator Library

Tencent Hunyuan has open sourced HPC-Ops, a production grade operator library for large language model inference architecture devices. HPC-Ops focuses on low level CUDA kernels for core operators such as Attention, Grouped GEMM, and Fused MoE, and exposes them through a compact-C and Python API for integration into existing inference stacks. HPC-Ops runs in large scale internal services. In those deployments it delivers about 30 percent queries per minute improvement for Tencent-HY models and about 17 percent improvement for DeepSeek models on mainstream inference cards. These gains are reported at the service level, so they reflect the cumulative effect of faster kernels inside a real inference pipeline. Scope and design of HPC-Ops HPC-Ops is a production grade, high performance, and easy to use operator library for LLM inference, developed by the Tencent Hunyuan AI Infra team. The project does not try to replace serving frameworks. Instead it provides kernels and clean APIs that can be called from systems that already handle scheduling, KV cache management, batching, and transport. The API is designed for seamless use inside popular inference frameworks such as vLLM and SGLang. That means the framework team can swap in HPC-Ops kernels behind their own abstractions without changing the external behavior of their servers. HPC-Ops uses C++ and CUDA with CuTe and CUTLASS as building blocks. Kernels are written as relatively small examples that also serve as a modern CUDA tutorial. Kernel performance characteristics The project publishes maximum observed speedup numbers for each operator relative to established baselines. These are microbenchmarks, and the research team stress that performance varies across shapes and workloads, but they show the optimization ceiling. For Attention in bf16, compared with FlashInfer, FlashAttention two, FlashAttention three, and TensorRT LLM, HPC Ops reports up to 1.33 times speedup in prefill and up to 2.22 times in decode. For Attention in fp8, compared with FlashInfer, FlashAttention three, and TensorRT LLM, it reports up to 1.12 times in prefill and up to 2.0 times in decode. For FusedMoE fp8, compared with TensorRT LLM and vLLM, maximum observed speedup is up to 1.49 times in prefill and 1.14 times in decode. For GroupGEMM fp8, compared with DeepGEMM, the reported gains are up to 1.1 times in prefill and 1.88 times in decode. These numbers matter because decode is usually the latency bottleneck in autoregressive generation, where batch sizes shrink and memory traffic dominates. The fact that Attention and GroupGEMM show the largest relative gains in decode suggests that HPC-Ops focuses on the part of the pipeline that most users notice. Supported kernels and precision The current release groups its functionality into three operator families: Attention kernels cover both prefill and decode and include support for paged attention. Paged attention is the memory layout that frameworks like vLLM use to place key and value cache blocks in a paged structure, which improves memory reuse for long sequences. Grouped GEMM is implemented as quantized GroupGEMM with fp8 weights. HPC-Ops supports block wise and per tensor scaling, so teams can trade off quantization granularity against parameter storage and calibration cost. Fused-MoE combines mixture of experts routing and expert computation in a single quantized operator. It also uses fp8 expert weights and supports block wise and per tensor scaling strategies. Across these kernels, HPC-Ops provides native support for bf16 and fp8 data types. That matches the current production trend to move inference toward lower precision formats that preserve accuracy while reducing memory bandwidth and improving tensor core utilization. Key Takeaways Tencent Hunyuan open-sourced HPC-Ops as a production grade operator library for LLM inference on NVIDIA SM90 GPUs, including H20, with C++ and CUDA kernels built on CuTe and CUTLASS. In production deployments HPC-Ops reports about 30 percent QPM gain for Tencent-HY models and about 17 percent QPM gain for DeepSeek models on mainstream inference cards. Operator microbenchmarks show maximum speedups up to 2.22 times for bf16 Attention decode, up to 2.0 times for fp8 Attention decode, up to 1.49 times for fp8 FusedMoE prefill, and up to 1.88 times for fp8 GroupGEMM decode compared with strong baselines like FlashInfer, FlashAttention, TensorRT LLM, and DeepGEMM. The library focuses on three operator families, Attention with paged attention support, quantized GroupGEMM with fp8 weights, and quantized Fused MoE with fp8 expert weights, with both block wise and per tensor scaling, and native bf16 plus fp8 precision support. HPC-Ops is designed as an operator layer that integrates into existing inference frameworks such as vLLM and SGLang, and the roadmap targets sparse attention for long context LLMs, extended quantization including 4 bit and 8 bit strategies, and kernels that better overlap computation with multi GPU communication. Check out the Repo here. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well. The post Tencent Hunyuan Releases HPC-Ops: A High Performance LLM Inference Operator Library appeared first on MarkTechPost.

Tencent Hunyuan Releases HPC-Ops: A High Performance LLM Inference Operator Library 投稿を読む »

AI, Committee, ニュース, Uncategorized

Cross-Examination Framework: A Task-Agnostic Diagnostic for Information Fidelity in Text-to-Text Generation

arXiv:2601.19350v1 Announce Type: new Abstract: Traditional metrics like BLEU and BERTScore fail to capture semantic fidelity in generative text-to-text tasks. We adapt the Cross-Examination Framework (CEF) for a reference-free, multi-dimensional evaluation by treating the source and candidate as independent knowledge bases. CEF generates verifiable questions from each text and performs a cross-examination to derive three interpretable scores: Coverage, Conformity, and Consistency. Validated across translation, summarization and clinical note-generation, our framework identifies critical errors, such as content omissions and factual contradictions, missed by standard metrics. A key contribution is a systematic robustness analysis to select a stable judge model. Crucially, the strong correlation between our reference-free and with-reference modes validates CEF’s reliability without gold references. Furthermore, human expert validation demonstrates that CEF mismatching questions align with meaning-altering semantic errors higher than with non-semantic errors, particularly excelling at identifying entity-based and relational distortions.

Cross-Examination Framework: A Task-Agnostic Diagnostic for Information Fidelity in Text-to-Text Generation 投稿を読む »

We use cookies to improve your experience and performance on our website. You can learn more at プライバシーポリシー and manage your privacy settings by clicking Settings.

Privacy Preferences

You can choose your cookie settings by turning on/off each type of cookie as you wish, except for essential cookies.

Allow All
Manage Consent Preferences
  • Always Active

Save
ja