YouZum

Uncategorized

AI, Committee, 新闻, Uncategorized

SQLong: Enhanced NL2SQL for Longer Contexts with LLMs

arXiv:2502.16747v2 Announce Type: replace Abstract: Open-weight large language models (LLMs) have significantly advanced performance in the Natural Language to SQL (NL2SQL) task. However, their effectiveness diminishes when dealing with large database schemas, as the context length increases. To address this limitation, we present SQLong, a novel and efficient data augmentation framework designed to enhance LLM performance in long-context scenarios for the NL2SQL task. SQLong generates augmented datasets by extending existing database schemas with additional synthetic CREATE TABLE commands and corresponding data rows, sampled from diverse schemas in the training data. This approach effectively simulates long-context scenarios during finetuning and evaluation. Through experiments on the Spider and BIRD datasets, we demonstrate that LLMs finetuned with SQLong-augmented data significantly outperform those trained on standard datasets. These imply SQLong’s practical implementation and its impact on improving NL2SQL capabilities in real-world settings with complex database schemas.

SQLong: Enhanced NL2SQL for Longer Contexts with LLMs Read Post »

AI, Committee, 新闻, Uncategorized

Forensic deepfake audio detection using segmental speech features

arXiv:2505.13847v1 Announce Type: cross Abstract: This study explores the potential of using acoustic features of segmental speech sounds to detect deepfake audio. These features are highly interpretable because of their close relationship with human articulatory processes and are expected to be more difficult for deepfake models to replicate. The results demonstrate that certain segmental features commonly used in forensic voice comparison are effective in identifying deep-fakes, whereas some global features provide little value. These findings underscore the need to approach audio deepfake detection differently for forensic voice comparison and offer a new perspective on leveraging segmental features for this purpose.

Forensic deepfake audio detection using segmental speech features Read Post »

AI, Committee, 新闻, Uncategorized

Multi2: Multi-Agent Test-Time Scalable Framework for Multi-Document Processing

arXiv:2502.20592v3 Announce Type: replace Abstract: Recent advances in test-time scaling have shown promising results in improving Large Language Model (LLM) performance through strategic computation allocation during inference. While this approach has demonstrated strong improvements in logical and mathematical reasoning tasks, its application to natural language generation (NLG), particularly summarization, remains unexplored. Multi-Document Summarization (MDS), a fundamental task in NLG, presents unique challenges by requiring models to extract and synthesize essential information across multiple lengthy documents. Unlike reasoning tasks, MDS demands a more nuanced approach to prompt design and ensemble methods, as no single “best” prompt can satisfy diverse summarization requirements. We propose a novel framework leveraging test-time scaling for MDS. Our approach employs prompt ensemble techniques to generate multiple candidate summaries using various prompts, then combines them with an aggregator to produce a refined summary. To evaluate our method effectively, we also introduce two new LLM-based metrics: the Consistency-Aware Preference (CAP) score and LLM Atom-Content-Unit (LLM-ACU) score, which assess summary quality while addressing the positional bias inherent in traditional automatic evaluation. Our extensive experiments demonstrate that this framework significantly enhances summary quality while also revealing the practical scaling boundaries to MDS tasks.

Multi2: Multi-Agent Test-Time Scalable Framework for Multi-Document Processing Read Post »

AI, Committee, 新闻, Uncategorized

By putting AI into everything, Google wants to make it invisible 

If you want to know where AI is headed, this year’s Google I/O has you covered. The company’s annual showcase of next-gen products, which kicked off yesterday, has all of the pomp and pizzazz, the sizzle reels and celebrity walk-ons, that you’d expect from a multimillion dollar marketing event. But it also shows us just how fast this still-experimental technology is being subsumed into a line-up designed to sell phones and subscription tiers. Never before have I seen this thing we call artificial intelligence appear so normal. Yes, Google’s line up of consumer-facing products is the slickest on offer. The firm is bundling most of its multimodal models into its Gemini app, including the new Imagen 4 image generator and the new Veo 3 video-generator. That means you can now access Google’s full range of generative models via a single chatbot. It also announced Gemini Live, a feature that lets you share your phone’s screen or your camera’s view with the chatbot and ask it about what it can see. Those features were previously only seen in demos of Project Astra, a “universal AI assistant” that Google DeepMind is working on. Now, Google is inching towards putting Project Astra into the hands of anyone with a smartphone. Google is also rolling out AI Mode, an LLM-powered front-end to search. This can now pull in personal information from Gmail or Google Docs to tailor searches to users. It will include Deep Search, which can break a query down into hundreds of individual searches and then summarize the results; a version of Project Mariner, Google DeepMind’s browser-using agent; and Search Live, which lets you hold up your camera and ask it what it sees. This is the new frontier. It’s no longer about who has the most powerful models, but who can spin them into the best products. OpenAI’s ChatGPT includes many similar features to Gemini. But with its existing ecosystem of consumer services and billions of existing users, Google has a clear advantage. Power users wanting access to the latest versions of everything on display can now sign up for Google AI Ultra for $250 a month.   When OpenAI released ChatGPT in late 2022, Google was caught on the back foot and had to jump into a higher gear to catch up. With this year’s product line-up, it feels like Google has stuck its landing. On a preview call, Google’s CEO Sundar Pichai claimed that AI Overviews, a precrusor to AI Mode that provides LLM-generated summaries of search results, had turned out to be popular with hundreds of millions of users. He speculated that many of them may not even know (or care) whether or not they were using AI—it was just a cool new feature. Google I/O gives a broader glimpse of that future, one where AI is invisible. “More intelligence is available, for everyone, everywhere,” Pichai told his audience. I think we are expected to marvel. But by putting AI in everything, Google is turning AI into a technology we won’t notice and may not even bother to name.

By putting AI into everything, Google wants to make it invisible  Read Post »

AI, Committee, 新闻, Uncategorized

Sampling Without Data is Now Scalable: Meta AI Releases Adjoint Sampling for Reward-Driven Generative Modeling

Data Scarcity in Generative Modeling Generative models traditionally rely on large, high-quality datasets to produce samples that replicate the underlying data distribution. However, in fields like molecular modeling or physics-based inference, acquiring such data can be computationally infeasible or even impossible. Instead of labeled data, only a scalar reward—typically derived from a complex energy function—is available to judge the quality of generated samples. This presents a significant challenge: how can one train generative models effectively without direct supervision from data? Meta AI Introduces Adjoint Sampling, a New Learning Algorithm Based on Scalar Rewards Meta AI tackles this challenge with Adjoint Sampling, a novel learning algorithm designed for training generative models using only scalar reward signals. Built on the theoretical framework of stochastic optimal control (SOC), Adjoint Sampling reframes the training process as an optimization task over a controlled diffusion process. Unlike standard generative models, it does not require explicit data. Instead, it learns to generate high-quality samples by iteratively refining them using a reward function—often derived from physical or chemical energy models. Adjoint Sampling excels in scenarios where only an unnormalized energy function is accessible. It produces samples that align with the target distribution defined by this energy, bypassing the need for corrective methods like importance sampling or MCMC, which are computationally intensive. Source: https://arxiv.org/abs/2504.11713 Technical Details The foundation of Adjoint Sampling is a stochastic differential equation (SDE) that models how sample trajectories evolve. The algorithm learns a control drift u(x,t)u(x, t)u(x,t) such that the final state of these trajectories approximates a desired distribution (e.g., Boltzmann). A key innovation is its use of Reciprocal Adjoint Matching (RAM)—a loss function that enables gradient-based updates using only the initial and final states of sample trajectories. This sidesteps the need to backpropagate through the entire diffusion path, greatly improving computational efficiency. By sampling from a known base process and conditioning on terminal states, Adjoint Sampling constructs a replay buffer of samples and gradients, allowing multiple optimization steps per sample. This on-policy training method provides scalability unmatched by previous approaches, making it suitable for high-dimensional problems like molecular conformer generation. Moreover, Adjoint Sampling supports geometric symmetries and periodic boundary conditions, enabling models to respect molecular invariances like rotation, translation, and torsion. These features are crucial for physically meaningful generative tasks in chemistry and physics. Performance Insights and Benchmark Results Adjoint Sampling achieves state-of-the-art results in both synthetic and real-world tasks. On synthetic benchmarks such as the Double-Well (DW-4), Lennard-Jones (LJ-13 and LJ-55) potentials, it significantly outperforms baselines like DDS and PIS, especially in energy efficiency. For example, where DDS and PIS require 1000 evaluations per gradient update, Adjoint Sampling only uses three, with similar or better performance in Wasserstein distance and effective sample size (ESS). In a practical setting, the algorithm was evaluated on large-scale molecular conformer generation using the eSEN energy model trained on the SPICE-MACE-OFF dataset. Adjoint Sampling, especially its Cartesian variant with pretraining, achieved up to 96.4% recall and 0.60 Å mean RMSD, surpassing RDKit ETKDG—a widely used chemistry-based baseline—across all metrics. The method generalizes well to the GEOM-DRUGS dataset, showing substantial improvements in recall while maintaining competitive precision. The algorithm’s ability to explore the configuration space broadly, aided by its stochastic initialization and reward-based learning, results in greater conformer diversity—critical for drug discovery and molecular design. Conclusion: A Scalable Path Forward for Reward-Driven Generative Models Adjoint Sampling represents a major step forward in generative modeling without data. By leveraging scalar reward signals and an efficient on-policy training method grounded in stochastic control, it enables scalable training of diffusion-based samplers with minimal energy evaluations. Its integration of geometric symmetries and its ability to generalize across diverse molecular structures position it as a foundational tool in computational chemistry and beyond. Check out the Paper, Model on Hugging Face and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter. The post Sampling Without Data is Now Scalable: Meta AI Releases Adjoint Sampling for Reward-Driven Generative Modeling appeared first on MarkTechPost.

Sampling Without Data is Now Scalable: Meta AI Releases Adjoint Sampling for Reward-Driven Generative Modeling Read Post »

AI, Committee, 新闻, Uncategorized

Google just leapfrogged every competitor with mind-blowing AI that can think deeper, shop smarter, and create videos with dialogue

Google unveiled major AI advancements at I/O 2025, including Gemini 2.5 with Deep Think, AI Mode in Search, Veo 3 for video with audio, and a $249 Ultra plan aimed at power users and enterprises.Read More

Google just leapfrogged every competitor with mind-blowing AI that can think deeper, shop smarter, and create videos with dialogue Read Post »

We use cookies to improve your experience and performance on our website. You can learn more at 隱私權政策 and manage your privacy settings by clicking Settings.

Privacy Preferences

You can choose your cookie settings by turning on/off each type of cookie as you wish, except for essential cookies.

Allow All
Manage Consent Preferences
  • Always Active

Save
zh_CN