YouZum

新闻

AI, Committee, 新闻, Uncategorized

A Simple Ensemble Strategy for LLM Inference: Towards More Stable Text Classification

arXiv:2504.18884v2 Announce Type: replace Abstract: With the advance of large language models (LLMs), LLMs have been utilized for the various tasks. However, the issues of variability and reproducibility of results from each trial of LLMs have been largely overlooked in existing literature while actual human annotation uses majority voting to resolve disagreements among annotators. Therefore, this study introduces the straightforward ensemble strategy to a sentiment analysis using LLMs. As the results, we demonstrate that the ensemble of multiple inference using medium-sized LLMs produces more robust and accurate results than using a large model with a single attempt with reducing RMSE by 18.6%.

A Simple Ensemble Strategy for LLM Inference: Towards More Stable Text Classification Read Post »

AI, Committee, 新闻, Uncategorized

JTCSE: Joint Tensor-Modulus Constraints and Cross-Attention for Unsupervised Contrastive Learning of Sentence Embeddings

arXiv:2505.02366v2 Announce Type: replace Abstract: Unsupervised contrastive learning has become a hot research topic in natural language processing. Existing works usually aim at constraining the orientation distribution of the representations of positive and negative samples in the high-dimensional semantic space in contrastive learning, but the semantic representation tensor possesses both modulus and orientation features, and the existing works ignore the modulus feature of the representations and cause insufficient contrastive learning. % Therefore, we firstly propose a training objective that aims at modulus constraints on the semantic representation tensor, to strengthen the alignment between the positive samples in contrastive learning. Therefore, we first propose a training objective that is designed to impose modulus constraints on the semantic representation tensor, to strengthen the alignment between positive samples in contrastive learning. Then, the BERT-like model suffers from the phenomenon of sinking attention, leading to a lack of attention to CLS tokens that aggregate semantic information. In response, we propose a cross-attention structure among the twin-tower ensemble models to enhance the model’s attention to CLS token and optimize the quality of CLS Pooling. Combining the above two motivations, we propose a new textbf{J}oint textbf{T}ensor representation modulus constraint and textbf{C}ross-attention unsupervised contrastive learning textbf{S}entence textbf{E}mbedding representation framework JTCSE, which we evaluate in seven semantic text similarity computation tasks, and the experimental results show that JTCSE’s twin-tower ensemble model and single-tower distillation model outperform the other baselines and become the current SOTA. In addition, we have conducted an extensive zero-shot downstream task evaluation, which shows that JTCSE outperforms other baselines overall on more than 130 tasks.

JTCSE: Joint Tensor-Modulus Constraints and Cross-Attention for Unsupervised Contrastive Learning of Sentence Embeddings Read Post »

AI, Committee, 新闻, Uncategorized

Bringing legal knowledge to the public by constructing a legal question bank using large-scale pre-trained language model

arXiv:2505.04132v1 Announce Type: new Abstract: Access to legal information is fundamental to access to justice. Yet accessibility refers not only to making legal documents available to the public, but also rendering legal information comprehensible to them. A vexing problem in bringing legal information to the public is how to turn formal legal documents such as legislation and judgments, which are often highly technical, to easily navigable and comprehensible knowledge to those without legal education. In this study, we formulate a three-step approach for bringing legal knowledge to laypersons, tackling the issues of navigability and comprehensibility. First, we translate selected sections of the law into snippets (called CLIC-pages), each being a small piece of article that focuses on explaining certain technical legal concept in layperson’s terms. Second, we construct a Legal Question Bank (LQB), which is a collection of legal questions whose answers can be found in the CLIC-pages. Third, we design an interactive CLIC Recommender (CRec). Given a user’s verbal description of a legal situation that requires a legal solution, CRec interprets the user’s input and shortlists questions from the question bank that are most likely relevant to the given legal situation and recommends their corresponding CLIC pages where relevant legal knowledge can be found. In this paper we focus on the technical aspects of creating an LQB. We show how large-scale pre-trained language models, such as GPT-3, can be used to generate legal questions. We compare machine-generated questions (MGQs) against human-composed questions (HCQs) and find that MGQs are more scalable, cost-effective, and more diversified, while HCQs are more precise. We also show a prototype of CRec and illustrate through an example how our 3-step approach effectively brings relevant legal knowledge to the public.

Bringing legal knowledge to the public by constructing a legal question bank using large-scale pre-trained language model Read Post »

AI, Committee, 新闻, Uncategorized

Sentiment-Aware Recommendation Systems in E-Commerce: A Review from a Natural Language Processing Perspective

arXiv:2505.03828v1 Announce Type: cross Abstract: E-commerce platforms generate vast volumes of user feedback, such as star ratings, written reviews, and comments. However, most recommendation engines rely primarily on numerical scores, often overlooking the nuanced opinions embedded in free text. This paper comprehensively reviews sentiment-aware recommendation systems from a natural language processing perspective, covering advancements from 2023 to early 2025. It highlights the benefits of integrating sentiment analysis into e-commerce recommenders to enhance prediction accuracy and explainability through detailed opinion extraction. Our survey categorizes recent work into four main approaches: deep learning classifiers that combine sentiment embeddings with user item interactions, transformer based methods for nuanced feature extraction, graph neural networks that propagate sentiment signals, and conversational recommenders that adapt in real time to user feedback. We summarize model architectures and demonstrate how sentiment flows through recommendation pipelines, impacting dialogue-based suggestions. Key challenges include handling noisy or sarcastic text, dynamic user preferences, and bias mitigation. Finally, we outline research gaps and provide a roadmap for developing smarter, fairer, and more user-centric recommendation tools.

Sentiment-Aware Recommendation Systems in E-Commerce: A Review from a Natural Language Processing Perspective Read Post »

AI, Committee, 新闻, Uncategorized

On the generalization of language models from in-context learning and finetuning: a controlled study

arXiv:2505.00661v2 Announce Type: replace Abstract: Large language models exhibit exciting capabilities, yet can show surprisingly narrow generalization from finetuning. E.g. they can fail to generalize to simple reversals of relations they are trained on, or fail to make simple logical deductions based on trained information. These failures to generalize from fine-tuning can hinder practical application of these models. On the other hand, language models’ in-context learning shows different inductive biases, and can generalize better in some cases. Here, we explore these differences in generalization between in-context- and fine-tuning-based learning. To do so, we constructed several novel datasets to evaluate and improve models’ abilities to generalize from finetuning data. The datasets are designed to create clean tests of generalization, by isolating the knowledge in the dataset from that in pretraining. We expose pretrained large models to controlled subsets of the information in these datasets — either in context, or through fine-tuning — and evaluate their performance on test sets that require various types of generalization. We find overall that in data-matched settings, in-context learning can generalize more flexibly than fine-tuning (though we also find some qualifications of prior findings, such as cases when fine-tuning can generalize to reversals embedded in a larger structure of knowledge). We build on these findings to propose a method to enable improved generalization from fine-tuning: adding in-context inferences to finetuning data. We show that this method improves generalization across various splits of our datasets and other benchmarks. Our results have implications for understanding the inductive biases of different modes of learning in language models, and practically improving their performance.

On the generalization of language models from in-context learning and finetuning: a controlled study Read Post »

AI, Committee, 新闻, Uncategorized

Quiet Feature Learning in Algorithmic Tasks

arXiv:2505.03997v1 Announce Type: cross Abstract: We train Transformer-based language models on ten foundational algorithmic tasks and observe pronounced phase transitions in their loss curves that deviate from established power-law scaling trends. Over large ranges of compute, the validation loss barely improves, then abruptly decreases. Probing the models’ internal representations reveals the learning of quiet features during the stagnant phase, followed by sudden acquisition of loud features that coincide with the sharp drop in loss. Our ablation experiments show that disrupting a single learned feature can dramatically degrade performance, providing evidence of their causal role in task performance. These findings challenge the prevailing assumption that next-token predictive loss reliably tracks incremental progress; instead, key internal features may be developing below the surface until they coalesce, triggering a rapid performance gain.

Quiet Feature Learning in Algorithmic Tasks Read Post »

AI, Committee, 新闻, Uncategorized

Estimating LLM Uncertainty with Logits

arXiv:2502.00290v4 Announce Type: replace Abstract: Over the past few years, Large Language Models (LLMs) have developed rapidly and are widely applied in various domains. However, LLMs face the issue of hallucinations, generating responses that may be unreliable when the models lack relevant knowledge. To be aware of potential hallucinations, uncertainty estimation methods have been introduced, and most of them have confirmed that reliability lies in critical tokens. However, probability-based methods perform poorly in identifying token reliability, limiting their practical utility. In this paper, we reveal that the probability-based method fails to estimate token reliability due to the loss of evidence strength information which is accumulated in the training stage. Therefore, we present Logits-induced token uncertainty (LogTokU), a framework for estimating decoupled token uncertainty in LLMs, enabling real-time uncertainty estimation without requiring multiple sampling processes. We employ evidence modeling to implement LogTokU and use the estimated uncertainty to guide downstream tasks. The experimental results demonstrate that LogTokU has significant effectiveness and promise.

Estimating LLM Uncertainty with Logits Read Post »

AI, Committee, 新闻, Uncategorized

Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding

arXiv:2505.03788v1 Announce Type: new Abstract: We introduce a novel approach for calibrating uncertainty quantification (UQ) tailored for multi-modal large language models (LLMs). Existing state-of-the-art UQ methods rely on consistency among multiple responses generated by the LLM on an input query under diverse settings. However, these approaches often report higher confidence in scenarios where the LLM is consistently incorrect. This leads to a poorly calibrated confidence with respect to accuracy. To address this, we leverage cross-modal consistency in addition to self-consistency to improve the calibration of the multi-modal models. Specifically, we ground the textual responses to the visual inputs. The confidence from the grounding model is used to calibrate the overall confidence. Given that using a grounding model adds its own uncertainty in the pipeline, we apply temperature scaling – a widely accepted parametric calibration technique – to calibrate the grounding model’s confidence in the accuracy of generated responses. We evaluate the proposed approach across multiple multi-modal tasks, such as medical question answering (Slake) and visual question answering (VQAv2), considering multi-modal models such as LLaVA-Med and LLaVA. The experiments demonstrate that the proposed framework achieves significantly improved calibration on both tasks.

Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding Read Post »

AI, Committee, 新闻, Uncategorized

The Power of Stories: Narrative Priming Shapes How LLM Agents Collaborate and Compete

arXiv:2505.03961v1 Announce Type: cross Abstract: According to Yuval Noah Harari, large-scale human cooperation is driven by shared narratives that encode common beliefs and values. This study explores whether such narratives can similarly nudge LLM agents toward collaboration. We use a finitely repeated public goods game in which LLM agents choose either cooperative or egoistic spending strategies. We prime agents with stories highlighting teamwork to different degrees and test how this influences negotiation outcomes. Our experiments explore four questions:(1) How do narratives influence negotiation behavior? (2) What differs when agents share the same story versus different ones? (3) What happens when the agent numbers grow? (4) Are agents resilient against self-serving negotiators? We find that story-based priming significantly affects negotiation strategies and success rates. Common stories improve collaboration, benefiting each agent. By contrast, priming agents with different stories reverses this effect, and those agents primed toward self-interest prevail. We hypothesize that these results carry implications for multi-agent system design and AI alignment.

The Power of Stories: Narrative Priming Shapes How LLM Agents Collaborate and Compete Read Post »

AI, Committee, 新闻, Uncategorized

Bryan Johnson wants to start a new religion in which “the body is God”

Bryan Johnson is on a mission to not die. The 47-year-old multimillionaire has already applied his slogan “Don’t Die” to events, merchandise, and a Netflix documentary. Now he’s founding a Don’t Die religion. Johnson, who famously spends millions of dollars on scans, tests, supplements, and a lifestyle routine designed to slow or reverse the aging process, has enjoyed extensive media coverage, and a huge social media following. For many people, he has become the face of the longevity field. I sat down with Johnson at an event for people interested in longevity in Berkeley, California, in late April. We spoke on the sidelines after lunch (conference plastic-lidded container meal for me; what seemed to be a plastic-free, compostable box of chicken and vegetables for him), and he sat with an impeccable posture, his expression neutral.  Earlier that morning, Johnson, in worn trainers and the kind of hoodie that is almost certainly deceptively expensive, had told the audience about what he saw as the end of humanity. Specifically, he was worried about AI—that we face an “event horizon,” a point at which superintelligent AI escapes human understanding and control. He had come to Berkeley to persuade people who are interested in longevity to focus their efforts on AI.  It is this particular concern that ultimately underpins his Don’t Die mission. First, humans must embrace the Don’t Die ideology. Then we must ensure AI is aligned with preserving human existence. Were it not for AI, he says, he wouldn’t be doing any of his anti-death activities and regimens. “I am convinced that we are at an existential moment as a species,” says Johnson, who was raised Mormon but has since left the church. Solving aging will take decades, he says—we’ll survive that long only if we make sure that AI is aligned with human survival.  The following Q&A has been lightly edited for length and clarity. Why are you creating a new religion? We’re in this new phase where [because of advances in AI] we’re trying to reimagine what it means to be human. It requires imagination and creativity and open-mindedness, and that’s a big ask. Approaching that conversation as a community, or a lifestyle, doesn’t carry enough weight or power. Religions have proven, over the past several thousand years, to be the most efficacious form to organize human efforts. It’s just a tried-and-true methodology.  How do you go about founding a new religion? It’s a good question. If you look at historical [examples], Buddha went through his own self-exploratory process and came up with a framework. And Muhammad had a story. Jesus had an origin story … You might even say Satoshi [Nakamoto, the mysterious creator of bitcoin] is like [the founder of] a modern-day religion, [launched] with the white paper. Adam Smith launched capitalism with his book. The question is: What is a modern-day religion, and how does it convince? It’s an open question for me. I don’t know yet. Your goal is to align AI with Don’t Die—or, in other words, ensure that AI models prioritize and protect human life. How will you do that? I’m talking to a lot of AI researchers about this. Communities of AIs could be instilled with values of conflict resolution that do not end in the death of a human. Or an AI. Or the planet. Would you say that Don’t Die is “your” religion? No, I think it’s humanity’s religion. It’s different from other religions, which are very founder-centric. I think this is going to be decentralized, and it will be something that everybody can make their own. So there’s no God? We’re playing with the idea that the body is God. We’ve been experimenting with this format of a Don’t Die fam, where eight to 12 people get together on a weekly basis. It’s patterned off of other groups like Alcoholics Anonymous. We structure an opening ritual. We have a mantra. And then there’s a part where people apologize to their body for something they’ve done that has inflicted harm upon themselves.  It’s reframing our relationship to body and to mind. It is also a way for people to have deep friendships, to explore emotionally vulnerable topics, and to support each other in health practices. What we’re really trying to say is: Existence is the virtue. Existence is the objective. If someone believes in God, that’s fine. People can be Christian and do this; they can be Muslim and do this. Don’t Die is a “yes, and” to all groups. So it’s a different way of thinking about religion? Yeah. Right now, religion doesn’t hold the highest status in society. A lot of people look down on it in some way. I think as AI progresses, it’s going to create additional questions on who we are: What is our identity? What do we believe about our existence in the future? People are going to want some kind of framework that helps them make sense of the moment. So I think there’s going to be a shift toward religion in the coming years. People might say that [founding a religion now] is kind of a weird move, and that [religion] turns people off. But I think that’s fine. I think we’re ahead. Does the religion incorporate, or make reference to, AI in any way? Yeah. AI is going to be omnipresent. And this is why we’ve been contemplating “the body is God.” Over the past couple of years … I’ve been testing the hypothesis that if I get a whole bunch of data about my body, and I give it to an algorithm, and feed that algorithm updates with scientific evidence, then it would eventually do a better job than a doctor. So I gave myself over to an algorithm.  It really is in my best interest to let it tell me what to eat, tell me when to sleep and exercise, because it would do a better job of making me happy. Instead of my mind haphazardly deciding what it

Bryan Johnson wants to start a new religion in which “the body is God” Read Post »

We use cookies to improve your experience and performance on our website. You can learn more at 隱私權政策 and manage your privacy settings by clicking Settings.

Privacy Preferences

You can choose your cookie settings by turning on/off each type of cookie as you wish, except for essential cookies.

Allow All
Manage Consent Preferences
  • Always Active

Save
zh_CN