YouZum

Uncategorized

AI, Committee, ニュース, Uncategorized

Sentiment-Aware Recommendation Systems in E-Commerce: A Review from a Natural Language Processing Perspective

arXiv:2505.03828v1 Announce Type: cross Abstract: E-commerce platforms generate vast volumes of user feedback, such as star ratings, written reviews, and comments. However, most recommendation engines rely primarily on numerical scores, often overlooking the nuanced opinions embedded in free text. This paper comprehensively reviews sentiment-aware recommendation systems from a natural language processing perspective, covering advancements from 2023 to early 2025. It highlights the benefits of integrating sentiment analysis into e-commerce recommenders to enhance prediction accuracy and explainability through detailed opinion extraction. Our survey categorizes recent work into four main approaches: deep learning classifiers that combine sentiment embeddings with user item interactions, transformer based methods for nuanced feature extraction, graph neural networks that propagate sentiment signals, and conversational recommenders that adapt in real time to user feedback. We summarize model architectures and demonstrate how sentiment flows through recommendation pipelines, impacting dialogue-based suggestions. Key challenges include handling noisy or sarcastic text, dynamic user preferences, and bias mitigation. Finally, we outline research gaps and provide a roadmap for developing smarter, fairer, and more user-centric recommendation tools.

Sentiment-Aware Recommendation Systems in E-Commerce: A Review from a Natural Language Processing Perspective 投稿を読む »

AI, Committee, ニュース, Uncategorized

A Simple Ensemble Strategy for LLM Inference: Towards More Stable Text Classification

arXiv:2504.18884v2 Announce Type: replace Abstract: With the advance of large language models (LLMs), LLMs have been utilized for the various tasks. However, the issues of variability and reproducibility of results from each trial of LLMs have been largely overlooked in existing literature while actual human annotation uses majority voting to resolve disagreements among annotators. Therefore, this study introduces the straightforward ensemble strategy to a sentiment analysis using LLMs. As the results, we demonstrate that the ensemble of multiple inference using medium-sized LLMs produces more robust and accurate results than using a large model with a single attempt with reducing RMSE by 18.6%.

A Simple Ensemble Strategy for LLM Inference: Towards More Stable Text Classification 投稿を読む »

AI, Committee, ニュース, Uncategorized

On the generalization of language models from in-context learning and finetuning: a controlled study

arXiv:2505.00661v2 Announce Type: replace Abstract: Large language models exhibit exciting capabilities, yet can show surprisingly narrow generalization from finetuning. E.g. they can fail to generalize to simple reversals of relations they are trained on, or fail to make simple logical deductions based on trained information. These failures to generalize from fine-tuning can hinder practical application of these models. On the other hand, language models’ in-context learning shows different inductive biases, and can generalize better in some cases. Here, we explore these differences in generalization between in-context- and fine-tuning-based learning. To do so, we constructed several novel datasets to evaluate and improve models’ abilities to generalize from finetuning data. The datasets are designed to create clean tests of generalization, by isolating the knowledge in the dataset from that in pretraining. We expose pretrained large models to controlled subsets of the information in these datasets — either in context, or through fine-tuning — and evaluate their performance on test sets that require various types of generalization. We find overall that in data-matched settings, in-context learning can generalize more flexibly than fine-tuning (though we also find some qualifications of prior findings, such as cases when fine-tuning can generalize to reversals embedded in a larger structure of knowledge). We build on these findings to propose a method to enable improved generalization from fine-tuning: adding in-context inferences to finetuning data. We show that this method improves generalization across various splits of our datasets and other benchmarks. Our results have implications for understanding the inductive biases of different modes of learning in language models, and practically improving their performance.

On the generalization of language models from in-context learning and finetuning: a controlled study 投稿を読む »

AI, Committee, ニュース, Uncategorized

Quiet Feature Learning in Algorithmic Tasks

arXiv:2505.03997v1 Announce Type: cross Abstract: We train Transformer-based language models on ten foundational algorithmic tasks and observe pronounced phase transitions in their loss curves that deviate from established power-law scaling trends. Over large ranges of compute, the validation loss barely improves, then abruptly decreases. Probing the models’ internal representations reveals the learning of quiet features during the stagnant phase, followed by sudden acquisition of loud features that coincide with the sharp drop in loss. Our ablation experiments show that disrupting a single learned feature can dramatically degrade performance, providing evidence of their causal role in task performance. These findings challenge the prevailing assumption that next-token predictive loss reliably tracks incremental progress; instead, key internal features may be developing below the surface until they coalesce, triggering a rapid performance gain.

Quiet Feature Learning in Algorithmic Tasks 投稿を読む »

AI, Committee, ニュース, Uncategorized

Estimating LLM Uncertainty with Logits

arXiv:2502.00290v4 Announce Type: replace Abstract: Over the past few years, Large Language Models (LLMs) have developed rapidly and are widely applied in various domains. However, LLMs face the issue of hallucinations, generating responses that may be unreliable when the models lack relevant knowledge. To be aware of potential hallucinations, uncertainty estimation methods have been introduced, and most of them have confirmed that reliability lies in critical tokens. However, probability-based methods perform poorly in identifying token reliability, limiting their practical utility. In this paper, we reveal that the probability-based method fails to estimate token reliability due to the loss of evidence strength information which is accumulated in the training stage. Therefore, we present Logits-induced token uncertainty (LogTokU), a framework for estimating decoupled token uncertainty in LLMs, enabling real-time uncertainty estimation without requiring multiple sampling processes. We employ evidence modeling to implement LogTokU and use the estimated uncertainty to guide downstream tasks. The experimental results demonstrate that LogTokU has significant effectiveness and promise.

Estimating LLM Uncertainty with Logits 投稿を読む »

AI, Committee, ニュース, Uncategorized

Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding

arXiv:2505.03788v1 Announce Type: new Abstract: We introduce a novel approach for calibrating uncertainty quantification (UQ) tailored for multi-modal large language models (LLMs). Existing state-of-the-art UQ methods rely on consistency among multiple responses generated by the LLM on an input query under diverse settings. However, these approaches often report higher confidence in scenarios where the LLM is consistently incorrect. This leads to a poorly calibrated confidence with respect to accuracy. To address this, we leverage cross-modal consistency in addition to self-consistency to improve the calibration of the multi-modal models. Specifically, we ground the textual responses to the visual inputs. The confidence from the grounding model is used to calibrate the overall confidence. Given that using a grounding model adds its own uncertainty in the pipeline, we apply temperature scaling – a widely accepted parametric calibration technique – to calibrate the grounding model’s confidence in the accuracy of generated responses. We evaluate the proposed approach across multiple multi-modal tasks, such as medical question answering (Slake) and visual question answering (VQAv2), considering multi-modal models such as LLaVA-Med and LLaVA. The experiments demonstrate that the proposed framework achieves significantly improved calibration on both tasks.

Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding 投稿を読む »

AI, Committee, ニュース, Uncategorized

The Power of Stories: Narrative Priming Shapes How LLM Agents Collaborate and Compete

arXiv:2505.03961v1 Announce Type: cross Abstract: According to Yuval Noah Harari, large-scale human cooperation is driven by shared narratives that encode common beliefs and values. This study explores whether such narratives can similarly nudge LLM agents toward collaboration. We use a finitely repeated public goods game in which LLM agents choose either cooperative or egoistic spending strategies. We prime agents with stories highlighting teamwork to different degrees and test how this influences negotiation outcomes. Our experiments explore four questions:(1) How do narratives influence negotiation behavior? (2) What differs when agents share the same story versus different ones? (3) What happens when the agent numbers grow? (4) Are agents resilient against self-serving negotiators? We find that story-based priming significantly affects negotiation strategies and success rates. Common stories improve collaboration, benefiting each agent. By contrast, priming agents with different stories reverses this effect, and those agents primed toward self-interest prevail. We hypothesize that these results carry implications for multi-agent system design and AI alignment.

The Power of Stories: Narrative Priming Shapes How LLM Agents Collaborate and Compete 投稿を読む »

AI, Committee, ニュース, Uncategorized

Bryan Johnson wants to start a new religion in which “the body is God”

Bryan Johnson is on a mission to not die. The 47-year-old multimillionaire has already applied his slogan “Don’t Die” to events, merchandise, and a Netflix documentary. Now he’s founding a Don’t Die religion. Johnson, who famously spends millions of dollars on scans, tests, supplements, and a lifestyle routine designed to slow or reverse the aging process, has enjoyed extensive media coverage, and a huge social media following. For many people, he has become the face of the longevity field. I sat down with Johnson at an event for people interested in longevity in Berkeley, California, in late April. We spoke on the sidelines after lunch (conference plastic-lidded container meal for me; what seemed to be a plastic-free, compostable box of chicken and vegetables for him), and he sat with an impeccable posture, his expression neutral.  Earlier that morning, Johnson, in worn trainers and the kind of hoodie that is almost certainly deceptively expensive, had told the audience about what he saw as the end of humanity. Specifically, he was worried about AI—that we face an “event horizon,” a point at which superintelligent AI escapes human understanding and control. He had come to Berkeley to persuade people who are interested in longevity to focus their efforts on AI.  It is this particular concern that ultimately underpins his Don’t Die mission. First, humans must embrace the Don’t Die ideology. Then we must ensure AI is aligned with preserving human existence. Were it not for AI, he says, he wouldn’t be doing any of his anti-death activities and regimens. “I am convinced that we are at an existential moment as a species,” says Johnson, who was raised Mormon but has since left the church. Solving aging will take decades, he says—we’ll survive that long only if we make sure that AI is aligned with human survival.  The following Q&A has been lightly edited for length and clarity. Why are you creating a new religion? We’re in this new phase where [because of advances in AI] we’re trying to reimagine what it means to be human. It requires imagination and creativity and open-mindedness, and that’s a big ask. Approaching that conversation as a community, or a lifestyle, doesn’t carry enough weight or power. Religions have proven, over the past several thousand years, to be the most efficacious form to organize human efforts. It’s just a tried-and-true methodology.  How do you go about founding a new religion? It’s a good question. If you look at historical [examples], Buddha went through his own self-exploratory process and came up with a framework. And Muhammad had a story. Jesus had an origin story … You might even say Satoshi [Nakamoto, the mysterious creator of bitcoin] is like [the founder of] a modern-day religion, [launched] with the white paper. Adam Smith launched capitalism with his book. The question is: What is a modern-day religion, and how does it convince? It’s an open question for me. I don’t know yet. Your goal is to align AI with Don’t Die—or, in other words, ensure that AI models prioritize and protect human life. How will you do that? I’m talking to a lot of AI researchers about this. Communities of AIs could be instilled with values of conflict resolution that do not end in the death of a human. Or an AI. Or the planet. Would you say that Don’t Die is “your” religion? No, I think it’s humanity’s religion. It’s different from other religions, which are very founder-centric. I think this is going to be decentralized, and it will be something that everybody can make their own. So there’s no God? We’re playing with the idea that the body is God. We’ve been experimenting with this format of a Don’t Die fam, where eight to 12 people get together on a weekly basis. It’s patterned off of other groups like Alcoholics Anonymous. We structure an opening ritual. We have a mantra. And then there’s a part where people apologize to their body for something they’ve done that has inflicted harm upon themselves.  It’s reframing our relationship to body and to mind. It is also a way for people to have deep friendships, to explore emotionally vulnerable topics, and to support each other in health practices. What we’re really trying to say is: Existence is the virtue. Existence is the objective. If someone believes in God, that’s fine. People can be Christian and do this; they can be Muslim and do this. Don’t Die is a “yes, and” to all groups. So it’s a different way of thinking about religion? Yeah. Right now, religion doesn’t hold the highest status in society. A lot of people look down on it in some way. I think as AI progresses, it’s going to create additional questions on who we are: What is our identity? What do we believe about our existence in the future? People are going to want some kind of framework that helps them make sense of the moment. So I think there’s going to be a shift toward religion in the coming years. People might say that [founding a religion now] is kind of a weird move, and that [religion] turns people off. But I think that’s fine. I think we’re ahead. Does the religion incorporate, or make reference to, AI in any way? Yeah. AI is going to be omnipresent. And this is why we’ve been contemplating “the body is God.” Over the past couple of years … I’ve been testing the hypothesis that if I get a whole bunch of data about my body, and I give it to an algorithm, and feed that algorithm updates with scientific evidence, then it would eventually do a better job than a doctor. So I gave myself over to an algorithm.  It really is in my best interest to let it tell me what to eat, tell me when to sleep and exercise, because it would do a better job of making me happy. Instead of my mind haphazardly deciding what it

Bryan Johnson wants to start a new religion in which “the body is God” 投稿を読む »

AI, Committee, ニュース, Uncategorized

A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage

arXiv:2504.21035v2 Announce Type: replace-cross Abstract: Sanitizing sensitive text data typically involves removing personally identifiable information (PII) or generating synthetic data under the assumption that these methods adequately protect privacy; however, their effectiveness is often only assessed by measuring the leakage of explicit identifiers but ignoring nuanced textual markers that can lead to re-identification. We challenge the above illusion of privacy by proposing a new framework that evaluates re-identification attacks to quantify individual privacy risks upon data release. Our approach shows that seemingly innocuous auxiliary information — such as routine social activities — can be used to infer sensitive attributes like age or substance use history from sanitized data. For instance, we demonstrate that Azure’s commercial PII removal tool fails to protect 74% of information in the MedQA dataset. Although differential privacy mitigates these risks to some extent, it significantly reduces the utility of the sanitized text for downstream tasks. Our findings indicate that current sanitization techniques offer a textit{false sense of privacy}, highlighting the need for more robust methods that protect against semantic-level information leakage.

A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage 投稿を読む »

AI, Committee, ニュース, Uncategorized

FinBERT-QA: Financial Question Answering with pre-trained BERT Language Models

arXiv:2505.00725v1 Announce Type: new Abstract: Motivated by the emerging demand in the financial industry for the automatic analysis of unstructured and structured data at scale, Question Answering (QA) systems can provide lucrative and competitive advantages to companies by facilitating the decision making of financial advisers. Consequently, we propose a novel financial QA system using the transformer-based pre-trained BERT language model to address the limitations of data scarcity and language specificity in the financial domain. Our system focuses on financial non-factoid answer selection, which retrieves a set of passage-level texts and selects the most relevant as the answer. To increase efficiency, we formulate the answer selection task as a re-ranking problem, in which our system consists of an Answer Retriever using BM25, a simple information retrieval approach, to first return a list of candidate answers, and an Answer Re-ranker built with variants of pre-trained BERT language models to re-rank and select the most relevant answers. We investigate various learning, further pre-training, and fine-tuning approaches for BERT. Our experiments suggest that FinBERT-QA, a model built from applying the Transfer and Adapt further fine-tuning and pointwise learning approach, is the most effective, improving the state-of-the-art results of task 2 of the FiQA dataset by 16% on MRR, 17% on NDCG, and 21% on Precision@1.

FinBERT-QA: Financial Question Answering with pre-trained BERT Language Models 投稿を読む »

We use cookies to improve your experience and performance on our website. You can learn more at プライバシーポリシー and manage your privacy settings by clicking Settings.

Privacy Preferences

You can choose your cookie settings by turning on/off each type of cookie as you wish, except for essential cookies.

Allow All
Manage Consent Preferences
  • Always Active

Save
ja