YouZum

Committee

AI, Committee, Noticias, Uncategorized

The Prompting Brain: Neurocognitive Markers of Expertise in Guiding Large Language Models

arXiv:2508.14869v1 Announce Type: cross Abstract: Prompt engineering has rapidly emerged as a critical skill for effective interaction with large language models (LLMs). However, the cognitive and neural underpinnings of this expertise remain largely unexplored. This paper presents findings from a cross-sectional pilot fMRI study investigating differences in brain functional connectivity and network activity between experts and intermediate prompt engineers. Our results reveal distinct neural signatures associated with higher prompt engineering literacy, including increased functional connectivity in brain regions such as the left middle temporal gyrus and the left frontal pole, as well as altered power-frequency dynamics in key cognitive networks. These findings offer initial insights into the neurobiological basis of prompt engineering proficiency. We discuss the implications of these neurocognitive markers in Natural Language Processing (NLP). Understanding the neural basis of human expertise in interacting with LLMs can inform the design of more intuitive human-AI interfaces, contribute to cognitive models of LLM interaction, and potentially guide the development of AI systems that better align with human cognitive workflows. This interdisciplinary approach aims to bridge the gap between human cognition and machine intelligence, fostering a deeper understanding of how humans learn and adapt to complex AI systems.

The Prompting Brain: Neurocognitive Markers of Expertise in Guiding Large Language Models Leer entrada »

AI, Committee, Noticias, Uncategorized

EEG-MedRAG: Enhancing EEG-based Clinical Decision-Making via Hierarchical Hypergraph Retrieval-Augmented Generation

arXiv:2508.13735v1 Announce Type: new Abstract: With the widespread application of electroencephalography (EEG) in neuroscience and clinical practice, efficiently retrieving and semantically interpreting large-scale, multi-source, heterogeneous EEG data has become a pressing challenge. We propose EEG-MedRAG, a three-layer hypergraph-based retrieval-augmented generation framework that unifies EEG domain knowledge, individual patient cases, and a large-scale repository into a traversable n-ary relational hypergraph, enabling joint semantic-temporal retrieval and causal-chain diagnostic generation. Concurrently, we introduce the first cross-disease, cross-role EEG clinical QA benchmark, spanning seven disorders and five authentic clinical perspectives. This benchmark allows systematic evaluation of disease-agnostic generalization and role-aware contextual understanding. Experiments show that EEG-MedRAG significantly outperforms TimeRAG and HyperGraphRAG in answer accuracy and retrieval, highlighting its strong potential for real-world clinical decision support. Our data and code are publicly available at https://github.com/yi9206413-boop/EEG-MedRAG.

EEG-MedRAG: Enhancing EEG-based Clinical Decision-Making via Hierarchical Hypergraph Retrieval-Augmented Generation Leer entrada »

AI, Committee, Noticias, Uncategorized

Ask Good Questions for Large Language Models

arXiv:2508.14025v1 Announce Type: new Abstract: Recent advances in large language models (LLMs) have significantly improved the performance of dialog systems, yet current approaches often fail to provide accurate guidance of topic due to their inability to discern user confusion in related concepts. To address this, we introduce the Ask-Good-Question (AGQ) framework, which features an improved Concept-Enhanced Item Response Theory (CEIRT) model to better identify users’ knowledge levels. Our contributions include applying the CEIRT model along with LLMs to directly generate guiding questions based on the inspiring text, greatly improving information retrieval efficiency during the question & answer process. Through comparisons with other baseline methods, our approach outperforms by significantly enhencing the users’ information retrieval experiences.

Ask Good Questions for Large Language Models Leer entrada »

AI, Committee, Noticias, Uncategorized

ReviewGraph: A Knowledge Graph Embedding Based Framework for Review Rating Prediction with Sentiment Features

arXiv:2508.13953v1 Announce Type: new Abstract: In the hospitality industry, understanding the factors that drive customer review ratings is critical for improving guest satisfaction and business performance. This work proposes ReviewGraph for Review Rating Prediction (RRP), a novel framework that transforms textual customer reviews into knowledge graphs by extracting (subject, predicate, object) triples and associating sentiment scores. Using graph embeddings (Node2Vec) and sentiment features, the framework predicts review rating scores through machine learning classifiers. We compare ReviewGraph performance with traditional NLP baselines (such as Bag of Words, TF-IDF, and Word2Vec) and large language models (LLMs), evaluating them in the HotelRec dataset. In comparison to the state of the art literature, our proposed model performs similar to their best performing model but with lower computational cost (without ensemble). While ReviewGraph achieves comparable predictive performance to LLMs and outperforms baselines on agreement-based metrics such as Cohen’s Kappa, it offers additional advantages in interpretability, visual exploration, and potential integration into Retrieval-Augmented Generation (RAG) systems. This work highlights the potential of graph-based representations for enhancing review analytics and lays the groundwork for future research integrating advanced graph neural networks and fine-tuned LLM-based extraction methods. We will share ReviewGraph output and platform open-sourced on our GitHub page https://github.com/aaronlifenghan/ReviewGraph

ReviewGraph: A Knowledge Graph Embedding Based Framework for Review Rating Prediction with Sentiment Features Leer entrada »

AI, Committee, Noticias, Uncategorized

SEA-LION: Southeast Asian Languages in One Network

arXiv:2504.05747v3 Announce Type: replace Abstract: Recently, Large Language Models (LLMs) have dominated much of the artificial intelligence scene with their ability to process and generate natural languages. However, the majority of LLM research and development remains English-centric, leaving low-resource languages such as those in the Southeast Asian (SEA) region under-represented. To address this representation gap, we introduce Llama-SEA-LION-v3-8B-IT and Gemma-SEA-LION-v3-9B-IT, two cutting-edge multilingual LLMs designed for SEA languages. The SEA-LION family of LLMs supports 11 SEA languages, namely English, Chinese, Indonesian, Vietnamese, Malay, Thai, Burmese, Lao, Filipino, Tamil, and Khmer. Our work leverages large-scale multilingual continued pre-training with a comprehensive post-training regime involving multiple stages of instruction fine-tuning, alignment, and model merging. Evaluation results on multilingual benchmarks indicate that our models achieve state-of-the-art performance across LLMs supporting SEA languages. We open-source the models to benefit the wider SEA community.

SEA-LION: Southeast Asian Languages in One Network Leer entrada »

AI, Committee, Noticias, Uncategorized

A Coding Implementation to Build a Complete Self-Hosted LLM Workflow with Ollama, REST API, and Gradio Chat Interface

In this tutorial, we implement a fully functional Ollama environment inside Google Colab to replicate a self-hosted LLM workflow. We begin by installing Ollama directly on the Colab VM using the official Linux installer and then launch the Ollama server in the background to expose the HTTP API on localhost:11434. After verifying the service, we pull lightweight models such as qwen2.5:0.5b-instruct or llama3.2:1b, which balance resource constraints with usability in a CPU-only environment. To interact with these models programmatically, we use the /api/chat endpoint via Python’s requests module with streaming enabled, which allows token-level output to be captured incrementally. Finally, we layer a Gradio-based UI on top of this client so we can issue prompts, maintain multi-turn history, configure parameters like temperature and context size, and view results in real time. Check out the Full Codes here. Copy CodeCopiedUse a different Browser import os, sys, subprocess, time, json, requests, textwrap from pathlib import Path def sh(cmd, check=True): “””Run a shell command, stream output.””” p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True) for line in p.stdout: print(line, end=””) p.wait() if check and p.returncode != 0: raise RuntimeError(f”Command failed: {cmd}”) if not Path(“/usr/local/bin/ollama”).exists() and not Path(“/usr/bin/ollama”).exists(): print(” Installing Ollama …”) sh(“curl -fsSL https://ollama.com/install.sh | sh”) else: print(” Ollama already installed.”) try: import gradio except Exception: print(” Installing Gradio …”) sh(“pip -q install gradio==4.44.0”) We first check if Ollama is already installed on the system, and if not, we install it using the official script. At the same time, we ensure Gradio is available by importing it or installing the required version when missing. This way, we prepare our Colab environment for running the chat interface smoothly. Check out the Full Codes here. Copy CodeCopiedUse a different Browser def start_ollama(): try: requests.get(“http://127.0.0.1:11434/api/tags”, timeout=1) print(” Ollama server already running.”) return None except Exception: pass print(” Starting Ollama server …”) proc = subprocess.Popen([“ollama”, “serve”], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True) for _ in range(60): time.sleep(1) try: r = requests.get(“http://127.0.0.1:11434/api/tags”, timeout=1) if r.ok: print(” Ollama server is up.”) break except Exception: pass else: raise RuntimeError(“Ollama did not start in time.”) return proc server_proc = start_ollama() We start the Ollama server in the background and keep checking its health endpoint until it responds successfully. By doing this, we ensure the server is running and ready before sending any API requests. Check out the Full Codes here. Copy CodeCopiedUse a different Browser MODEL = os.environ.get(“OLLAMA_MODEL”, “qwen2.5:0.5b-instruct”) print(f” Using model: {MODEL}”) try: tags = requests.get(“http://127.0.0.1:11434/api/tags”, timeout=5).json() have = any(m.get(“name”)==MODEL for m in tags.get(“models”, [])) except Exception: have = False if not have: print(f” Pulling model {MODEL} (first time only) …”) sh(f”ollama pull {MODEL}”) We define the default model to use, check if it is already available on the Ollama server, and if not, we automatically pull it. This ensures that the chosen model is ready before we start running any chat sessions. Check out the Full Codes here. Copy CodeCopiedUse a different Browser OLLAMA_URL = “http://127.0.0.1:11434/api/chat” def ollama_chat_stream(messages, model=MODEL, temperature=0.2, num_ctx=None): “””Yield streaming text chunks from Ollama /api/chat.””” payload = { “model”: model, “messages”: messages, “stream”: True, “options”: {“temperature”: float(temperature)} } if num_ctx: payload[“options”][“num_ctx”] = int(num_ctx) with requests.post(OLLAMA_URL, json=payload, stream=True) as r: r.raise_for_status() for line in r.iter_lines(): if not line: continue data = json.loads(line.decode(“utf-8”)) if “message” in data and “content” in data[“message”]: yield data[“message”][“content”] if data.get(“done”): break We create a streaming client for the Ollama /api/chat endpoint, where we send messages as JSON payloads and yield tokens as they arrive. This lets us handle responses incrementally, so we see the model’s output in real time instead of waiting for the full completion. Check out the Full Codes here. Copy CodeCopiedUse a different Browser def smoke_test(): print(“n Smoke test:”) sys_msg = {“role”:”system”,”content”:”You are concise. Use short bullets.”} user_msg = {“role”:”user”,”content”:”Give 3 quick tips to sleep better.”} out = [] for chunk in ollama_chat_stream([sys_msg, user_msg], temperature=0.3): print(chunk, end=””) out.append(chunk) print(“n Done.n”) try: smoke_test() except Exception as e: print(” Smoke test skipped:”, e) We run a quick smoke test by sending a simple prompt through our streaming client to confirm that the model responds correctly. This helps us verify that Ollama is installed, the server is running, and the chosen model is working before we build the full chat UI. Check out the Full Codes here. Copy CodeCopiedUse a different Browser import gradio as gr SYSTEM_PROMPT = “You are a helpful, crisp assistant. Prefer bullets when helpful.” def chat_fn(message, history, temperature, num_ctx): msgs = [{“role”:”system”,”content”:SYSTEM_PROMPT}] for u, a in history: if u: msgs.append({“role”:”user”,”content”:u}) if a: msgs.append({“role”:”assistant”,”content”:a}) msgs.append({“role”:”user”,”content”: message}) acc = “” try: for part in ollama_chat_stream(msgs, model=MODEL, temperature=temperature, num_ctx=num_ctx or None): acc += part yield acc except Exception as e: yield f” Error: {e}” with gr.Blocks(title=”Ollama Chat (Colab)”, fill_height=True) as demo: gr.Markdown(“# Ollama Chat (Colab)nSmall local-ish LLM via Ollama + Gradio.n”) with gr.Row(): temp = gr.Slider(0.0, 1.0, value=0.3, step=0.1, label=”Temperature”) num_ctx = gr.Slider(512, 8192, value=2048, step=256, label=”Context Tokens (num_ctx)”) chat = gr.Chatbot(height=460) msg = gr.Textbox(label=”Your message”, placeholder=”Ask anything…”, lines=3) clear = gr.Button(“Clear”) def user_send(m, h): m = (m or “”).strip() if not m: return “”, h return “”, h + [[m, None]] def bot_reply(h, temperature, num_ctx): u = h[-1][0] stream = chat_fn(u, h[:-1], temperature, int(num_ctx)) acc = “” for partial in stream: acc = partial h[-1][1] = acc yield h msg.submit(user_send, [msg, chat], [msg, chat]) .then(bot_reply, [chat, temp, num_ctx], [chat]) clear.click(lambda: None, None, chat) print(” Launching Gradio …”) demo.launch(share=True) We integrate Gradio to build an interactive chat UI on top of the Ollama server, where user input and conversation history are converted into the correct message format and streamed back as model responses. The sliders let us adjust parameters like temperature and context length, while the chat box and clear button provide a simple, real-time interface for testing different prompts. In conclusion, we establish a reproducible pipeline for running Ollama in Colab: installation, server startup, model management, API access, and user interface integration. The system uses Ollama’s REST API as the core interaction layer, providing both command-line and Python streaming access, while Gradio handles session persistence and chat rendering. This approach preserves the “self-hosted” design described in the original guide but

A Coding Implementation to Build a Complete Self-Hosted LLM Workflow with Ollama, REST API, and Gradio Chat Interface Leer entrada »

AI, Committee, Noticias, Uncategorized

What do Speech Foundation Models Learn? Analysis and Applications

arXiv:2508.12255v1 Announce Type: new Abstract: Speech foundation models (SFMs) are designed to serve as general-purpose representations for a wide range of speech-processing tasks. The last five years have seen an influx of increasingly successful self-supervised and supervised pre-trained models with impressive performance on various downstream tasks. Although the zoo of SFMs continues to grow, our understanding of the knowledge they acquire lags behind. This thesis presents a lightweight analysis framework using statistical tools and training-free tasks to investigate the acoustic and linguistic knowledge encoded in SFM layers. We conduct a comparative study across multiple SFMs and statistical tools. Our study also shows that the analytical insights have concrete implications for downstream task performance. The effectiveness of an SFM is ultimately determined by its performance on speech applications. Yet it remains unclear whether the benefits extend to spoken language understanding (SLU) tasks that require a deeper understanding than widely studied ones, such as speech recognition. The limited exploration of SLU is primarily due to a lack of relevant datasets. To alleviate that, this thesis contributes tasks, specifically spoken named entity recognition (NER) and named entity localization (NEL), to the Spoken Language Understanding Evaluation benchmark. We develop SFM-based approaches for NER and NEL, and find that end-to-end (E2E) models leveraging SFMs can surpass traditional cascaded (speech recognition followed by a text model) approaches. Further, we evaluate E2E SLU models across SFMs and adaptation strategies to assess the impact on task performance. Collectively, this thesis tackles previously unanswered questions about SFMs, providing tools and datasets to further our understanding and to enable the community to make informed design choices for future model development and adoption.

What do Speech Foundation Models Learn? Analysis and Applications Leer entrada »

AI, Committee, Noticias, Uncategorized

Generative Medical Event Models Improve with Scale

arXiv:2508.12104v1 Announce Type: cross Abstract: Realizing personalized medicine at scale calls for methods that distill insights from longitudinal patient journeys, which can be viewed as a sequence of medical events. Foundation models pretrained on large-scale medical event data represent a promising direction for scaling real-world evidence generation and generalizing to diverse downstream tasks. Using Epic Cosmos, a dataset with medical events from de-identified longitudinal health records for 16.3 billion encounters over 300 million unique patient records from 310 health systems, we introduce the Cosmos Medical Event Transformer ( CoMET) models, a family of decoder-only transformer models pretrained on 118 million patients representing 115 billion discrete medical events (151 billion tokens). We present the largest scaling-law study for medical event data, establishing a methodology for pretraining and revealing power-law scaling relationships for compute, tokens, and model size. Based on this, we pretrained a series of compute-optimal models with up to 1 billion parameters. Conditioned on a patient’s real-world history, CoMET autoregressively generates the next medical event, simulating patient health timelines. We studied 78 real-world tasks, including diagnosis prediction, disease prognosis, and healthcare operations. Remarkably for a foundation model with generic pretraining and simulation-based inference, CoMET generally outperformed or matched task-specific supervised models on these tasks, without requiring task-specific fine-tuning or few-shot examples. CoMET’s predictive power consistently improves as the model and pretraining scale. Our results show that CoMET, a generative medical event foundation model, can effectively capture complex clinical dynamics, providing an extensible and generalizable framework to support clinical decision-making, streamline healthcare operations, and improve patient outcomes.

Generative Medical Event Models Improve with Scale Leer entrada »

AI, Committee, Noticias, Uncategorized

Improving Detection of Watermarked Language Models

arXiv:2508.13131v1 Announce Type: new Abstract: Watermarking has recently emerged as an effective strategy for detecting the generations of large language models (LLMs). The strength of a watermark typically depends strongly on the entropy afforded by the language model and the set of input prompts. However, entropy can be quite limited in practice, especially for models that are post-trained, for example via instruction tuning or reinforcement learning from human feedback (RLHF), which makes detection based on watermarking alone challenging. In this work, we investigate whether detection can be improved by combining watermark detectors with non-watermark ones. We explore a number of hybrid schemes that combine the two, observing performance gains over either class of detector under a wide range of experimental conditions.

Improving Detection of Watermarked Language Models Leer entrada »

AI, Committee, Noticias, Uncategorized

Fast, Slow, and Tool-augmented Thinking for LLMs: A Review

arXiv:2508.12265v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable progress in reasoning across diverse domains. However, effective reasoning in real-world tasks requires adapting the reasoning strategy to the demands of the problem, ranging from fast, intuitive responses to deliberate, step-by-step reasoning and tool-augmented thinking. Drawing inspiration from cognitive psychology, we propose a novel taxonomy of LLM reasoning strategies along two knowledge boundaries: a fast/slow boundary separating intuitive from deliberative processes, and an internal/external boundary distinguishing reasoning grounded in the model’s parameters from reasoning augmented by external tools. We systematically survey recent work on adaptive reasoning in LLMs and categorize methods based on key decision factors. We conclude by highlighting open challenges and future directions toward more adaptive, efficient, and reliable LLMs.

Fast, Slow, and Tool-augmented Thinking for LLMs: A Review Leer entrada »

We use cookies to improve your experience and performance on our website. You can learn more at Política de privacidad and manage your privacy settings by clicking Settings.

Privacy Preferences

You can choose your cookie settings by turning on/off each type of cookie as you wish, except for essential cookies.

Allow All
Manage Consent Preferences
  • Always Active

Save
es_ES