Nachrichten
Nachrichten
AgentCompass: Towards Reliable Evaluation of Agentic Workflows in Production
arXiv:2509.14647v1 Announce Type: cross Abstract: With the growing adoption of Large Language Models (LLMs) in...
AgentArmor: Enforcing Program Analysis on Agent Runtime Trace to Defend Against Prompt Injection
arXiv:2508.01249v1 Announce Type: cross Abstract: Large Language Model (LLM) agents offer a powerful new paradigm...
Agentar-Fin-R1: Enhancing Financial Intelligence through Domain Expertise, Training Efficiency, and Advanced Reasoning
arXiv:2507.16802v2 Announce Type: replace Abstract: Large Language Models (LLMs) exhibit considerable promise in financial applications;...
Agent-based computing is outgrowing the web as we know it
AI agents are moving from passive assistants to active participants. Today, we ask them to...
Agent Learning via Early Experience
arXiv:2510.08558v2 Announce Type: replace-cross Abstract: A long-term goal of language agents is to learn and...
AegisLLM: Scaling LLM Security Through Adaptive Multi-Agent Systems at Inference Time
The Growing Threat Landscape for LLMs LLMs are key targets for fast-evolving attacks, including prompt...
Adversarial Topic-aware Prompt-tuning for Cross-topic Automated Essay Scoring
arXiv:2508.05987v1 Announce Type: new Abstract: Cross-topic automated essay scoring (AES) aims to develop a transferable...
Advancing Single and Multi-task Text Classification through Large Language Model Fine-tuning
arXiv:2412.08587v2 Announce Type: replace Abstract: Both encoder-only models (e.g., BERT, RoBERTa) and large language models...
Adopting agentic AI? Build AI fluency, redesign workflows, don’t neglect supervision
How can organizations decide how to use human-in-the-loop mechanisms and collaborative frameworks with AI agents?Read...
Achieving Tokenizer Flexibility in Language Models through Heuristic Adaptation and Supertoken Learning
arXiv:2505.09738v1 Announce Type: new Abstract: Pretrained language models (LLMs) are often constrained by their fixed...
Accenture Research Introduce MCP-Bench: A Large-Scale Benchmark that Evaluates LLM Agents in Complex Real-World Tasks via MCP Servers
Modern large language models (LLMs) have moved far beyond simple text generation. Many of the...
Accent-Invariant Automatic Speech Recognition via Saliency-Driven Spectrogram Masking
arXiv:2510.09528v1 Announce Type: new Abstract: Pre-trained transformer-based models have significantly advanced automatic speech recognition (ASR)...