Notizie
Notizie
Benchmarking the Pedagogical Knowledge of Large Language Models
arXiv:2506.18710v3 Announce Type: replace Abstract: Benchmarks like Massive Multitask Language Understanding (MMLU) have played a...
Benchmarking Chinese Commonsense Reasoning with a Multi-hop Reasoning Perspective
arXiv:2510.08800v1 Announce Type: new Abstract: While Large Language Models (LLMs) have demonstrated advanced reasoning capabilities...
BEAT: Visual Backdoor Attacks on VLM-based Embodied Agents via Contrastive Trigger Learning
arXiv:2510.27623v3 Announce Type: replace-cross Abstract: Recent advances in Vision-Language Models (VLMs) have propelled embodied agents...
Batch-Max: Higher LLM Throughput using Larger Batch Sizes and KV Cache Compression
arXiv:2412.05693v3 Announce Type: replace Abstract: Several works have developed eviction policies to remove key-value (KV)...
Base Models Beat Aligned Models at Randomness and Creativity
arXiv:2505.00047v2 Announce Type: replace Abstract: Alignment has quickly become a default ingredient in LLM development...
BanglaIPA: Towards Robust Text-to-IPA Transcription with Contextual Rewriting in Bengali
arXiv:2601.01778v1 Announce Type: new Abstract: Despite its widespread use, Bengali lacks a robust automated International...
Baidu Releases ERNIE-4.5-VL-28B-A3B-Thinking: An Open-Source and Compact Multimodal Reasoning Model Under the ERNIE-4.5 Family
How can we get large model level multimodal reasoning for documents, charts and videos while...
Aya Vision: Advancing the Frontier of Multilingual Multimodality
arXiv:2505.08751v1 Announce Type: new Abstract: Building multimodal language models is fundamentally challenging: it requires aligning...
AWED-FiNER: Agents, Web applications, and Expert Detectors for Fine-grained Named Entity Recognition across 36 Languages for 6.6 Billion Speakers
arXiv:2601.10161v2 Announce Type: replace Abstract: Named Entity Recognition (NER) is a foundational task in Natural...
AWARE, Beyond Sentence Boundaries: A Contextual Transformer Framework for Identifying Cultural Capital in STEM Narratives
arXiv:2510.04983v3 Announce Type: replace Abstract: Identifying cultural capital (CC) themes in student reflections can offer...
Avoiding Knowledge Edit Skipping in Multi-hop Question Answering with Guided Decomposition
arXiv:2509.07555v1 Announce Type: new Abstract: In a rapidly evolving world where information updates swiftly, knowledge...
AutoSpec: An Agentic Framework for Automatically Drafting Patent Specification
arXiv:2509.19640v1 Announce Type: new Abstract: Patents play a critical role in driving technological innovation by...
