Noticias
Noticias
Ask Good Questions for Large Language Models
arXiv:2508.14025v1 Announce Type: new Abstract: Recent advances in large language models (LLMs) have significantly improved...
Are Your LLMs Capable of Stable Reasoning?
arXiv:2412.13147v5 Announce Type: replace-cross Abstract: The rapid advancement of large language models (LLMs) has shown...
Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark
arXiv:2510.26802v1 Announce Type: cross Abstract: Recent video generation models can produce high-fidelity, temporally coherent videos...
Are LLMs Truly Multilingual? Exploring Zero-Shot Multilingual Capability of LLMs for Information Retrieval: An Italian Healthcare Use Case
arXiv:2512.04834v1 Announce Type: cross Abstract: Large Language Models (LLMs) have become a key topic in...
ArcMemo: Abstract Reasoning Composition with Lifelong LLM Memory
arXiv:2509.04439v1 Announce Type: cross Abstract: While inference-time scaling enables LLMs to carry out increasingly long...
Architectural Fusion Through Contextual Partitioning in Large Language Models: A Novel Approach to Parameterized Knowledge Integration
arXiv:2501.12901v2 Announce Type: replace Abstract: Contextual Partitioning introduces an innovative approach to enhancing the architectural...
Apple Researchers Reveal Structural Failures in Large Reasoning Models Using Puzzle-Based Evaluation
Artificial intelligence has undergone a significant transition from basic language models to advanced models that...
Apple makes major AI advance with image generation technology rivaling DALL-E and Midjourney
Apple researchers develop STARFlow, a breakthrough AI image generation system that challenges diffusion models used...
Apple and Duke Researchers Present a Reinforcement Learning Approach That Enables LLMs to Provide Intermediate Answers, Enhancing Speed and Accuracy
Long CoT reasoning improves large language models’ performance on complex tasks but comes with drawbacks...
Antidistillation Sampling
arXiv:2504.13146v4 Announce Type: replace-cross Abstract: Frontier models that generate extended reasoning traces inadvertently produce rich...
Anthropic’s New Research Shows Claude can Detect Injected Concepts, but only in Controlled Layers
How do you tell whether a model is actually noticing its own internal state instead...
Anthropic study: Leading AI models show up to 96% blackmail rate against executives
Anthropic research reveals AI models from OpenAI, Google, Meta and others chose blackmail, corporate espionage...



