Noticias
Noticias
Search and Refine During Think: Autonomous Retrieval-Augmented Reasoning of LLMs
arXiv:2505.11277v3 Announce Type: replace Abstract: Large language models have demonstrated impressive reasoning capabilities but are...
Scientists’ First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning
arXiv:2506.10521v2 Announce Type: replace-cross Abstract: Scientific discoveries increasingly rely on complex multimodal reasoning based on...
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
arXiv:2505.13227v2 Announce Type: replace-cross Abstract: Graphical user interface (GUI) grounding, the ability to map natural...
ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing
arXiv:2506.19848v1 Announce Type: cross Abstract: This paper presents ScaleCap, an inference-time scalable image captioning strategy...
SAS: Simulated Attention Score
arXiv:2507.07694v1 Announce Type: new Abstract: The attention mechanism is a core component of the Transformer...
Samsung Researchers Introduced ANSE (Active Noise Selection for Generation): A Model-Aware Framework for Improving Text-to-Video Diffusion Models through Attention-Based Uncertainty Estimation
Video generation models have become a core technology for creating dynamic content by transforming text...
Sampling Without Data is Now Scalable: Meta AI Releases Adjoint Sampling for Reward-Driven Generative Modeling
Data Scarcity in Generative Modeling Generative models traditionally rely on large, high-quality datasets to produce...
Sam Altman calls for ‘AI privilege’ as OpenAI clarifies court order to retain temporary and deleted ChatGPT sessions
Should talking to an AI chatbot be protected and privileged information, like talking to a...
Salesforce AI Releases BLIP3-o: A Fully Open-Source Unified Multimodal Model Built with CLIP Embeddings and Flow Matching for Image Understanding and Generation
Multimodal modeling focuses on building systems to understand and generate content across visual and textual...
Salesforce AI Introduces CRMArena-Pro: The First Multi-Turn and Enterprise-Grade Benchmark for LLM Agents
AI agents powered by LLMs show great promise for handling complex business tasks, especially in...
Sakana AI Introduces Text-to-LoRA (T2L): A Hypernetwork that Generates Task-Specific LLM Adapters (LoRAs) based on a Text Description of the Task
Transformer models have significantly influenced how AI systems approach tasks in natural language understanding, translation...
s3: The new RAG framework that trains search agents with minimal data
S3 decouples RAG search from generation, boosting efficiency and generalization for enterprise LLM applications with...