Noticias
Noticias
NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining
arXiv:2507.14119v1 Announce Type: cross Abstract: Recent advances in generative modeling enable image editing assistants that...
New embedding model leaderboard shakeup: Google takes #1 while Alibaba’s open source alternative closes gap
Google’s new Gemini Embedding model now leads the MTEB benchmark. But it is facing fierce...
New AI Method From Meta and NYU Boosts LLM Alignment Using Semi-Online Reinforcement Learning
Optimizing LLMs for Human Alignment Using Reinforcement Learning Large language models often require a further...
NeuralOS: A Generative Framework for Simulating Interactive Operating System Interfaces
Transforming Human-Computer Interaction with Generative Interfaces Recent advances in generative models are transforming the way...
Neural at ArchEHR-QA 2025: Agentic Prompt Optimization for Evidence-Grounded Clinical Question Answering
arXiv:2506.10751v1 Announce Type: cross Abstract: Automated question answering (QA) over electronic health records (EHRs) can...
Natural language processing for African languages
arXiv:2507.00297v1 Announce Type: new Abstract: Recent advances in word embeddings and language models use large-scale...
National University of Singapore Researchers Introduce Dimple: A Discrete Diffusion Multimodal Language Model for Efficient and Controllable Text Generation
In recent months, there has been growing interest in applying diffusion models—originally designed for continuous...
Narrowing the Gap: Supervised Fine-Tuning of Open-Source LLMs as a Viable Alternative to Proprietary Models for Pedagogical Tools
arXiv:2507.05305v1 Announce Type: cross Abstract: Frontier Large language models (LLMs) like ChatGPT and Gemini can...
Multimodal LLMs Without Compromise: Researchers from UCLA, UW–Madison, and Adobe Introduce X-Fusion to Add Vision to Frozen Language Models Without Losing Language Capabilities
LLMs have made significant strides in language-related tasks such as conversational AI, reasoning, and code...
Multimodal Foundation Models Fall Short on Physical Reasoning: PHYX Benchmark Highlights Key Limitations in Visual and Symbolic Integration
State-of-the-art models show human-competitive accuracy on AIME, GPQA, MATH-500, and OlympiadBench, solving Olympiad-level problems. Recent...
Multimodal Assessment of Classroom Discourse Quality: A Text-Centered Attention-Based Multi-Task Learning Approach
arXiv:2505.07902v1 Announce Type: cross Abstract: Classroom discourse is an essential vehicle through which teaching and...
Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You
arXiv:2401.16092v4 Announce Type: replace Abstract: Text-to-image generation models have recently achieved astonishing results in image...