YouZum

Noticias

Noticias

AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs

arXiv:2506.14562v3 Announce Type: replace Abstract: Weight decay is a standard regularization technique for training large...

ALOPE: Adaptive Layer Optimization for Translation Quality Estimation using Large Language Models

arXiv:2508.07484v1 Announce Type: new Abstract: Large Language Models (LLMs) have shown remarkable performance across a...

ALIGNS: Unlocking nomological networks in psychological measurement through a large language model

arXiv:2509.09723v1 Announce Type: new Abstract: Psychological measurement is critical to many disciplines. Despite advances in...

Alibaba’s ‘ZeroSearch’ lets AI learn to google itself — slashing training costs by 88 percent

Alibaba’s ZeroSearch trains large language models to beat Google Search and slash API costs by...

Alibaba Tongyi Lab Releases MAI-UI: A Foundation GUI Agent Family that Surpasses Gemini 2.5 Pro, Seed1.8 and UI-Tars-2 on AndroidWorld

Alibaba Tongyi Lab have released MAI-UI—a family of foundation GUI agents. It natively integrates MCP...

Alibaba Qwen Unveils Qwen3-4B-Instruct-2507 and Qwen3-4B-Thinking-2507: Refreshing the Importance of Small Language Models

Smaller Models with Smarter Performance and 256K Context Support Alibaba’s Qwen team has introduced two...

Alibaba Qwen Team Releases Qwen3-Embedding and Qwen3-Reranker Series – Redefining Multilingual Embedding and Ranking Standards

Text embedding and reranking are foundational to modern information retrieval systems, powering applications such as...

Alibaba Qwen Team Releases Qwen-VLo: A Unified Multimodal Understanding and Generation Model

The Alibaba Qwen team has introduced Qwen-VLo, a new addition to its Qwen model family...

Alibaba AI Unveils Qwen3-Max Preview: A Trillion-Parameter Qwen Model with Super Fast Speed and Quality

Alibaba’s Qwen Team unveiled Qwen3-Max-Preview (Instruct), a new flagship large language model with over one...

ALARB: An Arabic Legal Argument Reasoning Benchmark

arXiv:2510.00694v1 Announce Type: new Abstract: We introduce ALARB, a dataset and suite of tasks designed...

AIG Essay #7: Co-conspiratorial Optimization

The Anatomy of Language Models’ Obsession with Questions Continue reading on Medium »...

We use cookies to improve your experience and performance on our website. You can learn more at Política de privacidad and manage your privacy settings by clicking Settings.

Privacy Preferences

You can choose your cookie settings by turning on/off each type of cookie as you wish, except for essential cookies.

Allow All
Manage Consent Preferences
  • Always Active

Save
es_ES