News
News
T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation
arXiv:2501.12612v3 Announce Type: replace Abstract: Text-to-image (T2I) models have rapidly advanced, enabling the generation of...
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
arXiv:2505.00703v2 Announce Type: replace-cross Abstract: Recent advancements in large language models have demonstrated how chain-of-thought...
System Report for CCL25-Eval Task 10: SRAG-MAV for Fine-Grained Chinese Hate Speech Recognition
arXiv:2507.18580v1 Announce Type: new Abstract: This paper presents our system for CCL25-Eval Task 10, addressing...
SynPref-40M and Skywork-Reward-V2: Scalable Human-AI Alignment for State-of-the-Art Reward Models
Understanding Limitations of Current Reward Models Although reward models play a crucial role in Reinforcement...
SynapseRoute: An Auto-Route Switching Framework on Dual-State Large Language Model
arXiv:2507.02822v1 Announce Type: new Abstract: With the widespread adoption of large language models (LLMs) in...
SWE-Bench Performance Reaches 50.8% Without Tool Use: A Case for Monolithic State-in-Context Agents
Recent advancements in LM agents have shown promising potential for automating intricate real-world tasks. These...
Surprise Calibration for Better In-Context Learning
arXiv:2506.12796v1 Announce Type: new Abstract: In-context learning (ICL) has emerged as a powerful paradigm for...
Superintelligence: Unlocking the Mysteries of the Future
Imagine a future where machines don’t just outperform humans in specific tasks but fundamentally outthink...
Stop guessing why your LLMs break: Anthropic’s new tool shows you exactly what goes wrong
Anthropic’s open-source circuit tracing tool can help developers debug, optimize, and control AI for reliable...
StepFun AI Releases Step-Audio 2 Mini: An Open-Source 8B Speech-to-Speech AI Model that Surpasses GPT-4o-Audio
The StepFun AI team has released Step-Audio 2 Mini, an 8B parameter speech-to-speech large audio...
SRA-MCTS: Self-driven Reasoning Augmentation with Monte Carlo Tree Search for Code Generation
arXiv:2411.11053v5 Announce Type: replace Abstract: Large language models demonstrate exceptional performance in simple code generation...
SQLong: Enhanced NL2SQL for Longer Contexts with LLMs
arXiv:2502.16747v2 Announce Type: replace Abstract: Open-weight large language models (LLMs) have significantly advanced performance in...