ニュース
ニュース
Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers
arXiv:2507.10787v1 Announce Type: new Abstract: This paper introduces MISS-QA, the first benchmark specifically designed to...
Can LLMs Really Judge with Reasoning? Microsoft and Tsinghua Researchers Introduce Reward Reasoning Models to Dynamically Scale Test-Time Compute for Better Alignment
Reinforcement learning (RL) has emerged as a fundamental approach in LLM post-training, utilizing supervision signals...
Can LLMs Generate Reliable Test Case Generators? A Study on Competition-Level Programming Problems
arXiv:2506.06821v3 Announce Type: replace Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities in code...
Can Large Language Models Express Uncertainty Like Human?
arXiv:2509.24202v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used in high-stakes settings...
Can crowdsourced fact-checking curb misinformation on social media?
In a 2019 speech at Georgetown University, Mark Zuckerberg famously declared that he didn’t want...
Can a Small Language Model Predict Kernel Latency, Memory, and Model Accuracy from Code? A New Regression Language Model (RLM) Says Yes
Researchers from Cornell and Google introduce a unified Regression Language Model (RLM) that predicts numeric...
Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance
arXiv:2504.19811v2 Announce Type: replace Abstract: Accurately forecasting the performance of Large Language Models (LLMs) before...
Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding
arXiv:2505.03788v1 Announce Type: new Abstract: We introduce a novel approach for calibrating uncertainty quantification (UQ)...
C-VARC: A Large-Scale Chinese Value Rule Corpus for Value Alignment of Large Language Models
arXiv:2506.01495v5 Announce Type: replace Abstract: Ensuring that Large Language Models (LLMs) align with mainstream human...
ByteDance Researchers Introduce DetailFlow: A 1D Coarse-to-Fine Autoregressive Framework for Faster, Token-Efficient Image Generation
Autoregressive image generation has been shaped by advances in sequential modeling, originally seen in natural...
By putting AI into everything, Google wants to make it invisible
If you want to know where AI is headed, this year’s Google I/O has you...
Busted by the em dash — AI’s favorite punctuation mark, and how it’s blowing your cover
AI is brilliant at polishing and rephrasing. But like a child with glitter glue, you...




