ニュース

7月 16, 2025admin NUAI,Committee,ニュース,Uncategorized0

Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers

arXiv:2507.10787v1 Announce Type: new Abstract: This paper introduces MISS-QA, the first benchmark specifically designed to...

5月 27, 2025admin NUAI,Committee,ニュース,Uncategorized0

Can LLMs Really Judge with Reasoning? Microsoft and Tsinghua Researchers Introduce Reward Reasoning Models to Dynamically Scale Test-Time Compute for Better Alignment

Reinforcement learning (RL) has emerged as a fundamental approach in LLM post-training, utilizing supervision signals...

7月 23, 2025admin NUAI,Committee,ニュース,Uncategorized0

Can LLMs Generate Reliable Test Case Generators? A Study on Competition-Level Programming Problems

arXiv:2506.06821v3 Announce Type: replace Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities in code...

9月 30, 2025admin NUAI,Committee,ニュース,Uncategorized0

Can Large Language Models Express Uncertainty Like Human?

arXiv:2509.24202v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used in high-stakes settings...

5月 19, 2025admin NUAI,Committee,ニュース,Uncategorized0

Can crowdsourced fact-checking curb misinformation on social media?

In a 2019 speech at Georgetown University, Mark Zuckerberg famously declared that he didn’t want...

10月 4, 2025admin NUAI,Committee,ニュース,Uncategorized0

Can a Small Language Model Predict Kernel Latency, Memory, and Model Accuracy from Code? A New Regression Language Model (RLM) Says Yes

Researchers from Cornell and Google introduce a unified Regression Language Model (RLM) that predicts numeric...

8月 11, 2025admin NUAI,Committee,ニュース,Uncategorized0

Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance

arXiv:2504.19811v2 Announce Type: replace Abstract: Accurately forecasting the performance of Large Language Models (LLMs) before...

5月 8, 2025admin NUAI,Committee,ニュース,Uncategorized0

Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding

arXiv:2505.03788v1 Announce Type: new Abstract: We introduce a novel approach for calibrating uncertainty quantification (UQ)...

1月 5, 2026admin NUAI,Committee,ニュース,Uncategorized0

C-VARC: A Large-Scale Chinese Value Rule Corpus for Value Alignment of Large Language Models

arXiv:2506.01495v5 Announce Type: replace Abstract: Ensuring that Large Language Models (LLMs) align with mainstream human...

6月 8, 2025admin NUAI,Committee,ニュース,Uncategorized0

ByteDance Researchers Introduce DetailFlow: A 1D Coarse-to-Fine Autoregressive Framework for Faster, Token-Efficient Image Generation

Autoregressive image generation has been shaped by advances in sequential modeling, originally seen in natural...

5月 21, 2025admin NUAI,Committee,ニュース,Uncategorized0

By putting AI into everything, Google wants to make it invisible

If you want to know where AI is headed, this year’s Google I/O has you...

8月 24, 2025admin NUAI,Committee,ニュース,Uncategorized0

Busted by the em dash — AI’s favorite punctuation mark, and how it’s blowing your cover

AI is brilliant at polishing and rephrasing. But like a child with glitter glue, you...

ニュース

ニュース

Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers

Can LLMs Really Judge with Reasoning? Microsoft and Tsinghua Researchers Introduce Reward Reasoning Models to Dynamically Scale Test-Time Compute for Better Alignment

Can LLMs Generate Reliable Test Case Generators? A Study on Competition-Level Programming Problems

Can Large Language Models Express Uncertainty Like Human?

Can crowdsourced fact-checking curb misinformation on social media?

Can a Small Language Model Predict Kernel Latency, Memory, and Model Accuracy from Code? A New Regression Language Model (RLM) Says Yes

Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance

Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding

C-VARC: A Large-Scale Chinese Value Rule Corpus for Value Alignment of Large Language Models

ByteDance Researchers Introduce DetailFlow: A 1D Coarse-to-Fine Autoregressive Framework for Faster, Token-Efficient Image Generation

By putting AI into everything, Google wants to make it invisible

Busted by the em dash — AI’s favorite punctuation mark, and how it’s blowing your cover

私たちのサービス

ホーム

仕組み

ニュース

料金

サポート

ヘルプセンター

問題を報告

フィードバックを送る

プライバシーポリシー

ユーザーアカウント

フォローする