Nachrichten
Nachrichten
LLM-as-a-Judge: Can Language Models Be Trusted to Evaluate Other Models?
Exploring the promise, pitfalls, and practical applications of using LLMs to automate AI evaluation — from synthetic...
LLM-as-a-Judge for Reference-less Automatic Code Validation and Refinement for Natural Language to Bash in IT Automation
arXiv:2506.11237v1 Announce Type: cross Abstract: In an effort to automatically evaluate and select the best...
LLM one-shot style transfer for Authorship Attribution and Verification
arXiv:2510.13302v1 Announce Type: new Abstract: Computational stylometry analyzes writing style through quantitative patterns in text...
LLaPa: A Vision-Language Model Framework for Counterfactual-Aware Procedural Planning
arXiv:2507.08496v1 Announce Type: new Abstract: While large language models (LLMs) have advanced procedural planning for...
Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs
arXiv:2505.09338v1 Announce Type: new Abstract: We observe a novel phenomenon, contextual entrainment, across a wide...
Liquid AI’s LFM2-VL-3B Brings a 3B Parameter Vision Language Model (VLM) to Edge-Class Devices
Liquid AI released LFM2-VL-3B, a 3B parameter vision language model for image text to text...
Liquid AI Releases LFM2-ColBERT-350M: A New Small Model that brings Late Interaction Retrieval to Multilingual and Cross-Lingual RAG
Can a compact late interaction retriever index once and deliver accurate cross lingual search with...
Liquid AI Releases LFM2-8B-A1B: An On-Device Mixture-of-Experts with 8.3B Params and a 1.5B Active Params per Token
How much capability can a sparse 8.3B-parameter MoE with a ~1.5B active path deliver on...
Like humans, AI is forcing institutions to rethink their purpose
Like people undergoing cognitive migration, institutions must reassess what they were made for in this...
Let’s Reason Formally: Natural-Formal Hybrid Reasoning Enhances LLM’s Math Capability
arXiv:2505.23703v4 Announce Type: replace-cross Abstract: Enhancing the mathematical reasoning capabilities of LLMs has garnered significant...
LegalEval-Q: A New Benchmark for The Quality Evaluation of LLM-Generated Legal Text
arXiv:2505.24826v1 Announce Type: new Abstract: As large language models (LLMs) are increasingly used in legal...
Learning to Interpret Weight Differences in Language Models
arXiv:2510.05092v3 Announce Type: replace-cross Abstract: Finetuning (pretrained) language models is a standard approach for updating...



