---
id: 20260304-T0-18
title: "CARE方法解决LLM评估偏见"
title_en: "CARE Method Addresses LLM Evaluation Bias"
url: https://ai.daily.yangsir.net/daily/20260304-T0-18
issue_date: 2026-03-04
publish_date: 2026-03-03T05:00:00.000Z
source_name: "arXiv cs.LG (ML)"
source_url: https://arxiv.org/abs/2603.00039
---

# CARE方法解决LLM评估偏见

CARE 方法通过混淆感知聚合提升 LLM 评估可靠性。传统 LLM-as-a-judge 集成假设各评估独立，但实际存在系统性偏见。该方法引入混淆变量检测机制，动态调整权重，在 HELM 基准测试中，评估一致性提升 28%，可有效过滤模型风格偏好对质量评分的干扰。

## English Version

**CARE Method Addresses LLM Evaluation Bias**

CARE method enhances LLM evaluation reliability through confusion-aware aggregation. Traditional LLM-as-a-judge integration assumes independent evaluations, but suffers from systematic biases. This method introduces confounding variable detection mechanisms, dynamically adjusting weights, improving evaluation consistency by 28% in HELM benchmarks. Effectively filters interference from model style preferences on quality scores.

---

**来源**：[arXiv cs.LG (ML)](https://arxiv.org/abs/2603.00039)

**详情页**：https://ai.daily.yangsir.net/daily/20260304-T0-18

---

*智语观潮 · Daily — https://ai.daily.yangsir.net/llms.txt*