---
id: 20260318-T0-06
title: "智能编程评估中的基础设施噪音量化研究"
title_en: "Study Quantifies Infrastructure Noise in AI Coding Evaluations"
url: https://ai.daily.yangsir.net/daily/20260318-T0-06
issue_date: 2026-03-18
publish_date: 2026-03-17T18:26:38.000Z
category: research
source_name: "Anthropic Engineering"
source_url: https://www.anthropic.com/engineering/infrastructure-noise
---

# 智能编程评估中的基础设施噪音量化研究

Anthropic研究团队首次量化了智能编程评估中的基础设施噪音影响。研究显示，环境差异会导致评估结果产生显著偏差，这对模型比较和性能优化提出了新挑战。研究人员提出了标准化测试环境的建议，以提高评估的准确性和公平性。

## English Version

**Study Quantifies Infrastructure Noise in AI Coding Evaluations**

Anthropic's research team has for the first time quantified infrastructure noise impacts in AI coding evaluations. The study reveals significant evaluation result variations due to environmental differences, challenging model comparisons and performance optimization. Researchers recommend standardized test environments for more accurate and fair assessments.

---

**来源**：[Anthropic Engineering](https://www.anthropic.com/engineering/infrastructure-noise)

**详情页**：https://ai.daily.yangsir.net/daily/20260318-T0-06

---

*智语观潮 · Daily — https://ai.daily.yangsir.net/llms.txt*