---
id: 20260311-T0-14
title: "大语言模型注意力汇聚现象可解释性研究"
title_en: "Attention Concentration Explained in LLMs"
url: https://ai.daily.yangsir.net/daily/20260311-T0-14
issue_date: 2026-03-11
publish_date: 2026-03-10T04:00:00.000Z
source_name: "arXiv cs.LG (ML)"
source_url: https://arxiv.org/abs/2603.06591
---

# 大语言模型注意力汇聚现象可解释性研究

arXiv论文从可解释性角度分析大语言模型中的注意力汇聚现象。研究发现模型倾向于将注意力过度集中在特定token上，但部分情况下这种汇聚有助于提升任务性能。实验显示在问答任务中，注意力汇聚可使F1分数提升15%，而在创意写作中则可能限制多样性。研究者提出通过动态调整注意力分布权重，在保持任务性能的同时减少偏见。该方法在GLM-4模型测试中使有害输出减少40%。

## English Version

**Attention Concentration Explained in LLMs**

arXiv study analyzes LLM attention concentration patterns from an interpretability perspective. Models tend to over-focus on specific tokens, which can enhance task performance in some cases while limiting diversity in others. Experiments show attention concentration boosts F1 scores by 15% in Q&A tasks but may restrict creativity. Researchers propose dynamic attention weight adjustment to maintain performance while reducing bias, achieving 40% fewer harmful outputs in GLM-4 tests.

---

**来源**：[arXiv cs.LG (ML)](https://arxiv.org/abs/2603.06591)

**详情页**：https://ai.daily.yangsir.net/daily/20260311-T0-14

---

*智语观潮 · Daily — https://ai.daily.yangsir.net/llms.txt*