---
id: 20260611-T0-13
title: "早期标记置信度可预测多智能体辩论的推理质量"
title_en: "Early-Token Confidence Predicts Debate Reasoning Quality"
url: https://ai.daily.yangsir.net/daily/20260611-T0-13
issue_date: 2026-06-11
publish_date: 2026-06-10T04:00:00.000Z
category: research
source_name: "arXiv cs.CL (NLP)"
source_url: https://arxiv.org/abs/2606.10307
---

# 早期标记置信度可预测多智能体辩论的推理质量

研究发现多智能体LLM辩论的推理质量可通过早期标记置信度预测。针对开放任务缺乏标准答案的问题，研究使用解码过程中的标记级对数概率作为内在自信信号。实验表明，早期高置信度通常对应高质量推理，为评估复杂系统提供新指标。代码已开源。

## English Version

**Early-Token Confidence Predicts Debate Reasoning Quality**

Research finds that early-token confidence signals can predict reasoning quality in multi-agent LLM debates. For open-ended tasks without ground truth, the study uses token-level log-probabilities as intrinsic confidence metrics. Experiments show early high confidence correlates with high-quality reasoning, offering new evaluation metrics for complex systems.

---

**来源**：[arXiv cs.CL (NLP)](https://arxiv.org/abs/2606.10307)

**详情页**：https://ai.daily.yangsir.net/daily/20260611-T0-13

---

*智语观潮 · Daily — https://ai.daily.yangsir.net/llms.txt*