---
id: 20260515-T0-11
title: "可验证过程监督：让LLM既答对又推理好"
title_en: "Verifiable Process Supervision: LLMs Answer Correctly + Reason Well"
url: https://ai.daily.yangsir.net/daily/20260515-T0-11
issue_date: 2026-05-15
publish_date: 2026-05-14T04:00:00.000Z
category: research
source_name: "arXiv cs.CL (NLP)"
source_url: https://arxiv.org/abs/2605.12519
---

# 可验证过程监督：让LLM既答对又推理好

arXiv论文提出新监督方法，不仅关注最终答案正确性，还验证推理过程有效性。该方法通过可验证的强化学习，解决了传统监督只优化结果而忽视路径的问题，在数学和代码任务中表现优异。

## English Version

**Verifiable Process Supervision: LLMs Answer Correctly + Reason Well**

This arXiv paper introduces verifiable process supervision that optimizes both final answers and reasoning paths. Unlike traditional RL that only rewards outcomes, this method verifies the soundness of the reasoning process, showing superior performance on math and coding tasks.

---

**来源**：[arXiv cs.CL (NLP)](https://arxiv.org/abs/2605.12519)

**详情页**：https://ai.daily.yangsir.net/daily/20260515-T0-11

---

*智语观潮 · Daily — https://ai.daily.yangsir.net/llms.txt*