---
id: 20260522-T0-16
title: "GROW：开放世界VLM代理的状态-动作对齐方法"
title_en: "GROW: Aligns VLM Agents with State-Action Modeling"
url: https://ai.daily.yangsir.net/daily/20260522-T0-16
issue_date: 2026-05-22
publish_date: 2026-05-21T04:00:00.000Z
category: research
source_name: "arXiv cs.LG (ML)"
source_url: https://arxiv.org/abs/2605.20246
---

# GROW：开放世界VLM代理的状态-动作对齐方法

arXiv论文GROW提出开放世界视觉语言代理的全新对齐方法。现有VLM代理在多轮视觉感知和动作执行任务中表现有限，主要依赖奖励函数优化。该研究通过状态-动作建模强化GRPO算法，使代理能根据环境状态选择最优动作。在机器人导航测试中，任务完成率提升25%，尤其适用于需要动态调整策略的现实场景。

## English Version

**GROW: Aligns VLM Agents with State-Action Modeling**

arXiv paper GROW introduces a new alignment method for open-world VLM agents. Existing agents struggle with multi-turn visual tasks, relying mainly on reward optimization. This method enhances GRPO algorithm with state-action modeling, enabling agents to select optimal actions based on environment states. In robot navigation tests, it improved task completion rates by 25%, ideal for real-world scenarios requiring dynamic strategy adjustment.

---

**来源**：[arXiv cs.LG (ML)](https://arxiv.org/abs/2605.20246)

**详情页**：https://ai.daily.yangsir.net/daily/20260522-T0-16

---

*智语观潮 · Daily — https://ai.daily.yangsir.net/llms.txt*