---
id: 20260319-T0-18
title: "引导冻结LLM：动态对齐新方法"
title_en: "Steering Frozen LLMs: Dynamic Social Alignment"
url: https://ai.daily.yangsir.net/daily/20260319-T0-18
issue_date: 2026-03-19
publish_date: 2026-03-18T04:00:00.000Z
category: research
source_name: "arXiv cs.LG (ML)"
source_url: https://arxiv.org/abs/2603.15647
---

# 引导冻结LLM：动态对齐新方法

华盛顿大学提出'在线提示路由'技术，实现对冻结LLM的动态社会对齐。传统方法在部署后保持静态，无法适应新场景。新方法通过实时调整提示策略，在保持模型性能的同时提升安全性。实验显示，有害输出减少45%，同时不影响模型表现。AI安全研究员可用此技术提高现有模型的安全性。

## English Version

**Steering Frozen LLMs: Dynamic Social Alignment**

University of Washington researchers propose 'online prompt routing' for dynamic social alignment of frozen LLMs. Traditional methods remain static after deployment, failing to adapt to new scenarios. The new technique adjusts prompt strategies in real-time, improving safety without performance loss. Experiments show 45% reduction in harmful outputs while maintaining model performance. AI safety researchers can use this to enhance existing model safety.

---

**来源**：[arXiv cs.LG (ML)](https://arxiv.org/abs/2603.15647)

**详情页**：https://ai.daily.yangsir.net/daily/20260319-T0-18

---

*智语观潮 · Daily — https://ai.daily.yangsir.net/llms.txt*