---
id: 20260321-T0-23
title: "FaithSteer-BENCH：新基准测试评估LLM行为控制可靠性"
title_en: "FaithSteer-BENCH: New Benchmark for LLM Steering Reliability"
url: https://ai.daily.yangsir.net/daily/20260321-T0-23
issue_date: 2026-03-21
publish_date: 2026-03-20T04:00:00.000Z
category: research
source_name: "arXiv cs.AI"
source_url: https://arxiv.org/abs/2603.18329
---

# FaithSteer-BENCH：新基准测试评估LLM行为控制可靠性

研究人员发布FaithSteer-BENCH基准测试，专门评估推理时操控（inference-time steering）技术的可靠性。该测试针对LLM行为控制的激活级干预方案，首次部署对齐的压力测试标准，为AI安全提供新评估工具。

## English Version

**FaithSteer-BENCH: New Benchmark for LLM Steering Reliability**

Researchers released FaithSteer-BENCH, a benchmark for evaluating inference-time steering reliability in LLMs. The test focuses on activation-level interventions for behavior control, providing the first deployment-aligned stress-testing standard for AI safety.

---

**来源**：[arXiv cs.AI](https://arxiv.org/abs/2603.18329)

**详情页**：https://ai.daily.yangsir.net/daily/20260321-T0-23

---

*智语观潮 · Daily — https://ai.daily.yangsir.net/llms.txt*