---
id: 20260407-T0-16
title: "研究：WebGPU在LLM推理中存在显著调度开销"
title_en: "WebGPU Shows Significant Dispatch Overhead in LLM Inference"
url: https://ai.daily.yangsir.net/daily/20260407-T0-16
issue_date: 2026-04-07
publish_date: 2026-04-06T04:00:00.000Z
category: research
source_name: "arXiv cs.LG (ML)"
source_url: https://arxiv.org/abs/2604.02344
---

# 研究：WebGPU在LLM推理中存在显著调度开销

arXiv研究系统分析了WebGPU在LLM推理中的调度开销问题。WebGPU的安全设计要求对每个操作进行验证，在神经网络推理的大量小规模调度中累积成显著性能损耗。研究测试了四个GPU厂商、三种后端和三种浏览器，发现WebGPU的开销比传统方案高2-5倍，这限制了其在高性能LLM推理中的应用前景，需优化调度策略。

## English Version

**WebGPU Shows Significant Dispatch Overhead in LLM Inference**

arXiv study systematically analyzes WebGPU's dispatch overhead in LLM inference. WebGPU's security-focused design requires per-operation validation, causing significant performance penalties in neural network's many small dispatches. Testing across four GPU vendors, three backends and browsers shows 2-5x higher overhead than alternatives, limiting WebGPU's use in high-performance LLM inference.

---

**来源**：[arXiv cs.LG (ML)](https://arxiv.org/abs/2604.02344)

**详情页**：https://ai.daily.yangsir.net/daily/20260407-T0-16

---

*智语观潮 · Daily — https://ai.daily.yangsir.net/llms.txt*