---
id: 20260425-T0-02
title: "FairyFuse：CPU端三元核推理实现无乘法LLM"
title_en: "FairyFuse: No-Multiplication LLM Inference on CPUs"
url: https://ai.daily.yangsir.net/daily/20260425-T0-02
issue_date: 2026-04-25
publish_date: 2026-04-24T04:00:00.000Z
category: research
source_name: "arXiv cs.LG (ML)"
source_url: https://arxiv.org/abs/2604.20913
---

# FairyFuse：CPU端三元核推理实现无乘法LLM

MIT团队推出FairyFuse框架，通过融合三元核技术实现CPU端大模型4bit量化推理。该方案完全避免乘法运算，将内存带宽需求降低40%，在普通CPU上即可实现大模型实时推理。实验显示，相比现有系统，FairyFuse在保持精度损失小于1%的同时，推理速度提升2.3倍。

## English Version

**FairyFuse: No-Multiplication LLM Inference on CPUs**

MIT researchers introduce FairyFuse, a framework enabling 4-bit quantized LLM inference on CPUs via fused ternary kernels. It eliminates multiplication operations and reduces memory bandwidth needs by 40%, allowing real-time LLM inference on standard CPUs. Tests show it achieves 2.3x speedup with less than 1% accuracy loss.

---

**来源**：[arXiv cs.LG (ML)](https://arxiv.org/abs/2604.20913)

**详情页**：https://ai.daily.yangsir.net/daily/20260425-T0-02

---

*智语观潮 · Daily — https://ai.daily.yangsir.net/llms.txt*