---
id: 20260313-T0-22
title: "AraModernBERT：采用跨token初始化的阿拉伯语长上下文编码器"
title_en: "AraModernBERT: Arabic Long-Context Encoder with Cross-Token Initialization"
url: https://ai.daily.yangsir.net/daily/20260313-T0-22
issue_date: 2026-03-13
publish_date: 2026-03-12T04:00:00.000Z
source_name: "arXiv cs.CL (NLP)"
source_url: https://arxiv.org/abs/2603.09982
---

# AraModernBERT：采用跨token初始化的阿拉伯语长上下文编码器

研究人员提出AraModernBERT，将ModernBERT编码器架构适配阿拉伯语，采用跨token初始化和长上下文建模技术。该模型针对阿拉伯语语法特性优化，在多项NLP任务中表现优于基线模型。

## English Version

**AraModernBERT: Arabic Long-Context Encoder with Cross-Token Initialization**

Researchers introduced AraModernBERT, adapting the ModernBERT encoder for Arabic with cross-token initialization and long-context modeling techniques. The model is optimized for Arabic linguistic features and outperforms baselines on multiple NLP tasks.

---

**来源**：[arXiv cs.CL (NLP)](https://arxiv.org/abs/2603.09982)

**详情页**：https://ai.daily.yangsir.net/daily/20260313-T0-22

---

*智语观潮 · Daily — https://ai.daily.yangsir.net/llms.txt*