---
id: 20260324-T0-21
title: "LeWorldModel：首个端到端联合嵌入预测架构实现稳定世界模型"
title_en: "LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture"
url: https://ai.daily.yangsir.net/daily/20260324-T0-21
issue_date: 2026-03-24
publish_date: 2026-03-23T04:00:00.000Z
category: research
source_name: "arXiv cs.LG (ML)"
source_url: https://arxiv.org/abs/2603.19312
---

# LeWorldModel：首个端到端联合嵌入预测架构实现稳定世界模型

arXiv最新研究LeWorldModel提出了一种从像素直接学习的端到端联合嵌入预测架构(JEPA)。该方法在紧凑的潜在空间中构建世界模型，解决了现有方法依赖复杂多项损失函数和预训练编码器的问题。研究显示，新架构通过端到端训练显著提升了模型稳定性，为自监督视觉学习提供了新思路。开发者可基于此架构构建更高效的视觉预训练模型。

## English Version

**LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture**

LeWorldModel introduces an end-to-end joint-embedding predictive architecture that learns world models directly from pixels. The research addresses limitations in existing methods by eliminating complex multi-term losses and pre-trained encoders, achieving improved stability through end-to-end training. This breakthrough enables developers to build more efficient vision pre-training models.

---

**来源**：[arXiv cs.LG (ML)](https://arxiv.org/abs/2603.19312)

**详情页**：https://ai.daily.yangsir.net/daily/20260324-T0-21

---

*智语观潮 · Daily — https://ai.daily.yangsir.net/llms.txt*