Publications

You can also find my articles on my Google Scholar profile.

Conference Papers


Workshop Papers


arXiv Preprints


Beyond Language Modeling: An Exploration of Multimodal Pretraining

Shengbang Tong, David Fan, John Nguyen, Ellis Brown, Gaoyue Zhou, Shengyi Qian, Boyang Zheng, Théophane Vallaeys, Junlin Han, Rob Fergus, Naila Murray, Marjan Ghazvininejad, Mike Lewis, Nicolas Ballas, Amir Bar, Michael Rabbat, Jakob Verbeek, Luke Zettlemoyer, Koustuv Sinha, Yann LeCun, Saining Xie

Published in arXiv preprint, 2026

Empirical study of native multimodal pretraining using Transfusion framework, revealing key insights on visual representation, data synergy, world modeling, and MoE scaling.

Download Paper

VUGEN: Visual Understanding priors for GENeration

Xiangyi Chen, Théophane Vallaeys, Maha Elbayad, John Nguyen, Jakob Verbeek

Published in arXiv preprint, 2025

A framework that leverages VLM pretrained visual understanding priors for efficient and high-quality image generation, achieving superior performance while preserving understanding capabilities.

Download Paper