J Zhan, J Dai,
J Ye, Y Zhou,
D Zhang, Z Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce AnyGPT, an any-to-any multimodal language model that utilizes discrete
representations for the unified processing of various modalities, including speech, text …