A Hurst, A Lerer, AP Goucher, A Perelman… - arXiv preprint arXiv …, 2024 - arxiv.org
GPT-4o is an autoregressive omni model that accepts as input any combination of text,
audio, image, and video, and generates any combination of text, audio, and image outputs …