Transformers can optimally learn regression mixture models

R Pathak, R Sen, W Kong, A Das - arXiv preprint arXiv:2311.08362, 2023 - arxiv.org
Mixture models arise in many regression problems, but most methods have seen limited
adoption partly due to these algorithms' highly-tailored and model-specific nature. On the …