S Liu, N Zheng, H Kang, X Simmons, J Zhang… - Proceedings of the 18th …, 2024 - dl.acm.org
Training large-scale deep learning recommendation models (DLRMs) with embedding
tables stretching across multiple GPUs in a cluster presents a unique challenge, demanding …