作者
Saeed Rashidi, Pallavi Shurpali, Srinivas Sridharan, Naader Hassani, Dheevatsa Mudigere, Krishnakumar Nair, Misha Smelyanski, Tushar Krishna
发表日期
2020/8/19
研讨会论文
2020 IEEE Symposium on High-Performance Interconnects (HOTI)
页码范围
33-42
出版商
IEEE
简介
Recommendation model DNNs have gained significant attention due to their vital role in recommending the best content to the user. However, in order to further increase accuracy, DNNs are becoming more complex with more data to be trained, making them infeasible for training on a single node. Distributed training is a solution to tackle this problem by employing multiple nodes for training. The importance of recommendation models necessitates to design customized HW/SW platforms for training such networks in order to minimize the communication overheads among different nodes. However, exploring this design space is difficult due to the presence of many HW/SW parameters and the limitations to change the HW parameters in real systems. In this paper, we port the previously proposed ASTRA-SIM simulation platform on top of the versatile NS3 network simulator by introducing a portable network interface …
引用总数
20212022202320243322
学术搜索中的文章
S Rashidi, P Shurpali, S Sridharan, N Hassani… - 2020 IEEE Symposium on High-Performance …, 2020