image classification, speech recognition and synthesis, and health care, designing efficient
hardware for these models has gained a lot of popularity. While the majority of researches in
this area focus on efficient deployment of machine learning models (aka inference), this
work concentrates on challenges of training these models in hardware. In particular, this
paper presents a high-performance, scalable, reconfigurable solution for both training and …