Normalization techniques in training dnns: Methodology, analysis and application

L Huang, J Qin, Y Zhou, F Zhu, L Liu… - IEEE transactions on …, 2023 - ieeexplore.ieee.org
Normalization techniques are essential for accelerating the training and improving the
generalization of deep neural networks (DNNs), and have successfully been used in various …

Accelerating transformer-based deep learning models on fpgas using column balanced block pruning

H Peng, S Huang, T Geng, A Li, W Jiang… - … on Quality Electronic …, 2021 - ieeexplore.ieee.org
Although Transformer-based language representations achieve state-of-the-art accuracy on
various natural language processing (NLP) tasks, the large model size has been …

Parallelizing DNN training on GPUs: Challenges and opportunities

W Xu, Y Zhang, X Tang - … Proceedings of the Web Conference 2021, 2021 - dl.acm.org
In recent years, Deep Neural Networks (DNNs) have emerged as a widely adopted
approach in many application domains. Training DNN models is also becoming a significant …

MiCS: near-linear scaling for training gigantic model on public cloud

Z Zhang, S Zheng, Y Wang, J Chiu, G Karypis… - arXiv preprint arXiv …, 2022 - arxiv.org
Existing general purpose frameworks for gigantic model training, ie, dense models with
billions of parameters, cannot scale efficiently on cloud environment with various networking …

Slamb: accelerated large batch training with sparse communication

H Xu, W Zhang, J Fei, Y Wu, TW Xie… - International …, 2023 - proceedings.mlr.press
Distributed training of large deep neural networks requires frequent exchange of massive
data between machines, thus communication efficiency is a major concern. Existing …

Communication-compressed adaptive gradient method for distributed nonconvex optimization

Y Wang, L Lin, J Chen - International Conference on Artificial …, 2022 - proceedings.mlr.press
Due to the explosion in the size of the training datasets, distributed learning has received
growing interest in recent years. One of the major bottlenecks is the large communication …

Language models for the prediction of SARS-CoV-2 inhibitors

AE Blanchard, J Gounley, D Bhowmik… - … Journal of High …, 2022 - journals.sagepub.com
The COVID-19 pandemic highlights the need for computational tools to automate and
accelerate drug design for novel protein targets. We leverage deep learning language …

Transformer-Based Named Entity Recognition in Construction Supply Chain Risk Management in Australia

MB Shishehgarkhaneh, RC Moehler, Y Fang… - IEEE …, 2024 - ieeexplore.ieee.org
In the Australian construction industry, effective supply chain risk management (SCRM) is
critical due to its complex networks and susceptibility to various risks. This study explores the …

Web-scale semantic product search with large language models

A Muhamed, S Srinivasan, CH Teo, Q Cui… - Pacific-Asia Conference …, 2023 - Springer
Dense embedding-based semantic matching is widely used in e-commerce product search
to address the shortcomings of lexical matching such as sensitivity to spelling variants. The …

Optimizing Data Layout for Training Deep Neural Networks

B Li, Q Xue, G Yuan, S Li, X Ma, Y Wang… - … Proceedings of the Web …, 2022 - dl.acm.org
The widespread popularity of deep neural networks (DNNs) has made it an important
workload in modern datacenters. Training DNNs is both computation-intensive and memory …