Sparse progressive distillation: Resolving overfitting under pretrain-and-finetune paradigm

S Huang, D Xu, IEH Yen, Y Wang, SE Chang… - arXiv preprint arXiv …, 2021 - arxiv.org
Conventional wisdom in pruning Transformer-based language models is that pruning
reduces the model expressiveness and thus is more likely to underfit rather than overfit …

Binary complex neural network acceleration on fpga

H Peng, S Zhou, S Weitze, J Li, S Islam… - 2021 IEEE 32nd …, 2021 - ieeexplore.ieee.org
Being able to learn from complex data with phase information is imperative for many signal
processing applications. Today's real-valued deep neural networks (DNNs) have shown …

Optimizing fpga-based accelerator design for large-scale molecular similarity search (special session paper)

H Peng, S Chen, Z Wang, J Yang… - 2021 IEEE/ACM …, 2021 - ieeexplore.ieee.org
Molecular similarity search has been widely used in drug discovery to identify structurally
similar compounds from large molecular databases rapidly. With the increasing size of …