Normalization techniques in training dnns: Methodology, analysis and application

L Huang, J Qin, Y Zhou, F Zhu, L Liu… - IEEE transactions on …, 2023 - ieeexplore.ieee.org
Normalization techniques are essential for accelerating the training and improving the
generalization of deep neural networks (DNNs), and have successfully been used in various …

Sparse invariant risk minimization

X Zhou, Y Lin, W Zhang… - … Conference on Machine …, 2022 - proceedings.mlr.press
Abstract Invariant Risk Minimization (IRM) is an emerging invariant feature extracting
technique to help generalization with distributional shift. However, we find that there exists a …

Learning to optimize domain specific normalization for domain generalization

S Seo, Y Suh, D Kim, G Kim, J Han, B Han - Computer Vision–ECCV 2020 …, 2020 - Springer
We propose a simple but effective multi-source domain generalization technique based on
deep neural networks by incorporating optimized normalization layers that are specific to …

[HTML][HTML] A reversible automatic selection normalization (RASN) deep network for predicting in the smart agriculture system

X Jin, J Zhang, J Kong, T Su, Y Bai - Agronomy, 2022 - mdpi.com
Due to the nonlinear modeling capabilities, deep learning prediction networks have become
widely used for smart agriculture. Because the sensing data has noise and complex …

Adaptively sparse transformers

GM Correia, V Niculae, AFT Martins - arXiv preprint arXiv:1909.00015, 2019 - arxiv.org
Attention mechanisms have become ubiquitous in NLP. Recent architectures, notably the
Transformer, learn powerful context-aware word representations through layered, multi …

Adversarially adaptive normalization for single domain generalization

X Fan, Q Wang, J Ke, F Yang… - Proceedings of the …, 2021 - openaccess.thecvf.com
Single domain generalization aims to learn a model that performs well on many unseen
domains with only one domain data for training. Existing works focus on studying the …

Towards understanding regularization in batch normalization

P Luo, X Wang, W Shao, Z Peng - arXiv preprint arXiv:1809.00846, 2018 - arxiv.org
Batch Normalization (BN) improves both convergence and generalization in training neural
networks. This work understands these phenomena theoretically. We analyze BN by using a …

Differentiable learning-to-normalize via switchable normalization

P Luo, J Ren, Z Peng, R Zhang, J Li - arXiv preprint arXiv:1806.10779, 2018 - arxiv.org
We address a learning-to-normalize problem by proposing Switchable Normalization (SN),
which learns to select different normalizers for different normalization layers of a deep neural …

Micro-batch training with batch-channel normalization and weight standardization

S Qiao, H Wang, C Liu, W Shen, A Yuille - arXiv preprint arXiv:1903.10520, 2019 - arxiv.org
Batch Normalization (BN) has become an out-of-box technique to improve deep network
training. However, its effectiveness is limited for micro-batch training, ie, each GPU typically …

Cross-iteration batch normalization

Z Yao, Y Cao, S Zheng, G Huang… - Proceedings of the …, 2021 - openaccess.thecvf.com
A well-known issue of Batch Normalization is its significantly reduced effectiveness in the
case of small mini-batch sizes. When a mini-batch contains few examples, the statistics upon …