Ssn: Learning sparse switchable normalization via sparsestmax

L Huang, J Qin, Y Zhou, F Zhu, L Liu… - IEEE transactions on …, 2023 - ieeexplore.ieee.org

Normalization techniques are essential for accelerating the training and improving the
generalization of deep neural networks (DNNs), and have successfully been used in various …

被引用次数：282 相关文章所有 8 个版本

[PDF] mlr.press

Sparse invariant risk minimization

X Zhou, Y Lin, W Zhang… - … Conference on Machine …, 2022 - proceedings.mlr.press

Abstract Invariant Risk Minimization (IRM) is an emerging invariant feature extracting
technique to help generalization with distributional shift. However, we find that there exists a …

被引用次数：65 相关文章所有 4 个版本

[PDF] arxiv.org

Learning to optimize domain specific normalization for domain generalization

S Seo, Y Suh, D Kim, G Kim, J Han, B Han - Computer Vision–ECCV 2020 …, 2020 - Springer

We propose a simple but effective multi-source domain generalization technique based on
deep neural networks by incorporating optimized normalization layers that are specific to …

被引用次数：277 相关文章所有 5 个版本

[HTML] mdpi.com

[HTML][HTML] A reversible automatic selection normalization (RASN) deep network for predicting in the smart agriculture system

X Jin, J Zhang, J Kong, T Su, Y Bai - Agronomy, 2022 - mdpi.com

Due to the nonlinear modeling capabilities, deep learning prediction networks have become
widely used for smart agriculture. Because the sensing data has noise and complex …

被引用次数：94 相关文章所有 7 个版本

[PDF] arxiv.org

Adaptively sparse transformers

GM Correia, V Niculae, AFT Martins - arXiv preprint arXiv:1909.00015, 2019 - arxiv.org

Attention mechanisms have become ubiquitous in NLP. Recent architectures, notably the
Transformer, learn powerful context-aware word representations through layered, multi …

被引用次数：260 相关文章所有 3 个版本

[PDF] thecvf.com

Adversarially adaptive normalization for single domain generalization

X Fan, Q Wang, J Ke, F Yang… - Proceedings of the …, 2021 - openaccess.thecvf.com

Single domain generalization aims to learn a model that performs well on many unseen
domains with only one domain data for training. Existing works focus on studying the …

被引用次数：128 相关文章所有 8 个版本

[PDF] arxiv.org

Towards understanding regularization in batch normalization

P Luo, X Wang, W Shao, Z Peng - arXiv preprint arXiv:1809.00846, 2018 - arxiv.org

Batch Normalization (BN) improves both convergence and generalization in training neural
networks. This work understands these phenomena theoretically. We analyze BN by using a …

被引用次数：260 相关文章所有 5 个版本

[PDF] openreview.net

Differentiable learning-to-normalize via switchable normalization

P Luo, J Ren, Z Peng, R Zhang, J Li - arXiv preprint arXiv:1806.10779, 2018 - arxiv.org

We address a learning-to-normalize problem by proposing Switchable Normalization (SN),
which learns to select different normalizers for different normalization layers of a deep neural …

被引用次数：248 相关文章所有 4 个版本

[PDF] arxiv.org

Micro-batch training with batch-channel normalization and weight standardization

S Qiao, H Wang, C Liu, W Shen, A Yuille - arXiv preprint arXiv:1903.10520, 2019 - arxiv.org

Batch Normalization (BN) has become an out-of-box technique to improve deep network
training. However, its effectiveness is limited for micro-batch training, ie, each GPU typically …

被引用次数：165 相关文章所有 2 个版本

[PDF] thecvf.com

Cross-iteration batch normalization

Z Yao, Y Cao, S Zheng, G Huang… - Proceedings of the …, 2021 - openaccess.thecvf.com

A well-known issue of Batch Normalization is its significantly reduced effectiveness in the
case of small mini-batch sizes. When a mini-batch contains few examples, the statistics upon …

被引用次数：133 相关文章所有 7 个版本

高级搜索

QQ 群