[HTML][HTML] A comprehensive survey of image augmentation techniques for deep learning

M Xu, S Yoon, A Fuentes, DS Park - Pattern Recognition, 2023 - Elsevier
Although deep learning has achieved satisfactory performance in computer vision, a large
volume of images is required. However, collecting images is often expensive and …

[HTML][HTML] Data augmentation: A comprehensive survey of modern approaches

A Mumuni, F Mumuni - Array, 2022 - Elsevier
To ensure good performance, modern machine learning models typically require large
amounts of quality annotated data. Meanwhile, the data collection and annotation processes …

Scaling vision transformers to 22 billion parameters

M Dehghani, J Djolonga, B Mustafa… - International …, 2023 - proceedings.mlr.press
The scaling of Transformers has driven breakthrough capabilities for language models. At
present, the largest large language models (LLMs) contain upwards of 100B parameters …

When and why vision-language models behave like bags-of-words, and what to do about it?

M Yuksekgonul, F Bianchi, P Kalluri, D Jurafsky… - arXiv preprint arXiv …, 2022 - arxiv.org
Despite the success of large vision and language models (VLMs) in many downstream
applications, it is unclear how well they encode compositional information. Here, we create …

Scaling up your kernels to 31x31: Revisiting large kernel design in cnns

X Ding, X Zhang, J Han, G Ding - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by
recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few …

Beyond neural scaling laws: beating power law scaling via data pruning

B Sorscher, R Geirhos, S Shekhar… - Advances in …, 2022 - proceedings.neurips.cc
Widely observed neural scaling laws, in which error falls off as a power of the training set
size, model size, or both, have driven substantial performance improvements in deep …

Fast high-resolution image synthesis with latent adversarial diffusion distillation

A Sauer, F Boesel, T Dockhorn, A Blattmann… - SIGGRAPH Asia 2024 …, 2024 - dl.acm.org
Diffusion models are the main driver of progress in image and video synthesis, but suffer
from slow inference speed. Distillation methods, like the recently introduced adversarial …

Self-supervised contrastive pre-training for time series via time-frequency consistency

X Zhang, Z Zhao, T Tsiligkaridis… - Advances in Neural …, 2022 - proceedings.neurips.cc
Pre-training on time series poses a unique challenge due to the potential mismatch between
pre-training and target domains, such as shifts in temporal dynamics, fast-evolving trends …

Daformer: Improving network architectures and training strategies for domain-adaptive semantic segmentation

L Hoyer, D Dai, L Van Gool - Proceedings of the IEEE/CVF …, 2022 - openaccess.thecvf.com
As acquiring pixel-wise annotations of real-world images for semantic segmentation is a
costly process, a model can instead be trained with more accessible synthetic data and …

[HTML][HTML] Leakage and the reproducibility crisis in machine-learning-based science

S Kapoor, A Narayanan - Patterns, 2023 - cell.com
Machine-learning (ML) methods have gained prominence in the quantitative sciences.
However, there are many known methodological pitfalls, including data leakage, in ML …