[HTML][HTML] A comprehensive survey of image augmentation techniques for deep learning

M Xu, S Yoon, A Fuentes, DS Park - Pattern Recognition, 2023 - Elsevier
Although deep learning has achieved satisfactory performance in computer vision, a large
volume of images is required. However, collecting images is often expensive and …

[HTML][HTML] Shifting machine learning for healthcare from development to deployment and from models to data

A Zhang, L Xing, J Zou, JC Wu - Nature Biomedical Engineering, 2022 - nature.com
In the past decade, the application of machine learning (ML) to healthcare has helped drive
the automation of physician tasks as well as enhancements in clinical capabilities and …

[HTML][HTML] Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning

E Tiu, E Talius, P Patel, CP Langlotz, AY Ng… - Nature Biomedical …, 2022 - nature.com
In tasks involving the interpretation of medical images, suitably trained machine-learning
models often exceed the performance of medical experts. Yet such a high-level of …

Quantifying memorization across neural language models

N Carlini, D Ippolito, M Jagielski, K Lee… - arXiv preprint arXiv …, 2022 - arxiv.org
Large language models (LMs) have been shown to memorize parts of their training data,
and when prompted appropriately, they will emit the memorized training data verbatim. This …

Better diffusion models further improve adversarial training

Z Wang, T Pang, C Du, M Lin… - … on Machine Learning, 2023 - proceedings.mlr.press
It has been recognized that the data generated by the denoising diffusion probabilistic
model (DDPM) improves adversarial training. After two years of rapid development in …

Erasing concepts from diffusion models

R Gandikota, J Materzynska… - Proceedings of the …, 2023 - openaccess.thecvf.com
Motivated by concerns that large-scale diffusion models can produce undesirable output
such as sexually explicit content or copyrighted artistic styles, we study erasure of specific …

Fine-tuning aligned language models compromises safety, even when users do not intend to!

X Qi, Y Zeng, T Xie, PY Chen, R Jia, P Mittal… - arXiv preprint arXiv …, 2023 - arxiv.org
Optimizing large language models (LLMs) for downstream use cases often involves the
customization of pre-trained LLMs through further fine-tuning. Meta's open release of Llama …

Membership inference attacks from first principles

N Carlini, S Chien, M Nasr, S Song… - … IEEE Symposium on …, 2022 - ieeexplore.ieee.org
A membership inference attack allows an adversary to query a trained machine learning
model to predict whether or not a particular example was contained in the model's training …

Mind the gap: Understanding the modality gap in multi-modal contrastive representation learning

VW Liang, Y Zhang, Y Kwon… - Advances in Neural …, 2022 - proceedings.neurips.cc
We present modality gap, an intriguing geometric phenomenon of the representation space
of multi-modal models. Specifically, we show that different data modalities (eg images and …

Memorization without overfitting: Analyzing the training dynamics of large language models

K Tirumala, A Markosyan… - Advances in …, 2022 - proceedings.neurips.cc
Despite their wide adoption, the underlying training and memorization dynamics of very
large language models is not well understood. We empirically study exact memorization in …