Despite being responsible for state-of-the-art results in several computer vision and natural language processing tasks, neural networks have faced harsh criticism due to some of their …
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …
The availability of large-scale image captioning and visual question answering datasets has contributed significantly to recent successes in vision-and-language pre-training. However …
State-of-the-art models often make use of superficial patterns in the data that do not generalize well to out-of-domain or adversarial settings. For example, textual entailment …
Federated learning is a method of training models on private data distributed over multiple devices. To keep device data private, the global model is trained by only communicating …
Text-to-image generative models often reflect the biases of the training data, leading to unequal representations of underrepresented groups. This study investigates inclusive text …
Z Wang, K Qinami, IC Karakozis… - Proceedings of the …, 2020 - openaccess.thecvf.com
Computer vision models learn to perform a task by capturing relevant statistics from training data. It has been shown that models learn spurious age, gender, and race correlations when …
Gender is widely discussed in the context of language tasks and when examining the stereotypes propagated by language models. However, current discussions primarily treat …
J Cho, A Zala, M Bansal - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Recently, DALL-E, a multimodal transformer language model, and its variants including diffusion models have shown high-quality text-to-image generation capabilities. However …