As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines everywhere because of its ability to analyze and create text, images, and beyond. With such …
Recent large-scale text-guided diffusion models provide powerful image generation capabilities. Currently, a massive effort is given to enable the modification of these images …
Reinforcement learning algorithms typically struggle in the absence of a dense, well-shaped reward function. Intrinsically motivated exploration methods address this limitation by …
Image captioning is a fundamental task in vision-language understanding, where the model predicts a textual informative caption to a given input image. In this paper, we present a …
Recent breakthroughs in text-to-4D generation rely on pre-trained text-to-image and text-to- video models to generate dynamic 3D scenes. However current text-to-4D methods face a …
We present $\textbf {MolT5} $$-$ a self-supervised learning framework for pretraining models on a vast amount of unlabeled natural language text and molecule strings. $\textbf …
Web-crawled datasets have enabled remarkable generalization capabilities in recent image- text models such as CLIP (Contrastive Language-Image pre-training) or Flamingo, but little …
The remarkable performance of the Transformer architecture in natural language processing has recently also triggered broad interest in Computer Vision. Among other merits …
T Talaei Khoei, H Ould Slimane… - Neural Computing and …, 2023 - Springer
The current development in deep learning is witnessing an exponential transition into automation applications. This automation transition can provide a promising framework for …