Grounded text-to-image synthesis with attention refocusing

Q Phung, S Ge, JB Huang - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Driven by the scalable diffusion models trained on large-scale datasets text-to-image
synthesis methods have shown compelling results. However these models still fail to …

Maggie: Masked guided gradual human instance matting

C Huynh, SW Oh, A Shrivastava… - Proceedings of the …, 2024 - openaccess.thecvf.com
Human matting is a foundation task in image and video processing where human
foreground pixels are extracted from the input. Prior works either improve the accuracy by …

DQG: Database Question Generation for Exact Text-based Image Retrieval

R Yanagi, R Togo, T Ogawa, M Haseyama - ACM Multimedia 2024, 2024 - openreview.net
Screening similar but non-target images in text-based image retrieval is crucial for
pinpointing the user's desired images accurately. However, conventional methods mainly …