[PDF][PDF] Image Inpainting on the Sketch-Pencil Domain with Vision Transformers

JLF Campana, LGL Decker, MR e Souza… - Proceedings …, 2024 - scitepress.org
Proceedings Copyright, 2024scitepress.org
Image inpainting aims to realistically fill missing regions in images, which requires both
structural and textural understanding. Traditionally, methods in the literature have employed
Convolutional Neural Networks (CNN), especially Generative Adversarial Networks (GAN),
to restore missing regions in a coherent and reliable manner. However, CNNs' limited
receptive fields can sometimes result in unreliable outcomes due to their inability to capture
the broader context of the image. Transformer-based models, on the other hand, can learn …
Abstract
Image inpainting aims to realistically fill missing regions in images, which requires both structural and textural understanding. Traditionally, methods in the literature have employed Convolutional Neural Networks (CNN), especially Generative Adversarial Networks (GAN), to restore missing regions in a coherent and reliable manner. However, CNNs’ limited receptive fields can sometimes result in unreliable outcomes due to their inability to capture the broader context of the image. Transformer-based models, on the other hand, can learn long-range dependencies through self-attention mechanisms. In order to generate more consistent results, some approaches have further incorporated auxiliary information to guide the model’s understanding of structural information. In this work, we propose a new method for image inpainting that uses sketchpencil information to guide the restoration of structural, as well as textural elements. Unlike previous works that employ edges, lines, or segmentation maps, we leverage the sketch-pencil domain and the capabilities of Transformers to learn long-range dependencies to properly match structural and textural information, resulting in more consistent results. Experimental results show the effectiveness of our approach, demonstrating either superior or competitive performance when compared to existing methods, especially in scenarios involving complex images and large missing areas.
scitepress.org
以上显示的是最相近的搜索结果。 查看全部搜索结果