Infusion: Inject and attention fusion for multi concept zero-shot text-based video editing

A Khandelwal - … Conference on Computer Vision, 2023 - openaccess.thecvf.com
… Our framework is a low-cost alternative to one-shot tuned models for editing since it does …
This ICCV workshop paper is the Open Access version, provided by the Computer Vision

Shifted diffusion for text-to-image generation

Y Zhou, B Liu, Y Zhu, X Yang… - … on computer vision …, 2023 - openaccess.thecvf.com
… , provided by the Computer Vision Foundation. … text to image generation with attentional
generative adversarial networks. In Proceedings of the IEEE conference on computer vision

Coca: Contrastive captioners are image-text foundation models

J Yu, Z Wang, V Vasudevan, L Yeung… - arXiv preprint arXiv …, 2022 - arxiv.org
… models is of significant interest in computer vision because these models can be quickly …
web-scale alt-text data and annotated images by treating all labels simply as text, seamlessly …

Let's Talk about X: Combining image recognition and eye gaze to support conversation for people with ALS

SK Kane, MR Morris - Proceedings of the 2017 Conference on …, 2017 - dl.acm.org
… based AAC system that uses computer vision to identify objects in … Alternative text entry
methods such as Dasher [24,25] may offer improved performance, but learning an alternative text

[图书][B] Computer vision: models, learning, and inference

SJD Prince - 2012 - books.google.com
… That book too is marked by an enormously comprehensive view of computer vision, by …
summary of the state of the art in computer vision, the frontier of our knowledge and abilities, …

Vision-language pre-training: Basics, recent advances, and future trends

Z Gan, L Li, C Li, L Wang, Z Liu… - … Trends® in Computer …, 2022 - nowpublishers.com
text inputs. In Section 4, we discuss how core computer vision tasks can be viewed as image-text
… by contrastively pre-trained image-text models (such as CLIP [326]), and further enable …

Multi-modal representation learning with text-driven soft masks

J Park, B Han - … IEEE/CVF Conference on Computer Vision …, 2023 - openaccess.thecvf.com
… believed to harm vision-language models [16, 39] by breaking the semantic consistency
of the image-text pair, but we argue that this does not hold for the large-scale vision-language …

Ld-znet: A latent diffusion approach for text-based image segmentation

K Pnvr, B Singh, P Ghosh… - … on Computer Vision, 2023 - openaccess.thecvf.com
… have received less attention compared to alternative unsupervised methods [4], because of
… the internal features of a text-to-image LDM [38] for text based image segmentation, which is …

Transform and tell: Entity-aware news image captioning

A Tran, A Mathews, L Xie - … conference on computer vision …, 2020 - openaccess.thecvf.com
We propose an end-to-end model which generates captions for images embedded in news
articles. News images present two key challenges: they rely on real-world knowledge, …

[图书][B] Markov random field modeling in computer vision

SZ Li - 2012 - books.google.com
… in image processing and computer vision is to capture the … in image processing, computer
vision, applied statistics, and … of Markov random fields to computer vision problems such as …