alt text computer vision- 学术资源搜索

Thinking fast and slow: Efficient text-to-visual retrieval with transformers

A Miech, JB Alayrac, I Laptev, J Sivic… - … on Computer Vision …, 2021 - openaccess.thecvf.com

Our objective is language-based search of large-scale image and video datasets. For this
task, the approach that consists of independently mapping text and vision to a joint embedding …

被引用次数：141 相关文章所有 9 个版本

[PDF] thecvf.com

SciOL and MuLMS-Img: Introducing A Large-Scale Multimodal Scientific Dataset and Models for Image-Text Tasks in the Scientific Domain

T Tarsi, H Adel, JH Metzen, D Zhang… - … of Computer Vision, 2024 - openaccess.thecvf.com

… to the HCI alt-text dataset [8] for our evaluation. It consists of 3386 scientific figures with alt-text
descriptions extracted from publications on HumanComputer Interaction and accessibility. …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Locvtp: Video-text pre-training for temporal localization

M Cao, T Yang, J Weng, C Zhang, J Wang… - … on Computer Vision, 2022 - Springer

… 3) The recent work CLIP [43] provides a stronger vision encoder and we also evaluate the
performance based on it. It is shown that the CLIP’s weights greatly improve the performance …

被引用次数：55 相关文章所有 6 个版本

[PDF] cmu.edu

Improving accessibility of the web with a computer game

L Von Ahn, S Ginosar, M Kedia, R Liu… - Proceedings of the …, 2006 - dl.acm.org

… In essence, we solve a typical computer vision problem with … the images on the Web have
an HTML ALT caption). Today, it is the … Rather than designing a computer vision algorithm that …

被引用次数：242 相关文章所有 17 个版本

[PDF] psu.edu

WebInSight: making web images accessible

JP Bigham, RS Kaminsky, RE Ladner… - Proceedings of the 8th …, 2006 - dl.acm.org

… alternative text. To ameliorate this problem, we introduce WebInSight, a sysн tem that
automatically creates and inserts alternative text … image using computer vision techniques is …

被引用次数：221 相关文章所有 16 个版本

Unleash the Potential of Upstream Data Using Search, AI and Computer Vision

HM Asfoor, DA Alharbi - Abu Dhabi International Petroleum Exhibition …, 2022 - onepetro.org

… Enterprise Search, AI and Computer Vision to construct a single … Fourth, applying Computer
Vision techniques to extract … Figure 7, that utilizes Computer Vision to detect and extract …

被引用次数：1 相关文章

[PDF] arxiv.org

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks

X Li, X Yin, C Li, P Zhang, X Hu, L Zhang… - Computer Vision–ECCV …, 2020 - Springer

… modal representations on image-text pairs are becoming popular for vision-language tasks.
… text features as input to the model to be pre-trained and use self-attention to learn image-text …

被引用次数：1972 相关文章所有 6 个版本

Comparison of computer vision approaches in application to the electricity and gas meter reading

M Spichkova, J Van Zyl, S Sachdev, A Bhardwaj… - Evaluation of Novel …, 2020 - Springer

… convenient alternative method for their current meter reading updating system. The proposed
solution is to use computer vision techniques for capturing readings. One of the alternative …

被引用次数：15 相关文章所有 4 个版本

[PDF] researchgate.net

[PDF][PDF] Relational Learning in Computer Vision.

N Messina, F Falchi, G Amato, M Avvenuti, J Lokoc… - 2022 - researchgate.net

… This framework overturned many computer science fields, like Computer Vision and Natural
Language Processing, obtaining astonishing results. Nevertheless, many challenges are …

被引用次数：1 相关文章所有 2 个版本

[PDF] thecvf.com

Groupvit: Semantic segmentation emerges from text supervision

J Xu, S De Mello, S Liu, W Byeon… - … Computer Vision …, 2022 - openaccess.thecvf.com

… Inspired by the success of Transformers in NLP [20, 76], the Vision Transformer (ViT) [22]
was recently proposed and has been successfully applied to multiple computer vision tasks, …

被引用次数：416 相关文章所有 6 个版本

高级搜索

QQ 群