object detector image captioning- 学术资源搜索

Image captioning: Transforming objects into words

S Herdade, A Kappeler, K Boakye… - Advances in neural …, 2019 - proceedings.neurips.cc

… the use of object spatial relationship modeling for image captioning, specifically within the
… For image captioning, our architecture uses the feature vectors from the object detector as …

被引用次数：539 相关文章所有 7 个版本

[PDF] arxiv.org

Image captioning with object detection and localization

Z Yang, YJ Zhang, S Rehman, Y Huang - Image and Graphics: 9th …, 2017 - Springer

… of objects in an image. Due to the aforementioned reason, we combine object detection with
image captioning to … Initially, we use an object detection model to detect objects in the image …

被引用次数：56 相关文章所有 6 个版本

[PDF] thecvf.com

Beyond a pre-trained object detector: Cross-modal textual and visual context for image captioning

CW Kuo, Z Kira - Proceedings of the IEEE/CVF conference …, 2022 - openaccess.thecvf.com

… In this paper, we validate our proposed method on the VL task of image captioning. By …
-trained object detector described above, our method improves one of the SoTA image captioning …

被引用次数：51 相关文章所有 5 个版本

[PDF] mdpi.com

Image captioning using motion-CNN with object detection

K Iwamura, JY Louhi Kasahara, A Moro, A Yamashita… - Sensors, 2021 - mdpi.com

… object region, not all the motion features. Therefore, object detection that detects an object
region in an image … our proposed method improves image captioning performance. Previous …

被引用次数：16 相关文章所有 9 个版本

[PDF] arxiv.org

Image captioning with unseen objects

B Demirel, RG Cinbis, N Ikizler-Cinbis - arXiv preprint arXiv:1908.00047, 2019 - arxiv.org

… purely as a language modeling problem and presume the availability of a pre-trained fully-supervised
object detector over all object classes of interest, to the best of our knowledge. …

被引用次数：17 相关文章所有 3 个版本

[PDF] springer.com

Image captioning model using attention and object features to mimic human image understanding

MA Al-Malla, A Jafar, N Ghneim - Journal of Big Data, 2022 - Springer

… In order to take advantage of the image classification features and the object detection
features, we add this concatenation step, where we attach the output of the YOLOv4 subsystem as …

被引用次数：51 相关文章所有 10 个版本

[PDF] thecvf.com

Capsal: Leveraging captioning to boost semantics for salient object detection

L Zhang, J Zhang, Z Lin, H Lu… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com

… propose CapSal, a salient object detection framework that exploits image captioning to pro-
… Inspired by the success of these works, we propose to leverage image captioning as an …

被引用次数：127 相关文章所有 4 个版本

[PDF] thecvf.com

Nocaps: Novel object captioning at scale

H Agrawal, K Desai, Y Wang, X Chen… - Proceedings of the …, 2019 - openaccess.thecvf.com

… Dataset Analysis In this section, we compare our nocaps benchmark to COCO Captions [5]
in terms of both image content and caption diversity. Based on ground-truth object detection …

被引用次数：265 相关文章所有 11 个版本

[PDF] thecvf.com

Image captioning through image transformer

S He, W Liao, HR Tavakoli, M Yang… - Proceedings of the …, 2020 - openaccess.thecvf.com

… Two stage attention models consists of bottom-up attention and top-down attention, where
bottom-up attention first uses object detection models to detect multiple informative regions in …

被引用次数：125 相关文章所有 13 个版本

[PDF] thecvf.com

Open-vocabulary object detection using captions

A Zareian, KD Rosa, DH Hu… - Proceedings of the …, 2021 - openaccess.thecvf.com

… object detectors using bounding box annotations for a limited set of object categories, as well
as image… problem of knowledge transfer from image-caption pretraining to object detection. …

被引用次数：334 相关文章所有 6 个版本

高级搜索

QQ 群

Image captioning: Transforming objects into words

Image captioning with object detection and localization

Beyond a pre-trained object detector: Cross-modal textual and visual context for image captioning

Image captioning using motion-CNN with object detection

Image captioning with unseen objects

Image captioning model using attention and object features to mimic human image understanding

Capsal: Leveraging captioning to boost semantics for salient object detection

Nocaps: Novel object captioning at scale

Image captioning through image transformer

Open-vocabulary object detection using captions

引用