Adaptive path selection for dynamic image captioning

Enhancement, integration, expansion: Activating representation of detailed features for occluded person re-identification

E Ning, Y Wang, C Wang, H Zhang, X Ning - Neural Networks, 2024 - Elsevier

A proposed method, Enhancement, integration, and Expansion, aims to activate the
representation of detailed features for occluded person re-identification. Region and context …

被引用次数：33 相关文章所有 4 个版本

[PDF] arxiv.org

A comprehensive survey of 3d dense captioning: Localizing and describing objects in 3d scenes

T Yu, X Lin, S Wang, W Sheng… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Three-Dimensional (3D) dense captioning is an emerging vision-language bridging task that
aims to generate multiple detailed and accurate descriptions for 3D scenes. It presents …

被引用次数：9 相关文章所有 4 个版本

Mining graph-based dynamic relationships for object detection

X Yang, Z Li, X Zhong, C Zhang, H Ma - Engineering Applications of …, 2023 - Elsevier

Since the propagation of deep neural networks results in the loss of detailed feature
information, the performance of most object detection methods is limited due to their …

被引用次数：17 相关文章所有 2 个版本

Cross on cross attention: Deep fusion transformer for image captioning

J Zhang, Y Xie, W Ding, Z Wang - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Numerous studies have shown that in-depth mining of correlations between multi-modal
features can help improve the accuracy of cross-modal data analysis tasks. However, the …

被引用次数：43 相关文章所有 2 个版本

Fast RF-UIC: A fast unsupervised image captioning model

R Yang, X Cui, Q Qin, Z Deng, R Lan, X Luo - Displays, 2023 - Elsevier

For visually impaired individuals, image captioning is a crucial task that utilizes deep
learning models to recognize an image and generate a descriptive sentence, enabling them …

被引用次数：13 相关文章

Unsupervised cross-modal hashing retrieval via Dynamic Contrast and Optimization

X Xie, Z Li, B Li, C Zhang, H Ma - Engineering Applications of Artificial …, 2024 - Elsevier

Cross-modal hashing encodes multimodal data into a common binary space, which can
efficiently measure correlations between cross-modal instances. However, most existing …

被引用次数：3 相关文章所有 2 个版本

Dual attention transformer network for hyperspectral image classification

Z Shu, Y Wang, Z Yu - Engineering Applications of Artificial Intelligence, 2024 - Elsevier

Hyperspectral image classification (HSIC) has been a significant topic in the field of remote
sensing in the past few years. Convolutional neural networks have shown promising …

被引用次数：22 相关文章所有 2 个版本

Mining core information by evaluating semantic importance for unpaired image captioning

J Wei, Z Li, C Zhang, H Ma - Neural Networks, 2024 - Elsevier

Recently, exciting progress has been made in the research of supervised image captioning.
However, manually annotated image-annotation pair data is difficult and expensive to …

被引用次数：2 相关文章所有 3 个版本

Modeling graph-structured contexts for image captioning

Z Li, J Wei, F Huang, H Ma - Image and Vision Computing, 2023 - Elsevier

The performance of image captioning has been significantly improved recently through deep
neural network architectures combining with attention mechanisms and reinforcement …

被引用次数：19 相关文章所有 2 个版本

DualSyn: A dual-level feature interaction method to predict synergistic drug combinations

Z Chen, Z Li, X Shen, Y Liu, X Lin, D Zeng… - Expert Systems with …, 2024 - Elsevier

Drug combination therapy can reduce drug resistance and improve treatment efficacy,
making it an increasingly promising cancer treatment method. Although existing …

被引用次数：1 相关文章所有 2 个版本

高级搜索

QQ 群