Fine-grained image analysis with deep learning: A survey

XS Wei, YZ Song, O Mac Aodha, J Wu… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer
vision and pattern recognition, and underpins a diverse set of real-world applications. The …

Frequency-aware discriminative feature learning supervised by single-center loss for face forgery detection

J Li, H Xie, J Li, Z Wang… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Face forgery detection is raising ever-increasing interest in computer vision since facial
manipulation technologies cause serious worries. Though recent works have reached …

Task-adaptive attention for image captioning

C Yan, Y Hao, L Li, J Yin, A Liu, Z Mao… - … on Circuits and …, 2021 - ieeexplore.ieee.org
Attention mechanisms are now widely used in image captioning models. However, most
attention models only focus on visual features. When generating syntax related words, little …

Pooling methods in deep neural networks, a review

H Gholamalinezhad, H Khosravi - arXiv preprint arXiv:2009.07485, 2020 - arxiv.org
Nowadays, Deep Neural Networks are among the main tools used in various sciences.
Convolutional Neural Network is a special type of DNN consisting of several convolution …

Progressive spatio-temporal prototype matching for text-video retrieval

P Li, CW Xie, L Zhao, H Xie, J Ge… - Proceedings of the …, 2023 - openaccess.thecvf.com
The performance of text-video retrieval has been significantly improved by vision-language
cross-modal learning schemes. The typical solution is to directly align the global video-level …

Large scale visual food recognition

W Min, Z Wang, Y Liu, M Luo, L Kang… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Food recognition plays an important role in food choice and intake, which is essential to the
health and well‐being of humans. It is thus of importance to the computer vision community …

Bridging the gap between vision transformers and convolutional neural networks on small datasets

Z Lu, H Xie, C Liu, Y Zhang - Advances in Neural …, 2022 - proceedings.neurips.cc
There still remains an extreme performance gap between Vision Transformers (ViTs) and
Convolutional Neural Networks (CNNs) when training from scratch on small datasets, which …

Fine-grained recognition with learnable semantic data augmentation

Y Pu, Y Han, Y Wang, J Feng, C Deng… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Fine-grained image recognition is a longstanding computer vision challenge that focuses on
differentiating objects belonging to multiple subordinate categories within the same meta …

Adversarial learning for robust deep clustering

X Yang, C Deng, K Wei, J Yan… - Advances in Neural …, 2020 - proceedings.neurips.cc
Deep clustering integrates embedding and clustering together to obtain the optimal
nonlinear embedding space, which is more effective in real-world scenarios compared with …

Improving fine-grained visual recognition in low data regimes via self-boosting attention mechanism

Y Shu, B Yu, H Xu, L Liu - European Conference on Computer Vision, 2022 - Springer
The challenge of fine-grained visual recognition often lies in discovering the key
discriminative regions. While such regions can be automatically identified from a large-scale …