A survey of the vision transformers and their CNN-transformer based variants

A Khan, Z Rauf, A Sohail, AR Khan, H Asif… - Artificial Intelligence …, 2023 - Springer
Vision transformers have become popular as a possible substitute to convolutional neural
networks (CNNs) for a variety of computer vision applications. These transformers, with their …

A review of recent advances on deep learning methods for audio-visual speech recognition

D Ivanko, D Ryumin, A Karpov - Mathematics, 2023 - mdpi.com
This article provides a detailed review of recent advances in audio-visual speech
recognition (AVSR) methods that have been developed over the last decade (2013–2023) …

3D human pose estimation and action recognition using fisheye cameras: A survey and benchmark

Y Zhang, S You, S Karaoglu, T Gevers - Pattern Recognition, 2025 - Elsevier
Abstract 3D human pose estimation based on visual information aims to predict 3D poses of
humans in images or videos. The aim of human action recognition is to classify what kind of …

Comparative Analysis of the Clustering Quality in Self-Organizing Maps for Human Posture Classification

LE Ekemeyong Awong, T Zielinska - Sensors, 2023 - mdpi.com
The objective of this article is to develop a methodology for selecting the appropriate number
of clusters to group and identify human postures using neural networks with unsupervised …

[HTML][HTML] Self-supervised random mask attention GAN in tackling pose-invariant face recognition

J Liao, T Guha, V Sanchez - Pattern Recognition, 2025 - Elsevier
Abstract Pose Invariant Face Recognition (PIFR) has significantly advanced with Generative
Adversarial Networks (GANs), which rotate face images acquired at any angle to a frontal …

Bottom-up 2D pose estimation via dual anatomical centers for small-scale persons

Y Cheng, Y Ai, B Wang, X Wang, RT Tan - Pattern Recognition, 2023 - Elsevier
In multi-person 2D pose estimation, the bottom-up methods simultaneously predict poses for
all persons, and unlike the top-down methods, do not rely on human detection. However, the …

Study on deep learning models for human pose estimation and its real time application

J Jangade, KS Babulal - 2023 6th International Conference on …, 2023 - ieeexplore.ieee.org
In computer vision, human pose estimation details the posture of the person's body structure
that can be Kinematic, Planer, and Volumetric in an image or video. However, pose …

NRPose: Towards noise resistance for multi-person pose estimation

J He, J Sun, Q Liu, S Peng - Pattern Recognition, 2023 - Elsevier
The high signal-to-noise ratio is one of the main challenges of multi-person pose estimation
(MPE) and receives little attention. In this work, we find that MPE suffers from two types of …

Kinematics modeling network for video-based human pose estimation

Y Dang, J Yin, S Zhang, J Liu, Y Hu - Pattern Recognition, 2024 - Elsevier
Estimating human poses from videos is critical in human–computer interaction. Joints
cooperate rather than move independently during human movement. There are both spatial …

CWPR: An optimized transformer-based model for construction worker pose estimation on construction robots

J Zhou, W Zhou, Y Wang - Advanced Engineering Informatics, 2024 - Elsevier
Estimating construction workers' poses is critically important for recognizing unsafe
behaviors, conducting ergonomic analyses, and assessing productivity. Recently, utilizing …