Exploring attention on faces: similarities between humans and transformers

M Cadoni, S Nixon, A Lagorio… - 2022 18th IEEE …, 2022 - ieeexplore.ieee.org
2022 18th IEEE international conference on advanced video and …, 2022ieeexplore.ieee.org
Attention in Machine Learning allows a model to selectively up-weight informative parts of
an input in relation to others. The Vision Transformer (ViT) is entirely based on attention.
ViTs have shown state of the art performance in multiple fields including person re-
identification, presentation attack detection and object recognition. Several works have
shown that embedding human attention into a Machine Learning pipeline can improve
performance or compensate for a lack of data. However the correlation between computer …
Attention in Machine Learning allows a model to selectively up-weight informative parts of an input in relation to others. The Vision Transformer (ViT) is entirely based on attention. ViTs have shown state of the art performance in multiple fields including person re-identification, presentation attack detection and object recognition. Several works have shown that embedding human attention into a Machine Learning pipeline can improve performance or compensate for a lack of data. However the correlation between computer vision models and human attention has not yet been investigated. In this paper we explore the intersection of human and Transformer attention. For this we collect a new dataset: the University of Sassari Face Fixation Dataset (Uniss-FFD) of human fixations and show through a quantitative analysis that correlations exist between these two modalities. The dataset described in this paper is available at https://github.com/ CVLab-Uniss/Uniss-FFD.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果