[HTML][HTML] A comprehensive survey of image augmentation techniques for deep learning

M Xu, S Yoon, A Fuentes, DS Park - Pattern Recognition, 2023 - Elsevier
Although deep learning has achieved satisfactory performance in computer vision, a large
volume of images is required. However, collecting images is often expensive and …

Weakly supervised object localization and detection: A survey

D Zhang, J Han, G Cheng… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
As an emerging and challenging problem in the computer vision community, weakly
supervised object localization and detection plays an important role for developing new …

Multi-class token transformer for weakly supervised semantic segmentation

L Xu, W Ouyang, M Bennamoun… - Proceedings of the …, 2022 - openaccess.thecvf.com
This paper proposes a new transformer-based framework to learn class-specific object
localization maps as pseudo labels for weakly supervised semantic segmentation (WSSS) …

Layercam: Exploring hierarchical class activation maps for localization

PT Jiang, CB Zhang, Q Hou… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
The class activation maps are generated from the final convolutional layer of CNN. They can
highlight discriminative object regions for the class of interest. These discovered object …

Regional semantic contrast and aggregation for weakly supervised semantic segmentation

T Zhou, M Zhang, F Zhao, J Li - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Learning semantic segmentation from weakly-labeled (eg, image tags only) data is
challenging since it is hard to infer dense object regions from sparse semantic tags. Despite …

Self-challenging improves cross-domain generalization

Z Huang, H Wang, EP Xing, D Huang - … Glasgow, UK, August 23–28, 2020 …, 2020 - Springer
Abstract Convolutional Neural Networks (CNN) conduct image classification by activating
dominant features that correlated with labels. When the training and testing data are under …

TN-ZSTAD: Transferable network for zero-shot temporal activity detection

L Zhang, X Chang, J Liu, M Luo, Z Li… - … on Pattern Analysis …, 2022 - ieeexplore.ieee.org
An integral part of video analysis and surveillance is temporal activity detection, which
means to simultaneously recognize and localize activities in long untrimmed videos …

Banmo: Building animatable 3d neural models from many casual videos

G Yang, M Vo, N Neverova… - Proceedings of the …, 2022 - openaccess.thecvf.com
Prior work for articulated 3D shape reconstruction often relies on specialized multi-view and
depth sensors or pre-built deformable 3D models. Such methods do not scale to diverse sets …

Semi-supervised semantic segmentation with cross-consistency training

Y Ouali, C Hudelot, M Tami - Proceedings of the IEEE/CVF …, 2020 - openaccess.thecvf.com
In this paper, we present a novel cross-consistency based semi-supervised approach for
semantic segmentation. Consistency training has proven to be a powerful semi-supervised …

Masked discrimination for self-supervised learning on point clouds

H Liu, M Cai, YJ Lee - European Conference on Computer Vision, 2022 - Springer
Masked autoencoding has achieved great success for self-supervised learning in the image
and language domains. However, mask based pretraining has yet to show benefits for point …