HyNet: Hyper-scale object detection network framework for multiple spatial resolution remote sensing imagery

Z Zheng, Y Zhong, A Ma, X Han, J Zhao, Y Liu… - ISPRS Journal of …, 2020 - Elsevier
ISPRS Journal of Photogrammetry and Remote Sensing, 2020Elsevier
Faced with the problem of the large scale variation, geospatial object detection in multiple
spatial resolution (MSR) remote sensing imagery is a challenging task. To avoid the scale
problem, the current convolutional neural network (CNN) based object detectors use multi-
scale structures in the convolutional layer level to improve the detection performance by
utilizing different receptive fields in the convolutional layers with different scales to capture
objects with different scales. Examples of such methods are the image pyramid, pyramidal …
Abstract
Faced with the problem of the large scale variation, geospatial object detection in multiple spatial resolution (MSR) remote sensing imagery is a challenging task. To avoid the scale problem, the current convolutional neural network (CNN) based object detectors use multi-scale structures in the convolutional layer level to improve the detection performance by utilizing different receptive fields in the convolutional layers with different scales to capture objects with different scales. Examples of such methods are the image pyramid, pyramidal feature hierarchy, and the feature pyramid network. However, in MSR imagery, it is still difficult to model the large scale variation of geospatial objects for the existing multi-scale structures as their receptive fields are limited due to the fixed number of layers. In this paper, to solve the problem, a hyper-scale object detection framework for MSR imagery, namely HyNet, is proposed to alleviate the extreme scale-variation problem by learning hyper-scale feature representation. Differing from the previous multi-scale structure operation in the level of the convolutional layer, HyNet uses a hyper-scale block as the core structure, namely the HyBlock, in the sub-layer group level. In the HyBlock, each convolutional layer in the multi-scale structure is first divided into sub-layer groups with an equal size. In the sub-layer group level, hyper-scale features are obtained by a multi-scale sub-layer group operation with pyramidal receptive fields in the convolutional layers of each scale, which means that HyBlock is a fine-grained multi-scale structure. To effectively aggregate the hyper-scale features, group connection in the sub-layer level is used for intra-layer message passing. By promoting the intra-layer message passing to capture the scale-invariance of the hyper-scale features, the group connection can alleviate the scale-variation issue for object detection in MSR imagery. To better utilize the hyper-scale features, adaptive feature selection is proposed to select more effective hyper-scale features via adaptively weighting the different hyper-scale features. The experimental results obtained using three object detection datasets demonstrate that HyNet can learn a robust scale-invariant feature representation and can outperform the previous algorithms, and hence provides an effective new option for object detection in MSR remote sensing imagery.
Elsevier
以上显示的是最相近的搜索结果。 查看全部搜索结果