A binary convolutional encoder-decoder network for real-time natural scene text processing

Z Liu, Y Li, F Ren, H Yu - arXiv preprint arXiv:1612.03630, 2016 - arxiv.org
arXiv preprint arXiv:1612.03630, 2016arxiv.org
In this paper, we develop a binary convolutional encoder-decoder network (B-CEDNet) for
natural scene text processing (NSTP). It converts a text image to a class-distinguished
salience map that reveals the categorical, spatial and morphological information of
characters. The existing solutions are either memory consuming or run-time consuming that
cannot be applied to real-time applications on resource-constrained devices such as
advanced driver assistance systems. The developed network can process multiple regions …
In this paper, we develop a binary convolutional encoder-decoder network (B-CEDNet) for natural scene text processing (NSTP). It converts a text image to a class-distinguished salience map that reveals the categorical, spatial and morphological information of characters. The existing solutions are either memory consuming or run-time consuming that cannot be applied to real-time applications on resource-constrained devices such as advanced driver assistance systems. The developed network can process multiple regions containing characters by one-off forward operation, and is trained to have binary weights and binary feature maps, which lead to both remarkable inference run-time speedup and memory usage reduction. By training with over 200, 000 synthesis scene text images (size of ), it can achieve and pixel-wise accuracy on ICDAR-03 and ICDAR-13 datasets. It only consumes inference run-time realized on GPU with a small network size of 2.14 MB, which is up to faster and smaller than it full-precision version.
arxiv.org
以上显示的是最相近的搜索结果。 查看全部搜索结果