A New Approach to Data Annotation Automation for Online Handwritten Mathematical Expression...- 学术资源搜索

A New Approach to Data Annotation Automation for Online Handwritten Mathematical Expression Recognition based on Recurrent Neural Networks

D Zhelezniakov, A Cherneha, V Zaytsev… - … on Systems, Man …, 2021 - ieeexplore.ieee.org

D Zhelezniakov, A Cherneha, V Zaytsev, O Radyvonenko

2021 IEEE International Conference on Systems, Man, and …, 2021•ieeexplore.ieee.org

The modern recognition methods based on deep learning have established high requirements for the size of training data. However, such data is not always publicly available, often undersized, or limited by the number of classes. Preparing ground truth data is very expensive, time-consuming, and error-prone during collecting as well as annotation for many applications, particularly for optical character recognition and handwriting recognition. In many applications, such as recognition of 2-dimensional languages (diagrams, charts, mathematical formulas), annotation is further complicated by the fact that in addition to the large number of symbol classes that vary depending on the application, the spatial relations between symbols or classes must also be annotated. In this work, we propose an approach for automatic annotation of online handwritten mathematical expressions. This iterative approach provides a hierarchical annotation using an LSTM-based recognition model and a small annotated dataset as a starting point and provides an increase in the alphabet, gradually improving the recognition accuracy of new classes of symbols. The proposed approach does not imply prior verification of the gathered dataset and comprises three main stages: training recognition models, automatic annotation using recognition and matching algorithms, and automatic verification. These stages are repeated until the number of new automatically recognized and annotated samples becomes small enough. Samples that have not passed automatic verification are suspicious and require manual verification or refining, which is done at the last stage. In our experiment, more than 85% of the samples were automatically annotated. The annotation accuracy at the symbol level is more than 99%. Experimental results demonstrated that the proposed approach provided time-saving of up to 90% on manual operations. The proposed approach can also be applied to high-noise datasets.

ieeexplore.ieee.org

展开收起

被引用次数：2 相关文章所有 3 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

A New Approach to Data Annotation Automation for Online Handwritten Mathematical Expression Recognition based on Recurrent Neural Networks

引用