作者
Adnan Ul-Hasan, Saad Bin Ahmed, Faisal Rashid, Faisal Shafait, Thomas M Breuel
发表日期
2013/8/25
研讨会论文
2013 12th international conference on document analysis and recognition
页码范围
1061-1065
出版商
IEEE
简介
Recurrent neural networks (RNN) have been successfully applied for recognition of cursive handwritten documents, both in English and Arabic scripts. Ability of RNNs to model context in sequence data like speech and text makes them a suitable candidate to develop OCR systems for printed Nabataean scripts (including Nastaleeq for which no OCR system is available to date). In this work, we have presented the results of applying RNN to printed Urdu text in Nastaleeq script. Bidirectional Long Short Term Memory (BLSTM) architecture with Connectionist Temporal Classification (CTC) output layer was employed to recognize printed Urdu text. We evaluated BLSTM networks for two cases: one ignoring the character's shape variations and the second is considering them. The recognition error rate at character level for first case is 5.15% and for the second is 13.6%. These results were obtained on synthetically …
引用总数
20132014201520162017201820192020202120222023202418131317132119815113
学术搜索中的文章
A Ul-Hasan, SB Ahmed, F Rashid, F Shafait, TM Breuel - 2013 12th international conference on document …, 2013