作者
AS Atallah, Khairuddin Omar
发表日期
2008/10
来源
IJCSNS
卷号
8
期号
10
页码范围
137
简介
Preprocessing is the most important stage in the Arabic OCR system; it has a direct effect on the reliability and efficiency of the segmentation and feature extraction stages. It is worth mentioning that Arabic language is cursively written, and its characters have between 2 to 4 shapes. An Arabic word likely consists of two or more characters which are connected through an imaginary line called baseline. Detecting baseline is one of the main majorities in preprocessing Arabic OCR system. The baseline can be used for both skew normalization and character segmentation. This paper aims to provide a comprehensive review of the methods proposed by researchers to detect Arabic baseline. The Arabic baseline detection methods are categorized into four methods:(a) based on horizontal projection methods,(b) based on word skeleton method,(c) based on contour tracing method, and (d) based on principle component analysis method. Each of these methods has its own advantages and drawbacks.
引用总数
2009201020112012201320142015201620172018201920202021202220232024491168107527532521