Text segmentation techniques: a critical review

I Pak, PL Teh - … , Optimization and Its Applications: Modelling and …, 2018 - Springer
Text segmentation is a method of splitting a document into smaller parts, which is usually
called segments. It is widely used in text processing. Each segment has its relevant …

Text summarization using topic-based vector space model and semantic measure

RC Belwal, S Rai, A Gupta - Information Processing & Management, 2021 - Elsevier
The primary shortcoming associated with extractive text summarization is redundancy,
where more than one sentence representing a similar type of information are incorporated in …

Multi-scale multi-task fcn for semantic page segmentation and table detection

D He, S Cohen, B Price, D Kifer… - 2017 14th IAPR …, 2017 - ieeexplore.ieee.org
Page segmentation and table detection play an important role in understanding the structure
of documents. We present a page segmentation algorithm that incorporates state-of-the-art …

Multilevel color image segmentation based on GLCM and improved salp swarm algorithm

Z Xing, H Jia - IEEE Access, 2019 - ieeexplore.ieee.org
The grayscale co-occurrence matrix (GLCM) can be adapted to segment the image
according to the pixels, but the segmentation effect becomes worse as the number of …

Artificial neural networks in classification of steel grades based on non-destructive tests

A Beskopylny, A Lyapin, H Anysz, B Meskhi… - Materials, 2020 - mdpi.com
Assessment of the mechanical properties of structural steels characterizing their strength
and deformation parameters is an essential problem in the monitoring of structures that have …

Text and non-text separation in offline document images: a survey

S Bhowmik, R Sarkar, M Nasipuri… - International Journal on …, 2018 - Springer
Separation of text and non-text is an essential processing step for any document analysis
system. Therefore, it is important to have a clear understanding of the state-of-the-art of …

Page segmentation using a convolutional neural network with trainable co-occurrence features

J Lee, H Hayashi, W Ohyama… - … conference on document …, 2019 - ieeexplore.ieee.org
In document analysis, page segmentation is a fundamental task that divides a document
image into semantic regions. In addition to local features, such as pixel-wise information, co …

Utilization of relative context for text non-text region classification in offline documents using multi-scale dilated convolutional neural network

S Bhowmik - Multimedia Tools and Applications, 2024 - Springer
Identification of text and non-text regions in a document image is necessary before feeding it
to an Optical character recognition (OCR) engine for the generation of editable version. This …

DoT-Net: Document layout classification using texture-based CNN

SC Kosaraju, M Masum, NZ Tsaku… - 2019 International …, 2019 - ieeexplore.ieee.org
Document Layout Analysis (DLA) is a segmentation process that decomposes a scanned
document image into its blocks of interest and classifies them. DLA is essential in a large …

A simple and practical review of over-fitting in neural network learning

OK Oyedotun, EO Olaniyi… - International Journal of …, 2017 - inderscienceonline.com
Training a neural network involves the adaptation of its internal parameters for modelling a
specific task. The states of the internal parameters during training describe how much …