Unsupervised multilingual sentence boundary detection

T Kiss, J Strunk - Computational linguistics, 2006 - direct.mit.edu
In this article, we present a language-independent, unsupervised approach to sentence
boundary detection. It is based on the assumption that a large number of ambiguities in the …

Elephant: Sequence labeling for word and sentence segmentation

K Evang, V Basile, G Chrupała, J Bos - EMNLP 2013, 2013 - hal.science
Tokenization is widely regarded as a solved problem due to the high accuracy that rule-
based tokenizers achieve. But rule-based tokenizers are hard to maintain and their rules …

Fully convolutional networks for handwriting recognition

FP Such, D Peri, F Brockler, H Paul… - 2018 16th International …, 2018 - ieeexplore.ieee.org
Handwritten text recognition is challenging because of the virtually infinite ways a human
can write the same message. Our fully convolutional handwriting model takes in a …

Automatic text summarization with genetic algorithm-based attribute selection

CN Silla Jr, GL Pappa, AA Freitas… - … -American Conference on …, 2004 - Springer
The task of automatic text summarization consists of generating a summary of the original
text that allows the user to obtain the main pieces of information available in that text, but …

Sentence segmentation in narrative transcripts from neuropsychological tests using recurrent convolutional neural networks

MV Treviso, C Shulby, SM Aluísio - arXiv preprint arXiv:1610.00211, 2016 - arxiv.org
Automated discourse analysis tools based on Natural Language Processing (NLP) aiming at
the diagnosis of language-impairing dementias generally extract several textual metrics of …

Evaluating word embeddings for sentence boundary detection in speech transcripts

MV Treviso, CD Shulby, SM Aluisio - arXiv preprint arXiv:1708.04704, 2017 - arxiv.org
This paper is motivated by the automation of neuropsychological tests involving discourse
analysis in the retellings of narratives by patients with potential cognitive impairment. In this …

iSentenizer‐μ: Multilingual Sentence Boundary Detection Model

DF Wong, LS Chao, X Zeng - The Scientific World Journal, 2014 - Wiley Online Library
Sentence boundary detection (SBD) system is normally quite sensitive to genres of data that
the system is trained on. The genres of data are often referred to the shifts of text topics and …

Experiments on sentence boundary detection in user-generated web content

R López, TAS Pardo - … Linguistics and Intelligent Text Processing: 16th …, 2015 - Springer
Abstract Sentence Boundary Detection (SBD) is a very important prerequisite for proper
sentence analysis in different Natural Language Processing tasks. During the last years …

[PDF][PDF] Indonesian Sentence Boundary Detection using Deep Learning Approaches.

J Santoso, EI Setiawan, CN Purwanto… - … . Eng. Data Sci., 2021 - pdfs.semanticscholar.org
Sentence segmentation or tokenization is a primary text processing in natural language
processing [1]. To begin processing each token of words, we need to detect whether those …

Suggesting alternative query phrases in query results

H Shaw - US Patent 9,183,323, 2015 - Google Patents
Field of Classification S h rimary Examiner—AJ1un Jaco is." SSCO Sea 707f766 (74)
Attorney, Agent, or Firm—Fish & Richardson PC See application file for complete search …