A brief survey of text mining: Classification, clustering and extraction techniques

M Allahyari, S Pouriyeh, M Assefi, S Safaei… - arXiv preprint arXiv …, 2017 - arxiv.org
The amount of text that is generated every day is increasing dramatically. This tremendous
volume of mostly unstructured text cannot be simply processed and perceived by computers …

Bidirectional LSTM-CRF models for sequence tagging

Z Huang, W Xu, K Yu - arXiv preprint arXiv:1508.01991, 2015 - arxiv.org
In this paper, we propose a variety of Long Short-Term Memory (LSTM) based models for
sequence tagging. These models include LSTM networks, bidirectional LSTM (BI-LSTM) …

[PDF][PDF] Natural language processing (almost) from scratch

R Collobert, J Weston, L Bottou, M Karlen… - Journal of machine …, 2011 - jmlr.org
We propose a unified neural network architecture and learning algorithm that can be applied
to various natural language processing tasks including part-of-speech tagging, chunking …

Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition

EF Sang, F De Meulder - arXiv preprint cs/0306050, 2003 - arxiv.org
We describe the CoNLL-2003 shared task: language-independent named entity recognition.
We give background information on the data sets (English and German) and the evaluation …

[PDF][PDF] Design challenges and misconceptions in named entity recognition

L Ratinov, D Roth - Proceedings of the thirteenth conference on …, 2009 - aclanthology.org
We analyze some of the fundamental design challenges and misconceptions that underlie
the development of an efficient and robust NER system. In particular, we address issues …

An encoding strategy based word-character LSTM for Chinese NER

W Liu, T Xu, Q Xu, J Song, Y Zu - … of the 2019 Conference of the …, 2019 - aclanthology.org
A recently proposed lattice model has demonstrated that words in character sequence can
provide rich word boundary information for character-based Chinese NER model. In this …

An introduction to conditional random fields

C Sutton, A McCallum - Foundations and Trends® in Machine …, 2012 - nowpublishers.com
Many tasks involve predicting a large number of variables that depend on each other as well
as on other observed variables. Structured prediction methods are essentially a combination …

[PDF][PDF] A framework for learning predictive structures from multiple tasks and unlabeled data.

RK Ando, T Zhang, P Bartlett - Journal of machine learning research, 2005 - jmlr.org
One of the most important issues in machine learning is whether one can improve the
performance of a supervised learning algorithm by including unlabeled data. Methods that …

Chinese mineral named entity recognition based on BERT model

Y Yu, Y Wang, J Mu, W Li, S Jiao, Z Wang, P Lv… - Expert Systems with …, 2022 - Elsevier
Mineral named entity recognition (MNER) is the extraction for the specific types of entities
from unstructured Chinese mineral text, which is a prerequisite for building a mineral …

[PDF][PDF] Introduction to the CoNLL-2005 shared task: Semantic role labeling

X Carreras, L Màrquez - Proceedings of the ninth conference on …, 2005 - aclanthology.org
In this paper we describe the CoNLL-2005 shared task on Semantic Role Labeling. We
introduce the specification and goals of the task, describe the data sets and evaluation …