查看文章

psu.edu 中的 [PDF]

Named entity recognition in Vietnamese free-text and web documents using conditional random fields

作者

Nguyen Cam Tu, Tran Thi Oanh, Phan Xuan Hieu, Ha Quang Thuy

发表日期

2005

期刊

The 8th Conference on Some selection problems of Information Technology and Telecommunication

页码范围

简介

Named entity recognition (NER) is the process of identifying different entity types (eg person, location, organization, or date/time), mentioned in natural language documents. It is an important task in information extraction and is a necessary precursor to higher processing and understanding natural language such as text mining, text summarization, question-answering, and machine translation. Further, the automated recognition of named entities is essential for the increasing need of searching, extracting, and tracking relevant information on the web environment, and especially for building the emerging semantic web technology.

This paper presents a machine learning approach to the problem of detecting named entities within Vietnamese free-text and web documents that is based on the use of conditional random fields (CRFs)–a novel and powerful discriminative sequential learning model. The noticeable advantage of CRFs is the flexibility to incorporate a variety of arbitrary, overlapping, and non-independent features at different levels of granularity from training data. As a result, our NER system can predict named entity types accurately by relying on various kinds of contextual evidence ranging from linguistic information (ie, words or phrases), text format, to a rich set of regular expressions. The experimental results (precision of 83.69%, recall of 87.41%, F1 score of 85.51%) on a moderate number of web documents show that our method can not only achieve significant accuracy but also effectively deal with potential ambiguity in Vietnamese.

引用总数

被引用次数：17

20072008200920102011201220132014201520162017201820192020202120223 1 1 3 3 2 1 1 1 1

学术搜索中的文章

Named entity recognition in Vietnamese free-text and web documents using conditional random fields

NC Tu, TT Oanh, PX Hieu, HQ Thuy - The 8th Conference on Some selection problems of …, 2005

被引用次数：17 相关文章所有 2 个版本