Chinese new word identification: a latent discriminative model with global features

X Sun, DG Huang, HY Song, FJ Ren - Journal of computer science and …, 2011 - Springer
Chinese new words are particularly problematic in Chinese natural language processing.
With the fast development of Internet and information explosion, it is impossible to get a …

Detecting new words from Chinese text using latent semi-CRF models

X Sun, D Huang, F Ren - IEICE transactions on information and …, 2010 - search.ieice.org
Chinese new words and their part-of-speech (POS) are particularly problematic in Chinese
natural language processing. With the fast development of internet and information …

Domain-specific new words detection in chinese

A Chen, M Sun - Proceedings of the 6th joint conference on …, 2017 - aclanthology.org
With the explosive growth of Internet, more and more domain-specific environments appear,
such as forums, blogs, MOOCs and etc. Domain-specific words appear in these areas and …

Out-domain Chinese new word detection with statistics-based character embedding

Y Liang, M Yang, J Zhu, SM Yiu - Natural Language Engineering, 2019 - cambridge.org
Unlike English and other Western languages, many Asian languages such as Chinese and
Japanese do not delimit words by space. Word segmentation and new word detection are …

Modeling word concepts without convention: linguistic and computational issues in Chinese word identification

CR Huang, N Xue - 2015 - academic.oup.com
Modeling Word Concepts without Convention: Linguistic and Computational Issues in
Chinese Word Identification | The Oxford Handbook of Chinese Linguistics | Oxford …

Automatic microblog‐oriented unknown word recognition with unsupervised method

D Huang, J Zhang, K Huang - Chinese Journal of Electronics, 2018 - Wiley Online Library
As a prerequisite task in Natural language processing (NLP), Chinese word segmentation
(CWS), is challenged by unknown words. Aiming to effectively detect Chinese unknown …

Automatic recognition of Chinese unknown word for single-character and affix models

X Jiang, L Wang, Y Cao, Z Lu - … and Management: Proceedings of the Sixth …, 2011 - Springer
This paper presents a novel method to recognize Chinese unknown word from short texts
corpus, which is based our observation of both single-character and affix models of Chinese …

A pragmatic model for new Chinese word extraction

H Zhang, H Huang, C Zhu, S Shi - Proceedings of the 6th …, 2010 - ieeexplore.ieee.org
This paper proposed a pragmatic model for repeat-based Chinese New Word Extraction
(NWE). It contains two innovations. The first is a formal description for the process of NWE …

Based on support vector and word features new word discovery research

L Chengcheng, X Yuanfang - … , ISCTCS 2012, Beijing, China, May 28–June …, 2013 - Springer
Chinese word segmentation is difficult to deal with ambiguity and unknown words
recognition, this paper proposes the new word mode features as well as various word …

The use of SVM for Chinese new word identification

H Li, CN Huang, J Gao, X Fan - … , Hainan Island, China, March 22-24, 2004 …, 2005 - Springer
We present a study of new word identification (NWI) to improve the performance of a
Chinese word segmenter. In this paper the distribution and types of new words are …