Lexicon enhanced Chinese sequence labeling using BERT adapter

W Liu, X Fu, Y Zhang, W Xiao - arXiv preprint arXiv:2105.07148, 2021 - arxiv.org
Lexicon information and pre-trained models, such as BERT, have been combined to explore
Chinese sequence labelling tasks due to their respective strengths. However, existing …

ZEN: Pre-training Chinese text encoder enhanced by n-gram representations

S Diao, J Bai, Y Song, T Zhang, Y Wang - arXiv preprint arXiv:1911.00720, 2019 - arxiv.org
The pre-training of text encoders normally processes text as a sequence of tokens
corresponding to small text units, such as word pieces in English and characters in Chinese …

Improving named entity recognition with attentive ensemble of syntactic information

Y Nie, Y Tian, Y Song, X Ao, X Wan - arXiv preprint arXiv:2010.15466, 2020 - arxiv.org
Named entity recognition (NER) is highly sensitive to sentential syntactic and semantic
properties where entities may be extracted according to how they are used and placed in the …

Improving Chinese word segmentation with wordhood memory networks

Y Tian, Y Song, F Xia, T Zhang… - Proceedings of the 58th …, 2020 - aclanthology.org
Contextual features always play an important role in Chinese word segmentation (CWS).
Wordhood information, being one of the contextual features, is proved to be useful in many …

Taming pre-trained language models with n-gram representations for low-resource domain adaptation

S Diao, R Xu, H Su, Y Jiang, Y Song… - Proceedings of the 59th …, 2021 - aclanthology.org
Large pre-trained models such as BERT are known to improve different downstream NLP
tasks, even when such a model is trained on a generic domain. Moreover, recent studies …

Summarizing medical conversations via identifying important utterances

Y Song, Y Tian, N Wang, F Xia - Proceedings of the 28th …, 2020 - aclanthology.org
Summarization is an important natural language processing (NLP) task in identifying key
information from text. For conversations, the summarization systems need to extract salient …

Joint Chinese word segmentation and part-of-speech tagging via multi-channel attention of character n-grams

Y Tian, Y Song, F Xia - … of the 28th International Conference on …, 2020 - aclanthology.org
Chinese word segmentation (CWS) and part-of-speech (POS) tagging are two fundamental
tasks for Chinese language processing. Previous studies have demonstrated that jointly …

Smart contract generation assisted by AI-based word segmentation

Y Tong, W Tan, J Guo, B Shen, P Qin, S Zhuo - Applied Sciences, 2022 - mdpi.com
In the last decade, blockchain smart contracts emerged as an automated, decentralized,
traceable, and immutable medium of value exchange. Nevertheless, existing blockchain …

Unsupervised boundary-aware language model pretraining for Chinese sequence labeling

P Jiang, D Long, Y Zhang, P Xie, M Zhang… - arXiv preprint arXiv …, 2022 - arxiv.org
Boundary information is critical for various Chinese language processing tasks, such as
word segmentation, part-of-speech tagging, and named entity recognition. Previous studies …

Prompt-based word-level information injection BERT for Chinese named entity recognition

Q He, G Chen, W Song, P Zhang - Applied Sciences, 2023 - mdpi.com
Named entity recognition (NER) is a subfield of natural language processing (NLP) that
identifies and classifies entities from plain text, such as people, organizations, locations, and …