作者
Minh-Tien Nguyen, Dung Tien Le, Nguyen Hong Son, Bui Cong Minh, Akira Shojiguchi
发表日期
2021/7/18
研讨会论文
2021 International Joint Conference on Neural Networks (IJCNN)
页码范围
1-8
出版商
IEEE
简介
Information extraction is a key corner-stone in the digitization of office data which requires the conversion of unstructured to structured data. However, in the actual application to business cases, there is a big deadlock to adapt common extraction systems to domain-specific documents due to the limitation of preparation of training data. To overcome this issue, we introduce a model, which employs pre-trained language models with a customized CNN layer for domain adaptation. The model is validated on three Japanese domain-specific and two benchmark machine reading comprehension data sets (SQuADs). Experimental results confirm that our model achieves promising results which are applicable for actual business scenarios.
引用总数
学术搜索中的文章
MT Nguyen, DT Le, NH Son, BC Minh, A Shojiguchi - 2021 International Joint Conference on Neural …, 2021