作者
Minh-Tien Nguyen, Dung Tien Le, Le Thai Linh, Nguyen Hong Son, Do Hoang Thai Duong, Bui Cong Minh, Nguyen Hai Phong, Nguyen Huu Hiep
发表日期
2020/10/19
图书
Proceedings of the 29th ACM international conference on information & knowledge management
页码范围
3437-3440
简介
Information extraction is a well-known topic that plays a critical role in many NLP applications as its outputs can be considered as an entrance step for digital transformation. However, there still exist gaps when applying research results to actual business cases. This paper introduces AURORA, an information extraction for domain-specific business documents. The intuition of AURORA is to use transfer learning for extraction. To do that, it utilizes the power of transformers for dealing with the limitation of training data in business cases and stacks additional layers for domain adaptation. We demonstrate AURORA in the context of actual scenarios where users are invited to experience two functions: fine-grained and whole paragraph extraction of Japanese business documents. A video of the system is available at http://y2u.be/xHQpYE41Tqw.
引用总数
学术搜索中的文章
MT Nguyen, DT Le, LT Linh, N Hong Son, DHT Duong… - Proceedings of the 29th ACM international conference …, 2020