S Garg, T Vu, A Moschitti - Proceedings of the AAAI conference on artificial …, 2020 - aaai.org
We propose TandA, an effective technique for fine-tuning pre-trained Transformer models for natural language tasks. Specifically, we first transfer a pre-trained model into a model for a …
S Kim, I Kang, N Kwak - Proceedings of the AAAI conference on artificial …, 2019 - ojs.aaai.org
Sentence matching is widely used in various natural language tasks such as natural language inference, paraphrase identification, and question answering. For these tasks …
R Yang, J Zhang, X Gao, F Ji, H Chen - arXiv preprint arXiv:1908.00300, 2019 - arxiv.org
In this paper, we present a fast and strong neural approach for general purpose text matching applications. We explore what is sufficient to build a fast and well-performed text …
W Lan, W Xu - arXiv preprint arXiv:1806.04330, 2018 - arxiv.org
In this paper, we analyze several neural network designs (and their variations) for sentence pair modeling and compare their performance extensively across eight datasets, including …
Q Guo, S Cao, Z Yi - International Journal of Intelligent Systems, 2022 - Wiley Online Library
Question answering systems have become prominent in all areas, while in the medical domain it has been challenging because of the abundant domain knowledge. Retrieval …
In this paper, we propose a novel method for a sentence-level answer-selection task that is a fundamental problem in natural language processing. First, we explore the effect of …
Automatic crime classification is a fundamental task in the legal field. Given the fact descriptions, judges first determine the relevant violated laws, and then the articles. As laws …
Y Tay, LA Tuan, SC Hui - Proceedings of the 24th ACM SIGKDD …, 2018 - dl.acm.org
Attention is typically used to select informative sub-phrases that are used for prediction. This paper investigates the novel use of attention as a form of feature augmentation, ie, casted …
Large transformer-based language models have been shown to be very effective in many classification tasks. However, their computational complexity prevents their use in …