[PDF][PDF] An abstractive summarization technique with variable length keywords as per document diversity

MY Saeed, M Awais, M Younas, MA Shah… - Comput Mater …, 2021 - cdn.techscience.cn
MY Saeed, M Awais, M Younas, MA Shah, A Khan, MI Uddin, M Mahmoud
Comput Mater Contin, 2021cdn.techscience.cn
Text Summarization is an essential area in text mining, which has procedures for text
extraction. In natural language processing, text summarization maps the documents to a
representative set of descriptive words. Therefore, the objective of text extraction is to attain
reduced expressive contents from the text documents. Text summarization has two main
areas such as abstractive, and extractive summarization. Extractive text summarization has
further two approaches, in which the first approach applies the sentence score algorithm …
Abstract
Text Summarization is an essential area in text mining, which has procedures for text extraction. In natural language processing, text summarization maps the documents to a representative set of descriptive words. Therefore, the objective of text extraction is to attain reduced expressive contents from the text documents. Text summarization has two main areas such as abstractive, and extractive summarization. Extractive text summarization has further two approaches, in which the first approach applies the sentence score algorithm, and the second approach follows the word embedding principles. All such text extractions have limitations in providing the basic theme of the underlying documents. In this paper, we have employed text summarization by TF-IDF with PageRank keywords, sentence score algorithm, and Word2Vec word embedding. The study compared these forms of the text summarizations with the actual text, by calculating cosine similarities. Furthermore, TF-IDF based PageRank keywords are extracted from the other two extractive summarizations. An intersection over these three types of TD-IDF keywords to generate the more representative set of keywords for each text document is performed. This technique generates variable-length keywords as per document diversity instead of selecting fixedlength keywords for each document. This form of abstractive summarization improves metadata similarity to the original text compared to all other forms of summarized text. It also solves the issue of deciding the number of representative
cdn.techscience.cn
以上显示的是最相近的搜索结果。 查看全部搜索结果