Enhancing web service clustering using Length Feature Weight Method for service description document vector space representation

N Agarwal, G Sikka, LK Awasthi - Expert Systems with Applications, 2020 - Elsevier
Expert Systems with Applications, 2020Elsevier
Due to the rapid growth of web services in repositories, discovering the requisite web
service is becoming increasingly cumbersome task. It has raised the demand for efficient
web service clustering algorithms. In service repositories, when related web services are
stored in a clustered way, it enhances the web service discovery process by reducing search
space and time. Many eminent researchers have worked in this field and used the Term
Frequency–Inverse Document Frequency (TF-IDF) method for representing web services in …
Abstract
Due to the rapid growth of web services in repositories, discovering the requisite web service is becoming increasingly cumbersome task. It has raised the demand for efficient web service clustering algorithms. In service repositories, when related web services are stored in a clustered way, it enhances the web service discovery process by reducing search space and time. Many eminent researchers have worked in this field and used the Term Frequency – Inverse Document Frequency (TF-IDF) method for representing web services in vector space. In general, there are various limitations of the TF-IDF approach i.e. (1) Not efficient for large documents (2) Position of term and its co-occurrences does not matter (3) Unable to analyze how terms are dispersed in different documents. In the web service scenario, services are represented in short text form. TF-IDF does not work well in web service representation because of the reason that it is unable to effectively find the importance of a term concerning its occurrence in other documents. If we compare two service documents i.e. ‘s1’ and ‘s2’ first having a large and second having small number of terms respectively then TF-IDF does not demonstrate the importance of terms in ‘s1’ as smaller to ‘s2’. Therefore, it is not possible to assign effective weights to the terms. In the lack of effective vector space representation, the performance of the clustering algorithm also degrades. In this paper, we propose a new approach i.e. LFW+K which is based on Length Feature Weight (LFW) for the vectorized representation of service followed by K-Means clustering. The proposed approach helps to find the informative term from web service and assigns the term weight accordingly by considering parameters like the dimension of the web service document, maximum frequency of a term in the document and occurrences of a term in other documents. LFW+K is applied on the datasets of real-world web services and the performance is measured using standard measurement criteria (i.e. precision, recall, F1-score, and accuracy). Results of the proposed approach are compared with K-Means clustering on TF-IDF representation method i.e. TF-IDF+K. Results show that the proposed method outperforms the clustering done by using TF-IDF method for vector space representation of web services.
Elsevier
以上显示的是最相近的搜索结果。 查看全部搜索结果