作者
Dweepna Garg, Khushboo Trivedi
发表日期
2014/5/8
研讨会论文
2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies
页码范围
1607-1610
出版商
IEEE
简介
Clustering is regarded as one of the significant task in data mining which deals with primarily grouping of similar data. To cluster large data is a point of concern. Hadoop is a software framework which deals with distributed processing of huge amount of data across clusters of commodity computers using MapReduce programming model. MapReduce allows a kind of parallelization for solving a problem involving large data sets using computing clusters and is also an attractive mean for data clustering involving large datasets. Mahout, a scalable machine learning library is an approach to Fuzzy K-mean clustering which runs on a Hadoop. This paper focuses on studying the performance of different datasets using Fuzzy K-mean clustering in MapReduce on Hadoop. Experimental results depict the execution time of the approach on a multi-node Hadoop cluster which is build using Amazon Elastic Cloud Computing …
引用总数
201520162017201820192020202120222023154312213
学术搜索中的文章
D Garg, K Trivedi - 2014 IEEE International Conference on Advanced …, 2014