作者
Md Rezaul Karim, Md Azam Hossain, Md Mamunur Rashid, Byeong-Soo Jeong, Ho-Jin Choi1
发表日期
2012/3/1
期刊
IETE Technical Review
卷号
29
期号
2
页码范围
162-168
出版商
Taylor & Francis
简介
Current DNA sequence datasets have become extremely large, making it a great challenge for single-processor and main-memory-based computing systems to mine interesting patterns. Such limited hardware resources make the performance of most Apriori-like algorithms inefficient. However, recent implementation of a MapReduce framework has overcome these limitations. Furthermore, mining with maximal contiguous frequent patterns to express the function and structure of DNA sequences is a useful technique, capable of capturing the common data characteristics among related sequences. In this paper, we proposed an efficient approach for mining maximal contiguous frequent patterns in large DNA sequence data using MapReduce framework which can handle a massive DNA sequence datasets with a large number of nodes on a Hadoop platform. Our extensive experimental results show that the …
引用总数
20122013201420152016201720182019202020212022202312562212211
学术搜索中的文章