De-Bruijn graph with MapReduce framework towards metagenomic data classification

MS Kamal, S Parvin, AS Ashour, F Shi… - International Journal of …, 2017 - Springer
International Journal of Information Technology, 2017Springer
Metagenomic gene classifications are significant in bioinformatics and computational
biology research. There are huge interrelated datasets that deal with human characteristics,
diseases and molecular functionalities. Analysis of metagenomic reorganization is a
challenging issue due to their diversity and efficient classification tools. Graph based
MapReducing approach can easily handle the genomic diversity. MapReduce has two parts
such as mapping and reducing. In mapping phase, a recursive naive algorithm is used for …
Abstract
Metagenomic gene classifications are significant in bioinformatics and computational biology research. There are huge interrelated datasets that deal with human characteristics, diseases and molecular functionalities. Analysis of metagenomic reorganization is a challenging issue due to their diversity and efficient classification tools. Graph based MapReducing approach can easily handle the genomic diversity. MapReduce has two parts such as mapping and reducing. In mapping phase, a recursive naive algorithm is used for generating K-mers. De-Bruijn graph is a compact representation of k-mers and finds out an optimal path (solution) for genome assembly. Similarity metrics have been utilized for finding similarity among the De-Oxy Ribonucleic Acid (DNA) sequences. In reducing side, Jaccard similarity and purity of clustering are used as datasets classifier to classify the sequences based on their similarity. Reducing phase can easily classify the DNA sequences from large database. Extensive experimental analysis has demonstrated that graph based MapReduce analysis generate optimal solutions. Remarkable improvements in time and space have recorded and observed. The results established that proposed framework performed faster than SSMA-SFSD when classified elements are increased. It provided better accuracy for metagenomic data clustering.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果