MindTheGap: integrated detection and assembly of short and long insertions

G Rizk, A Gouin, R Chikhi, C Lemaitre - Bioinformatics, 2014 - academic.oup.com
Bioinformatics, 2014academic.oup.com
Motivation: Insertions play an important role in genome evolution. However, such variants
are difficult to detect from short-read sequencing data, especially when they exceed the
paired-end insert size. Many approaches have been proposed to call short insertion variants
based on paired-end mapping. However, there remains a lack of practical methods to detect
and assemble long variants. Results: We propose here an original method, called M ind T
he G ap, for the integrated detection and assembly of insertion variants from re-sequencing …
Abstract
Motivation: Insertions play an important role in genome evolution. However, such variants are difficult to detect from short-read sequencing data, especially when they exceed the paired-end insert size. Many approaches have been proposed to call short insertion variants based on paired-end mapping. However, there remains a lack of practical methods to detect and assemble long variants.
Results: We propose here an original method, called M ind T he G ap , for the integrated detection and assembly of insertion variants from re-sequencing data. Importantly, it is designed to call insertions of any size, whether they are novel or duplicated, homozygous or heterozygous in the donor genome. M ind T he G ap uses an efficient k -mer-based method to detect insertion sites in a reference genome, and subsequently assemble them from the donor reads. M ind T he G ap showed high recall and precision on simulated datasets of various genome complexities. When applied to real Caenorhabditis elegans and human NA12878 datasets, M ind T he G ap detected and correctly assembled insertions >1 kb, using at most 14 GB of memory.
Availability and implementation:  http://mindthegap.genouest.org
Contact:  guillaume.rizk@inria.fr or claire.lemaitre@inria.fr
Oxford University Press
以上显示的是最相近的搜索结果。 查看全部搜索结果