[PDF][PDF] sam2lca: Lowest Common Ancestor for SAM/BAM/CRAM alignment files

M Borry, A Hübner, C Warinner - Journal of Open Source Software, 2022 - joss.theoj.org
Journal of Open Source Software, 2022joss.theoj.org
In a typical shotgun metagenomics approach, after the DNA of an ecological community has
been sequenced, it is compared to a genetic reference database of organisms with known
taxonomy. Even though the number of DNA sequences and genomes in reference
databases is constantly growing, there are still instances where a query sequence will not
have a direct match in a reference database, and it will instead weakly align to one or more
distantly related reference organisms. Furthermore, when analyzing short DNA sequences …
In a typical shotgun metagenomics approach, after the DNA of an ecological community has been sequenced, it is compared to a genetic reference database of organisms with known taxonomy. Even though the number of DNA sequences and genomes in reference databases is constantly growing, there are still instances where a query sequence will not have a direct match in a reference database, and it will instead weakly align to one or more distantly related reference organisms. Furthermore, when analyzing short DNA sequences, a query DNA sequence will often match equally well to more than one reference organism, posing a challenge for its taxonomic assignation.
One solution to this problem is to apply a lowest common ancestor algorithm (LCA)(Figure 1) during taxonomic profiling to place such ambiguous assignments higher in a taxonomic tree, where they can be more confidently assigned. This idea was first implemented for metagenomics with the MEGAN program (Huson et al., 2007).
joss.theoj.org
以上显示的是最相近的搜索结果。 查看全部搜索结果