Optimal exact string matching based on suffix arrays

MI Abouelhoda, E Ohlebusch, S Kurtz - String Processing and Information …, 2002 - Springer
MI Abouelhoda, E Ohlebusch, S Kurtz
String Processing and Information Retrieval: 9th International Symposium …, 2002Springer
Using the suffix tree of a string S, decision queries of the type “Is P a substring of S?” can be
answered in O (| P|) time and enumeration queries of the type “Where are all z occurrences
of P in S?” can be answered in O (| P|+ z) time, totally independent of the size of S. However,
in large scale applications as genome analysis, the space requirements of the suffix tree are
a severe drawback. The suffix array is a more space economical index structure. Using it
and an additional table, Manber and Myers (1993) showed that decision queries and …
Abstract
Using the suffix tree of a string S, decision queries of the type “Is P a substring of S?” can be answered in O(|P|) time and enumeration queries of the type “Where are all z occurrences of P in S?” can be answered in O(|P|+z) time, totally independent of the size of S. However, in large scale applications as genome analysis, the space requirements of the suffix tree are a severe drawback. The suffix array is a more space economical index structure. Using it and an additional table, Manber and Myers (1993) showed that decision queries and enumeration queries can be answered in O(|P|+log|S|) and O(|P|+log|S|+z) time, respectively, but no optimal time algorithms are known. In this paper, we show how to achieve the optimal O(|P|) and O(|P| + z) time bounds for the suffix array. Our approach is not confined to exact pattern matching. In fact, it can be used to efficiently solve all problems that are usually solved by a top-down traversal of the suffix tree. Experiments show that our method is not only of theoretical interest but also of practical relevance.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果