A degenerate or indeterminate string on an alphabet Σ is a sequence of non-empty subsets of Σ. Given a degenerate string t of length n and its Burrows–Wheeler transform we present a new method for searching for a degenerate pattern of length m in t running in O(mn) time on a constant size alphabet Σ. Furthermore, it is a hybrid pattern matching technique that works on both regular and degenerate strings. A degenerate string is said to be conservative if its number of non-solid letters is upper-bounded by a fixed positive constant q; in this case we show that the search time complexity is O(qm2) for counting the number of occurrences andO(qm2+occ) for reporting the found occurrences where occ is the number of occurrences of the pattern in t. Experimental results show that our method performs well in practice
|Journal||Information Processing Letters|
|Early online date||15 Mar 2019|
|Publication status||E-pub ahead of print - 15 Mar 2019|
- Burrows-Wheeler transform
- pattern matching
FingerprintDive into the research topics of 'Efficient pattern matching in degenerate strings with the Burrows–Wheeler transform'. Together they form a unique fingerprint.
- Faculty of Business and Physcial Sciences, Department of Computer Science - Honorary Research Fellow