Abstract
A degenerate or indeterminate string on an alphabet Σ is a sequence of non-empty subsets of Σ. Given a degenerate string t of length n and its Burrows–Wheeler transform we present a new method for searching for a degenerate pattern of length m in t running in O(mn) time on a constant size alphabet Σ. Furthermore, it is a hybrid pattern matching technique that works on both regular and degenerate strings. A degenerate string is said to be conservative if its number of non-solid letters is upper-bounded by a fixed positive constant q; in this case we show that the search time complexity is O(qm2) for counting the number of occurrences andO(qm2+occ) for reporting the found occurrences where occ is the number of occurrences of the pattern in t. Experimental results show that our method performs well in practice
Original language | English |
---|---|
Journal | Information Processing Letters |
Early online date | 15 Mar 2019 |
DOIs | |
Publication status | E-pub ahead of print - 15 Mar 2019 |
Keywords
- algorithm
- Burrows-Wheeler transform
- conservative
- degenerate
- pattern matching
- string
Fingerprint
Dive into the research topics of 'Efficient pattern matching in degenerate strings with the Burrows–Wheeler transform'. Together they form a unique fingerprint.Profiles
-
Jacqueline Daykin
- Faculty of Business and Physcial Sciences, Department of Computer Science - Honorary Research Fellow
Person: Other