Title : 
Searching BWT compressed text with the Boyer-Moore algorithm and binary search
         
        
            Author : 
Bell, Tim ; Powell, Matt ; Mukherjee, Amar ; Adjeroh, Don
         
        
            Author_Institution : 
Dept. of Comput. Sci., Univ. of Canterbury, New Zealand
         
        
        
        
        
        
            Abstract : 
This paper explores two techniques for on-line exact pattern matching in files that have been compressed using the Burrows-Wheeler transform. We investigate two approaches. The first is an application of the Boyer-Moore algorithm (1977) to a transformed string. The second approach is based on the observation that the transform effectively contains a sorted list of all substrings of the original text, which can be exploited for very rapid searching using a variant of binary search. Both methods are faster than a decompress-and-search approach for small numbers of queries, and binary search is much faster even for large numbers of queries.
         
        
            Keywords : 
data compression; search problems; string matching; text analysis; transforms; BWT; Boyer-Moore algorithm; Burrows-Wheeler transform; binary search; compressed text; on-line exact pattern matching; queries; searching; sorted list; substrings; transformed string; Computer science; Data compression; Encoding; Image coding; Pattern matching; USA Councils;
         
        
        
        
            Conference_Titel : 
Data Compression Conference, 2002. Proceedings. DCC 2002
         
        
        
            Print_ISBN : 
0-7695-1477-4
         
        
        
            DOI : 
10.1109/DCC.2002.999949