Title :
Two Efficient Algorithms for Linear Time Suffix Array Construction
Author :
Nong, Ge ; Zhang, Sen ; Chan, Wai Hong
Author_Institution :
Dept. of Comput. Sci., Sun Yat-sen Univ., Guangzhou, China
Abstract :
We present, in this paper, two efficient algorithms for linear time suffix array construction. These two algorithms achieve their linear time complexities, using the techniques of divide-and-conquer, and recursion. What distinguish the proposed algorithms from other linear time suffix array construction algorithms (SACAs) are the variable-length leftmost S-type (LMS) substrings and the fixed-length d-critical substrings sampled for problem reduction, and the simple algorithms for sorting these sampled substrings: the induced sorting algorithm for the variable-length LMS substrings and the radix sorting algorithm for the fixed-length d-critical substrings. The very simple sorting mechanisms render our algorithms an elegant design framework, and, in turn, the surprisingly succinct implementations. The fully functional sample implementations of our proposed algorithms require only around 100 lines of C code for each, which is only 1/10 of the implementation of the KA algorithm and comparable to that of the KS algorithm. The experimental results demonstrate that these two newly proposed algorithms yield the best time and space efficiencies among all the existing linear time SACAs.
Keywords :
computational complexity; data structures; sorting; design framework; divide-and-conquer technique; fixed-length d-critical substrings; induced sorting algorithm; linear time complexity; linear time suffix array construction; radix sorting algorithm; recursion technique; sampled substrings sorting; variable-length leftmost S-type substrings; Algorithm design and analysis; Arrays; Construction industry; Indexes; Least squares approximation; Merging; Sorting; Suffix array; divide-and-conquer.; linear time;
Journal_Title :
Computers, IEEE Transactions on
DOI :
10.1109/TC.2010.188