Title :
Annotated statistical indices for sequence analysis
Author :
Apostolico, Alberto ; Bock, Mary Ellen ; Xu, Xuyan
Author_Institution :
Dept. of Comput. Sci., Purdue Univ., West Lafayette, IN, USA
Abstract :
A statistical index for string x is a digital-search tree or trie that returns, for any query string ω and in a number of comparisons bounded by the length of ω, the number of occurrences of ω in x. Clever algorithms are available that support the construction and weighting of such indices in time and space linear in the length of x. This paper addresses the problem of annotating a statistical index with such parameters as the expected value and variance of the number of occurrences of each substring
Keywords :
pattern recognition; sequences; statistical analysis; tree searching; annotated statistical indices; construction; digital-search tree; occurrence; query string; sequence analysis; statistical index; substring; trie; variance; weighting; Algorithm design and analysis; Bioinformatics; Frequency measurement; Genomics; Pattern analysis; Pattern matching; Sequences; Statistical analysis; Statistics; USA Councils;
Conference_Titel :
Compression and Complexity of Sequences 1997. Proceedings
Conference_Location :
Salerno
Print_ISBN :
0-8186-8132-2
DOI :
10.1109/SEQUEN.1997.666917