Title :
Conservative, Non-conservative and Average Pairwise Statistical Significance of Local Sequence Alignment
Author :
Agrawal, Ankit ; Huang, Xiaoqiu
Author_Institution :
Dept. of Comput. Sci., Iowa State Univ., Ames, IA
Abstract :
Estimation of statistical significance of a pairwise alignment is an important problem in sequence comparison. Recently, it was shown that pairwise statistical significance does better in practice than database statistical significance in terms of retrieval accuracy of homologs. In this paper, we introduce the concept of conservative, non-conservative, and average pairwise statistical significance which can be easily derived from original pairwise statistical significance estimates and use more information specific to the sequence pair under consideration using multiple shuffle spaces. Experimental results for homology detection reveal that the proposed measures give at least comparable or significantly better retrieval accuracy than original pairwise statistical significance and database statistical significance reported by BLAST, PSI-BLAST, and SSEARCH. The use of the proposed measures is further shown to be extremely useful when using sequence-specific substitution matrices.
Keywords :
DNA; biology computing; molecular biophysics; proteins; statistical analysis; BLAST; PSI-BLAST; SSEARCH; database statistical significance; homology detection; local sequence alignment; multiple shuffle spaces; pairwise alignment; pairwise statistical significance; retrieval accuracy; sequence-specific substitution matrix; Bioinformatics; Biomedical measurements; Computer science; Databases; Information retrieval; Length measurement; Maximum likelihood estimation; Sequences; State estimation; USA Councils; Database statistical significance; Homologs; Pairwise statistical significance; Sequence Alignment;
Conference_Titel :
Bioinformatics and Biomedicine, 2008. BIBM '08. IEEE International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
978-0-7695-3452-7
DOI :
10.1109/BIBM.2008.19