Title :
Discovery of Prokaryotic Relationships through Latent Structure of Correlated Nucleotide Sequences
Author :
Muzinich, Natalya
Author_Institution :
Indiana University
Abstract :
This paper describes an application of statistical techniques that have yielded fruitful results in many fields including artificial intelligence and information retrieval to the problem of establishing relationships among organisms. A combination of these techniques constitutes a new method of comparing organisms based on their whole genomic sequences. The method represents genomes as sets of short overlapping nucleotide subsequences and employs latent structure modeling to capture correlations in the observed patterns of their distribution. Factor scores computed to measure the correlations serve as the input to a Ward’s hierarchical cluster analysis method, which produces a tree of their relationships. The runtime results indicate that this method allows for the fast and efficient comparison that scales well as the number of organisms increases.
Keywords :
Ward’s hierarchical cluster analysis; principal component analysis; singular value decomposition; whole genome sequence; Artificial intelligence; Bioinformatics; DNA; Genetics; Genomics; Information retrieval; Organisms; Phylogeny; Proteins; Sequences; Ward’s hierarchical cluster analysis; principal component analysis; singular value decomposition; whole genome sequence;
Conference_Titel :
Computer Vision and Pattern Recognition - Workshops, 2005. CVPR Workshops. IEEE Computer Society Conference on
Conference_Location :
San Diego, CA, USA
Print_ISBN :
0-7695-2372-2
DOI :
10.1109/CVPR.2005.443