Title :
On the implementation of a phylogenetic tree database
Author :
Yoshikawa, Takanobu ; Tabe, Tomohiro ; Kishinami, Risa ; Matsuda, Hideo ; Hashimoto, Akihiro
Author_Institution :
Dept. of Inf. & Math. Sci., Osaka Univ., Japan
Abstract :
A molecular phylogenetic tree is a tree-structured graph that represents the evolutionary process of genes, and is constructed from sequence data (such as DNA sequences) obtained from several organisms. Although molecular phylogenetic trees are fundamental data structures in evolutionary analysis, no database system is available that can match trees in the database against a user-supplied tree by comparing tree structures. In this paper, we propose a phylogenetic tree database system with a retrieval function that matches trees having similar structure. The tree data stored in the database are transformed from document images published in biological journals using a pattern-recognition program developed by us. To retrieve phylogenetic trees from the database according to their structures, we propose a method of determining the structural similarity between trees that is based on the split distance method. Our structural similarity measure shows high correlation with the log-likelihood difference that is widely used for comparing phylogenetic trees, and the computation time of our measure is much shorter than that of the log-likelihood difference, which relies on sequence comparison
Keywords :
DNA; biology computing; genetics; image retrieval; molecular biophysics; scientific information systems; sequences; tree data structures; visual databases; biological journals; data structures; document images; evolutionary process; genes; log-likelihood difference; molecular phylogenetic tree; organisms; pattern recognition program; phylogenetic tree database; retrieval function; sequence data; split distance method; structural similarity; tree matching; tree-structured graph; DNA; Database systems; Image databases; Information retrieval; Organisms; Phylogeny; Sequences; Time measurement; Tree data structures; Tree graphs;
Conference_Titel :
Communications, Computers and Signal Processing, 1999 IEEE Pacific Rim Conference on
Conference_Location :
Victoria, BC
Print_ISBN :
0-7803-5582-2
DOI :
10.1109/PACRIM.1999.799473