Title :
Estimation of protein function with an evolutionary dictionary
Author :
Chiba, Shinji ; Sugawara, Ken
Author_Institution :
Dept. of Inf. Eng., Sendai Nat. Coll. of Technol., Kitahara, Japan
Abstract :
Proteins have complicated spatial structure and have chemical and physical functions that originate from the structure. Today no method is available to predict the function accurately from the DNA/Amino acid sequence. Instead, there are some approaches to estimate the functions approximately based on a similarity retrieval of sequences. In this paper, we propose two types of methods for amino acid sequence retrieval by an evolutionary dictionary. One is based on homology retrieval. Introduction of the compression by evolutionary dictionary technique enables us to describe the text data as an n-dimensional vector using n dictionaries which are generated by compressing n typical texts, and it also enables us to classify them based on their sequential similarity. The other is based on motif retrieval. As there are some common arrangements in functionally similar amino acid sequences, we can make a "dictionary" which is specific to the group. In this method, we introduce a genetic algorithm and refine the dictionary. Effectiveness of our proposal is examined using real genome data
Keywords :
DNA; data compression; genetic algorithms; proteins; amino acid sequence retrieval; data compression; evolutionary dictionary; genetic algorithm; homology retrieval; motif retrieval; n-dimensional vector; physical functions; protein function estimation; real genome data; similarity retrieval; spatial structure; text data; Amino acids; Bioinformatics; Chemicals; DNA; Dictionaries; Genetic algorithms; Genomics; Proposals; Proteins; Sequences;
Conference_Titel :
Evolutionary Computation, 2002. CEC '02. Proceedings of the 2002 Congress on
Conference_Location :
Honolulu, HI
Print_ISBN :
0-7803-7282-4
DOI :
10.1109/CEC.2002.1006253