Title :
A self-organizing neural network structure for motif identification in DNA sequences
Author :
Liu, Derong ; Xiong, Xiaoxu ; DasGupta, Bhaskar
Author_Institution :
Dept. of Electr. & Comput. Eng., Illinois Univ., Chicago, IL, USA
Abstract :
In this paper, we study the problem of subtle signal discoveries in unaligned DNA and protein sequences. Motifs, also known as approximate common substrings, are good examples of subtle signals in DNA and protein sequences. The problem of motif identification in DNA and protein sequences has been studied for many years in the literature. Major hurdles at this point include computational complexity and reliability of the searching algorithms. We would develop a self-organizing neural network for solving the problem of motif identification in DNA and protein sequences. Our network contains several layers with each layer performing classifications at different level. The top layer divide the input space into a small number of regions and the bottom layer classifies all input patterns into motifs and non-motif patterns. Depending on the number of input patterns to be classified, several layers between the top layer and the bottom layer are needed to perform intermediate classification. We maintain a low computational complexity through the use of the layered structure so that each pattern´s classification is performed with respect to a small subspace of the whole input space. We also maintain a high reliability using our self-organizing neural network since the network would grow as needed to make sure all input patterns are considered and are given the same amount of attention. Finally, simulation results show that our algorithm significantly outperforms existing algorithms, especially in the reliability aspect Our algorithm can identify motifs with higher accuracy than existing algorithms.
Keywords :
DNA; biology computing; computational complexity; molecular biophysics; proteins; search problems; self-organising feature maps; DNA sequences; computational complexity; motif identification; protein sequences; searching algorithm; self-organizing neural network structure; Computational complexity; DNA; Intelligent networks; Maintenance; Neural networks; Pattern classification; Protein engineering; RNA; Sequences; Signal processing;
Conference_Titel :
Networking, Sensing and Control, 2005. Proceedings. 2005 IEEE
Print_ISBN :
0-7803-8812-7
DOI :
10.1109/ICNSC.2005.1461174