DocumentCode :
515343
Title :
Clustering DNA sequences by selforganizing map and similarity functions
Author :
Elhadi, Gamal F. ; Abbas, Mohamed A.
Author_Institution :
Comput. Sci. Dept., Menofia Univ., Menouf, Egypt
fYear :
2010
fDate :
28-30 March 2010
Firstpage :
1
Lastpage :
8
Abstract :
Deoxyribonucleic acid (DNA) is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms and some viruses. Our proposed approach for DNA clustering depends on an algorithm for clustering DNA sequences using self-organizing map (SOM) technique. The main objective of this paper is to analyze biological data and to bunch DNA to many clusters more easily and efficiently. Clustering is a process that groups a set of objects into clusters so that the similarity among objects in the same cluster is high, while that among the objects in different clusters is low. Since the work entails processing huge amounts of incomplete or ambiguous data, the learning ability of neural networks, uncertainty handling capacity of fuzzy sets and the searching potential of genetic algorithms are utilized in this direction. We use the proposed approach to analyze both large and small amount of input DNA sequences. The results show that the similarity of the sequences does not depend on the amount of input sequences. Our approach depends on evaluating the degree of the DNA sequences similarity using the hierarchal representation Dendrogram. Representing large amount of data using hierarchal tree gives the ability to compare large sequences efficiently.
Keywords :
DNA; biology computing; fuzzy set theory; genetic algorithms; self-organising feature maps; uncertainty handling; DNA sequence clustering process; SOM technique; fuzzy set theory; genetic algorithms; hierarchal representation Dendrogram; neural networks; self-organizing map technique; similarity functions; uncertainty handling; Clustering algorithms; DNA; Data analysis; Fuzzy sets; Genetics; Neural networks; Organisms; Sequences; Uncertainty; Viruses (medical); DNA sequences; clustering; nucleic acids; self-organizing map;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Informatics and Systems (INFOS), 2010 The 7th International Conference on
Conference_Location :
Cairo
Print_ISBN :
978-1-4244-5828-8
Type :
conf
Filename :
5461737
Link To Document :
بازگشت