DocumentCode
515343
Title
Clustering DNA sequences by selforganizing map and similarity functions
Author
Elhadi, Gamal F. ; Abbas, Mohamed A.
Author_Institution
Comput. Sci. Dept., Menofia Univ., Menouf, Egypt
fYear
2010
fDate
28-30 March 2010
Firstpage
1
Lastpage
8
Abstract
Deoxyribonucleic acid (DNA) is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms and some viruses. Our proposed approach for DNA clustering depends on an algorithm for clustering DNA sequences using self-organizing map (SOM) technique. The main objective of this paper is to analyze biological data and to bunch DNA to many clusters more easily and efficiently. Clustering is a process that groups a set of objects into clusters so that the similarity among objects in the same cluster is high, while that among the objects in different clusters is low. Since the work entails processing huge amounts of incomplete or ambiguous data, the learning ability of neural networks, uncertainty handling capacity of fuzzy sets and the searching potential of genetic algorithms are utilized in this direction. We use the proposed approach to analyze both large and small amount of input DNA sequences. The results show that the similarity of the sequences does not depend on the amount of input sequences. Our approach depends on evaluating the degree of the DNA sequences similarity using the hierarchal representation Dendrogram. Representing large amount of data using hierarchal tree gives the ability to compare large sequences efficiently.
Keywords
DNA; biology computing; fuzzy set theory; genetic algorithms; self-organising feature maps; uncertainty handling; DNA sequence clustering process; SOM technique; fuzzy set theory; genetic algorithms; hierarchal representation Dendrogram; neural networks; self-organizing map technique; similarity functions; uncertainty handling; Clustering algorithms; DNA; Data analysis; Fuzzy sets; Genetics; Neural networks; Organisms; Sequences; Uncertainty; Viruses (medical); DNA sequences; clustering; nucleic acids; self-organizing map;
fLanguage
English
Publisher
ieee
Conference_Titel
Informatics and Systems (INFOS), 2010 The 7th International Conference on
Conference_Location
Cairo
Print_ISBN
978-1-4244-5828-8
Type
conf
Filename
5461737
Link To Document