DocumentCode :
3502208
Title :
A comparative study on existing methodologies to predict dominating patterns amongst biological sequences
Author :
Priya, G. Lakshmi ; Hariharan, Shanmugasundaram
Author_Institution :
Dept. of Comput. Sci. & Eng., J.J. Coll. of Eng. & Technol., Tiruchirapalli, India
fYear :
2011
fDate :
14-16 Dec. 2011
Firstpage :
210
Lastpage :
215
Abstract :
Data Mining is the process of extracting or mining the patterns from very large amount of biological datasets. Utilization of Data mining algorithms can reveal biological relevant associations between different genes and gene expression. In Data Mining, several techniques are available for predicting frequent patterns. One among the technique is association rule mining algorithm; which can be applied for solving the crucial problems faced in the field of biological science. From the literature, various algorithms have been employed in generating frequent patterns for distinct application. These algorithms have some limitations in predicting frequent patterns, such as space, time complexity and accuracy. In order to overcome these drawbacks, the study is made on existing algorithms for generating frequent patterns from the biological sequences. The literature survey gives a significant number of methods were generated for predicting associative patterns. The proposed system has to be developed for solving problems in biological science. Biological sequence may be a collection of DNA sequence, Gene expression sequence or Protein sequence for a specific viral disease. Amino acids are the building blocks of proteins. Proteins are organic compounds made up of amino acids arranged in a linear chain and folded into a globular form. The future proposal not only leads in predicting the frequent patterns; it will also satisfy some factors such as: time complexity, space and predict accurate solution to the required problem. With the help of these three factors into consideration and efficient algorithm can be identified for predicting the dominating amino acids for any kind of specific biological implication.
Keywords :
DNA; biology computing; data mining; diseases; feature extraction; genetics; molecular biophysics; proteins; DNA sequence; amino acids; association rule mining algorithm; biological datasets; biological science; biological sequences; data mining; frequent pattern generation; gene expression sequence; organic compound; pattern extraction; protein sequence; viral disease; Algorithm design and analysis; Amino acids; Association rules; Itemsets; Proteins; Association Rules and Bioinformatics; Clustering techniques; Data Mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Computing (ICoAC), 2011 Third International Conference on
Conference_Location :
Chennai
Print_ISBN :
978-1-4673-0670-6
Type :
conf
DOI :
10.1109/ICoAC.2011.6165177
Filename :
6165177
Link To Document :
بازگشت