Title :
Mining Positional Association Super-Rules on Fixed-Size Protein Sequence Motifs
Author :
Chen, Bernard ; Kockara, Sinan
Author_Institution :
Comput. Sci. Dept., Univ. of Central Arkansas, Conway, AR, USA
Abstract :
Protein sequence motifs information is crucial to the analysis of biologically significant regions. The conserved regions have the potential to determine the role of the proteins. Many algorithms or techniques to discover motifs require a predefined fixed window size in advance. Due to the fixed size, these approaches often deliver a number of similar motifs simply shifted by some bases or including mismatches. To confront the shifted motifs problem, we cooperate the Super-Rule-Tree (SRT) concept, which is designed for solving the mismatched motifs problem, and propose a new Positional Association Rules algorithm. In Positional Association Rules algorithm, a new parameter named distance assurance is created to search frequent distances appearing in association rules. By analyzing the motifs results generated by our approach on our dataset, we provide the optimal minimum support, confidence, and distance assurance. We believe the Positional Association Super-Rules algorithm can play an important role in similar researches which requires predefined fixed window size.
Keywords :
bioinformatics; data mining; molecular biophysics; proteins; distance assurance; fixed-size protein sequence motifs; positional association rules algorithm; super-rule-tree concept; Association rules; Bioinformatics; Biomedical engineering; Clustering algorithms; Computer science; DNA; Data mining; Itemsets; Protein sequence; Sequences; Positional Association Rules; Super-rules; protein sequence motif;
Conference_Titel :
Bioinformatics and BioEngineering, 2009. BIBE '09. Ninth IEEE International Conference on
Conference_Location :
Taichung
Print_ISBN :
978-0-7695-3656-9
DOI :
10.1109/BIBE.2009.11