DocumentCode
2738096
Title
InfoBarcoding: Selection of non-contiguous sites in molecular biomarker
Author
Chiu, David K Y ; Xu, Peter S C
Author_Institution
Dept. Comput. Sci., Univ. of Guelph, Guelph, ON, Canada
fYear
2011
fDate
3-5 Feb. 2011
Firstpage
69
Lastpage
74
Abstract
DNA barcoding has recently emerged for fast taxonomic classification of species using molecular biomarkers. Different from traditional classification scheme, DNA barcode often involves a small number of samples in each class, likely leading to a phenomenon known as overfit. To evaluate the efficacy of a biomarker based on a given meaningful multiple sequence alignment, we use a metric-based information measure that identifies converging interdependence on statistically significant sites. Experiments show that for the identified sites, when the convergent information between sites in the biomarker is small, its classification information is also small, whereas when it is high, then the information of the class is high. The correlation between these two types of pattern indicates the importance of selecting informative sites, in order for the biomarker to be effective as an identification barcode.
Keywords
DNA; bar codes; biology computing; classification; molecular biophysics; molecular configurations; DNA barcoding; InfoBarcoding; fast taxonomic classification; metric-based information measure; molecular biomarker; molecular biomarkers; multiple sequence alignment; noncontiguous site selection; Correlation; DNA; Molecular biomarkers; Statistical analysis; DNA barcode; biomarker refinement; convergent information; multiple sequence analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Advances in Bio and Medical Sciences (ICCABS), 2011 IEEE 1st International Conference on
Conference_Location
Orlando, FL
Print_ISBN
978-1-61284-851-8
Type
conf
DOI
10.1109/ICCABS.2011.5729944
Filename
5729944
Link To Document