DocumentCode :
2371316
Title :
Applying noise handling techniques to genomic data: a case study
Author :
Teng, Choh Man
Author_Institution :
Inst. for Human & Machine Cognition, Pensacola, FL, USA
fYear :
2003
fDate :
19-22 Nov. 2003
Firstpage :
743
Lastpage :
746
Abstract :
Osteogenesis Imperfecta (OI) is a genetic collagenous disease associated with mutations in one or both of the genes COLIA1 and COLIA2. There are at least four known phenotypes of OI, of which type II is the severest and often lethal. We identified three approaches to noise handling, namely, robust algorithms, filtering, and polishing, and evaluated their effectiveness when applied to the problem of classifying the disease OI based on a data set of amino acid sequences and associated information of point mutations of COLIA1. Preliminary results suggest that each noise handling mechanism is useful under different circumstances. Filtering is stable across all cases. Pruning with robust c4.5 increased the classification accuracy in some cases, and polishing gave rise to some additional improvement in classifying the lethal OI phenotype.
Keywords :
data mining; diseases; genetics; information filters; medical computing; noise; proteins; Osteogenesis Imperfecta phenotype; amino acid sequence; filtering; genetic collagenous disease; genomic data; noise handling technique; robust algorithm; Amino acids; Bioinformatics; Bone diseases; Computer aided software engineering; Genetic mutations; Genomics; Humans; Information filtering; Information filters; Noise robustness;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
Print_ISBN :
0-7695-1978-4
Type :
conf
DOI :
10.1109/ICDM.2003.1251022
Filename :
1251022
Link To Document :
بازگشت