DocumentCode :
2528970
Title :
Maintaining imbalance highly dependent medical data using dirichlet process data generation
Author :
Antaresti, Tieta ; Fanany, Mohamad Ivan ; Arymurthy, Aniati Murni
Author_Institution :
Fac. of Comput. Sci., Pattern Recognition, Image Process. & Content-Based Image Retrieval Lab., Univ. Indonesia, Depok, Indonesia
fYear :
2011
fDate :
26-28 Sept. 2011
Firstpage :
18
Lastpage :
22
Abstract :
The existence of imbalanced data between one class and another class is an important issue to be considered in a classification problem. One of the well-known data balancing technique is the artificial oversampling, which increase the size of datasets. In this research, multinomial classification was applied to classify some recorded features obtained from a single ECG (electrocardiograph) sensor. Therefore, a Dirichlet process, a dirichlet distribution of cumulative distribution function of each data partition, was needed to model the distribution of the new generated data by also considering the statistical properties of the previous data. Data balancing process had given the result of 77.21% classification accuracy (CA), and 90.9% area under ROC curve (AUC).
Keywords :
electrocardiography; feature extraction; medical administrative data processing; medical computing; pattern classification; statistical distributions; Dirichlet process data generation; artificial oversampling; classification problem; data balancing technique; highly dependent medical data maintenance; multinomial classification; single ECG sensor; Accuracy; Bayesian methods; Data models; Diseases; Electrocardiography; Machine learning; Sleep apnea;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Information Management (ICDIM), 2011 Sixth International Conference on
Conference_Location :
Melbourn, QLD
ISSN :
Pending
Print_ISBN :
978-1-4577-1538-9
Type :
conf
DOI :
10.1109/ICDIM.2011.6093359
Filename :
6093359
Link To Document :
بازگشت