DocumentCode
2331142
Title
A novel framework to elucidate core classes in a dataset
Author
Soria, Daniele ; Garibaldi, Jonathan M.
Author_Institution
Sch. of Comput. Sci., Univ. of Nottingham, Nottingham, UK
fYear
2010
fDate
18-23 July 2010
Firstpage
1
Lastpage
8
Abstract
In this paper we present an original framework to extract representative groups from a dataset, and we validate it over a novel case study. The framework specifies the application of different clustering algorithms, then several statistical and visualisation techniques are used to characterise the results, and core classes are defined by consensus clustering. Classes may be verified using supervised classification algorithms to obtain a set of rules which may be useful for new data points in the future. This framework is validated over a novel set of histone markers for breast cancer patients. From a technical perspective, the resultant classes are well separated and characterised by low, medium and high levels of biological markers. Clinically, the groups appear to distinguish patients with poor overall survival from those with low grading score and better survival. Overall, this framework offers a promising methodology for elucidating core consensus groups from data.
Keywords
cancer; data handling; data visualisation; medical diagnostic computing; pattern classification; pattern clustering; statistical analysis; biological marker; breast cancer patient; clustering algorithm; consensus clustering; core class elucidation; grading score; histone marker; representative group; statistical technique; supervised classification algorithm; visualisation technique; Algorithm design and analysis; Breast cancer; Clustering algorithms; Educational institutions; Indexes; Partitioning algorithms; Robustness;
fLanguage
English
Publisher
ieee
Conference_Titel
Evolutionary Computation (CEC), 2010 IEEE Congress on
Conference_Location
Barcelona
Print_ISBN
978-1-4244-6909-3
Type
conf
DOI
10.1109/CEC.2010.5586331
Filename
5586331
Link To Document