DocumentCode :
659277
Title :
A framework for medical text mining using a feature weighted clustering algorithm
Author :
Chakrabarty, Ankush ; Roy, Sandip
Author_Institution :
Dept. of MCA, Future Inst. of Eng. & Manage., Kolkata, India
fYear :
2013
fDate :
13-14 Sept. 2013
Firstpage :
135
Lastpage :
139
Abstract :
Text categorization is the task of deciding whether a document belongs to a set of pre specified classes of documents. Categorization of documents is challenging, as the number of discriminating words can be huge. Many existing text classification algorithms simply do not work with these many number of words. Traditional text classification algorithm uses all training samples for classification, thereby increasing the storage requirements and calculation complexity as the number of features increase. Mining medical records for relationships between living factors and the symptoms of a disease is an important task, however there has been relatively little research into this area. The proposed work evolves a text classification algorithm where al l cluster centers are taken as training samples there by reducing the sample size and introduces a weight factor to indicate the different importance of each training sample. A similarity measure function is used to classify a new patient document, based on the measure. Experiments on real life data show that the proposed algorithm outperforms the state of art classification algorithms such as k-nearest neighbor.
Keywords :
data mining; electronic health records; text analysis; calculation complexity; documents categorization; feature weighted clustering algorithm; medical text mining; storage requirements; text categorization; text classification algorithms; Biomedical imaging; Classification algorithms; Clustering algorithms; Diseases; Liver; Text categorization; Training; Text categorization; clustering; cosine based similarity; liver disease; minimum spanning tree; weight factor;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Emerging Trends and Applications in Computer Science (ICETACS), 2013 1st International Conference on
Conference_Location :
Shillong
Print_ISBN :
978-1-4673-5249-9
Type :
conf
DOI :
10.1109/ICETACS.2013.6691410
Filename :
6691410
Link To Document :
بازگشت