DocumentCode
659277
Title
A framework for medical text mining using a feature weighted clustering algorithm
Author
Chakrabarty, Ankush ; Roy, Sandip
Author_Institution
Dept. of MCA, Future Inst. of Eng. & Manage., Kolkata, India
fYear
2013
fDate
13-14 Sept. 2013
Firstpage
135
Lastpage
139
Abstract
Text categorization is the task of deciding whether a document belongs to a set of pre specified classes of documents. Categorization of documents is challenging, as the number of discriminating words can be huge. Many existing text classification algorithms simply do not work with these many number of words. Traditional text classification algorithm uses all training samples for classification, thereby increasing the storage requirements and calculation complexity as the number of features increase. Mining medical records for relationships between living factors and the symptoms of a disease is an important task, however there has been relatively little research into this area. The proposed work evolves a text classification algorithm where al l cluster centers are taken as training samples there by reducing the sample size and introduces a weight factor to indicate the different importance of each training sample. A similarity measure function is used to classify a new patient document, based on the measure. Experiments on real life data show that the proposed algorithm outperforms the state of art classification algorithms such as k-nearest neighbor.
Keywords
data mining; electronic health records; text analysis; calculation complexity; documents categorization; feature weighted clustering algorithm; medical text mining; storage requirements; text categorization; text classification algorithms; Biomedical imaging; Classification algorithms; Clustering algorithms; Diseases; Liver; Text categorization; Training; Text categorization; clustering; cosine based similarity; liver disease; minimum spanning tree; weight factor;
fLanguage
English
Publisher
ieee
Conference_Titel
Emerging Trends and Applications in Computer Science (ICETACS), 2013 1st International Conference on
Conference_Location
Shillong
Print_ISBN
978-1-4673-5249-9
Type
conf
DOI
10.1109/ICETACS.2013.6691410
Filename
6691410
Link To Document