• DocumentCode
    659277
  • Title

    A framework for medical text mining using a feature weighted clustering algorithm

  • Author

    Chakrabarty, Ankush ; Roy, Sandip

  • Author_Institution
    Dept. of MCA, Future Inst. of Eng. & Manage., Kolkata, India
  • fYear
    2013
  • fDate
    13-14 Sept. 2013
  • Firstpage
    135
  • Lastpage
    139
  • Abstract
    Text categorization is the task of deciding whether a document belongs to a set of pre specified classes of documents. Categorization of documents is challenging, as the number of discriminating words can be huge. Many existing text classification algorithms simply do not work with these many number of words. Traditional text classification algorithm uses all training samples for classification, thereby increasing the storage requirements and calculation complexity as the number of features increase. Mining medical records for relationships between living factors and the symptoms of a disease is an important task, however there has been relatively little research into this area. The proposed work evolves a text classification algorithm where al l cluster centers are taken as training samples there by reducing the sample size and introduces a weight factor to indicate the different importance of each training sample. A similarity measure function is used to classify a new patient document, based on the measure. Experiments on real life data show that the proposed algorithm outperforms the state of art classification algorithms such as k-nearest neighbor.
  • Keywords
    data mining; electronic health records; text analysis; calculation complexity; documents categorization; feature weighted clustering algorithm; medical text mining; storage requirements; text categorization; text classification algorithms; Biomedical imaging; Classification algorithms; Clustering algorithms; Diseases; Liver; Text categorization; Training; Text categorization; clustering; cosine based similarity; liver disease; minimum spanning tree; weight factor;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Emerging Trends and Applications in Computer Science (ICETACS), 2013 1st International Conference on
  • Conference_Location
    Shillong
  • Print_ISBN
    978-1-4673-5249-9
  • Type

    conf

  • DOI
    10.1109/ICETACS.2013.6691410
  • Filename
    6691410