• DocumentCode
    3154786
  • Title

    Different similarity measures in semi-supervised text classification

  • Author

    Wajeed, Mohammed Abdul ; Adilakshmi, T.

  • Author_Institution
    Sreenidhi Inst. of Sci. & Technol., Hyderabad, India
  • fYear
    2011
  • fDate
    16-18 Dec. 2011
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Information has a great value, in order to use the existing information we need to store it in a manner which can be retrieved easily when needed. So classifying the available information becomes inevitable. In addition to the existing supervised and unsupervised paradigms of classification the paper attempts to exploit the concept of semi-supervised learning paradigm. Semi-supervised learning is halfway between the supervised and unsupervised learning. In addition to unlabeled data, the algorithm is provided with some supervision information but not necessary for all example data. The paper explores the semi-supervised text classification which is applied to different types of vectors that are generated from the text documents. KNN algorithm is employed in the process of semi-supervised text classification, and results obtained are encouraging.
  • Keywords
    learning (artificial intelligence); text analysis; KNN algorithm; semisupervised learning; semisupervised text classification; similarity measure; unsupervised learning; Classification algorithms; Support vector machine classification; Text categorization; Time frequency analysis; Training; Training data; Vectors; confusion matrix; semi-supervised learning; similarity measures; text classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    India Conference (INDICON), 2011 Annual IEEE
  • Conference_Location
    Hyderabad
  • Print_ISBN
    978-1-4577-1110-7
  • Type

    conf

  • DOI
    10.1109/INDCON.2011.6139401
  • Filename
    6139401