DocumentCode
3154786
Title
Different similarity measures in semi-supervised text classification
Author
Wajeed, Mohammed Abdul ; Adilakshmi, T.
Author_Institution
Sreenidhi Inst. of Sci. & Technol., Hyderabad, India
fYear
2011
fDate
16-18 Dec. 2011
Firstpage
1
Lastpage
5
Abstract
Information has a great value, in order to use the existing information we need to store it in a manner which can be retrieved easily when needed. So classifying the available information becomes inevitable. In addition to the existing supervised and unsupervised paradigms of classification the paper attempts to exploit the concept of semi-supervised learning paradigm. Semi-supervised learning is halfway between the supervised and unsupervised learning. In addition to unlabeled data, the algorithm is provided with some supervision information but not necessary for all example data. The paper explores the semi-supervised text classification which is applied to different types of vectors that are generated from the text documents. KNN algorithm is employed in the process of semi-supervised text classification, and results obtained are encouraging.
Keywords
learning (artificial intelligence); text analysis; KNN algorithm; semisupervised learning; semisupervised text classification; similarity measure; unsupervised learning; Classification algorithms; Support vector machine classification; Text categorization; Time frequency analysis; Training; Training data; Vectors; confusion matrix; semi-supervised learning; similarity measures; text classification;
fLanguage
English
Publisher
ieee
Conference_Titel
India Conference (INDICON), 2011 Annual IEEE
Conference_Location
Hyderabad
Print_ISBN
978-1-4577-1110-7
Type
conf
DOI
10.1109/INDCON.2011.6139401
Filename
6139401
Link To Document