DocumentCode :
3725738
Title :
Classification of children stories in hindi using keywords and POS density
Author :
D M Harikrishna;K. Sreenivasa Rao
Author_Institution :
Indian Institute of Technology Kharagpur, India
fYear :
2015
Firstpage :
1
Lastpage :
5
Abstract :
The main objective of this work is to classify Hindi stories into three genres: fable, folk-tale and legend. In this paper, we are proposing a framework for story classification using keyword and Part-of-speech (POS) based features. Keyword based features like Term Frequency (TF) and Term Frequency Inverse Document Frequency (TFIDF) are used. Effect of POS tags like Noun, Pronoun, Adjective etc., are analyzed for different story genres. Classification performance is analyzed using different combinations of features with three classifiers; Naive Bayes (NB), k-Nearest Neighbour (KNN) and Support Vector Machine (SVM). From the experimental studies, it is observed that combining linguistic and keyword based features do not improve significantly the classifier performance. Among the classifiers, SVM models outperformed the other models.
Keywords :
"Support vector machines","Niobium","Pragmatics","Conferences","Computers","Text categorization","Tagging"
Publisher :
ieee
Conference_Titel :
Computer, Communication and Control (IC4), 2015 International Conference on
Type :
conf
DOI :
10.1109/IC4.2015.7375666
Filename :
7375666
Link To Document :
بازگشت