Title :
Children story classification based on structure of the story
Author :
Harikrishna D M;K. Sreenivasa Rao
Author_Institution :
School of Information Technology, Indian Institute of Technology, Kharagpur, India
Abstract :
The main objective of this work is to classify Hindi and Telugu stories based on their structure into three genres: Fable, Folk-tale and Legend. In this work, each story is divided into three parts: (i) introduction, (ii) main and (iii) climax. The objective of this work is to explore how story genre information is embedded in different parts of the story. We are proposing a framework for story classification using keyword and Part-of-speech (POS) based features. Keyword based features like Term Frequency (TF) and Term Frequency Inverse Document Frequency (TFIDF) are used. Classification performance is analyzed for different story parts using various combinations of features with three classifiers: (i) Naive Bayes (NB), (ii) k-Nearest Neighbour (KNN) and (iii) Support Vector Machine (SVM). From the experimental studies, it has been observed that classification performance has not significantly improved by combining linguistic (POS) and keyword based features. Among classifiers, SVM outperformed the other classifiers. The main part of the story has the highest classification accuracy compared to introduction and climax parts of the story.
Keywords :
"Support vector machines","Niobium","Accuracy","Handheld computers","Text categorization","Informatics","Machine learning algorithms"
Conference_Titel :
Advances in Computing, Communications and Informatics (ICACCI), 2015 International Conference on
Print_ISBN :
978-1-4799-8790-0
DOI :
10.1109/ICACCI.2015.7275822