Title :
A computational framework for Tamil document classification using Random Kitchen Sink
Author :
Sanjanasri J.P; Anand Kumar M
Author_Institution :
Center for Excellence in Computational Engineering and Networking, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, India
Abstract :
Along the prompt growth in World Wide Web, the availability and accessibility of regional language contents such as e-books, web pages, e-mails, and digital repositories has grown exponentially. As a result, the automatic document classification has become the hotspot for fetching information among the millions of web documents. The idea of classifying the text, forms the baseline for many NLP applications such as information extraction, query response, information summarization, etc. The main objective of this paper is to develop an computational framework for supervised Tamil document classification task. This paper highlights the performance of Random Kitchen Sink, a randomization algorithm, in Grand Unified Regularized Least Squares (GURLS), a Machine Learning Library, is proven to be comparably better than the conventional kernel based classifier in terms of accuracy. Henceforth, we claim that Random Kitchen Sink can be an effective alternative to the kernels for a classifier.
Keywords :
"Kernel","Support vector machines","Machine learning algorithms","Frequency measurement","Natural language processing","Training","Vocabulary"
Conference_Titel :
Advances in Computing, Communications and Informatics (ICACCI), 2015 International Conference on
Print_ISBN :
978-1-4799-8790-0
DOI :
10.1109/ICACCI.2015.7275837