DocumentCode :
3659681
Title :
A computational framework for Tamil document classification using Random Kitchen Sink
Author :
Sanjanasri J.P; Anand Kumar M
Author_Institution :
Center for Excellence in Computational Engineering and Networking, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, India
fYear :
2015
Firstpage :
1571
Lastpage :
1577
Abstract :
Along the prompt growth in World Wide Web, the availability and accessibility of regional language contents such as e-books, web pages, e-mails, and digital repositories has grown exponentially. As a result, the automatic document classification has become the hotspot for fetching information among the millions of web documents. The idea of classifying the text, forms the baseline for many NLP applications such as information extraction, query response, information summarization, etc. The main objective of this paper is to develop an computational framework for supervised Tamil document classification task. This paper highlights the performance of Random Kitchen Sink, a randomization algorithm, in Grand Unified Regularized Least Squares (GURLS), a Machine Learning Library, is proven to be comparably better than the conventional kernel based classifier in terms of accuracy. Henceforth, we claim that Random Kitchen Sink can be an effective alternative to the kernels for a classifier.
Keywords :
"Kernel","Support vector machines","Machine learning algorithms","Frequency measurement","Natural language processing","Training","Vocabulary"
Publisher :
ieee
Conference_Titel :
Advances in Computing, Communications and Informatics (ICACCI), 2015 International Conference on
Print_ISBN :
978-1-4799-8790-0
Type :
conf
DOI :
10.1109/ICACCI.2015.7275837
Filename :
7275837
Link To Document :
بازگشت