مرکز منطقه ای اطلاع رساني علوم و فناوري - Evaluation of classification models for language processing

DocumentCode :

3659884

Title :

Evaluation of classification models for language processing

Author :

Zeynep Hilal Kilimci;Murat Can Ganiz

Author_Institution :

Computer Engineering Department, Dogus University, Istanbul, Turkey

fYear :

2015

Firstpage :

Lastpage :

Abstract :

Naïve Bayes is a commonly used algorithm in text categorization because of its easy implementation and low complexity. Naïve Bayes has mainly two event models used for text categorization which are multivariate Bernoulli and multinomial models. A very large number of studies choose multinomial model and Laplace smoothing just based on the assumption that it performs better than multivariate model under almost any conditions. This study aims to shed some light into this widely adopted assumption by analyzing Naïve Bayes event models and smoothing methods from a different perspective. To clarify the difference between events models of Naïve Bayes, their classification performance are compared on different languages - English and Turkish - datasets. Results of our extensive experiments demonstrate that superior performance of multinomial model does not observed all the time. On the other hand, multivariate Bernoulli model can perform well when combined with an appropriate smoothing method under different training data size conditions.

Keywords :

"Smoothing methods","Niobium","Computational modeling","Text categorization","Vocabulary","Training","Accuracy"

Publisher :

ieee

Conference_Titel :

Innovations in Intelligent SysTems and Applications (INISTA), 2015 International Symposium on

Type :

conf

DOI :

10.1109/INISTA.2015.7276787

Filename :

7276787

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3659884