• DocumentCode
    2448409
  • Title

    An analysis of sentence level text classification for the Kannada language

  • Author

    Jayashree, R. ; Srikanta, M.K.

  • Author_Institution
    Dept. of Comput. Sci., PES Inst. of Technol., Bangalore, India
  • fYear
    2011
  • fDate
    14-16 Oct. 2011
  • Firstpage
    147
  • Lastpage
    151
  • Abstract
    With the rapid growth of internet, huge amount of data is available online. The ability to draw useful information from this digital data is quite challenging. The task of exploring and extracting information from native languages available on line is very much a useful task. The work presented here focuses on sentence level classification in the Kannada language. The most popular approaches in text categorization like Naïve Bayesian and Bag of Words (BOW) approaches are used in this work. It is evident that Bag of Words approach performs significantly better than Naïve Bayesian approach. The objective of the work is to find how sentence level classification works for Kannada Language, as it can be extended further to sentiment classification, Question Answering, Text Summarization and also for customer reviews in Kannada Blogs, because most user´s comments, queries, opinions etc are expressed using sentences, hence this sentence level Text Classification becomes a special task of Text Classification problem. The work though focuses on very basic approaches presently, can later be extended to other methods like SVM, KNN etc.
  • Keywords
    Bayes methods; Internet; natural language processing; pattern classification; question answering (information retrieval); support vector machines; text analysis; word processing; Internet; KNN; Kannada Blogs; Kannada language; SVM; customer review; digital data; information extraction; question answering; sentence level text classification; sentiment classification; text summarization; Bayesian methods; Computer science; Natural language processing; Text categorization; Training; Training data; Bag of Words; Naïve Bayesian; kannada text classification; sentence level classification; single label;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Soft Computing and Pattern Recognition (SoCPaR), 2011 International Conference of
  • Conference_Location
    Dalian
  • Print_ISBN
    978-1-4577-1195-4
  • Type

    conf

  • DOI
    10.1109/SoCPaR.2011.6089130
  • Filename
    6089130