• DocumentCode
    3226642
  • Title

    Automatic Tamil Content Generation

  • Author

    Kohilavani, S. ; Mala, T. ; Geetha, T.V.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Anna Univ., Chennai, India
  • fYear
    2009
  • fDate
    22-24 July 2009
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Automatic content generation aims on developing an intelligent tutoring system in Tamil language. This system focuses on delivering personalized content in Tamil language to an individual user needs based on their learning abilities and interests. This paper deals with automatic classification of Tamil documents and also the information extraction from those documents to construct the knowledge base. Documents are repositories of knowledge. There are numerous documents available and effective search in documents is time consuming. To make document search a simpler task, we need to perform document categorization. Document category can be found out using various techniques. In this paper, naive Bayes (NB) algorithm is used to classify Tamil documents to one of predefined categories. Experiments are used to evaluate the naive Bayes categorizer. The experimental results show that the naive Bayes classifier performs well and its effectiveness is achieved with 89.8% accuracy. Informational words or sentences of the documents are then extracted using heuristic rules to fill up the predefined templates. An individual user´s interests are identified and recorded to create a user profile. A user profile is specific to a user and is subjected to change over time. The topic categorizer is used to categorize the topic based on user´s query. The topic analyzer is used to analyze the user´s profile and evaluate the user´s knowledge using intelligent evaluator system. Based on the user´s knowledge, intelligent evaluator system makes a decision to suggest the location or to suggest a new topic retrieved from the knowledge base. Then the personalized content will be generated based on the knowledge level of the user. The experimental results show that the content generator performs well and its effectiveness is achieved with 82.35% accuracy.
  • Keywords
    Bayes methods; knowledge acquisition; pattern classification; query processing; text analysis; user interfaces; word processing; Tamil documents classification; Tamil language; automatic content generation; intelligent tutoring system; naive Bayes algorithm; user profile; Computer science; Data mining; Educational institutions; Indexing; Information retrieval; Intelligent systems; Machine learning algorithms; Natural languages; Niobium; Text categorization; Document Categorization; Naïve Bayes; Stopwords; classifier; information extraction; preprocessing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Agent & Multi-Agent Systems, 2009. IAMA 2009. International Conference on
  • Conference_Location
    Chennai
  • Print_ISBN
    978-1-4244-4710-7
  • Type

    conf

  • DOI
    10.1109/IAMA.2009.5228064
  • Filename
    5228064