DocumentCode :
3226642
Title :
Automatic Tamil Content Generation
Author :
Kohilavani, S. ; Mala, T. ; Geetha, T.V.
Author_Institution :
Dept. of Comput. Sci. & Eng., Anna Univ., Chennai, India
fYear :
2009
fDate :
22-24 July 2009
Firstpage :
1
Lastpage :
6
Abstract :
Automatic content generation aims on developing an intelligent tutoring system in Tamil language. This system focuses on delivering personalized content in Tamil language to an individual user needs based on their learning abilities and interests. This paper deals with automatic classification of Tamil documents and also the information extraction from those documents to construct the knowledge base. Documents are repositories of knowledge. There are numerous documents available and effective search in documents is time consuming. To make document search a simpler task, we need to perform document categorization. Document category can be found out using various techniques. In this paper, naive Bayes (NB) algorithm is used to classify Tamil documents to one of predefined categories. Experiments are used to evaluate the naive Bayes categorizer. The experimental results show that the naive Bayes classifier performs well and its effectiveness is achieved with 89.8% accuracy. Informational words or sentences of the documents are then extracted using heuristic rules to fill up the predefined templates. An individual user´s interests are identified and recorded to create a user profile. A user profile is specific to a user and is subjected to change over time. The topic categorizer is used to categorize the topic based on user´s query. The topic analyzer is used to analyze the user´s profile and evaluate the user´s knowledge using intelligent evaluator system. Based on the user´s knowledge, intelligent evaluator system makes a decision to suggest the location or to suggest a new topic retrieved from the knowledge base. Then the personalized content will be generated based on the knowledge level of the user. The experimental results show that the content generator performs well and its effectiveness is achieved with 82.35% accuracy.
Keywords :
Bayes methods; knowledge acquisition; pattern classification; query processing; text analysis; user interfaces; word processing; Tamil documents classification; Tamil language; automatic content generation; intelligent tutoring system; naive Bayes algorithm; user profile; Computer science; Data mining; Educational institutions; Indexing; Information retrieval; Intelligent systems; Machine learning algorithms; Natural languages; Niobium; Text categorization; Document Categorization; Naïve Bayes; Stopwords; classifier; information extraction; preprocessing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Agent & Multi-Agent Systems, 2009. IAMA 2009. International Conference on
Conference_Location :
Chennai
Print_ISBN :
978-1-4244-4710-7
Type :
conf
DOI :
10.1109/IAMA.2009.5228064
Filename :
5228064
Link To Document :
بازگشت