DocumentCode :
1925395
Title :
Extracting knowledge using probabilistic classifier for text mining
Author :
Subbaiah, S.
Author_Institution :
Dept. of Comput. Applic., K.S.R. Coll. of Technol., Tiruchengode, India
fYear :
2013
fDate :
21-22 Feb. 2013
Firstpage :
440
Lastpage :
442
Abstract :
Text mining is a process of extracting knowledge from large text documents. A new probabilistic classifier for text mining is proposed in this paper. It uses ODP taxonomy and domain ontology and datasets to cluster and identify the category of the given text document. The proposed work has three steps, namely, preprocessing, rule generation and probability calculation. At the stage of preprocessing the input document is split into paragraphs and statements. In rule generation, the documents from the training set are read. In probability calculation, positive and negative weight factor is calculated. The proposed algorithm calculates the positive probability value and negative probability value for each term set or pattern identified from the document. Based on the calculated probability value the probabilistic classifier indexes the document to the concern group of the cluster.
Keywords :
data mining; ontologies (artificial intelligence); pattern classification; probability; text analysis; ODP taxonomy; datasets; domain ontology; input document; knowledge extraction; positive weight factor; preprocessing; probabilistic classifier; probability calculation; rule generation; text documents; text mining; training set; Association rules; Databases; Probabilistic logic; Probability; Text mining; Training; Classification; Clustering; ODP Taxonomy; Probabilistic Classifier; Text Mining; categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, Informatics and Mobile Engineering (PRIME), 2013 International Conference on
Conference_Location :
Salem
Print_ISBN :
978-1-4673-5843-9
Type :
conf
DOI :
10.1109/ICPRIME.2013.6496517
Filename :
6496517
Link To Document :
بازگشت