DocumentCode
1925395
Title
Extracting knowledge using probabilistic classifier for text mining
Author
Subbaiah, S.
Author_Institution
Dept. of Comput. Applic., K.S.R. Coll. of Technol., Tiruchengode, India
fYear
2013
fDate
21-22 Feb. 2013
Firstpage
440
Lastpage
442
Abstract
Text mining is a process of extracting knowledge from large text documents. A new probabilistic classifier for text mining is proposed in this paper. It uses ODP taxonomy and domain ontology and datasets to cluster and identify the category of the given text document. The proposed work has three steps, namely, preprocessing, rule generation and probability calculation. At the stage of preprocessing the input document is split into paragraphs and statements. In rule generation, the documents from the training set are read. In probability calculation, positive and negative weight factor is calculated. The proposed algorithm calculates the positive probability value and negative probability value for each term set or pattern identified from the document. Based on the calculated probability value the probabilistic classifier indexes the document to the concern group of the cluster.
Keywords
data mining; ontologies (artificial intelligence); pattern classification; probability; text analysis; ODP taxonomy; datasets; domain ontology; input document; knowledge extraction; positive weight factor; preprocessing; probabilistic classifier; probability calculation; rule generation; text documents; text mining; training set; Association rules; Databases; Probabilistic logic; Probability; Text mining; Training; Classification; Clustering; ODP Taxonomy; Probabilistic Classifier; Text Mining; categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Pattern Recognition, Informatics and Mobile Engineering (PRIME), 2013 International Conference on
Conference_Location
Salem
Print_ISBN
978-1-4673-5843-9
Type
conf
DOI
10.1109/ICPRIME.2013.6496517
Filename
6496517
Link To Document