Title :
Text Classification Algorithm Study Based on Rough Set Theory
Author :
Xun, Lin ; Zhishu, Li ; Yong, Zhou ; Yuan, Xue
Author_Institution :
Sch. of Comput., Si Chuan Univ. (SCU), Chengdu, China
Abstract :
Text Classification is an important research area in Chinese information processing, whose goal is on the base of analyzing the text content to give the allocation of one or more of the text to more appropriate classes to enhance the text retrieval, storage, applications such as processing efficiency. In this paper, text dataset is transformed to information system without attribute of decision making and the core content of attribute reduction has been applied to text classification. Experiment shows that the precision rate and recall rate are enhanced in this method; furthermore, it does not require any a priori information. In this paper, The first Determination of the text vector, The second generates Text set information systems, The third Attribute value discretization.
Keywords :
classification; information retrieval; natural language processing; rough set theory; text analysis; Chinese information processing; attribute value discretization; rough set theory; text classification algorithm; text retrieval; text set information systems; text vector determination; Classification algorithms; Decision making; Information systems; Set theory; Support vector machine classification; Text categorization; Training; Rought Set; Text Classification; priori information; reduction;
Conference_Titel :
Information Technology and Applications (IFITA), 2010 International Forum on
Conference_Location :
Kunming
Print_ISBN :
978-1-4244-7621-3
Electronic_ISBN :
978-1-4244-7622-0
DOI :
10.1109/IFITA.2010.203