Title :
A Novel Knowledge Discovery Method for Chinese Architectural Document
Author :
Zhang, Xiang ; Li, Changhua ; Zhou, Mingquan ; Ye, Na ; Dong, Lehong
Author_Institution :
Coll. of Inf. Sci. & Technol., Northwest Univ., Xi´´an, China
Abstract :
Aiming at the problem of the traditional feature selection that threshold filtering loses a lot of effective architectural information and to improve the precise of Chinese architectural document classification, a new algorithm based on rough set and C4.5Bagging is proposed for Chinese architectural document categorization. Firstly the cores of attribute are found by discernibility matrix and one of the cores is regarded as the start point. Then attributes´ significance and dependency are used as the heuristic information to do feature selection. Finally the c4.5bagging is designed to architectural document classifier. The experimental results show that the novel method is not only easy to implement but can effectively reduce the dimensional space, and improve the accuracy of classification.
Keywords :
data mining; document handling; natural language processing; rough set theory; C4.5Bagging; chinese architectural document; feature selection; novel knowledge discovery method; rough set; Classification algorithms; Control engineering; Educational institutions; Filtering algorithms; Frequency; Information filtering; Information filters; Information science; Rough sets; Text categorization;
Conference_Titel :
Intelligent Systems and Applications (ISA), 2010 2nd International Workshop on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-5872-1
Electronic_ISBN :
978-1-4244-5874-5
DOI :
10.1109/IWISA.2010.5473263