DocumentCode :
1620365
Title :
Pruning fuzzy ARTMAP using the minimum description length principle in learning from clinical databases
Author :
Lin, Ten-Ho ; Soo, Von-Wun
Author_Institution :
Dept. of Comput. Sci., Nat. Tsing Hua Univ., Hsinchu, Taiwan
fYear :
1997
Firstpage :
396
Lastpage :
403
Abstract :
The fuzzy ARTMAP is one of the families of neural network architectures based on ART (adaptive resonance theory) in which supervised learning can be carried out. However, it usually tends to create more categories than are actually needed. This often causes the so-called overfitting problem, where the performance of the fuzzy ARTMAP networks in the test set does not increase monotonically with additional training epochs and category creation. In order to avoid the overfitting problem, Carpenter and Tan (1993) proposed a confidence-based pruning method by eliminating those categories that were either less useful or less accurate. This paper proposes yet another alternative pruning method, which is based on the minimal description length (MDL) principle. The MDL principle can be viewed as a tradeoff between theory complexity and data prediction accuracy, given the theory. We adopted Cameron-Jones´s (1992) error encoding scheme and Quinlan´s (1994, 1995) modification for theory encoding to estimate the fuzzy ARTMAP theory description length. A greedy MDL search algorithm is proposed to prune the fuzzy ARTMAP categories one by one. Experiments showed that a fuzzy ARTMAP pruned with the MDL principle gave a better performance, with far fewer categories created, than the original fuzzy ARTMAP and other machine-learning systems on a number of benchmark clinical databases such as heart disease, breast cancer and diabetes databases
Keywords :
ART neural nets; deductive databases; fuzzy neural nets; knowledge acquisition; learning (artificial intelligence); medical information systems; neural net architecture; performance evaluation; tree searching; adaptive resonance theory; breast cancer; category creation; category pruning; clinical databases; confidence-based pruning method; data mining; data prediction accuracy; diabetes; error encoding scheme; fuzzy ARTMAP; greedy search algorithm; heart disease; knowledge acquisition; minimum description length; neural network architecture; overfitting problem; performance; supervised learning; theory complexity; theory encoding; training epochs; Databases; Encoding; Fuzzy neural networks; Fuzzy sets; Fuzzy systems; Neural networks; Resonance; Subspace constraints; Supervised learning; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 1997. Proceedings., Ninth IEEE International Conference on
Conference_Location :
Newport Beach, CA
ISSN :
1082-3409
Print_ISBN :
0-8186-8203-5
Type :
conf
DOI :
10.1109/TAI.1997.632281
Filename :
632281
Link To Document :
بازگشت