DocumentCode :
659586
Title :
Learning from multiple data sets with different missing attributes and privacy policies: Parallel distributed fuzzy genetics-based machine learning approach
Author :
Ishibuchi, Hisao ; Yamane, Michi ; Nojima, Yusuke
Author_Institution :
Dept. of Comput. Sci. & Intell. Syst., Osaka Prefecture Univ., Sakai, Japan
fYear :
2013
fDate :
6-9 Oct. 2013
Firstpage :
63
Lastpage :
70
Abstract :
This paper discusses parallel distributed genetics-based machine learning (GBML) of fuzzy rule-based classifiers from multiple data sets. We assume that each data set has a similar but different set of attributes. In other words, each data set has different missing attributes. Our task is the design of a fuzzy rule-based classifier from those data sets. In this paper, we first show that fuzzy rules can handle missing attributes easily. Next we explain how parallel distributed fuzzy GBML can handle multiple data sets with different missing attributes. Then we examine the accuracy of obtained fuzzy rule-based classifiers from various settings of available training data such as a single data set with no missing attribute and multiple data sets with many missing attributes. Experimental results show that the use of multiple data sets often increases the accuracy of obtained fuzzy rule-based classifiers even when they have missing attributes. We also discuss the learning from a data set under a severe privacy preserving policy where only the error rate of each candidate classifier is available. It is assumed that no information about each individual pattern is available. This means that we cannot use any information on the class label or the attribute values of each pattern. We explain how such a black-box data set can be utilized for classifier design.
Keywords :
data privacy; genetic algorithms; learning (artificial intelligence); parallel processing; black-box data set; classifier design; fuzzy rule-based classifiers; fuzzy rules; missing attributes; multiple data set learning; multiple data sets; parallel distributed fuzzy GBML; parallel distributed fuzzy genetics-based machine learning approach; parallel distributed genetics-based machine learning; privacy policies; privacy preserving policy; training data; Classification algorithms; Data privacy; Distributed databases; Fuzzy sets; Sociology; Statistics; Training data; Evolutionary algorithms; fuzzy rule-based classifiers; genetics-based machine learning; horizontally partitioned data sets; parallel distributed implementation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data, 2013 IEEE International Conference on
Conference_Location :
Silicon Valley, CA
Type :
conf
DOI :
10.1109/BigData.2013.6691735
Filename :
6691735
Link To Document :
بازگشت