Title :
Regression in the Presence Missing Data Using Ensemble Methods
Author :
Hassan, Mostafa M. ; Atiya, Amir F. ; El-Gayar, Neamat ; El-Fouly, Raafat
Author_Institution :
Dept. of Comput. Eng., Cairo Univ., Cairo
Abstract :
We consider the problem of missing data, and develop ensemble-network models for handling the missing data. The proposed method is based on utilizing the inherent uncertainty of the missing records in generating diverse training sets for the ensemble´s networks. The proposed method is based on generating the missing values using their probability density. We repeat this procedure many time thereby creating several complete data sets. A network is trained for each of these data sets, therefore obtaining an ensemble of networks. Several variants are proposed, including the univariate approach and the multivariate approach, which differ in the way missing values are generated. Simulation results confirm the general superiority of the proposed methods compared to the conventional approaches.
Keywords :
data handling; neural nets; probability; regression analysis; ensemble-network model; missing data handling; missing records; probability density; regression; Information technology; Learning systems; Linear regression; Machine learning algorithms; Maximum likelihood estimation; Neural networks; Parameter estimation; Statistics; Training data; Uncertainty;
Conference_Titel :
Neural Networks, 2007. IJCNN 2007. International Joint Conference on
Conference_Location :
Orlando, FL
Print_ISBN :
978-1-4244-1379-9
Electronic_ISBN :
1098-7576
DOI :
10.1109/IJCNN.2007.4371139