• DocumentCode
    2008507
  • Title

    Estimating Missing Data and Determining the Confidence of the Estimate Data

  • Author

    Mistry, Jaisheel ; Nelwamondo, Fulufhelo ; Marwala, Tshlidzi

  • Author_Institution
    Univ. of Witwatersrand, Johannesburg, South Africa
  • fYear
    2008
  • fDate
    11-13 Dec. 2008
  • Firstpage
    752
  • Lastpage
    755
  • Abstract
    A Computational Intelligence approach to estimate missing data makes use of Autoassociative Neural Networks (ANN) and a stochastic optimization technique. The ANN captures interrelationships within data and the optimization technique estimates probable values that are used as inputs to the ANN. The optimum estimate is one that has a minimum influence on the output of the ANN. A method to determine the confidence of this estimate is presented in this paper. An ensemble of ANNs with a Multi Layer Perceptron architecture is collected using Bayesian training methods. The percentage of the most dominant estimate values is used as a confidence measure. The South African antenatal seroprevalence survey data is used and the HIV status of the patients is estimated. It was found that the missing data could be estimated with an overall accuracy of 68% and the confidence ranges between 50% and 97%. Estimates that have a confidence exceeding 70% have 88% estimation accuracy.
  • Keywords
    data handling; multilayer perceptrons; optimisation; stochastic processes; Bayesian training methods; autoassociative neural networks; computational intelligence; missing data estimation; multilayer perceptron architecture; stochastic optimization technique; Artificial neural networks; Bayesian methods; Biological neural networks; Competitive intelligence; Computational intelligence; Computer networks; Databases; Intelligent networks; Neural networks; Testing; Bayesian Neural Networks; Confidence; Missing Data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications, 2008. ICMLA '08. Seventh International Conference on
  • Conference_Location
    San Diego, CA
  • Print_ISBN
    978-0-7695-3495-4
  • Type

    conf

  • DOI
    10.1109/ICMLA.2008.71
  • Filename
    4725060