• DocumentCode
    1241701
  • Title

    Information Loss of the Mahalanobis Distance in High Dimensions: Application to Feature Selection

  • Author

    Ververidis, Dimitrios ; Kotropoulos, Constantine

  • Author_Institution
    Dept. of Inf., Aristotle Univ. of Thessaloniki, Thessaloniki, Greece
  • Volume
    31
  • Issue
    12
  • fYear
    2009
  • Firstpage
    2275
  • Lastpage
    2281
  • Abstract
    When an infinite training set is used, the Mahalanobis distance between a pattern measurement vector of dimensionality D and the center of the class it belongs to is distributed as a chi2 with D degrees of freedom. However, the distribution of Mahalanobis distance becomes either Fisher or Beta depending on whether cross validation or resubstitution is used for parameter estimation in finite training sets. The total variation between chi2 and Fisher, as well as between chi2 and Beta, allows us to measure the information loss in high dimensions. The information loss is exploited then to set a lower limit for the correct classification rate achieved by the Bayes classifier that is used in subset feature selection.
  • Keywords
    Bayes methods; parameter estimation; pattern classification; Bayes classifier; Beta distance; Fisher distance; Mahalanobis distance; cross validation; feature selection; infinite training set; information loss; parameter estimation; pattern measurement vector; resubstitution; Bayes classifier; Gaussian distribution; Mahalanobis distance; cross validation.; feature selection;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2009.84
  • Filename
    4815271