• DocumentCode
    1367586
  • Title

    A statistical-heuristic feature selection criterion for decision tree induction

  • Author

    Zhou, Xiao Jia ; Dillon, Tharam S.

  • Author_Institution
    Dept. of Comput Sci., La Trobe Univ., Bundoora, Vic., Australia
  • Volume
    13
  • Issue
    8
  • fYear
    1991
  • fDate
    8/1/1991 12:00:00 AM
  • Firstpage
    834
  • Lastpage
    841
  • Abstract
    The authors present a statistical-heuristic feature selection criterion for constructing multibranching decision trees in noisy real-world domains. Real world problems often have multivalued features. To these problems, multibranching decision trees provide a more efficient and more comprehensible solution that binary decision trees. The authors propose a statistical-heuristic criterion, the symmetrical τ and then discuss its consistency with a Bayesian classifier and its built-in statistical test. The combination of a measure of proportional-reduction-in-error and cost-of-complexity heuristic enables the symmetrical τ to be a powerful criterion with many merits, including robustness to noise, fairness to multivalued features, and ability to handle a Boolean combination of logical features, and middle-cut preference. The τ criterion also provides a natural basis for prepruning and dynamic error estimation. Illustrative examples are also presented
  • Keywords
    Bayes methods; decision theory; pattern recognition; statistics; trees (mathematics); τ criterion; Bayesian classifier; built-in statistical test; cost-of-complexity heuristic; decision tree induction; dynamic error estimation; middle-cut preference; multibranching decision trees; pattern recognition; prepruning; proportional-reduction-in-error; robustness; statistical-heuristic feature selection criterion; Bayesian methods; Classification tree analysis; Decision trees; Knowledge acquisition; Learning systems; Noise measurement; Noise robustness; Pattern recognition; Testing; Working environment noise;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/34.85676
  • Filename
    85676