• DocumentCode
    1923552
  • Title

    Feature selection forcing overtraining may help to improve performance

  • Author

    Romero, Enrique ; Sopena, Josep M. ; Navarrete, Gorka ; Alquézar, René

  • Author_Institution
    Llenguatges i Sistemes Inf., Univ. Politecnica de Catalunya, Barcelona, Spain
  • Volume
    3
  • fYear
    2003
  • fDate
    20-24 July 2003
  • Firstpage
    2181
  • Abstract
    One of the main drawbacks of machine learning systems is the negative effect caused by overtraining. If the points in the dataset are perfectly fitted, the generalization performance is usually bad. We propose to take profit of overtraining, together with Feature Selection, to improve the performance of a learning system. The main idea lies in the hypothesis that when the dataset is as fitted as possible, the system is forced to use all the available variables as much as possible. Noisy and useless variables can be detected if generalization improves when the system is not allowed to use them. Forcing overtraining, noisy and useless variables should be more outstanding. In order to test this hypothesis, we performed several Feature Selection experiments using Feedforward Neural Networks. The particular Feature Selection procedure used was Sequential Backward Selection. Experimental results with several real-world problems suggest that our hypothesis seems to be well-founded. Ironically, forcing overtraining may help to achieve good performance.
  • Keywords
    feature extraction; feedforward neural nets; learning (artificial intelligence); learning systems; dataset; feature selection; feedforward neural networks; machine learning systems; noisy variables; sequential backward selection; useless variables; Feedforward neural networks; Feedforward systems; Learning systems; Machine learning; Neural networks; Performance evaluation; Psychology; Security; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2003. Proceedings of the International Joint Conference on
  • ISSN
    1098-7576
  • Print_ISBN
    0-7803-7898-9
  • Type

    conf

  • DOI
    10.1109/IJCNN.2003.1223746
  • Filename
    1223746