• DocumentCode
    1765852
  • Title

    Automated Feature Design for Numeric Sequence Classification by Genetic Programming

  • Author

    Harvey, Dustin Y. ; Todd, Michael D.

  • Author_Institution
    Dept. of Struct. Eng., Univ. of California, San Diego, La Jolla, CA, USA
  • Volume
    19
  • Issue
    4
  • fYear
    2015
  • fDate
    Aug. 2015
  • Firstpage
    474
  • Lastpage
    489
  • Abstract
    Pattern recognition methods rely on maximum-information, minimum-dimension feature sets to reliably perform classification and regression tasks. Many methods exist to reduce feature set dimensionality and construct improved features from an initial set; however, there are few general approaches for the design of features from numeric sequences. Any information lost in preprocessing or feature measurement cannot be recreated during pattern recognition. General approaches are needed to extend pattern recognition to include feature design and selection for numeric sequences, such as time series, within the learning process itself. This paper proposes a novel genetic programming (GP) approach to automated feature design called Autofead. In this method, a GP variant evolves a population of candidate features built from a library of sequence-handling functions. Numerical optimization methods, included through a hybrid approach, ensure that the fitness of candidate algorithms is measured using optimal parameter values. Autofead represents the first automated feature design system for numeric sequences to leverage the power and efficiency of both numerical optimization and standard pattern recognition algorithms. Potential applications include the monitoring of electrocardiogram signals for indications of heart failure, network traffic analysis for intrusion detection systems, vibration measurement for bearing condition determination in rotating machinery, and credit card activity for fraud detection.
  • Keywords
    data reduction; feature selection; genetic algorithms; learning (artificial intelligence); pattern classification; regression analysis; time series; Autofead; GP approach; automated feature design system; bearing condition determination; candidate algorithms; credit card activity; electrocardiogram signal monitoring; feature measurement; feature selection; feature set dimensionality reduction; fraud detection; genetic programming; heart failure; information lost; intrusion detection systems; learning process; maximum-informationfeature sets; minimum-dimension feature sets; network traffic analysis; numeric sequence classification; numerical optimization; numerical optimization methods; optimal parameter values; pattern recognition methods; regression tasks; rotating machinery; sequence-handling functions; time series; Algorithm design and analysis; Classification algorithms; Genetic programming; Pattern recognition; Standards; Time series analysis; Vegetation; Feature design; genetic programming; machine learning; pattern recognition; sequence classification; time series classification; time series data mining;
  • fLanguage
    English
  • Journal_Title
    Evolutionary Computation, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1089-778X
  • Type

    jour

  • DOI
    10.1109/TEVC.2014.2341451
  • Filename
    6861439