• DocumentCode
    1967489
  • Title

    Empirical case studies in attribute noise detection

  • Author

    Khoshgoftaar, Taghi M. ; Hulse, Jason Van

  • Author_Institution
    Dept. of Comput. Eng., Florida Atlantic Univ., Boca Raton, FL, USA
  • fYear
    2005
  • fDate
    15-17 Aug. 2005
  • Firstpage
    211
  • Lastpage
    216
  • Abstract
    The problem of determining the noisiest attribute(s) from a set of domain-specific attributes is of practical importance to domain experts and the data mining community. Data noise is generally of two types: attribute noise and mislabeling errors (class noise). For a given domain-specific dataset, attributes that contain a significant amount of noise can have a detrimental impact on the success of a data mining initiative, e.g., reducing the predictive ability of a classifier in a supervised learning task. Techniques that provide information about the noise quality of an attribute are useful tools for a data mining practitioner when performing analysis on a dataset or scrutinizing the data collection processes. Our technique for detecting noisy attributes uses an algorithm that we recently proposed for the detection of instances with attribute noise. This paper presents case studies that confirm our recent work done on detecting noisy attributes and further validates that our technique is indeed able to detect attributes that contain noise.
  • Keywords
    data mining; database management systems; attribute noise; attribute noise detection; class noise; data collection process; data mining; domain experts; domain-specific attributes; domain-specific dataset; mislabeling errors; supervised learning task; Computer aided software engineering; Computer errors; Computer science; Data analysis; Data engineering; Data mining; Information analysis; Noise reduction; Performance analysis; Supervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Reuse and Integration, Conf, 2005. IRI -2005 IEEE International Conference on.
  • Print_ISBN
    0-7803-9093-8
  • Type

    conf

  • DOI
    10.1109/IRI-05.2005.1506475
  • Filename
    1506475