• DocumentCode
    515365
  • Title

    A proposed outliers identification algorithm for categorical data sets

  • Author

    Taha, Ayman ; Hegazy, Osman M.

  • Author_Institution
    Fac. of Comput. & Inf., Cairo Univ., Cairo, Egypt
  • fYear
    2010
  • fDate
    28-30 March 2010
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Outliers are a minority of observations that are inconsistent with the pattern suggested by the majority of observations. Outliers identification algorithms for categorical data sets face many limitation because measuring distance is not common in categorical data. In this paper, we propose a new unsupervised outliers identification method in categorical data sets. In contrast to other outliers identification methods, the proposed method considers number of categories inside categorical variables. Experimental results show that the proposed method has a comparable performance results with respect to other outliers identification methods in performance.
  • Keywords
    data mining; categorical data sets; categorical variables; data mining; outliers identification algorithm; Application software; Computer errors; Data mining; Detection algorithms; Frequency; Phase detection; Spatial databases; Supervised learning; Testing; Unsupervised learning; Categorical Data; Data Mining; Outliers Detection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Informatics and Systems (INFOS), 2010 The 7th International Conference on
  • Conference_Location
    Cairo
  • Print_ISBN
    978-1-4244-5828-8
  • Type

    conf

  • Filename
    5461759