• DocumentCode
    3351335
  • Title

    Mining atypical groups for a target quantitative attribute

  • Author

    Guillaume, Sylvie ; Guillochon, Florian

  • Author_Institution
    LIMOS Res. Lab., Blaise Pascal Univ., Aubiere
  • fYear
    2008
  • fDate
    21-24 Sept. 2008
  • Firstpage
    1067
  • Lastpage
    1074
  • Abstract
    An important task in data analysis is the understanding of unexpected or atypical behaviors in a group of individuals. Which categories of individuals earn the higher salaries or, on the contrary, which ones earn the lower salaries? We present the problem of how data concerning atypical groups can be mined compared with a target quantitative attribute, like for instance the attribute ldquosalaryrdquo, and in particular for the high and low values of a user-defined interval. Our search therefore focuses on conjunctions of attributes whose distribution differs significantly from the learning set for the intervalpsilas high and low values of the target attribute. Such atypical groups can be found by adapting an existing measure, the intensity of inclination. This measure frees us from the transformation step of quantitative attributes, that is to say the step of discretization followed by a complete disjunctive coding. Thus, we propose an algorithm for mining such groups using pruning rules in order to reduce the complexity of the problem. This algorithm has been developed and integrated into the WEKA software for knowledge extraction. Finally we give an example of data extraction from the American census database IPUMS.
  • Keywords
    behavioural sciences computing; data analysis; data mining; American census database; IPUMS; WEKA software; atypical behaviors; atypical group mining; data analysis; data extraction; knowledge extraction; pruning rules; target quantitative attribute; Association rules; Clustering methods; Data analysis; Data mining; Fuzzy sets; Itemsets; Laboratories; Merging; Remuneration; Software algorithms; Quantitative associations; interestingness measures; negative and positive associations;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cybernetics and Intelligent Systems, 2008 IEEE Conference on
  • Conference_Location
    Chengdu
  • Print_ISBN
    978-1-4244-1673-8
  • Electronic_ISBN
    978-1-4244-1674-5
  • Type

    conf

  • DOI
    10.1109/ICCIS.2008.4670867
  • Filename
    4670867