DocumentCode
3351335
Title
Mining atypical groups for a target quantitative attribute
Author
Guillaume, Sylvie ; Guillochon, Florian
Author_Institution
LIMOS Res. Lab., Blaise Pascal Univ., Aubiere
fYear
2008
fDate
21-24 Sept. 2008
Firstpage
1067
Lastpage
1074
Abstract
An important task in data analysis is the understanding of unexpected or atypical behaviors in a group of individuals. Which categories of individuals earn the higher salaries or, on the contrary, which ones earn the lower salaries? We present the problem of how data concerning atypical groups can be mined compared with a target quantitative attribute, like for instance the attribute ldquosalaryrdquo, and in particular for the high and low values of a user-defined interval. Our search therefore focuses on conjunctions of attributes whose distribution differs significantly from the learning set for the intervalpsilas high and low values of the target attribute. Such atypical groups can be found by adapting an existing measure, the intensity of inclination. This measure frees us from the transformation step of quantitative attributes, that is to say the step of discretization followed by a complete disjunctive coding. Thus, we propose an algorithm for mining such groups using pruning rules in order to reduce the complexity of the problem. This algorithm has been developed and integrated into the WEKA software for knowledge extraction. Finally we give an example of data extraction from the American census database IPUMS.
Keywords
behavioural sciences computing; data analysis; data mining; American census database; IPUMS; WEKA software; atypical behaviors; atypical group mining; data analysis; data extraction; knowledge extraction; pruning rules; target quantitative attribute; Association rules; Clustering methods; Data analysis; Data mining; Fuzzy sets; Itemsets; Laboratories; Merging; Remuneration; Software algorithms; Quantitative associations; interestingness measures; negative and positive associations;
fLanguage
English
Publisher
ieee
Conference_Titel
Cybernetics and Intelligent Systems, 2008 IEEE Conference on
Conference_Location
Chengdu
Print_ISBN
978-1-4244-1673-8
Electronic_ISBN
978-1-4244-1674-5
Type
conf
DOI
10.1109/ICCIS.2008.4670867
Filename
4670867
Link To Document