DocumentCode :
2864247
Title :
Top 10 data mining mistakes
Author :
Elder, John F., IV
Author_Institution :
Elder Res., Inc., Charlottesville, VA, USA
fYear :
2005
fDate :
27-30 Nov. 2005
Abstract :
Summary form only given. Data mining is still as much it is an art as a science, and fancy new tools make it easy to do wrong things with one\´s data even faster. We\´ll examine the major "cracks in the crystal ball" through case studies, both simple and complex, of (often personal) errors - drawn from real-world consulting engagements. Best practices for data mining will be (accidentally) illuminated by their (rarely described) opposites. These common errors range from allowing anachronistic variables into the pool of candidate inputs, to subtly inflating results through early up-sampling. You\´ll hear cautionary tales of endangered projects and embarrassed teams-but also the keys to avoiding such a fate yourself.
Keywords :
data mining; anachronistic variables; complex errors; data mining; real-world consulting; simple errors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, Fifth IEEE International Conference on
ISSN :
1550-4786
Print_ISBN :
0-7695-2278-5
Type :
conf
DOI :
10.1109/ICDM.2005.83
Filename :
1565651
Link To Document :
بازگشت