Title :
Top 10 data mining mistakes
Author :
Elder, John F., IV
Author_Institution :
Elder Res., Inc., Charlottesville, VA, USA
Abstract :
Summary form only given. Data mining is still as much it is an art as a science, and fancy new tools make it easy to do wrong things with one\´s data even faster. We\´ll examine the major "cracks in the crystal ball" through case studies, both simple and complex, of (often personal) errors - drawn from real-world consulting engagements. Best practices for data mining will be (accidentally) illuminated by their (rarely described) opposites. These common errors range from allowing anachronistic variables into the pool of candidate inputs, to subtly inflating results through early up-sampling. You\´ll hear cautionary tales of endangered projects and embarrassed teams-but also the keys to avoiding such a fate yourself.
Keywords :
data mining; anachronistic variables; complex errors; data mining; real-world consulting; simple errors;
Conference_Titel :
Data Mining, Fifth IEEE International Conference on
Print_ISBN :
0-7695-2278-5
DOI :
10.1109/ICDM.2005.83