Title :
A scalable bottom-up data mining algorithm for relational databases
Author :
Giuffrida, Giovanni ; Cooper, Lee G. ; Chu, Wesley W.
Author_Institution :
Dept. of Comput. Sci., California Univ., Los Angeles, CA, USA
Abstract :
Machine learning induction algorithms are difficult to scale to very large databases because of their memory-bound nature. Using virtual memory results in a significant performance degradation. To overcome such shortcomings, we developed a classification rule induction algorithm for relational databases. Our algorithm uses a bottom-up rule generation strategy that is more effective for mining databases having large cardinality of nominal variables. We have successfully used our algorithm to mine a retail grocery database containing more than 1.6 million records in about 5 hours on a dual Pentium processor PC
Keywords :
deductive databases; knowledge acquisition; learning by example; query processing; relational databases; retail data processing; software performance evaluation; very large databases; Pentium processor; bottom-up data mining algorithm; bottom-up rule generation; classification rule induction algorithm; induction algorithms; machine learning; memory-bound; performance; relational databases; retail grocery database; scalable algorithm; very large databases; virtual memory; Classification algorithms; Data mining; Indexing; Induction generators; Law; Machine learning; Operating systems; Relational databases; Spatial databases; Testing;
Conference_Titel :
Scientific and Statistical Database Management, 1998. Proceedings. Tenth International Conference on
Conference_Location :
Capri
Print_ISBN :
0-8186-8575-1
DOI :
10.1109/SSDM.1998.688125