Title :
A Practical Heterogeneous Classifier for Relational Databases
Author :
Manjunath, Geetha ; Murty, Narasimha M. ; Sitaram, Dinkar
Abstract :
Most enterprise data is distributed in multiple relational databases with expert-designed schema. Using traditional single-table machine learning techniques over such data not only incur a computational penalty for converting to a ”flat” form (mega-join), even the human-specified semantic information present in the relations is lost. In this paper, we present a two-phase hierarchical meta-classification algorithm for relational databases with a semantic divide and conquer approach. We propose a recursive, prediction aggregation technique over heterogeneous classifiers applied on individual database tables. A preliminary evaluation on TPCH and UCI benchmarks shows reduced training time without any loss of prediction accuracy.
Keywords :
divide and conquer methods; learning (artificial intelligence); relational databases; TPCH; UCI benchmarks; enterprise data; expert-designed schema; heterogeneous classifier; human-specified semantic information; multiple relational databases; relational databases; single-table machine learning techniques; two-phase hierarchical meta-classification algorithm; Accuracy; Distributed databases; Prediction algorithms; Resource description framework; Semantics; Training; classification; hierarchical; relational databases; semantics;
Conference_Titel :
Pattern Recognition (ICPR), 2010 20th International Conference on
Conference_Location :
Istanbul
Print_ISBN :
978-1-4244-7542-1
DOI :
10.1109/ICPR.2010.811