Title :
Graph Propositionalization for Random Forests
Author :
Karunaratne, Thashmee ; Bostrom, Henrik
Author_Institution :
Dept. of Comput. & Syst. Sci., Stockholm Univ., Stockholm, Sweden
Abstract :
Graph propositionalization methods transform structured and relational data into a fixed-length feature vector format that can be used by standard machine learning methods. However, the choice of propositionalization method may have a significant impact on the performance of the resulting classifier. Six different propositionalization methods are evaluated when used in conjunction with random forests. The empirical evaluation shows that the choice of propositionalization method has a significant impact on the resulting accuracy for structured data sets. The results furthermore show that the maximum frequent itemset approach and a combination of this approach and maximal common substructures turn out to be the most successful propositionalization methods for structured data, each significantly outperforming the four other considered methods.
Keywords :
data structures; graph theory; learning (artificial intelligence); feature vector format; graph propositionalization; maximum frequent itemset approach; random forests; relational data; standard machine learning method; structured data set; Application software; Data preprocessing; Feature extraction; Fingerprint recognition; Frequency; Itemsets; Learning systems; Machine learning; Machine learning algorithms; Standards development; Graph Propositionalization; Learning algorithms; structured data;
Conference_Titel :
Machine Learning and Applications, 2009. ICMLA '09. International Conference on
Conference_Location :
Miami Beach, FL
Print_ISBN :
978-0-7695-3926-3
DOI :
10.1109/ICMLA.2009.113