Title :
Large-scale factorization of type-constrained multi-relational data
Author :
Krompass, Denis ; Nickel, Maximilian ; Tresp, Volker
Author_Institution :
Ludwig Maximilian Univ., Munich, Germany
Abstract :
The statistical modeling of large multi-relational datasets has increasingly gained attention in recent years. Typical applications involve large knowledge bases like DBpedia, Freebase, YAGO and the recently introduced Google Knowledge Graph that contain millions of entities, hundreds and thousands of relations, and billions of relational tuples. Collective factorization methods have been shown to scale up to these large multi-relational datasets, in particular in form of tensor approaches that can exploit the highly scalable alternating least squares (ALS) algorithms for calculating the factors. In this paper we extend the recently proposed state-of-the-art RESCAL tensor factorization to consider relational type-constraints. Relational type-constraints explicitly define the logic of relations by excluding entities from the subject or object role. In addition we will show that in absence of prior knowledge about type-constraints, local closed-world assumptions can be approximated for each relation by ignoring unobserved subject or object entities in a relation. In our experiments on representative large datasets (Cora, DBpedia), that contain up to millions of entities and hundreds of type-constrained relations, we show that the proposed approach is scalable. It further significantly outperforms RESCAL without type-constraints in both, runtime and prediction quality.
Keywords :
least squares approximations; matrix decomposition; relational databases; tensors; ALS algorithms; Cora dataset; DBpedia dataset; RESCAL tensor factorization; alternating least squares algorithms; collective factorization method; large multirelational datasets; large-scale factorization; object role; prediction quality; relation logic; relational type-constraints; runtime analysis; statistical modeling; tensor approaches; type-constrained multirelational data; Lead;
Conference_Titel :
Data Science and Advanced Analytics (DSAA), 2014 International Conference on
DOI :
10.1109/DSAA.2014.7058046