DocumentCode
3006293
Title
Consistent Process Mining over Big Data Triple Stores
Author
Azzini, Antonia ; Ceravolo, Paolo
Author_Institution
Dipt. di Inf., Univ. degli Studi di Milano, Milan, Italy
fYear
2013
fDate
June 27 2013-July 2 2013
Firstpage
54
Lastpage
61
Abstract
´Big Data´ techniques are often adopted in cross-organization scenarios for integrating multiple data sources to extract statistics or other latent information. Even if these techniques do not require the support of a schema for processing data, a common conceptual model is typically defined to address name resolution. This implies that each local source is tasked of applying a semantic lifting procedure for expressing the local data in term of the common model. Semantic heterogeneity is then potentially introduced in data. In this paper we illustrate a methodology designed to the implementation of consistent process mining algorithms in a `Big Data´ context. In particular, we exploit two different procedures. The first one is aimed at computing the mismatch among the data sources to be integrated. The second uses mismatch values to extend data to be processed with a traditional map reduce algorithm.
Keywords
data mining; statistics; Map Reduce algorithm; big data techniques; big data triple stores; conceptual model; consistent process mining algorithm; cross-organization scenarios; latent information extraction; multiple data source integration; name resolution; semantic heterogeneity; semantic lifting procedure; statistics extraction; Data handling; Data mining; Data models; Data storage systems; Information management; Resource description framework; Semantics; Big Data; Data Integration; Process Mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Big Data (BigData Congress), 2013 IEEE International Congress on
Conference_Location
Santa Clara, CA
Print_ISBN
978-0-7695-5006-0
Type
conf
DOI
10.1109/BigData.Congress.2013.17
Filename
6597119
Link To Document