Author : 
Abello, Alberto ; Romero, Oscar ; Bach Pedersen, Torben ; Berlanga, Rafael ; Nebot, Victoria ; Aramburu, Maria Jose ; Simitsis, Alkis
         
        
            Abstract : 
This paper describes the convergence of some of the most influential technologies in the last few years, namely data warehousing (DW), on-line analytical processing (OLAP), and the Semantic Web (SW). OLAP is used by enterprises to derive important business-critical knowledge from data inside the company. However, the most interesting OLAP queries can no longer be answered on internal data alone, external data must also be discovered (most often on the web), acquired, integrated, and (analytically) queried, resulting in a new type of OLAP, exploratory OLAP. When using external data, an important issue is knowing the precise semantics of the data. Here, SW technologies come to the rescue, as they allow semantics (ranging from very simple to very complex) to be specified for web-available resources. SW technologies do not only support capturing the “passive” semantics, but also support active inference and reasoning on the data. The paper first presents a characterization of DW/OLAP environments, followed by an introduction to the relevant SW foundation concepts. Then, it describes the relationship of multidimensional (MD) models and SW technologies, including the relationship between MD models and SW formalisms. Next, the paper goes on to survey the use of SW technologies for data modeling and data provisioning, including semantic data annotation and semantic-aware extract, transform, and load (ETL) processes. Finally, all the findings are discussed and a number of directions for future research are outlined, including SW support for intelligent MD querying, using SW technologies for providing context to data warehouses, and scalability issues.
         
        
            Keywords : 
business data processing; data mining; data warehouses; semantic Web; DW; ETL process; SW; business-critical knowledge; data acquisition; data discovery; data integration; data querying; data semantics; data warehousing; exploratory OLAP; extract-transform-and-load process; multidimensional models; online analytical processing; scalability issues; semantic Web technology; Bismuth; Cognition; Data mining; Data models; Semantic Web; Semantics; Transforms; Business Intelligence; ETL; OLAP; Semantic Web; data warehousing; reasoning;