Title :
A general divide and conquer approach for process mining
Author :
van der Aalst, Wil M. P.
Author_Institution :
Archit. of Inf. Syst., Eindhoven Univ. of Technol., Eindhoven, Netherlands
Abstract :
Operational processes leave trails in the information systems supporting them. Such event data are the starting point for process mining - an emerging scientific discipline relating modeled and observed behavior. The relevance of process mining is increasing as more and more event data become available. The increasing volume of such data (“Big Data”) provides both opportunities and challenges for process mining. In this paper we focus on two particular types of process mining: process discovery (learning a process model from example behavior recorded in an event log) and conformance checking (diagnosing and quantifying discrepancies between observed behavior and modeled behavior). These tasks become challenging when there are hundreds or even thousands of different activities and millions of cases. Typically, process mining algorithms are linear in the number of cases and exponential in the number of different activities. This paper proposes a very general divide-and-conquer approach that decomposes the event log based on a partitioning of activities. Unlike existing approaches, this paper does not assume a particular process representation (e.g., Petri nets or BPMN) and allows for various decomposition strategies (e.g., SESE- or passage-based decomposition). Moreover, the generic divide-and-conquer approach reveals the core requirements for decomposing process discovery and conformance checking problems.
Keywords :
Petri nets; business data processing; data mining; divide and conquer methods; BPMN; Petri nets; SESE-deconposition; conformance checking problems; core requirements; decomposition strategy; divide and conquer approach; event log; generic divide-and-conquer approach; information systems; modeled behavior; observed behavior; operational processes; passage-based decomposition; process discovery; process mining algorithms; process representation; Computational modeling; Data mining; Hidden Markov models; Information systems; Measurement; Petri nets; Unified modeling language;
Conference_Titel :
Computer Science and Information Systems (FedCSIS), 2013 Federated Conference on
Conference_Location :
Krako??w