• DocumentCode
    31793
  • Title

    Active Trace Clustering for Improved Process Discovery

  • Author

    De Weerdt, J. ; vanden Broucke, S. ; Vanthienen, Jan ; Baesens, Bart

  • Author_Institution
    Dept. of Decision Sci. & Inf. Manage, Katholieke Univ. Leuven, Leuven, Belgium
  • Volume
    25
  • Issue
    12
  • fYear
    2013
  • fDate
    Dec. 2013
  • Firstpage
    2708
  • Lastpage
    2720
  • Abstract
    Process discovery is the learning task that entails the construction of process models from event logs of information systems. Typically, these event logs are large data sets that contain the process executions by registering what activity has taken place at a certain moment in time. By far the most arduous challenge for process discovery algorithms consists of tackling the problem of accurate and comprehensible knowledge discovery from highly flexible environments. Event logs from such flexible systems often contain a large variety of process executions which makes the application of process mining most interesting. However, simply applying existing process discovery techniques will often yield highly incomprehensible process models because of their inaccuracy and complexity. With respect to resolving this problem, trace clustering is one very interesting approach since it allows to split up an existing event log so as to facilitate the knowledge discovery process. In this paper, we propose a novel trace clustering technique that significantly differs from previous approaches. Above all, it starts from the observation that currently available techniques suffer from a large divergence between the clustering bias and the evaluation bias. By employing an active learning inspired approach, this bias divergence is solved. In an assessment using four complex, real-life event logs, it is shown that our technique significantly outperforms currently available trace clustering techniques.
  • Keywords
    data mining; information systems; learning (artificial intelligence); pattern clustering; active learning inspired approach; active trace clustering technique; information systems; knowledge discovery; large data sets; learning task; process discovery algorithm; process executions; process mining; process model construction; real-life event logs; Clustering; Information systems; Learning systems; Process mining; Process mining; active learning; event logs; trace clustering;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2013.64
  • Filename
    6507222