• DocumentCode
    588199
  • Title

    Temporal representation for scientific data provenance

  • Author

    Peng Chen ; Plale, Beth ; Aktas, Mehmet S.

  • Author_Institution
    Sch. of Inf. & Comput., Indiana Univ., Bloomington, IN, USA
  • fYear
    2012
  • fDate
    8-12 Oct. 2012
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Provenance of digital scientific data is an important piece of the metadata of a data object. It can however grow voluminous quickly because the granularity level of capture can be high. It can also be quite feature rich. We propose a representation of the provenance data based on logical time that reduces the feature space. Creating time and frequency domain representations of the provenance, we apply clustering, classification and association rule mining to the abstract representations to determine the usefulness of the temporal representation. We evaluate the temporal representation using an existing 10 GB database of provenance captured from a range of scientific workflows.
  • Keywords
    abstract data types; data mining; frequency-domain analysis; meta data; pattern classification; pattern clustering; scientific information systems; temporal databases; time-domain analysis; workflow management software; abstract representation; association rule mining; classification; clustering; digital scientific data provenance; feature space reduction; frequency domain representation; metadata; provenance data representation; provenance database; scientific workflows; temporal representation; time domain representation; Clocks; Clustering algorithms; Data mining; Databases; Frequency domain analysis; Partitioning algorithms; Time domain analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    E-Science (e-Science), 2012 IEEE 8th International Conference on
  • Conference_Location
    Chicago, IL
  • Print_ISBN
    978-1-4673-4467-8
  • Type

    conf

  • DOI
    10.1109/eScience.2012.6404477
  • Filename
    6404477