• DocumentCode
    2934734
  • Title

    A Framework for Efficient Data Analytics through Automatic Configuration and Customization of Scientific Workflows

  • Author

    Hauder, Matheus ; Gil, Yolanda ; Liu, Yan

  • Author_Institution
    Inst. for Software & Syst. Eng., Univ. of Augsburg, Augsburg, Germany
  • fYear
    2011
  • fDate
    5-8 Dec. 2011
  • Firstpage
    379
  • Lastpage
    386
  • Abstract
    Data analytics involves choosing between many different algorithms and experimenting with possible combinations of those algorithms. Existing approaches however do not support scientists with the laborious tasks of exploring the design space of computational experiments. We have developed a framework to assist scientists with data analysis tasks in particular machine learning and data mining. It takes advantage of the unique capabilities of the Wings workflow system to reason about semantic constraints. We show how the framework can rule out invalid workflows and help scientists to explore the design space. We demonstrate our system in the domain of text analytics, and outline the benefits of our approach.
  • Keywords
    data analysis; data mining; learning (artificial intelligence); natural sciences computing; text analysis; workflow management software; Wings workflow system; computational experiment; data analysis; data analytics; data mining; machine learning; scientific workflow configuration; scientific workflow customization; text analytics; Algorithm design and analysis; Clustering algorithms; Correlation; Machine learning algorithms; Prediction algorithms; Software; Software algorithms; Data Analytics; Scientific Workflows;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    E-Science (e-Science), 2011 IEEE 7th International Conference on
  • Conference_Location
    Stockholm
  • Print_ISBN
    978-1-4577-2163-2
  • Type

    conf

  • DOI
    10.1109/eScience.2011.59
  • Filename
    6123302