DocumentCode
2934734
Title
A Framework for Efficient Data Analytics through Automatic Configuration and Customization of Scientific Workflows
Author
Hauder, Matheus ; Gil, Yolanda ; Liu, Yan
Author_Institution
Inst. for Software & Syst. Eng., Univ. of Augsburg, Augsburg, Germany
fYear
2011
fDate
5-8 Dec. 2011
Firstpage
379
Lastpage
386
Abstract
Data analytics involves choosing between many different algorithms and experimenting with possible combinations of those algorithms. Existing approaches however do not support scientists with the laborious tasks of exploring the design space of computational experiments. We have developed a framework to assist scientists with data analysis tasks in particular machine learning and data mining. It takes advantage of the unique capabilities of the Wings workflow system to reason about semantic constraints. We show how the framework can rule out invalid workflows and help scientists to explore the design space. We demonstrate our system in the domain of text analytics, and outline the benefits of our approach.
Keywords
data analysis; data mining; learning (artificial intelligence); natural sciences computing; text analysis; workflow management software; Wings workflow system; computational experiment; data analysis; data analytics; data mining; machine learning; scientific workflow configuration; scientific workflow customization; text analytics; Algorithm design and analysis; Clustering algorithms; Correlation; Machine learning algorithms; Prediction algorithms; Software; Software algorithms; Data Analytics; Scientific Workflows;
fLanguage
English
Publisher
ieee
Conference_Titel
E-Science (e-Science), 2011 IEEE 7th International Conference on
Conference_Location
Stockholm
Print_ISBN
978-1-4577-2163-2
Type
conf
DOI
10.1109/eScience.2011.59
Filename
6123302
Link To Document