DocumentCode :
3145263
Title :
Data science for software engineering
Author :
Menzies, T. ; Kocaguneli, Ekrem ; Peters, F. ; Turhan, Burak ; Minku, Leandro L.
Author_Institution :
Lane Dept. of CS&EE, West Virginia Univ., Morgantown, WV, USA
fYear :
2013
fDate :
18-26 May 2013
Firstpage :
1484
Lastpage :
1486
Abstract :
Target audience: Software practitioners and researchers wanting to understand the state of the art in using data science for software engineering (SE). Content: In the age of big data, data science (the knowledge of deriving meaningful outcomes from data) is an essential skill that should be equipped by software engineers. It can be used to predict useful information on new projects based on completed projects. This tutorial offers core insights about the state-of-the-art in this important field. What participants will learn: Before data science: this tutorial discusses the tasks needed to deploy machine-learning algorithms to organizations (Part 1: Organization Issues). During data science: from discretization to clustering to dichotomization and statistical analysis. And the rest: When local data is scarce, we show how to adapt data from other organizations to local problems. When privacy concerns block access, we show how to privatize data while still being able to mine it. When working with data of dubious quality, we show how to prune spurious information. When data or models seem too complex, we show how to simplify data mining results. When data is too scarce to support intricate models, we show methods for generating predictions. When the world changes, and old models need to be updated, we show how to handle those updates. When the effect is too complex for one model, we show how to reason across ensembles of models. Pre-requisites: This tutorial makes minimal use of maths of advanced algorithms and would be understandable by developers and technical managers.
Keywords :
data handling; data mining; data privacy; learning (artificial intelligence); pattern clustering; software engineering; statistical analysis; big data; data clustering; data dichotomization; data discretization; data mining; data privacy; data science; machine-learning algorithms; software engineering; software practitioners; software researchers; statistical analysis; Data mining; Data models; Educational institutions; Predictive models; Software; Software engineering; Tutorials;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Engineering (ICSE), 2013 35th International Conference on
Conference_Location :
San Francisco, CA
Print_ISBN :
978-1-4673-3073-2
Type :
conf
DOI :
10.1109/ICSE.2013.6606752
Filename :
6606752
Link To Document :
بازگشت