DocumentCode :
3112548
Title :
Parametric classification over multiple samples
Author :
Russo, Barbara
Author_Institution :
Fac. of Comput. Sci., Free Univ. of Bozen-Bolzano, Bolzano, Italy
fYear :
2013
fDate :
21-21 May 2013
Firstpage :
23
Lastpage :
25
Abstract :
This pattern was originally designed to classify sequences of events in log files by error-proneness. Sequences of events trace application use in real contexts. As such, identifying error-prone sequences helps understand and predict application use. The classification problem we describe is typical in supervised machine learning, but the composite pattern we propose investigates it with several techniques to control for data brittleness. Data pre-processing, feature selection, parametric classification, and cross-validation are the major instruments that enable a good degree of control over this classification problem. In particular, the pattern includes a solution for typical problems that occurs when data comes from several samples of different populations and with different degree of sparcity.
Keywords :
learning (artificial intelligence); pattern classification; classification problem; cross-validation; data pre-processing; error-prone sequences; feature selection; parametric classification; supervised machine learning; Accuracy; Correlation; Sociology; Software; Training; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Analysis Patterns in Software Engineering (DAPSE), 2013 1st International Workshop on
Conference_Location :
San Francisco, CA
Type :
conf
DOI :
10.1109/DAPSE.2013.6603805
Filename :
6603805
Link To Document :
بازگشت