DocumentCode :
2765975
Title :
On the Use of Machine Learning to Predict the Time and Resources Consumed by Applications
Author :
Matsunaga, Andréa ; Fortes, José
fYear :
2010
fDate :
17-20 May 2010
Firstpage :
495
Lastpage :
504
Abstract :
Most data centers, clouds and grids consist of multiple generations of computing systems, each with different performance profiles, posing a challenge to job schedulers in achieving the best usage of the infrastructure. A useful piece of information for scheduling jobs, typically not available, is the extent to which applications will use available resources once they are executed. This paper comparatively assesses the suitability of several machine learning techniques for predicting spatio temporal utilization of resources by applications. Modern machine learning techniques able to handle large number of attributes are used, taking into account application- and system-specific attributes (e.g., CPU micro architecture, size and speed of memory and storage, input data characteristics and input parameters). The work also extends an existing classification tree algorithm, called Predicting Query Runtime (PQR), to the regression problem by allowing the leaves of the tree to select the best regression method for each collection of data on leaves. The new method (PQR2) yields the best average percentage error, predicting execution time, memory and disk consumption for two bioinformatics applications, BLAST and RAxML, deployed on scenarios that differ in system and usage. In specific scenarios where usage is a non-linear function of system and application attributes, certain configurations of two other machine learning algorithms, Support Vector Machine and k-nearest neighbors, also yield competitive results. In addition, experiments show that the inclusion of system performance and application-specific attributes also improves the performance of machine learning algorithms investigated.
Keywords :
Classification algorithms; Classification tree analysis; Cloud computing; Grid computing; Machine learning; Machine learning algorithms; Mesh generation; Prediction algorithms; Processor scheduling; Regression tree analysis; application resource usage; classifier tree; machine learning; regression;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2010 10th IEEE/ACM International Conference on
Conference_Location :
Melbourne, Australia
Print_ISBN :
978-1-4244-6987-1
Type :
conf
DOI :
10.1109/CCGRID.2010.98
Filename :
5493447
Link To Document :
بازگشت