Title :
Evaluating the Effectiveness of Information Extraction in Real-World Storage Management
Author :
Singh, Aameek ; Uttamchandani, Sandeep ; Wang, Yin
Author_Institution :
Almaden Res. Center, IBM, San Jose, CA
Abstract :
As storage deployments within enterprises continue to grow, there is an increasing need to simplify and automate. Existing tools for automation rely on extracting information in the form of device models and workload patterns from raw performance data collected from devices. This paper evaluates the effectiveness of applying such information extraction techniques on real-world data collected over a period of months from the data centers of two commercial enterprises. Real-world monitor data has several challenges that typically do not exist in controlled lab environments. Our analysis for creating models is using popular algorithms such as M5, CART, ARIMA and Fast Fourier Transform (FFT). The relative error rate in predicting device response time from real-world data is 40-45% - a similar experiment using data from a controlled lab environment has a relative error of 25%. Bootstrapping models for the two commercial datasets ran for 245 mins and 477 mins respectively, which illustrates the need for mechanisms that effectively deal with large enterprise scales. We describe one such technique that clusters devices with similar hardware configurations. With a cluster size of five devices, we were able to reduce the model creation time to 94 mins and 138 mins respectively. Finally, an interesting trade-off arises in model accuracy and computation time required to refine the model.
Keywords :
information retrieval; storage management; ARIMA; CART; M5; device models; fast Fourier transform; hardware configurations; information extraction; real-world monitor data; real-world storage management; storage deployments; workload patterns; Algorithm design and analysis; Data mining; Delay; Error analysis; Error correction; Fast Fourier transforms; Hardware; Monitoring; Radio access networks; Storage automation;
Conference_Titel :
Modeling, Analysis and Simulation of Computers and Telecommunication Systems, 2008. MASCOTS 2008. IEEE International Symposium on
Conference_Location :
Baltimore, MD
Print_ISBN :
978-1-4244-2817-5
Electronic_ISBN :
1526-7539
DOI :
10.1109/MASCOT.2008.4770579