Title :
GraphLens: Mining Enterprise Storage Workloads Using Graph Analytics
Author :
Yang Zhou ; Seshadri, Sangeetha ; Chiu, Lin-Kai ; Ling Liu
Author_Institution :
Georgia Inst. of Technol., Atlanta, GA, USA
fDate :
June 27 2014-July 2 2014
Abstract :
Conventional methods used to analyze storage workloads have been centered on relational database technology combined with attributes-based classification algorithms. This paper presents a novel analytic architecture, GraphLens, for mining and analyzing real world storage traces. The design of our GraphLens system embodies three unique features. First, we model storage traces as heterogeneous trace graphs in order to capture diverse spatial correlations and storage access patterns using a unified analytic framework. Second, we employ and develop an innovative graph clustering method to discover interesting spatial access patterns. This enables us to better characterize important hotspots of storage access and understand hotspot movement patterns. Third, we design a unified weighted similarity measure through an iterative learning and dynamic weight refinement algorithm. With an optimal weight assignment scheme, we can efficiently combine the correlation information for each type of storage access patterns, such as random v.s. sequential, read v.s. write, to identify interesting spatial correlations hidden in the traces. Extensive evaluation on real storage traces shows GraphLens can provide scalable and reliable data analytics for better storage strategy planning and efficient data placement guidance.
Keywords :
data mining; graph theory; iterative methods; learning (artificial intelligence); pattern classification; pattern clustering; relational databases; GraphLens; attributes-based classification algorithms; data analytics; dynamic weight refinement algorithm; enterprise storage workload mining; graph analytics; heterogeneous trace graphs; innovative graph clustering method; iterative learning; optimal weight assignment scheme; relational database technology; spatial access patterns; storage access patterns; storage strategy planning; storage traces; unified analytic framework; unified weighted similarity measure; Algorithm design and analysis; Analytical models; Correlation; Data mining; Production; Servers; Weight measurement;
Conference_Titel :
Big Data (BigData Congress), 2014 IEEE International Congress on
Conference_Location :
Anchorage, AK
Print_ISBN :
978-1-4799-5056-0
DOI :
10.1109/BigData.Congress.2014.11