DocumentCode :
3018874
Title :
File classification in self-* storage systems
Author :
Mesnier, Michael ; Thereska, Eno ; Ganger, Gregory R. ; Ellard, Daniel ; Seltzer, Margo
Author_Institution :
Carnegie Mellon Univ., Pittsburgh, PA, USA
fYear :
2004
fDate :
17-18 May 2004
Firstpage :
44
Lastpage :
51
Abstract :
To tune and manage themselves, file and storage systems must understand key properties (e.g., access pattern, lifetime, size) of their various files. This paper describes how systems can automatically learn to classify the properties of files (e.g., read-only access pattern, short-lived, small in size) and predict the properties of new files, as they are created, by exploiting the strong associations between a file´s properties and the names and attributes assigned to it. These associations exist, strongly but differently, in each of four real NFS environments studied. Decision tree classifiers can automatically identify and model such associations, providing prediction accuracies that often exceed 90%. Such predictions can be used to select storage policies (e.g., disk allocation schemes and replication factors) for individual files. Further, changes in associations can expose information about applications, helping autonomic system components distinguish growth from fundamental change.
Keywords :
decision trees; file organisation; pattern classification; self-adjusting systems; storage allocation; decision tree classifiers; disk allocation; file classification; file properties; replication factors; self-* storage systems; storage policies; Accuracy; Application software; Automation; Classification tree analysis; Computer errors; Decision making; Decision trees; Predictive models; Programming profession; Tuning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Autonomic Computing, 2004. Proceedings. International Conference on
Print_ISBN :
0-7695-2114-2
Type :
conf
DOI :
10.1109/ICAC.2004.1301346
Filename :
1301346
Link To Document :
بازگشت