Title :
File Block Classification by Support Vector Machine
Author :
Sportiello, Luigi ; Zanero, Stefano
Author_Institution :
Dipt. di Elettron. e Inf., Politec. di Milano, Milan, Italy
Abstract :
Retrieval of files without the support of file system structures is arguably essential for digital forensics. Files are typically stored as sequences of data blocks, which have to be reconstructed in the retrieval process. This is commonly performed, among other approaches, through file carving, in general detecting the original block sequences by means of signatures of known headers and footers of files. Of course, this creates challenges with fragmented files, where blocks belonging to different files may be interleaved. Ways to classify file blocks into file types relying on their content may provide a support to achieve a successful reconstruction. We propose to classify file blocks using Support Vector Machines (SVMs), and we do so by studying in-depth the impact of an appropriate selection of the features used in the classification process. We analyze several potential features and test their performance over a large and representative collection of file blocks and file types. We find out that SVM classifiers can achieve a good accuracy and that a specific type of features (based on byte frequency distribution) performs well across almost all of the examined file types.
Keywords :
computer forensics; information retrieval; pattern classification; support vector machines; SVM classifiers; block sequence; byte frequency distribution; digital forensics; file block; file block classification; file retrieval; file type; fragmented file; support vector machine; Complexity theory; Computational modeling; Entropy; Feature extraction; Mathematical model; Support vector machines; Training; File Block Classification; File Carving; Forensic Analysis; Machine Learning;
Conference_Titel :
Availability, Reliability and Security (ARES), 2011 Sixth International Conference on
Conference_Location :
Vienna
Print_ISBN :
978-1-4577-0979-1
Electronic_ISBN :
978-0-7695-4485-4
DOI :
10.1109/ARES.2011.52