DocumentCode :
249517
Title :
CouchFS: A High-Performance File System for Large Data Sets
Author :
Fangzhou Yao ; Campbell, Roy H.
Author_Institution :
Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
fYear :
2014
fDate :
June 27 2014-July 2 2014
Firstpage :
784
Lastpage :
785
Abstract :
Numerous file systems have been implemented to meet the needs in today´s big data era, however many of them require specific configurations or frameworks for data processing. This paper presents CouchFS, a POSIX-compliant distributed file system for large data sets. We build CouchFS on top of CouchDB, which grants us flexibility to handle semistructured data. Since a database has similar behaviors as a file system, and CouchDB provides a high customizable MapReduce view for indexing, CouchFS is able to achieve high-performance searching for both text and supported binary objects. This work compares search of Wikipedia data using CouchDB, PostgreSQL and Spotlight on HFS+ file system. We show our design of CouchFS and discuss future approaches to improve this file system.
Keywords :
Big Data; Web sites; indexing; information retrieval; parallel processing; CouchDB; CouchFS; POSIX-compliant distributed file system; PostgreSQL; Spotlight; Wikipedia data; big data era; high customizable MapReduce view; high-performance file system; high-performance searching; large data sets; Big data; Electronic publishing; Encyclopedias; Indexing; Internet;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data (BigData Congress), 2014 IEEE International Congress on
Conference_Location :
Anchorage, AK
Print_ISBN :
978-1-4799-5056-0
Type :
conf
DOI :
10.1109/BigData.Congress.2014.122
Filename :
6906866
Link To Document :
بازگشت