• DocumentCode
    228667
  • Title

    IndexFS: Scaling File System Metadata Performance with Stateless Caching and Bulk Insertion

  • Author

    Kai Ren ; Qing Zheng ; Patil, Swapnil ; Gibson, Garth

  • Author_Institution
    Carnegie Mellon Univ., Pittsburgh, PA, USA
  • fYear
    2014
  • fDate
    16-21 Nov. 2014
  • Firstpage
    237
  • Lastpage
    248
  • Abstract
    The growing size of modern storage systems is expected to exceed billions of objects, making metadata scalability critical to overall performance. Many existing distributed file systems only focus on providing highly parallel fast access to file data, and lack a scalable metadata service. In this paper, we introduce a middleware design called Index FS that adds support to existing file systems such as PVFS, Lustre, and HDFS for scalable high-performance operations on metadata and small files. Index FS uses a table-based architecture that incrementally partitions the namespace on a per-directory basis, preserving server and disk locality for small directories. An optimized log-structured layout is used to store metadata and small files efficiently. We also propose two client-based storm free caching techniques: bulk namespace insertion for creation intensive workloads such as N-N check pointing, and stateless consistent metadata caching for hot spot mitigation. By combining these techniques, we have demonstrated Index FS scaled to 128 metadata servers. Experiments show our out-of-core metadata throughput out-performing existing solutions such as PVFS, Lustre, and HDFS by 50% to two orders of magnitude.
  • Keywords
    cache storage; checkpointing; meta data; middleware; HDFS; IndexFS; Lustre; N-N check pointing; PVFS; bulk insertion; bulk namespace insertion; client-based storm free caching techniques; creation intensive workloads; disk locality; distributed file systems; file system metadata performance scaling; high-performance operations; hot spot mitigation; log-structured layout optimization; metadata scalability; middleware design; namespace partitioning; out-of-core metadata throughput; per-directory basis; preserving server; stateless caching; stateless consistent metadata caching; storage systems; table-based architecture; Compaction; Indexes; Middleware; Receivers; Scalability; Servers; Throughput; Distributed file systems; bulk insertion; file system metadata; log-structured merge tree; stateless caching;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis, SC14: International Conference for
  • Conference_Location
    New Orleans, LA
  • Print_ISBN
    978-1-4799-5499-5
  • Type

    conf

  • DOI
    10.1109/SC.2014.25
  • Filename
    7013007