• DocumentCode
    1787634
  • Title

    Using multi-level cell STT-RAM for fast and energy-efficient local checkpointing

  • Author

    Ping Chi ; Cong Xu ; Tao Zhang ; Xiangyu Dong ; Yuan Xie

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
  • fYear
    2014
  • fDate
    2-6 Nov. 2014
  • Firstpage
    301
  • Lastpage
    308
  • Abstract
    High reliability, availability, and serviceability are critical for modern large-scale computing systems. As an effective error recovery mechanism, checkpointing has been widely used in such systems for their survival from unexpected failures. The conventional checkpointing schemes, however, are time-consuming due to the limited I/O bandwidth between the DRAM-based main memory and the backup storage. To mitigate the checkpoint overhead, we propose a fast local checkpointing scheme by leveraging Multi-Level Cell (MLC) STT-RAM. We take advantage of the unique features of MLC STT-RAM to accelerate local checkpointing. Our experimental results show that the average performance overhead is less than 1% in a multi-programmed four-core process node with a 1-second local checkpoint interval. The evaluation results also demonstrate that using MLC STT-RAM is an energy-efficient solution.
  • Keywords
    checkpointing; microcomputers; random-access storage; DRAM-based main memory; MLC STT-RAM; backup storage; checkpointing schemes; error recovery mechanism; large-scale computing systems; multilevel cell STT-RAM; spin transfer torque random access memory; Checkpointing; Magnetic tunneling; Nonvolatile memory; Phase change random access memory; Resistance; Switches;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer-Aided Design (ICCAD), 2014 IEEE/ACM International Conference on
  • Conference_Location
    San Jose, CA
  • Type

    conf

  • DOI
    10.1109/ICCAD.2014.7001367
  • Filename
    7001367