DocumentCode :
692869
Title :
Exploring DRAM organizations for energy-efficient and resilient exascale memories
Author :
Giridhar, B. ; Cieslak, Michael ; Duggal, Deepankar ; Dreslinski, Ronald ; Hsing Min Chen ; Patti, Robert ; Hold, Betina ; Chakrabarti, Chaitali ; Mudge, Trevor ; Blaauw, D.
Author_Institution :
Univ. of Michigan, Ann Arbor, MI, USA
fYear :
2013
fDate :
17-22 Nov. 2013
Firstpage :
1
Lastpage :
12
Abstract :
The power target for exascale supercomputing is 20MW, with about 30% budgeted for the memory subsystem. Commodity DRAMs will not satisfy this requirement. Additionally, the large number of memory chips (>10M) required will result in crippling failure rates. Although specialized DRAM memories have been reorganized to reduce power through 3D-stacking or row buffer resizing, their implications on fault tolerance have not been considered. We show that addressing reliability and energy is a co-optimization problem involving tradeoffs between error correction cost, access energy and refresh power-reducing the physical page size to decrease access energy increases the energy/area overhead of error resilience. Additionally, power can be reduced by optimizing bitline lengths. The proposed 3D-stacked memory uses a page size of 4kb and consumes 5.1pJ/bit based on simulations with NEK5000 benchmarks. Scaling to 100PB, the memory consumes 4.7MW at 100PB/s which, while well within the total power budget (20MW), is also error-resilient.
Keywords :
DRAM chips; buffer storage; error correction; fault tolerant computing; mainframes; parallel machines; power aware computing; system recovery; 3D-stacked memory; 3D-stacking; DRAM memories; DRAM organizations; NEK5000 benchmarks; access energy; bitline lengths; commodity DRAM; cooptimization problem; crippling failure rates; energy-efficient resilient exascale memories; error correction cost; error resilience; exascale supercomputing; fault tolerance; memory chips; memory subsystem; power 20 MW; power 4.7 MW; row buffer resizing; Abstracts; Bandwidth; Error correction codes; Pins; Random access memory; Three-dimensional displays; Through-silicon vias;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing, Networking, Storage and Analysis (SC), 2013 International Conference for
Conference_Location :
Denver, CO
Print_ISBN :
978-1-4503-2378-9
Type :
conf
DOI :
10.1145/2503210.2503215
Filename :
6877456
Link To Document :
بازگشت