Title : 
Data parallel address architecture
         
        
            Author : 
Ahn, Jung Ho ; Dally, William J.
         
        
            Author_Institution : 
Comput. Syst. Lab., Stanford Univ., CA
         
        
        
        
        
        
        
            Abstract : 
Data parallel memory systems must maintain a large number of outstanding memory references to fully use increasing DRAM bandwidth in the presence of increasing latency. At the same time, the throughput of modern DRAMs is very sensitive to access pattern´s due to the time required to precharge and activate banks and to switch between read and write access. To achieve memory reference parallelism a system may simultaneously issue references from multiple reference threads. Alternatively multiple references from a single thread can be issued in parallel. In this paper, we examine this tradeoff and show that allowing only a single thread to access DRAM at any given time significantly improves performance by increasing the locality of the reference stream and hence reducing precharge/activate operations and read/write turnaround. Simulations of scientific and multimedia applications show that generating multiple references from a single thread gives, on average, 17% better performance than generating references from two parallel threads
         
        
            Keywords : 
DRAM chips; parallel architectures; parallel memories; DRAM bandwidth; data parallel address architecture; data parallel memory systems; read access; write access; Bandwidth; Computer architecture; Delay; Memory management; Parallel processing; Random access memory; Scheduling; Streaming media; Switches; Yarn;
         
        
        
            Journal_Title : 
Computer Architecture Letters
         
        
        
        
        
            DOI : 
10.1109/L-CA.2006.4