Title : 
Efficient use of memory bandwidth to improve network processor throughput
         
        
            Author : 
Hasan, Jahangir ; Chandra, Satish ; Vijaykumar, T.N.
         
        
            Author_Institution : 
Sch. of Electr. & Comput. Eng., Purdue Univ., West Lafayette, IN, USA
         
        
        
        
        
            Abstract : 
We consider the efficiency of packet buffers used in packet switches built using network processors (NPs). Packet buffers are typically implemented using DRAM, which provides plentiful buffering at a reasonable cost. The problem we address is that a typical NP workload may be unable to utilize the peak DRAM bandwidth. Since the bandwidth of the packet buffer is often the bottleneck in the performance of a shared-memory packet switch, inefficient use of available DRAM bandwidth further reduces the packet throughput. Specialized hardware-based schemes that alleviate the DRAM bandwidth problem in high-end routers may be less applicable to NP-based systems, in which cost is an important consideration. We propose cost-effective ways to enhance average-case DRAM bandwidth. In modern DRAMs, successive accesses falling within the same DRAM row are significantly faster than those falling across rows. If accesses to DRAM can be generated differently or reordered to take advantage of fast same-row accesses, peak DRAM bandwidth can be approached. The challenge is in exploiting this "row locality" despite the unpredictable nature of memory accesses in NPs. We propose a set of simple techniques to meet this challenge. These include locality-sensitive buffer allocation on packet input, reordering DRAM accesses to increase locality, and prefetching to reduce row miss penalty. We evaluate our techniques on cycle-accurate simulations of Intel\´s IXP 1200 network processor and find that they boost packet throughput on average by 42.7%, utilizing nearly the peak DRAM bandwidth, for a set of common NP applications processing a real trace.
         
        
            Keywords : 
DRAM chips; bandwidth allocation; buffer storage; multi-threading; packet switching; shared memory systems; DRAM bandwidth; Intel IXP 1200 NP; NP applications; NP throughput; locality-sensitivity buffer allocation; memory access; network processor; packet buffers; row locality; shared memory packet switching; Bandwidth; Buffer storage; Costs; Delay; Multithreading; Packet switching; Random access memory; Space technology; Switches; Throughput;
         
        
        
        
            Conference_Titel : 
Computer Architecture, 2003. Proceedings. 30th Annual International Symposium on
         
        
        
            Print_ISBN : 
0-7695-1945-8
         
        
        
            DOI : 
10.1109/ISCA.2003.1207009