DocumentCode :
2396140
Title :
Parallel Processing on FPGAs: The Effect of Profiling on Performance
Author :
Li, Xiaoguang ; Areibi, Shawki ; Dony, Robert
Author_Institution :
Sch. of Eng., Guelph Univ., Ont.
fYear :
2006
fDate :
Dec. 2006
Firstpage :
179
Lastpage :
184
Abstract :
The processing elements, logic resources, and on-chip block RAMs of modern FPGAs can not only be used for prototyping custom hardware modules, but also for parallel processing purposes by implementing multiple processors for a single task. This paper compares the performance of a single-processor implementation with two types of dual-processor implementations for a widely used radix-2 n-point FFT algorithm (Kooley and Tuckey, 1965) in terms of processing speed and FPGA resource utilization. In the first dual-processor implementation, the partitioning is performed based on the computation complexity - O(nlog(n)) of the radix-2 FFT algorithm. In the second implementation, the partitioning is based on a detailed profiling procedure applied to each line of the code in the single-processor implementation. Results obtained show that the speedup of the first dual-processor implementation is on average 1.3times faster than the single-processor implementation, whereas the second dual-processor implementation is about 1.9times faster which is very close to the expected speedup. This result shows that detailed profiling is crucial in identifying the bottlenecks of an algorithm (i.e., all the factors are taken into consideration) and consequently the algorithm can be efficiently mapped on a multiprocessor system based on the correct decision
Keywords :
fast Fourier transforms; field programmable gate arrays; multiprocessing systems; parallel processing; random-access storage; FPGA; dual-processor; logic resources; multiple processors; multiprocessor system; on-chip block RAM; parallel processing; radix-2 n-point FFT algorithm; Acceleration; Algorithm design and analysis; Field programmable gate arrays; Hardware; Logic; Multiprocessing systems; Parallel processing; Partitioning algorithms; Software algorithms; Software performance;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
System-on-Chip for Real-Time Applications, The 6th International Workshop on
Conference_Location :
Cairo
Print_ISBN :
1-4244-0898-9
Type :
conf
DOI :
10.1109/IWSOC.2006.348232
Filename :
4155285
Link To Document :
بازگشت