Title :
SIMD Vectorization of Histogram Functions
Author :
Shahbahrami, Asadollah ; Juurlink, Ben ; Vassiliadis, Stamatis
Author_Institution :
Delft Univ. of Technol., Delft
Abstract :
Existing SIMD extensions cannot efficiently vectorize the histogram function due to memory collisions. We propose two techniques to avoid this problem. In the first, a hierarchical structure of three levels is proposed. In order to provide n-way parallelism, auxiliary arrays that have n and n/2 subarrays are used in the first and second level, respectively. The last level has the primary histogram array. Indirect SIMD load and store instructions are designed in order to access different elements of different subarrays. The different subarrays in the lower levels are merged and finally at the end, the calculated results are stored in the primary histogram array. In the second method, parallel comparators are used in order to count the number of subwords within a media register that are the same. Thereafter, these numbers are added to the values of the histogram array simultaneously. Experimental results obtained by extending the SimpleScalar toolset show that proposed techniques improve the performance compared to the fastest scalar version by a factor of 7.37 and 5.52, respectively.
Keywords :
parallel processing; storage management; SIMD vectorization; SimpleScalar toolset; histogram function; media register; memory collision; parallel comparator; Histograms; Image processing; Image retrieval; Image segmentation; Image storage; Laboratories; Parallel processing; Pattern recognition; Pixel; Video sequences; Histogram Calculation; Multimedia Extensions; Subword Parallelism;
Conference_Titel :
Application-specific Systems, Architectures and Processors, 2007. ASAP. IEEE International Conf. on
Conference_Location :
Montreal, Que.
Print_ISBN :
978-1-4244-1026-2
Electronic_ISBN :
2160-0511
DOI :
10.1109/ASAP.2007.4429976