DocumentCode :
2992873
Title :
Parallelization and characterization of SIFT on multi-core systems
Author :
Feng, Hao ; Li, Eric ; Chen, Yurong ; Zhang, Yimin
Author_Institution :
Intel China Res. Center, Applic. Res. Lab., Beijing
fYear :
2008
fDate :
14-16 Sept. 2008
Firstpage :
14
Lastpage :
23
Abstract :
This paper parallelizes and characterizes an important computer vision application -Scale Invariant Feature Transform (SIFT) both on a Symmetric Multiprocessor (SMP) platform and a large scale Chip Multiprocessor (CMP) simulator. SIFT is an approach for extracting distinctive invariant features from images and has been widely applied. In many computer vision problems, a real-time or even super-real-time processing capability of SIFT is required. To meet the computation demand, we optimize and parallelize SIFT to accelerate its execution on multi-core systems. Our study shows that SIFT can achieve a 9.7x ~ llx speedup on a 16 -core SMP system. Furthermore, Single Instruction Multiple Data (SIMD) and cache-conscious optimization bring another 85% performance gain at most. But it is still three times slower than the real-time requirement for High-Definition Television (HDTV) image. Then we study the performance of SIFT on a 64 -core CMP simulator. The results show that for HDTV image, SIFT can achieve an excellent speedup of 52 x and run in real-time finally. Besides the parallelization and optimization work, we also conduct a detailed performance analysis for SIFT on those two platforms. We find that load imbalance significantly limits the scalability and SIFT suffers from intensive burst memory bandwidth requirement on the 16 -core SMP system. However, on the 64 -core CMP simulator the memory pressure is not high due to the shared last-level cache (LLC) which accommodates tremendous read-write sharing in SIFT. Thus it does not affect the scaling performance. In short, understanding the characterization of SIFT can help identify the program bottlenecks and give us further insights into designing better systems.
Keywords :
cache storage; computer vision; feature extraction; multiprocessing systems; parallel algorithms; transforms; CMP simulator; cache-conscious optimization; computer vision application; distinctive invariant feature extraction; large scale chip multiprocessor; multicore system; parallel algorithm; scale invariant feature transform; single instruction multiple data; symmetric multiprocessor; Acceleration; Application software; Computational modeling; Computer simulation; Computer vision; Concurrent computing; Feature extraction; HDTV; Large-scale systems; Performance gain;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Workload Characterization, 2008. IISWC 2008. IEEE International Symposium on
Conference_Location :
Seattle, WA
Print_ISBN :
978-1-4244-2777-2
Electronic_ISBN :
978-1-4244-2778-9
Type :
conf
DOI :
10.1109/IISWC.2008.4636087
Filename :
4636087
Link To Document :
بازگشت