Abstract :
In this paper, we present and discuss high performance implementation of a wide class of image processing applications on a low-power massively parallel SIMD architecture, the ClearSpeed CSX700. We present parallel implementation results for four classes of image processing applications: feature detection (Harris Corner Detector), stereo vision (a class of SSD like algorithms), model estimation (RANSAC), and object detection (based on Histogram of Oriented Gradient, HOG) on the CSX SIMD architecture. Our results indicate that this SIMD architecture is indeed a good candidate for achieving low-power supercomputing capability, as well as a rather satisfactory degree of flexibility for implementing various applications. We also compare our results, when applicable, with similar implementations on ASIC, FPGAs, and GPGPUs. This comparison cealrly demonstrates that we achieve a much better absolute computational performance than ASICs and FPGAs, with a better relative performance per watt. Compared with GPGPUs, we achieve similar (and for some cases better) computational performance but with a significantly better relative performance per watt. We show that, by designing appropriate efficient parallel algorithms, this highly parallel SIMD architecture can represent an excellent candidate for space-borne applications wherein low-power, light weight, high performance computation is a major requirement.
Keywords :
feature extraction; image processing; object detection; parallel architectures; stereo image processing; ASIC; CSX SIMD architecture; ClearSpeed CSX700; FPGA; GPGPU; Harris corner detector; RANSAC; feature detection; histogram of oriented gradient; image processing application; model estimation; object detection; parallel SIMD architecture; space borne application; stereo vision; Clocks; Computational modeling; Data models; Robots;