Title :
Hyperfast Parallel--Beam Backprojection
Author :
Kachelriess, Marc ; Knaup, Michael ; Bockenbach, Olivier
Author_Institution :
Inst. of Med. Phys., Erlangen-Nurnberg Univ., Erlangen
fDate :
Oct. 29 2006-Nov. 1 2006
Abstract :
Tomographic image reconstruction, such as the reconstruction of CT projection values, of tomosynthesis data, PET or SPECT events, is computational very demanding. The most time-consuming step is the backprojection which is often limited by the memory bandwidth. Recently, a novel general purpose architecture optimized for distributed computing became available: the Cell Broadband Engine (CBE). Its eight synergistic processing elements (SPEs) currently allow for a theoretical performance of 192 GFlops (3 GHz, 8 units, 4 floats per vector, 2 instructions, multiply and add, per clock). To maximize image reconstruction speed we modified our parallel-beam backprojection algorithm that is highly optimized for standard PCs, and optimized the code for the cell processor. Data mining techniques and double buffering of source data were extensively used to optimally utilize both the memory bandwidth and the available local store of each SPE. The pixel-driven backprojection code uses floating point arithmetic and either linear interpolation (LI) or nearest neighbor (NN) interpolation between neighboring detector channels. Performance was measured using simulated data with 512 parallel beam projections per half rotation and 1024 detector elements. The data were backprojected into an image of 512 by 512 pixels using our PC-based approach and the new cell-based algorithm. Both the PC and the CBE were clocked at 3 GHz. Images obtained were found to be identical with both approaches. A throughput of 11 fps (LI) and 15 fps (NN) was measured on the PC whereas the CBE achieved 126 fps (LI) and 165 fps (NN). Thereby, the cell greatly outperforms today´s top-notch backprojections based on graphical processing units (GPU). Using both CBEs of our dual cell-based blade (Mercury Computer Systems) one can backproject 252 images per second with LI and and 330 images per second with NN.
Keywords :
buffer storage; data mining; distributed processing; floating point arithmetic; image reconstruction; interpolation; medical computing; medical image processing; positron emission tomography; single photon emission computed tomography; CBE; CT projection values; Mercury Computer Systems; PET; SPECT; cell based algorithm; cell broadband engine; data mining techniques; distributed computing; dual cell based blade; floating point arithmetic; general purpose architecture; hyperfast parallel beam backprojection; image reconstruction speed; linear interpolation; memory bandwidth; nearest neighbor interpolation; neighboring detector channels; parallel beam backprojection algorithm; pixel driven backprojection code; source data double buffering; synergistic processing elements; tomographic image reconstruction; tomosynthesis data; Bandwidth; Clocks; Computed tomography; Computer architecture; Distributed computing; Engines; Image reconstruction; Interpolation; Neural networks; Positron emission tomography;
Conference_Titel :
Nuclear Science Symposium Conference Record, 2006. IEEE
Conference_Location :
San Diego, CA
Print_ISBN :
1-4244-0560-2
Electronic_ISBN :
1095-7863
DOI :
10.1109/NSSMIC.2006.356533