• DocumentCode
    3338951
  • Title

    Branchless vectorized median filtering

  • Author

    Kachelriess, Marc

  • Author_Institution
    Inst. of Med. Phys. (IMP), Univ. of Erlangen-Niirnberg, Erlangen, Germany
  • fYear
    2009
  • fDate
    Oct. 24 2009-Nov. 1 2009
  • Firstpage
    4099
  • Lastpage
    4105
  • Abstract
    Median filtering is an important tool in signal or image processing. Based on the vector capabilities of modern hardware, which allows for vectorized min, max and mask operations, we provide a median algorithm of complexity O(NM) that is both branchless and vectorized. In contrast to conventional fast median filters, whose run-time is data-dependent and that can operate only on scalar data, its runtime is predictable and data-independent and it can simultaneously operate on several one-dimensional signals thereby making use of data-level parallelism. Our branchless vectorized median (BVM) filter keeps track of a sorted array from which values are deleted and to which new values are inserted. As a spin-off effect we could also use our work to provide a data sorting algorithm that is branchless and vectorized. Although it is of O(M2) computational complexity while other sort algorithms are of O(M lnM) computational complexity, at least for typical data, it may outperform other implementations for small M. This is mainly due to the fact that we have no unexpected branches which would stall the instruction pipeline and that we can simultaneously operate on vectors of data. Here, however, we focus on median filtering. BVM is compared to a median filter based on doubly linked lists (DLL) and to the median implementation of the Intel Performance Primitives (IPP) library. The comparison uses constant data and linear data, which are two unrealistic settings, and random data, which is more realistic. For constant data IPP is up to five times faster than BVM whose run-time does not depend on the data themselves. However, regarding the realistic case of median filtering noisy data BVM outperforms DLL by about a factor of 1.5 and IPP by about a factor of 2 in our CPU-based implementations, which is far more than we did expect initially.
  • Keywords
    computational complexity; high energy physics instrumentation computing; median filters; sorting; CPU-based implementations; Intel performance primitives library; branchless vectorized median filtering; computational complexity; conventional fast median filters; data sorting algorithm; data-level parallelism; doubly linked lists; image processing; instruction pipeline; median algorithm; median filtering noisy data; modern hardware; one-dimensional signals; random data; scalar data; signal processing; sorted array; spin-off effect; vector capabilities; vectorized mask operation; vectorized max operation; vectorized min operation; Computational complexity; Filtering; Filters; Hardware; Image processing; Parallel processing; Pipelines; Runtime; Signal processing; Sorting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Nuclear Science Symposium Conference Record (NSS/MIC), 2009 IEEE
  • Conference_Location
    Orlando, FL
  • ISSN
    1095-7863
  • Print_ISBN
    978-1-4244-3961-4
  • Electronic_ISBN
    1095-7863
  • Type

    conf

  • DOI
    10.1109/NSSMIC.2009.5402362
  • Filename
    5402362