• DocumentCode
    21018
  • Title

    Parallel Reproducible Summation

  • Author

    Demmel, James ; Hong Diep Nguyen

  • Author_Institution
    Math. Dept., Univ. of California at Berkeley, Berkeley, CA, USA
  • Volume
    64
  • Issue
    7
  • fYear
    2015
  • fDate
    July 1 2015
  • Firstpage
    2060
  • Lastpage
    2070
  • Abstract
    Reproducibility, i.e. getting bitwise identical floating point results from multiple runs of the same program, is a property that many users depend on either for debugging or correctness checking in many codes [10]. However, the combination of dynamic scheduling of parallel computing resources, and floating point nonassociativity, makes attaining reproducibility a challenge even for simple reduction operations like computing the sum of a vector of numbers in parallel. We propose a technique for floating point summation that is reproducible independent of the order of summation. Our technique uses Rump´s algorithm for error-free vector transformation [7], and is much more efficient than using (possibly very) high precision arithmetic. Our algorithm reproducibly computes highly accurate results with an absolute error bound of n · 2-28 macheps maxiIviI at a cost of 7n FLOPs and a small constant amount of extra memory usage. Higher accuracies are also possible by increasing the number of error-free transformations. As long as all operations are performed in to-nearest rounding mode, results computed by the proposed algorithms are reproducible for any run on any platform. In particular, our algorithm requires the minimum number of reductions, i.e. one reduction of an array of six double precision floating point numbers per sum, and hence is well suited for massively parallel environments.
  • Keywords
    floating point arithmetic; parallel processing; program debugging; Rump algorithm; correctness checking; dynamic scheduling; error free transformations; error free vector transformation; floating point nonassociativity; floating point numbers; floating point summation; identical floating point; parallel computing resources; parallel environments; parallel reproducible summation; program debugging; reproducibility; Accuracy; Algorithm design and analysis; Computational modeling; Numerical analysis; Program processors; Standards; Vectors; Reproducibility; floating-point; numerical analysis; parallel computing; rounding error; summation;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/TC.2014.2345391
  • Filename
    6875899