• DocumentCode
    3351323
  • Title

    Adaptive data partition for sorting using probability distribution

  • Author

    Shen, Xipeng ; Ding, Chen

  • Author_Institution
    Dept. of Comput. Sci., Rochester Univ., NY, USA
  • fYear
    2004
  • fDate
    15-18 Aug. 2004
  • Firstpage
    250
  • Abstract
    Many computing problems benefit from dynamic partition of data into smaller chunks with better parallelism and locality. However, it is difficult to partition all types of inputs with the same high efficiency. This paper presents a new partition method in sorting scenario based on probability distribution, an idea first studied by Janus and Lamagna in early 1980´s on a mainframe computer. The new technique makes three improvements. The first is a rigorous sampling technique that ensures accurate estimate of the probability distribution. The second is an efficient implementation on modern, cache-based machines. The last is the use of probability distribution in parallel sorting. Experiments show 10-30% improvement in partition balance and 20-70% reduction in partition overhead, compared to two commonly used techniques. The new method reduces the parallel sorting time by 33-50% and outperforms the previous fastest sequential sorting technique by up to 30%.
  • Keywords
    cache storage; parallel algorithms; parallel machines; performance evaluation; probability; sampling methods; sorting; adaptive data partition; cache-based machine; mainframe computer; parallel sorting algorithm; probability distribution; sampling technique; Biology computing; Computer science; Concurrent computing; Discrete event simulation; Distributed computing; Parallel processing; Partitioning algorithms; Probability distribution; Sampling methods; Sorting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing, 2004. ICPP 2004. International Conference on
  • ISSN
    0190-3918
  • Print_ISBN
    0-7695-2197-5
  • Type

    conf

  • DOI
    10.1109/ICPP.2004.1327928
  • Filename
    1327928