DocumentCode
3351323
Title
Adaptive data partition for sorting using probability distribution
Author
Shen, Xipeng ; Ding, Chen
Author_Institution
Dept. of Comput. Sci., Rochester Univ., NY, USA
fYear
2004
fDate
15-18 Aug. 2004
Firstpage
250
Abstract
Many computing problems benefit from dynamic partition of data into smaller chunks with better parallelism and locality. However, it is difficult to partition all types of inputs with the same high efficiency. This paper presents a new partition method in sorting scenario based on probability distribution, an idea first studied by Janus and Lamagna in early 1980´s on a mainframe computer. The new technique makes three improvements. The first is a rigorous sampling technique that ensures accurate estimate of the probability distribution. The second is an efficient implementation on modern, cache-based machines. The last is the use of probability distribution in parallel sorting. Experiments show 10-30% improvement in partition balance and 20-70% reduction in partition overhead, compared to two commonly used techniques. The new method reduces the parallel sorting time by 33-50% and outperforms the previous fastest sequential sorting technique by up to 30%.
Keywords
cache storage; parallel algorithms; parallel machines; performance evaluation; probability; sampling methods; sorting; adaptive data partition; cache-based machine; mainframe computer; parallel sorting algorithm; probability distribution; sampling technique; Biology computing; Computer science; Concurrent computing; Discrete event simulation; Distributed computing; Parallel processing; Partitioning algorithms; Probability distribution; Sampling methods; Sorting;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Processing, 2004. ICPP 2004. International Conference on
ISSN
0190-3918
Print_ISBN
0-7695-2197-5
Type
conf
DOI
10.1109/ICPP.2004.1327928
Filename
1327928
Link To Document