CyclicBarrier example: a parallel sort algorithm
As an example of using the Java CyclicBarrier class,
we consider a parallel sort. There are various ways to sort data in parallel, but the
approach we consider here lends itself to using a CyclicBarrier. It also has
the advantage that we're essentially building on available library routines: we divide
the work into a sublist per thread, and then each thread uses the standard
to sort its sublist.
Parallel sort algorithm with CyclicBarrier
we: (a) split the data into one "bucket" per thread, where each bucket contains an
approximately equal number of values, and each bucket covers values within a particular
range; (b) each thread sorts its bucket; (c) the buckets are then put together in order:
since each bucket covered values within a particular range, simply appending the buckets
in thread order results in a completely sorted list.
The actual sorting of the buckets is obviously parallelisable. We can also parallelise
the process of putting the data into buckets: for this, each thread goes through an equal-sized
portion of the list and, from that sublist, decides which bucket to put each piece of data
The slightly more subtle problem— and subtly parallelisable problem—
is deciding on values for the "split points". For example, if we want to sort the data using
two threads, we need to find some value of x so that putting values less than x
into bucket 1 and values more than or equal to x into bucket 2 will result in
roughly equal-sized buckets. In this case, x is the "split point".
Wtih 3 threads, we need 2 split points; 4 threads needs 3 split points etc.
If we know
in advance that our data is evenly distributed across the possible range, then we can
decide on split points with a simple division. For example, with 2 threads, if
each piece of data can take a value between 0 and 100,000 and all values have equal
probability, then we can set x to be 50,000, and we know that the buckets are
likely to be roughly equal in size. But what if the distirbution is more subtle: say,
values below 20,000 are twice as likely, or values are more probable the closer they
are to some number?
One method that would work, but which would be utterly pointless, would be to sort the
data (in a single thread), and then take x to be the value halfway through the list:
that way, we obviously would know that half of the values were smaller than x. And scaling
to larger numbers of threads, taking split points at equal intervals in the sorted list,
we'd know for sure that an equal number of numbers in the list
had values between the given split points.
Now obviously, there's little point on deciding on how to sort a list in parralel
by first sorting it in a single thread! But what we can do is
sort a small random sample of the list and take the split points from that.
And although we sort the sample list in a single thread1,
we can pick the random sample in parallel: each thread chooses, say, 16 random
values from a portion of the list.
On the next page, we summarise the stages we've just described, and then look at
how to code them using a Java CyclicBarrier.
1. Actually, using a construct called a sorting network,
we can sort a smallish list in parallel, and this is the approach suggested by
Herlihy & Shavit (2008),
The Art of Multiprocessor Programming, pp. 290-291. Largely for simplicity of illustration (but also because I'm skeptical that it's
worth the effort in this case), I choose to sort the sample list in a single thread.