Java threading introduction  Thread-safety  Thread methods  Interruption  Thread scheduling  Context switching  Thread priorities  sleep()  yield()  Deadlock  Threading with Swing  invokeLater()  Thread pools  CoundDownLatch  ThreadPoolExecutor  CyclicBarrier

CyclicBarrier example: a parallel sort algorithm

As an example of using the Java CyclicBarrier class, we consider a parallel sort. There are various ways to sort data in parallel, but the approach we consider here lends itself to using a CyclicBarrier. It also has the advantage that we're essentially building on available library routines: we divide the work into a sublist per thread, and then each thread uses the standard Collections.sort() to sort its sublist.

Parallel sort algorithm with CyclicBarrier

Essentially, we: (a) split the data into one "bucket" per thread, where each bucket contains an approximately equal number of values, and each bucket covers values within a particular range; (b) each thread sorts its bucket; (c) the buckets are then put together in order: since each bucket covered values within a particular range, simply appending the buckets in thread order results in a completely sorted list.

The actual sorting of the buckets is obviously parallelisable. We can also parallelise the process of putting the data into buckets: for this, each thread goes through an equal-sized portion of the list and, from that sublist, decides which bucket to put each piece of data into.

Split points

The slightly more subtle problem— and subtly parallelisable problem— is deciding on values for the "split points". For example, if we want to sort the data using two threads, we need to find some value of x so that putting values less than x into bucket 1 and values more than or equal to x into bucket 2 will result in roughly equal-sized buckets. In this case, x is the "split point". Wtih 3 threads, we need 2 split points; 4 threads needs 3 split points etc.

If we know in advance that our data is evenly distributed across the possible range, then we can decide on split points with a simple division. For example, with 2 threads, if each piece of data can take a value between 0 and 100,000 and all values have equal probability, then we can set x to be 50,000, and we know that the buckets are likely to be roughly equal in size. But what if the distirbution is more subtle: say, values below 20,000 are twice as likely, or values are more probable the closer they are to some number?

One method that would work, but which would be utterly pointless, would be to sort the data (in a single thread), and then take x to be the value halfway through the list: that way, we obviously would know that half of the values were smaller than x. And scaling to larger numbers of threads, taking split points at equal intervals in the sorted list, we'd know for sure that an equal number of numbers in the list had values between the given split points.

Now obviously, there's little point on deciding on how to sort a list in parralel by first sorting it in a single thread! But what we can do is sort a small random sample of the list and take the split points from that. And although we sort the sample list in a single thread1, we can pick the random sample in parallel: each thread chooses, say, 16 random values from a portion of the list.

Next

On the next page, we summarise the stages we've just described, and then look at how to code them using a Java CyclicBarrier.


1. Actually, using a construct called a sorting network, we can sort a smallish list in parallel, and this is the approach suggested by Herlihy & Shavit (2008), The Art of Multiprocessor Programming, pp. 290-291. Largely for simplicity of illustration (but also because I'm skeptical that it's worth the effort in this case), I choose to sort the sample list in a single thread.

comments powered by Disqus

 Java threading articles  Java threading and concurrency  Java profiling  Java performance graph index

Unless otherwise stated, the Java programming articles and tutorials on this site are written by Neil Coffey. Suggestions are always welcome if you wish to suggest topics for Java tutorials or programming articles, or if you simply have a programming question that you would like to see answered on this site. Most topics will be considered. But in particular, the site aims to provide tutorials and information on topics that aren't well covered elsewhere, or on Java performance information that is poorly described or understood. Suggestions may be made via the Javamex blog (see the site's front page for details).
Copyright © Javamex UK 2014. All rights reserved.