Java threading introduction  Thread-safety  Thread methods  Interruption  Thread scheduling  Thread priorities  sleep()  yield()  Deadlock  Threading with Swing  invokeLater()  Thread pools  CoundDownLatch  ThreadPoolExecutor  CyclicBarrier

Threads Database Profiling Regular expressions Random numbers Compression Exceptions C Equivalents in Java

 Help us improve this site: Take the Javamex survey!

Thread scheduling implications in Java

Our discussion of how threads work and in particular how thread scheduling operates on typical platform leads to implications and restrictions that we can expect from Java threads, which we'll outline here. In some cases, we point to separate pages with fuller discussions.

Thread control

Firstly, the way that thread scheduling works has implications on various Java methods that control threads (which generally interact with the underlying operating system thread APIs):

  • the granularity and responsiveness of the Thread.sleep() method is largely determined by the scheduler's interrupt period and by how quickly the slept thread becomes the "chosen" thread again;
  • the precise function of the setPriority() method depends on the specific OS's interpretation of priority (and which underlying API call Java actually uses when several are available): for more information, see the more detailed section on thread priority;
  • the behaviour of the Thread.yield() method is similarly determined by what particuar underlying API calls do, and which is actually chosen by the VM implementation.

"Granularity" of threads

Although our introduction to threading focussed on how to create a thread, it turns out that it isn't appropriate to create a brand new thread just for a very small task. Threads are actually quite a "coarse-grained" unit of execution, for reasons that are hopefully becoming clear from the previous sections.

Overhead and limits of creating and destroying threads

We mentioned that certain structures need to be allocated and deallocated when a thread is created or killed, including a stack and some kind of thread status structure or "control block". In particular, the latter links in to global, shared resources about the currently running threads and its access requires proper synchronization by the OS. The upshot is that:

  • creating and tearing down threads isn't free: there'll be some CPU overhead each time we do so;
  • there may be some moderate limit on the number of threads that can be created, determined by the resources that a thread needs to have allocated (if a process has 2GB of address space, and each thread as 512K of stack, that means a maximum of a few thousands threads per process).

Although it's rare to do so, as of Java 1.4, it is possible to specify a stack size to the Thread constructor.

Avoiding thread overhead in Java

In applications such as servers that need to continually execute short, multithreaded tasks, the usual way to avoid the overhead of repeated thread creation is to create a thread pool. That is, a number of threads are initially created and then sit permanently waiting for jobs to be sent to them.

From Java 5, the Java API includes the ThreadPoolExecutor and various related classes in the java.util.concurrent package for implementing job queues and thread pools.

Context and process switching

Switching between threads will have some overhead:

  • the thread scheduler must actually manage the various thread structures and make decisions about which thread to schedule next where, and every time the thread running on a CPU actually changes— often referred to as a context switch— there'll be some negative impact due to e.g. the interruption of the instruction pipeline or the fact that the processor cache may no longer be relevant;
  • switching between threads of different processes (that is, switching to a thread that belongs to a different process from the one last running on that CPU) will carry a higher cost, since the address-to-memory mappings must be changed, and the contents of the cache almost certainly will be irrelevant to the next process.

Context switches appear to typically have a cost somewhere between 1 and 10 microseconds (i.e. between a thousandth and a hundredth of a millisecond) between the fastest and slowest cases (same-process threads with little memory contention vs different processes). So the following are acceptable:

  • a modest number of fast-case switches— e.g. a thousand per second per CPU will generally be much less than 1% of CPU usage for the context switch per se;
  • a few slower-case switches in a second, but where each switched-in thread can do, say, a milliseconds or so of worth of real work (and ideally several milliseconds) once switched in, where the more memory addresses the thread accesses (or the more cache lines it hits), the more milliseconds we want it to run interrupted for.

So the worst case is generally where we have several "juggling" threads which each time they are switched in only do a tiny amount of work (but do some work, thus hitting memory and contending with one another for resources) before context switching.

What causes too many slow context switches in Java?

Every time we deliberately change a thread's status or attributes (e.g. by sleeping, waiting on an object, changing the thread's priority etc), we will cause a context switch. But usually we don't do those things so many times in a second to matter. Typically, the cause of excessive context switching comes from contention on shared resources, particularly synchronized locks:

  • rarely, a single object very frequently synchronized on could become a bottleneck;
  • more frequently, a complex application has several different objects that are each synchronized on with moderate frequency, but overall, threads find it difficult to make progress because they keep hitting different contended locks at regular intervals.

The second case is generally worse, because the juggling threads, each time they make a tiny bit of progress, fight for shared CPU cache, thus making each other less efficient each time they're switched in.

Avoiding contention and context switches in Java

Firstly, before hacking with your code, a first course of action is upgrading your JVM, particularly if you are not yet using Java 6. Most new Java JVM releases have come with improved synchronization optimisation.

Then, a high-level solution to avoiding synchronized lock contention is generally to use the various classes from the Java 5 concurrency framework (see the java.util.concurrent package). For example, instead of using a HashMap with appropriate synchronization, a ConcurrentHashMap can easily double the throughput with 4 threads and treble it with 8 threads (see the aforementioned link for some ConcurrentHashMap performance measurements). A replacement to synchronized with often better concurrency is offered with various explicit lock classes (such as ReentrantLock).

At a lower level, solutions include holding on to locks for less time and (as part of this), reducing the "housekeeping" involved in managing a lock. The Java 5 atomic classes such as AtomicInteger effectively provide a way to access a shared variable with "less housekeeping", thus improving throughput.


 Did this article solve your problem? If not, you can now post a comment or question
 Java threading articles
 Java threading and concurrency
 Java profiling
 Java performance graph index

Unless otherwise stated, the Java programming articles and tutorials on this site are written by Neil Coffey. Suggestions are always welcome if you wish to suggest topics for Java tutorials or programming articles, or if you simply have a programming question that you would like to see answered on this site. Most topics will be considered. But in particular, the site aims to provide tutorials and information on topics that aren't well covered elsewhere, or on Java performance information that is poorly described or understood. Suggestions may be made via the Javamex blog (see the site's front page for details).
Copyright © Javamex UK 2010. All rights reserved.