ConcurrentHashMap

The Java ConcurrentHashMap is version of the standard Java HashMap that is optimised for efficient access by multiple threads at the same time. Maps— also referred to as dictionaries— are an important data structure used in many applications where we need to associate one piece of data with another. We usually talk about associating keys with values. This type of association crops up in a range of cases such as:

Caches and lookups: for example, after reading the contents of a given file or database table, we could associate the file name with its contents and hold it in a HashMap representing an in-memory cache (similarly, we could associate a database key with a representation of the row data, associate a web server session ID with user data...);
Dictionaires: for example, we could associate locale abbrevations with a language name;
Sparse arrays: by mapping integers to values, we in effect create an array which does not waste space on blank elements.

Many of these cases are precisely the types of data that we need to work with, for example, in a multi-threaded server application. Ordinarily, using a HashMap is not thread-safe: if multiple threads attempt to access the same map simultaneously, there is a risk that the map will be come corrupted or that threads will not read the correct data. The Java synchronized keyword provides a means to add thread-safety. But under high contention, using synchronized is potentially inefficient. And as discussed, maps are an extremely useful data structure that we frequently want to use in applications where performance is a concern.

How `ConcurrentHashMap` works to overcome synchronization overhead

The ConcurrentHashMap improves concurrent performance by taking advantage of how HashMaps store their data: the data is distributed into different "buckets" in memory. When we call put() or get() on a ConcurrentHashMap, the map therefore only needs to be temporarily locked on the specific bucket of data being accessed, rather than on the whole map. (Whereas if we synchronized on the get() and put() methods, we would lock on the entire map and hence reduce throughput.) In fact, the ConcurrentHashMap also takes advantage of Java ReadWriteLocks so that:

writing to a ConcurrentHashMap locks only a portion of the map;
reads can generally occur without locking.

When to use `ConcurrentHashMap`

ConcurrentHashMap is recommended instead of a standard HashMap whenever the map will be modified by multiple threads:

it provides improved throughput compared to synchronized;
it allows concurrent reads modifications to be performed safely;
it provides atomic operations for query-then-update ("get-set") operations;
it provides memory visibility guarantees

In general, there is little downside to using ConcurrentHashMap other than that in general it will consume more memory than the equivalent standard HashMap.

Next: throughput and scalability of ConcurrentHashMap vs synchronized HashMap

The benefits of ConcurrentHashMap over a regular synchronized HashMap become blatantly apparent when we run a small experiment to simulate what might happen in the case of a map used as a frequently-accessed cache on a moderately busy server. On the next page, we discuss the scalability of ConcurrentHashMap in the light of such a simulation.

If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants. Follow @BitterCoffey

ConcurrentHashMap

How ConcurrentHashMap works to overcome synchronization overhead

When to use ConcurrentHashMap

Next: throughput and scalability of ConcurrentHashMap vs synchronized HashMap

How `ConcurrentHashMap` works to overcome synchronization overhead

When to use `ConcurrentHashMap`