ConcurrentHashMap<K, V>

Language/Java

ConcurrentHashMap<K, V>

사과만쥬 2024. 10. 3. 12:00

ConcurrentHashMap<K, V>

해시맵(Map)의 동시성 버전으로, 다수의 스레드가 동시에 데이터를 삽입, 수정, 삭제할 수 있도록 안전한 구조를 제공합니다.
Hashtable과 비슷하지만 HashMap과 달리 이 클래스는 null을 키나 값으로 사용하는 것을 허용하지 않습니다.

Like Hashtable but unlike HashMap, this class does not allow null to be used as a key or value.

Java 1.5에서 도입되었으며, Java 8에서는 ConcurrentHashMap에 대한 동시성 API가 개선되었으며, HashMap과 달리 ConcurrentHashMap은 멀티스레드 환경에서 안전하게 사용할 수 있습니다.

public class ConcurrentHashMap<K,V> extends AbstractMap<K,V>
    implements ConcurrentMap<K,V>, Serializable {
    private static final long serialVersionUID = 7249069246763182397L;

Serializable을 implement하면서 클래스의 직렬화가 가능해진다.

Serializability of a class is enabled by the class implementing the java.io.Serializable interface.

이 해시 테이블의 목적은 크게 두 가지로 나뉘는데,

1) 업데이트 경합을 최소화하면서 동시 가독성을 유지하는 것이다. (특히, get() 뿐만 아니라 iterator(반복자) 및 관련 메서드도 포함함)

2) java.util.HashMap과 비슷한 수준이나 더 나은 수준으로 공간 소비를 유지하고, 많은 스레드를 이용하면서 빈 테이블에 높은 초기 삽입 속도를 지원하는 것이다.

The primary design goal of this hash table is to maintain concurrent readability (typically method get(), but also iterators and related methods) while minimizing update contention.
Secondary goals are to keep space consumption about the same or better than java.util.HashMap, and to support high initial insertion rates on an empty table by many threads.

각 key-value 매핑은 노드에 보관된다.

Each key-value mapping is held in a Node.

버킷 기반 잠금 메커니즘: ConcurrentHashMap은 각 버킷에 개별적인 잠금을 사용하여 성능을 향상시킵니다. 이는 전체 맵을 잠그는 대신, 맵의 일부만 잠그는 방식으로, synchronizedMap과 비교해 더 나은 성능을 제공합니다.
동시 읽기 작업: Java 8의 ConcurrentHashMap은 읽기 작업을 동시 수행할 수 있으며, put이나 remove 같은 쓰기 작업 중에도 읽기 작업이 블로킹되지 않아 성능이 향상됩니다.

버킷 기반 잠금 메커니즘과 동시 읽기 작업에 대해 더 설명해보자면

// concurrenthashmap code

public class ConcurrentHashMap<K,V> extends AbstractMap<K,V>
    implements ConcurrentMap<K,V>, Serializable {

    public V get(Object key) {}

    public boolean containsKey(Object key) { }

    public V put(K key, V value) {
        return putVal(key, value, false);
    }

    final V putVal(K key, V value, boolean onlyIfAbsent) {
        if (key == null || value == null) throw new NullPointerException();
        int hash = spread(key.hashCode());
        int binCount = 0;
        for (Node<K,V>[] tab = table;;) {
            Node<K,V> f; int n, i, fh;
            if (tab == null || (n = tab.length) == 0)
                tab = initTable();
            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
                if (casTabAt(tab, i, null,
                             new Node<K,V>(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }
            else if ((fh = f.hash) == MOVED)
                tab = helpTransfer(tab, f);
            else {
                V oldVal = null;
                synchronized (f) {
                    if (tabAt(tab, i) == f) {
                        if (fh >= 0) {
                            binCount = 1;
                            for (Node<K,V> e = f;; ++binCount) {
                                K ek;
                                if (e.hash == hash &&
                                    ((ek = e.key) == key ||
                                     (ek != null && key.equals(ek)))) {
                                    oldVal = e.val;
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                Node<K,V> pred = e;
                                if ((e = e.next) == null) {
                                    pred.next = new Node<K,V>(hash, key,
                                                              value, null);
                                    break;
                                }
                            }
                        }
                        else if (f instanceof TreeBin) {
                            Node<K,V> p;
                            binCount = 2;
                            if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                           value)) != null) {
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }
                    }
                }
                if (binCount != 0) {
                    if (binCount >= TREEIFY_THRESHOLD)
                        treeifyBin(tab, i);
                    if (oldVal != null)
                        return oldVal;
                    break;
                }
            }
        }
        addCount(1L, binCount);
        return null;
    }
}

위의 코드가 ConcurrentHashMap 코드이고, 아래의 코드는 HashTable 코드이다.

// hashtable 코드

public class Hashtable<K,V>
    extends Dictionary<K,V>
    implements Map<K,V>, Cloneable, java.io.Serializable {

    public synchronized int size() { }

    @SuppressWarnings("unchecked")
    public synchronized V get(Object key) { }

    public synchronized V put(K key, V value) { }
}

Hashtable 코드를 먼저 살펴 보면,

    public synchronized int size() { }

    @SuppressWarnings("unchecked")
    public synchronized V get(Object key) { }

    public synchronized V put(K key, V value) { }

get, put 메소드 앞에 synchonized가 하나하나 붙어있는 것을 알 수 있다.

그러나 concurrenthashmap의 경우에는 아래 코드를 참고해 보시면,

get 메소드에는 synchonized가 아예 붙어있지 않고, put() 메소드에는 중간에 키워드가 존재한다.

// concurrenthashmap get method

public V get(Object key) {}

// concurrent put method

final V putVal(K key, V value, boolean onlyIfAbsent) {
        if (key == null || value == null) throw new NullPointerException();
        int hash = spread(key.hashCode());
        int binCount = 0;
        for (Node<K,V>[] tab = table;;) {
            Node<K,V> f; int n, i, fh;
            if (tab == null || (n = tab.length) == 0)
                tab = initTable();
            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
                if (casTabAt(tab, i, null,
                             new Node<K,V>(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }
            else if ((fh = f.hash) == MOVED)
                tab = helpTransfer(tab, f);
            else {
                V oldVal = null;
                synchronized (f) { // 여기에만 synchronized가 붙어 있음
                    if (tabAt(tab, i) == f) {
                        if (fh >= 0) {
                            binCount = 1;
                            for (Node<K,V> e = f;; ++binCount) {
                                K ek;
                                if (e.hash == hash &&
                                    ((ek = e.key) == key ||
                                     (ek != null && key.equals(ek)))) {
                                    oldVal = e.val;
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                Node<K,V> pred = e;
                                if ((e = e.next) == null) {
                                    pred.next = new Node<K,V>(hash, key,
                                                              value, null);
                                    break;
                                }
                            }
                        }
                        else if (f instanceof TreeBin) {
                            Node<K,V> p;
                            binCount = 2;
                            if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                           value)) != null) {
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }
                    }
                }
                if (binCount != 0) {
                    if (binCount >= TREEIFY_THRESHOLD)
                        treeifyBin(tab, i);
                    if (oldVal != null)
                        return oldVal;
                    break;
                }
            }
        }
        addCount(1L, binCount);
        return null;
    }
}

이로 인해서 concurrenthashmap의 경우에는 읽기 작업에는 여러 쓰레드가 동시에 읽을 수 있지만, 쓰기 작업에는 lock을 사용하고 있습니다. 그 결과 여러 번 작업해도 영향을 주지 않을 읽기 작업으로 인해 쓰기 작업이 방해되는 경우를 막을 수 있는 것 같습니다.

안전한 반복자: ConcurrentHashMap의 반복자는 fail-safe로, 반복 도중에 데이터 구조가 변경되더라도 ConcurrentModificationException을 발생시키지 않습니다.
병렬 작업: Java 8에서는 forEach, search, reduce와 같은 병렬 작업이 도입되었으며, 맵 크기에 따라 스레드 수를 조정해 ForkJoinPool을 사용하여 효율적으로 실행됩니다.

병렬 작업에 대해서는 공식 문서를 찾아서 조금 더 설명합니다.

forEach: 각 요소에 대해 지정된 작업을 수행합니다. 변형 형식은 작업을 수행하기 전에 각 요소에 지정된 변환을 적용합니다.
search: 각 요소에 주어진 함수를 적용한 결과 중 null이 아닌 첫 번째 사용 가능한 결과를 반환합니다. 결과가 발견되면 추가 검색을 건너뜁니다.
reduce: 각 요소를 축적합니다. 제공된 reduce 함수는 순서에 의존하지 않습니다. 크게 5가지 종류가 있습니다.
- Plain reduction: 해당하는 반환 유형이 없기 때문에 (키, 값) 함수 인수에 대한 이 메소드의 형식은 없습니다.
- Mapped reductions: 각 요소에 적용된 특정 함수의 결과를 누적합니다.
- 주어진 값을 사용하여 double, long 및 int로 reduce 작업을 하는 것.

forEach: Performs a given action on each element. A variant form applies a given transformation on each element before performing the action.

search: Returns the first available non-null result of applying a given function on each element; skipping further search when a result is found.

reduce: Accumulates each element. The supplied reduction function cannot rely on ordering (more formally, it should be both associative and commutative). There are five variants:

1) Plain reductions. (There is not a form of this method for (key, value) function arguments since there is no corresponding return type.)
2) Mapped reductions that accumulate the results of a given function applied to each element.
3) Reductions to scalar doubles, longs, and ints, using a given basis value.