一致性哈希的Java实现

一致性哈希

借用维基百科的定义:

consistent hashing is a special kind of hashing such that when a hash table is resized.
In contrast, in most traditional hash tables, a change in the number of array slots causes nearly all keys to be remapped because the mapping between the keys and the slots is defined by a modular operation. Consistent hashing is a particular case of rendezvous hashing, which has a conceptually simpler algorithm, and was first described in 1996.

实现

Java里面的实现,主要依赖了TreeMap来实现。
hash算法,可以用字符的原生hashCode方法.

public class ConsistentHash<T> {
    private final int numberOfReplicas;

    private final SortedMap<Integer, T> circle = new TreeMap<>();// 存储虚拟节点的hash值到真实节点的映射

    public ConsistentHash(int numberOfReplicas, Collection<T> nodes) {
        this.numberOfReplicas = numberOfReplicas;
        for (T node : nodes) {
            add(node);
        }
    }

    /**
     * 不同的虚拟节点(i不同)有不同的hash值,但都对应同一个实际机器node
     * 虚拟node一般是均衡分布在环上的,数据存储在顺时针方向的虚拟node上
     * <p>
     * 对于一个实际机器节点 node, 对应 numberOfReplicas 个虚拟节点
     */
    public void add(T node) {
        for (int i = 0; i < numberOfReplicas; i++) {
            String nodeName = node.toString() + i;
            int hashcode = nodeName.hashCode();
            circle.put(hashcode, node);

        }
    }

    public void remove(T node) {
        for (int i = 0; i < numberOfReplicas; i++) {
            circle.remove((node.toString() + i).hashCode());
        }
    }

    /**
     * 获得一个最近的顺时针节点,根据给定的key 取Hash
     * 然后再取得顺时针方向上最近的一个虚拟节点对应的实际节点
     * 再从实际节点中取得数据
     */
    public T get(Object key) {
        if (circle.isEmpty()) {
            return null;
        }

        int hash = key.hashCode();
        if (!circle.containsKey(hash)) {//数据映射在两台虚拟机器所在环之间,就需要按顺时针方向寻找节点
            SortedMap<Integer, T> tailMap = circle.tailMap(hash);
            hash = tailMap.isEmpty() ? circle.firstKey() : tailMap.firstKey();
        }
        return circle.get(hash);
    }

    public long size() {
        return circle.size();
    }
}

关键是理解treeMap的tailMap(hash)方法。

使用了一致性哈希的知名系统

  • Couchbase automated data partitioning
  • OpenStack’s Object Storage Service Swift
  • Partitioning component of Amazon’s storage system Dynamo
  • Data partitioning in Apache Cassandra
  • Data partitioning in Voldemort
  • Akka’s consistent hashing router
  • Riak, a distributed key-value database
  • Gluster, a network-attached storage file system
  • Akamai content delivery network
  • Discord chat application
  • Maglev network load balancer
  • Data partitioning in Azure Cosmos DB

(完)

发表评论

邮箱地址不会被公开。 必填项已用*标注