一致性哈希
借用维基百科的定义:
consistent hashing is a special kind of hashing such that when a hash table is resized.
In contrast, in most traditional hash tables, a change in the number of array slots causes nearly all keys to be remapped because the mapping between the keys and the slots is defined by a modular operation. Consistent hashing is a particular case of rendezvous hashing, which has a conceptually simpler algorithm, and was first described in 1996.
实现
Java里面的实现,主要依赖了TreeMap来实现。
hash算法,可以用字符的原生hashCode方法.
public class ConsistentHash<T> {
private final int numberOfReplicas;
private final SortedMap<Integer, T> circle = new TreeMap<>();// 存储虚拟节点的hash值到真实节点的映射
public ConsistentHash(int numberOfReplicas, Collection<T> nodes) {
this.numberOfReplicas = numberOfReplicas;
for (T node : nodes) {
add(node);
}
}
/**
* 不同的虚拟节点(i不同)有不同的hash值,但都对应同一个实际机器node
* 虚拟node一般是均衡分布在环上的,数据存储在顺时针方向的虚拟node上
* <p>
* 对于一个实际机器节点 node, 对应 numberOfReplicas 个虚拟节点
*/
public void add(T node) {
for (int i = 0; i < numberOfReplicas; i++) {
String nodeName = node.toString() + i;
int hashcode = nodeName.hashCode();
circle.put(hashcode, node);
}
}
public void remove(T node) {
for (int i = 0; i < numberOfReplicas; i++) {
circle.remove((node.toString() + i).hashCode());
}
}
/**
* 获得一个最近的顺时针节点,根据给定的key 取Hash
* 然后再取得顺时针方向上最近的一个虚拟节点对应的实际节点
* 再从实际节点中取得数据
*/
public T get(Object key) {
if (circle.isEmpty()) {
return null;
}
int hash = key.hashCode();
if (!circle.containsKey(hash)) {//数据映射在两台虚拟机器所在环之间,就需要按顺时针方向寻找节点
SortedMap<Integer, T> tailMap = circle.tailMap(hash);
hash = tailMap.isEmpty() ? circle.firstKey() : tailMap.firstKey();
}
return circle.get(hash);
}
public long size() {
return circle.size();
}
}
关键是理解treeMap的tailMap(hash)方法。
使用了一致性哈希的知名系统
- Couchbase automated data partitioning
- OpenStack’s Object Storage Service Swift
- Partitioning component of Amazon’s storage system Dynamo
- Data partitioning in Apache Cassandra
- Data partitioning in Voldemort
- Akka’s consistent hashing router
- Riak, a distributed key-value database
- Gluster, a network-attached storage file system
- Akamai content delivery network
- Discord chat application
- Maglev network load balancer
- Data partitioning in Azure Cosmos DB
(完)