What is libconhash

libconhash is a consistent hashing library which can be compiled both on Windows and Linux platforms, with the following features:

  1. High performance and easy to use, libconhash uses a red-black tree to manage all nodes to achieve high performance.
  2. By default, it uses the MD5 algorithm, but it also supports user-defined hash functions.
  3. Easy to scale according to the node's processing capacity.

Consistent hashing

Why you need consistent hashing

Now we will consider the common way to do load balance. The machine number chosen to cache object o will be:

Hide   Copy Code
hash(o) mod n

Here, n is the total number of cache machines. While this works well until you add or remove cache machines:

    1. When you add a cache machine, then object o will be cached into the machine:
Hide   Copy Code
hash(o) mod (n+1)
  1. When you remove a cache machine, then object o will be cached into the machine:

    Hide   Copy Code
    hash(o) mod (n-1)

So you can see that almost all objects will hashed into a new location. This will be a disaster since the originating content servers are swamped with requests from the cache machines. And this is why you need consistent hashing.

Consistent hashing can guarantee that when a cache machine is removed, only the objects cached in it will be rehashed; when a new cache machine is added, only a fairly few objects will be rehashed.

Now we will go into consistent hashing step by step.

Hash space

Commonly, a hash function will map a value into a 32-bit key, 0~2^32-1. Now imagine mapping the range into a circle, then the key will be wrapped, and 0 will be followed by 2^32-1, as illustrated in figure 1.

Figure 1

Map object into hash space

Now consider four objects: object1~object4. We use a hash function to get their key values and map them into the circle, as illustrated in figure 2.

Figure 2
Hide   Copy Code
hash(object1) = key1;
.....
hash(object4) = key4;

Map the cache into hash space

The basic idea of consistent hashing is to map the cache and objects into the same hash space using the same hash function.

Now consider we have three caches, A, B and C, and then the mapping result will look like in figure 3.

Hide   Copy Code
hash(cache A) = key A;
....
hash(cache C) = key C;

Figure 3

Map objects into cache

Now all the caches and objects are hashed into the same space, so we can determine how to map objects into caches. Take object obj for example, just start from where obj is and head clockwise on the ring until you find a server. If that server is down, you go to the next one, and so forth. See figure 3 above.

According to the method, object1 will be cached into cache A; object2 and object3 will be cached into cache C, and object4 will be cached into cache B.

Add or remove cache

Now consider the two scenarios, a cache is down and removed; and a new cache is added.

If cache B is removed, then only the objects that cached in B will be rehashed and moved to C; in the example, see object4 illustrated in figure 4.

Figure 4

If a new cache D is added, and D is hashed between object2 and object3 in the ring, then only the objects that are between D and B will be rehashed; in the example, see object2, illustrated in figure 5.

Figure 5

Virtual nodes

It is possible to have a very non-uniform distribution of objects between caches if you don't deploy enough caches. The solution is to introduce the idea of "virtual nodes".

Virtual nodes are replicas of cache points in the circle, each real cache corresponds to several virtual nodes in the circle; whenever we add a cache, actually, we create a number of virtual nodes in the circle for it; and when a cache is removed, we remove all its virtual nodes from the circle.

Consider the above example. There are two caches A and C in the system, and now we introduce virtual nodes, and the replica is 2, then three will be 4 virtual nodes. Cache A1 and cache A2 represent cache A; cache C1 and cache C2 represent cache C, illustrated as in figure 6.

Figure 6

Then, the map from object to the virtual node will be:

Hide   Copy Code
objec1->cache A2; objec2->cache A1; objec3->cache C1; objec4->cache C2

When you get the virtual node, you get the cache, as in the above figure.

So object1 and object2 are cached into cache A, and object3 and object4 are cached into cache. The result is more balanced now.

So now you know what consistent hashing is.

Using the code

Interfaces of libconhash

Hide   Shrink   Copy Code
/* initialize conhash library
* @pfhash : hash function, NULL to use default MD5 method
* return a conhash_s instance
*/
CONHASH_API struct conhash_s* conhash_init(conhash_cb_hashfunc pfhash); /* finalize lib */
CONHASH_API void conhash_fini(struct conhash_s *conhash); /* set node */
CONHASH_API void conhash_set_node(struct node_s *node,
const char *iden, u_int replica); /*
* add a new node
* @node: the node to add
*/
CONHASH_API int conhash_add_node(struct conhash_s *conhash,
struct node_s *node); /* remove a node */
CONHASH_API int conhash_del_node(struct conhash_s *conhash,
struct node_s *node);
... /*
* lookup a server which object belongs to
* @object: the input string which indicates an object
* return the server_s structure, do not modify the value,
* or it will cause a disaster
*/
CONHASH_API const struct node_s*
conhash_lookup(const struct conhash_s *conhash,
const char *object);

Libconhash is very easy to use. There is a sample in the project that shows how to use the library.

First, create a conhash instance. And then you can add or remove nodes of the instance, and look up objects.

The update node's replica function is not implemented yet.

Hide   Copy Code
/* init conhash instance */
struct conhash_s *conhash = conhash_init(NULL);
if(conhash)
{
/* set nodes */
conhash_set_node(&g_nodes[0], "titanic", 32);
/* ... */ /* add nodes */
conhash_add_node(conhash, &g_nodes[0]);
/* ... */
printf("virtual nodes number %d\n", conhash_get_vnodes_num(conhash));
printf("the hashing results--------------------------------------:\n"); /* lookup object */
node = conhash_lookup(conhash, "James.km");
if(node) printf("[%16s] is in node: [%16s]\n", str, node->iden);
}

Reference

 

License

This article, along with any associated source code and files, is licensed under The BSD License

Consistent hashing的更多相关文章

  1. Consistent hashing —— 一致性哈希

    原文地址:http://www.codeproject.com/Articles/56138/Consistent-hashing 基于BSD License What is libconhash l ...

  2. 一致性 hash 算法( consistent hashing )a

    一致性 hash 算法( consistent hashing ) 张亮 consistent hashing 算法早在 1997 年就在论文 Consistent hashing and rando ...

  3. 一致性哈希算法 - consistent hashing

    1 基本场景比如你有 N 个 cache 服务器(后面简称 cache ),那么如何将一个对象 object 映射到 N 个 cache 上呢,你很可能会采用类似下面的通用方法计算 object 的 ...

  4. 一致性 hash 算法( consistent hashing )

    consistent hashing 算法早在 1997 年就在论文 Consistent hashing and random trees 中被提出,目前在cache 系统中应用越来越广泛: 1 基 ...

  5. 【转】一致性hash算法(consistent hashing)

    consistent hashing 算法早在 1997 年就在论文 Consistent hashing and random trees 中被提出,目前在 cache 系统中应用越来越广泛: 1  ...

  6. Consistent Hashing算法-搜索/负载均衡

    在做服务器负载均衡时候可供选择的负载均衡的算法有很多,包括:  轮循算法(Round Robin).哈希算法(HASH).最少连接算法(Least Connection).响应速度算法(Respons ...

  7. 一致性hash算法 - consistent hashing

    consistent hashing 算法早在 1997 年就在论文 Consistent hashing and random trees 中被提出,目前在 cache 系统中应用越来越广泛: 1 ...

  8. Consistent Hashing原理与实现

    原理介绍: consistent hashing原理介绍来自博客:http://blog.csdn.net/sparkliang/article/details/5279393, 多谢博主的分享 co ...

  9. 一致性哈希算法(consistent hashing)样例+測试。

    一个简单的consistent hashing的样例,非常easy理解. 首先有一个设备类,定义了机器名和ip: public class Cache { public String name; pu ...

  10. _00013 一致性哈希算法 Consistent Hashing 新的讨论,并出现相应的解决

    笔者博文:妳那伊抹微笑 博客地址:http://blog.csdn.net/u012185296 个性签名:世界上最遥远的距离不是天涯,也不是海角,而是我站在妳的面前.妳却感觉不到我的存在 技术方向: ...

随机推荐

  1. Oracle VM VirtualBox虚拟机导出教程

    Oracle VM VirtualBox虚拟机导出教程 | 浏览:583 | 更新:2015-01-31 11:21 1 2 3 4 5 6 7 分步阅读 有时我们需要把Oracle VM Virtu ...

  2. Android App常规测试内容

    转自:https://mp.weixin.qq.com/s?__biz=MzU0NjcyNDg3Mw==&mid=2247484053&idx=1&sn=116fe8c7eed ...

  3. tomcat web 修改logo

    第一种: 打开tomcat目录,进入 D:\tomcat\apache-tomcat-7.0.50-1\webapps\ROOT 找到favicon.ico图标 然后替换成自己的 第二种: 由于tom ...

  4. 偶遇 sqlserver 参数嗅探

    需求: 费用统计 环境: 查询设计多张大表 解决方案: 优化查询语句,封装成存储过程,建立索引,最终查询速度很不错.部署上线,告一段落... 一段时间后投诉来了... 客户投诉说查询没内容,我看了日志 ...

  5. Memcached集群:Magent缓存代理使用

    小结: 先启动memcached 然后启动magent memcached -d -p 11211 -u memcached -m 64 -c 5120 memcached -d -p 11212 - ...

  6. 抽象窗口工具包AWT (Abstract Window Toolkit) 是 API为Java 程序提供的建立 图形用户界面

    抽象窗口工具包AWT (Abstract Window Toolkit) 是 API为Java 程序提供的建立 图形用户界面GUI (Graphics User Interface)工具集,AWT可用 ...

  7. APP接口基础学习一

    PHP面向对象思想 1.客户端发送http请求到达服务器 2.服务器做出响应返回数据(XML,JSON或者其他)到达客户端 XML与JSON 的区别 1.可读性:xml胜出 2.生成数据:json胜出 ...

  8. 详略。。设计模式2——单例变形(多例).。。。studying

    ★ 缓存在单例中的使用("单例+缓存"技术) 缓存在编程中使用非常频繁,有着非常关键的数据,它可以帮助程序实现以空间换取时间, 通常被设计成整个应用程序所共享的一个空间,现要求实现 ...

  9. 让TextView的drawableLeft与文本一起居中显示

     TextView的drawableLeft.drawableRight和drawableTop是一个常用.好用的属性,可以在文本的上下左右放置一个图片,而不使用更加复杂布局就能达到,我也常常喜欢用R ...

  10. 多线程的设计模式--Future模式,Master-Worker模式,生产者-消费者模式

    代码示例: public interface Data { String getRequest(); } public class FutureData implements Data{ privat ...