glusterfs 中的字典查询

　　glusterfs文件系统是一个分布式的文件系统，但是与很多分布式文件系统不一样，它没有元数服务器，听说swift上也是应用了这个技术的。glusterfs中每个xlator的配置信息都是用dict进行管理的。dict这玩意儿，说白了就是一个hash表，是一个key/value的内存数据库。今天花了点时间慢慢研究了glusterfs中的设计，觉得还是挺有意思的。

　　上篇博客介绍了glusterfs文件系统的内存池的设计，而glusterfs的内存池正应用在这项技术上。首先，glusterfsd在程序初始化时，就建立了三个池dict_pool、dict_pair_pool、dict_data_pool。接下来看看它是怎么玩这三个内存池的呢！

　　1、在使用dict之前，首先是建立dict对象，这点是面向对象的思想吧。

 dict_t *

 get_new_dict (void)

 {

         return get_new_dict_full ();

 }

　　glusterfs调用get_new_dict来建立一个dict对象，接下来看看get_new_dict又做了什么呢？

 dict_t *

 get_new_dict_full (int size_hint)

 {

         dict_t *dict = mem_get0 (THIS->ctx->dict_pool);

         if (!dict) {

                 return NULL;

         }

         dict->hash_size = size_hint;

         if (size_hint == ) {

                 /*

                  * This is the only case we ever see currently.  If we ever

                  * need to support resizing the hash table, the resize function

                  * will have to take into account the possibility that

                  * "members" is not separately allocated (i.e. don't just call

                  * realloc() blindly.

                  */

                 dict->members = &dict->members_internal;

         }

         else {

                 /*

                  * We actually need to allocate space for size_hint *pointers*

                  * but we actually allocate space for one *structure*.  Since

                  * a data_pair_t consists of five pointers, we're wasting four

                  * pointers' worth for N=1, and will overrun what we allocated

                  * for N>5.  If anybody ever starts using size_hint, we'll need

                  * to fix this.

                  */

                 GF_ASSERT (size_hint <=

                            (sizeof(data_pair_t) / sizeof(data_pair_t *)));

                 dict->members = mem_get0 (THIS->ctx->dict_pair_pool);

                 if (!dict->members) {

                         mem_put (dict);

                         return NULL;

                 }

         }

         LOCK_INIT (&dict->lock);

         return dict;

 }

　　size_hint是要分配的字典的大小。当 size_hint为1时，字典内的数据将是一个链表（用链表解决HASH冲突问题）。

　　接下来看看程序又将是如何向字典中添加一项数据的呢？首先还是来看看dict_t 的数据结构吧：

 struct _dict {

         unsigned char   is_static:;

         int32_t         hash_size;

         int32_t         count;

         int32_t         refcount;

         data_pair_t   **members;

         data_pair_t    *members_list;

         char           *extra_free;

         char           *extra_stdfree;

         gf_lock_t       lock;

         data_pair_t    *members_internal;

         data_pair_t     free_pair;

         gf_boolean_t    free_pair_in_use;

 };

在dict_t中有一个lock子成员，每次操作dict_t对象时，首先要对它进行加锁：

int32_t

dict_add (dict_t *this, char *key, data_t *value)

{

        int32_t ret;

        if (!this || !value) {

                gf_log_callingfn ("dict", GF_LOG_WARNING,

                                  "!this || !value for key=%s", key);

                return -;

        }

        LOCK (&this->lock);

        ret = _dict_set (this, key, value, );

        UNLOCK (&this->lock);

        return ret;

}

　　不得不说glusterfs的编码风格还是挺漂亮的，它把一些细节与核心点分的很清楚，代码看上去那个爽啊！！看上面的代码：打日志与加锁放一在一块，核心的处理将在_dict_set中处理。

 static int32_t

 _dict_set (dict_t *this, char *key, data_t *value, gf_boolean_t replace)

 {

         int hashval;

         data_pair_t *pair;

         char key_free = ;

         int tmp = ;

         int ret = ;

         if (!key) {

                 ret = gf_asprintf (&key, "ref:%p", value);

                 if (- == ret) {

                         gf_log ("dict", GF_LOG_WARNING, "asprintf failed %s", key);

                         return -;

                 }

                 key_free = ;

         }

         tmp = SuperFastHash (key, strlen (key));

         hashval = (tmp % this->hash_size);

         /* Search for a existing key if 'replace' is asked for */

         if (replace) {

                 pair = _dict_lookup (this, key);

                 if (pair) {

                         data_t *unref_data = pair->value;

                         pair->value = data_ref (value);

                         data_unref (unref_data);

                         if (key_free)

                                 GF_FREE (key);

                         /* Indicates duplicate key */

                         return ;

                 }

         }

         if (this->free_pair_in_use) {

                 pair = mem_get0 (THIS->ctx->dict_pair_pool);

                 if (!pair) {

                         if (key_free)

                                 GF_FREE (key);

                         return -;

                 }

         }

         else {

                 pair = &this->free_pair;

                 this->free_pair_in_use = _gf_true;

         }

         if (key_free) {

                 /* It's ours.  Use it. */

                 pair->key = key;

                 key_free = ;

         }

         else {

                 pair->key = (char *) GF_CALLOC (, strlen (key) + ,

                                                 gf_common_mt_char);

                 if (!pair->key) {

                         if (pair == &this->free_pair) {

                                 this->free_pair_in_use = _gf_false;

                         }

                         else {

                                 mem_put (pair);

                         }

                         return -;

                 }

                 strcpy (pair->key, key);

         }

         pair->value = data_ref (value);

         pair->hash_next = this->members[hashval];

         this->members[hashval] = pair;

         pair->next = this->members_list;

         pair->prev = NULL;

         if (this->members_list)

                 this->members_list->prev = pair;

         this->members_list = pair;

         this->count++;

         if (key_free)

                 GF_FREE (key);

         return ;

 }

　　19行利用HASH算法计算HASH值，20行缩小HASH值的范围，23行到了35行为替换处理。37-48行是让我最难受的代码，这个地方不知道是不是设计的问题。55行之后是插入新的HASH键值的操作。

　　再看看查询的操作吧。

 data_t *

 dict_get (dict_t *this, char *key)

 {

         data_pair_t *pair;

         if (!this || !key) {

                 gf_log_callingfn ("dict", GF_LOG_INFO,

                                   "!this || key=%s", (key) ? key : "()");

                 return NULL;

         }

         LOCK (&this->lock);

         pair = _dict_lookup (this, key);

         UNLOCK (&this->lock);

         if (pair)

                 return pair->value;

         return NULL;

 }

同样是先处理锁之类的杂项操作，_dict_lookup才是真正的始作俑者。

 static data_pair_t *

 _dict_lookup (dict_t *this, char *key)

 {

         if (!this || !key) {

                 gf_log_callingfn ("dict", GF_LOG_WARNING,

                                   "!this || !key (%s)", key);

                 return NULL;

         }

         int hashval = SuperFastHash (key, strlen (key)) % this->hash_size;

         data_pair_t *pair;

         for (pair = this->members[hashval]; pair != NULL; pair = pair->hash_next) {

                 if (pair->key && !strcmp (pair->key, key))

                         return pair;

         }

         return NULL;

 }

　　查询的代码相当的简单吧，计算一个哈希值，再查询一个链表就OK了。

　　查看了glusterfs中的所有代码，glusterfs_new_dict_full调用时几乎都是传入参数1，只有dict_copy接口比较特别：

 dict_t *

 dict_copy (dict_t *dict,

            dict_t *new)

 {

         if (!dict) {

                 gf_log_callingfn ("dict", GF_LOG_WARNING, "dict is NULL");

                 return NULL;

         }

         if (!new)

                 new = get_new_dict_full (dict->hash_size);

         dict_foreach (dict, _copy, new);

         return new;

 }

从代码上看，只有此处才发挥了HASH表的作用，其它的都只是把dict_t当成链表来使用。而且这个地方也并不是用HASH表的思想，只是把一个链表转换成了HASH表。这个是我在glusterfs中见到的一处最不明智的地方。

glusterfs 中的字典查询的更多相关文章

SQL Server中Table字典数据的查询SQL示例代码
SQL Server中Table字典数据的查询SQL示例代码前言在数据库系统原理与设计(第3版)教科书中这样写道: 数据库包含4类数据: 1.用户数据 2.元数据 3.索引 4.应用元数据其中, ...
13.python中的字典
字典其实和之前的元祖和列表功能相似,都是用来储存一系列对象的.也就是一种可变容器,或者是我所比喻的革新派的菜单. 但也不是完全相同,我在之前曾经将字典称为特殊的'序列',是字典拥有序列的部分特性,但是 ...
python中的{字典}
目录字典--dict { } 字典是无序,可变的数据类型. 字典:用于存储数据,存储大量数据,字典要比列表快:将数据和数据之间进行关联. 定义: dic = {键:值,键:值} #每个键值对以逗 ...
详细讲述MySQL中的子查询操作（来自脚本之家）
继续做以下的前期准备工作: 新建一个测试数据库TestDB: ? 1 create database TestDB; 创建测试表table1和table2: ? 1 2 3 4 5 6 7 8 9 1 ...
在update语句中使用子查询
在update 中的 where 子句中使用子查询: UPDATE mg_page_log as a SET page_num=1 WHERE id in( SELECT id from mg_ ...
（九）WebGIS中的矢量查询（针对AGS和GeoServer）
文章版权由作者李晓晖和博客园共有,若转载请于明显处标明出处:http://www.cnblogs.com/naaoveGIS/. 1.前言在第七章里我们知道了WebGIS中要素的本质是UICompo ...
mysql中的模糊查询
转载自:http://www.letuknowit.com/archives/90/ MySQL中实现模糊查询有2种方式:一是用LIKE/NOT LIKE,二是用REGEXP/NOT REGEXP(或 ...
Hibernate中的HQL查询与缓存机制
HQL:完全面向对象查询 SQL的执行顺序: 1.From 2.Where 过滤基础数据 where与having的区别:1.顺序不同 2.where过滤基础数据 3. 过滤聚合函数 3.Group ...
浅谈T-SQL中的子查询
引言这篇文章我们来简单的谈一下子查询的相关知识.子查询可以分为独立子查询和相关子查询.独立子查询不依赖于它所属的外部查询,而相关子查询则依赖于它所属的外部查询.子查询返回的值可以是标量(单值).多值 ...

随机推荐

css3中transition和animation的回调处理
弱鸡最近在准备面试,网上找了一些题,发现一些基础题也完全答不好(┬＿┬)看来还是要再接再励啊w(ﾟДﾟ)w 言归正传,今天的主题是CSS3中的动画回调处理,这里动画执行完毕后触发的事件是transit ...
这种代码怎么改写？以致于在下次增加CustomsType时，不需要再加 if 语句。
最近看到项目里一段代码如下: excelObject excel = new excelObject(); if (loadbill.CustomsType == 1) excel.IDownload ...
eclipse中添加python开发环境
由于自己一直使用的是eclipse这个IDE,在写spark,java等都是用它,主要是用它比较顺手,也并不是觉得它有什么特别好的之处.下面主要介绍一下,在window系统下,eclipse中搭建py ...
Mac下手动安装SafariDriver extension
环境:Mac OS X Yosemite 10.10.4下, Safari 8 Step 1:第一次运行SafariDriver时,先找到WebDriver extension的安装路径,比如/Use ...
windows Path变量优先级
系统>用户且第一次配置无需重启即可使用如遇到升级版本,需要重新配置Path,则需要重启方可生效~~
最大流模版 pascal
//最大流模版 ; maxm=; ..maxn] of integer; end; var n,m,max:longint; r:..maxn,..maxn] of longint; g:..maxn ...
[转]PYTHON-SCRAPY-WINDOWS下的安装笔记
分类: Crawler.Net Python2014-05-15 13:36 127人阅读评论(0) 收藏举报 PYTHON SCRAPY 1.安装PYTHON2.7.6,下载地址:https:/ ...
删除hao123这个恶心的毒瘤
最近做服务器,好好一个东西莫名其妙的被染上了这个狗皮膏药......然后我就用了各种手段删除,注册表.组策略等等都用上了,却没有丝毫办法.....最后发现的地方特别无语,居然在快捷方式的属性中加上了u ...
使用Application对象简单完成网站总访问人数的统计
Global.asax文件: using System.IO; protected void Application_Start(object sender, EventArgs e) { Fil ...
UWP/Win10新特性系列—Drag&Drop 拖动打开文件
在Win10 App开发中,微软新增了系统PC文件与UWP 之间的文件拖拽行为,它支持将系统磁盘上的文件以拖拽的形式拖入App中并处理,在前不久的微软build 2015开发者大会上微软展示的UWP版 ...

glusterfs 中的字典查询

glusterfs 中的字典查询的更多相关文章

随机推荐

热门专题