11 Hash tables
11 Hash tables
Many applications require a dynamic set that supports only the dictionary operations INSERT, SEARCH, and DELETE.
For example, a compiler that translates a programming language maintains a symbol table, in which the keys of elements are arbitrary character strings corresponding to identifiers in the language.
A hash table generalizes the simpler notion of an ordinary array. Directly ad- dressing into an ordinary array makes effective use of our ability to examine an arbitrary position in an array in O(1) time.
一个hash表是由一个简单地普通列扩展而来的。
直接寻址
使得一个普通的列
能够
检查任意位置在o(1)时间。
11.1 Direct- address tables
Direct addressing is a simple technique that works well when the universe U of keys is reasonably small.
直接寻址是一个简单地技术。
当
关键字
的区间
恰当小时,工作正常。
To represent the dynamic set, we use an array, or direct-address table, denoted by T [0.. m-1],
in which each position, or slot, corresponds to a key in the uni- verse U .
11.2 Hash tables
With direct addressing, an element with key k is stored in slot k. With hashing, this element is stored in slot h.k/; that is, we use a hash function h to compute the slot from the key k.
Here ,h maps the universe U of keys into the slots of a hash table T[0.. m-1]:
h:U->{0,1,…,m-1}
We say that an element with key k hashes to slot h.k/; we also say that h.k/ is the hash value of key k.

There is one hitch: two keys may hash to the same slot. We call this situation a collision.
Collision resolution by chaining

In chaining, we place all the elements that hash to the same slot into the same linked list
The dictionary operations on a hash table T are easy to implement when collisions are resolved by chaining 
11.3 Hash functions
In this section ,we discuss some issues regrading the design of good hash functions and then present three schemes for their creation . Two of the schemes ,hashing by division and hashing by multiplication ,are heuristic in nature ,whereas the third scheme ,universal hashing ,uses randommization to provide provably good performance .
11.3.1 The division method
h(k)=k mod m
11.3.2 The multiplication method
The multiplication method for creating hash functions operates in two steps. First, we multiply the key k by a constant A in the range 0 < A < 1 and extract the
fractional part of kA. Then, we multiply this value by m and take the floor of the result. In short, the hash function is

这两个函数是最简单的。
hash表最重要的就是找到一个哈希函数,让其尽量的避免或少点collision。
11.4 Open addressing
In open addressing ,all elements occupy the hash table itself . That is ,each table entry contains either an element of the dynamic set or nil.When searching for an element, we systematically examine table slots until either we find the desired element or we have ascertained that the element is not in the table. No lists and
no elements are stored outside the table, unlike in chaining.
The advantage of open addressing is that it avoids pointers altogether .
开放地址的好处是完全避免了指针。
Instead of following pointers, we compute the sequence of slots to be examined.
代替指针的是我们计算slots 的序列来检查。
The hash function becomes :

With open addressing ,we require that for every key k, the probe sequence 
be a permutation of <0,1,,..m-1>
The algorithm for searching for key k probes the same sequence of slots that the insertion algorithm examined when key k was inserted.

In our analysis ,we assume uniform hashing : the probe sequence of each key is equally likely to be any of the m!permutations of <0,1,2,m-1>.
We will examine three commonly used techniques to compute the probe sequences required for open addressing : linear probing ,quadratic probing ,and double hashing .
线性探针,平方探针,双重探针。
Linear probing
Given an ordinary hash function h':u->{0,1,m-1} ,which we refer to as an auxiliary hash function ,the method of linear probing uses the hash function 

Double hashing
Double hashing offers one of the best methods available for open addressing be- cause the permutations produced have many of the characteristics of randomly chosen permutations. Double hashing uses a hash function of the form

11 Hash tables的更多相关文章
- Hash Tables
哈希表 红黑树实现的符号表可以保证对数级别的性能,但我们可以做得更好.哈希表实现的符号表提供了新的数据访问方式,插入和搜索操作可以在常数时间内完成(不支持和顺序有关的操作).所以,在很多情况下的简单符 ...
- 数据库(11)-- Hash索引和BTree索引 的区别
索引是帮助mysql获取数据的数据结构.最常见的索引是Btree索引和Hash索引. 不同的引擎对于索引有不同的支持:Innodb和MyISAM默认的索引是Btree索引:而Mermory默认的索引是 ...
- Hash Tables and Hash Functions
Reference: Compuer science Introduction: This computer science video describes the fundamental princ ...
- Javascript: hash tables in javascript
/** * Copyright 2010 Tim Down. * * Licensed under the Apache License, Version 2.0 (the "License ...
- Hash Table Performance in R: Part I(转)
What Is It? A hash table, or associative array, is a well known key-value data structure. In R there ...
- Effective Java 第三版——11. 重写equals方法时同时也要重写hashcode方法
Tips <Effective Java, Third Edition>一书英文版已经出版,这本书的第二版想必很多人都读过,号称Java四大名著之一,不过第二版2009年出版,到现在已经将 ...
- 06: 字典、顺序表、列表、hash树 实现原理
算法其他篇 目录: 1.1 python中字典对象实现原理 1.2 顺序表 1.3 python 列表(list) 1.1 python中字典对象实现原理返回顶部 注:字典类型是Python中最常 ...
- NoSQL生态系统——hash分片和范围分片两种分片
13.4 横向扩展带来性能提升 很多NoSQL系统都是基于键值模型的,因此其查询条件也基本上是基于键值的查询,基本不会有对整个数据进行查询的时候.由于基本上所有的查询操作都是基本键值形式的,因此分片通 ...
- hash算法总结收集
hash算法的意义在于提供了一种快速存取数据的方法,它用一种算法建立键值与真实值之间的对应关系,(每一个真实值只能有一个键值,但是一个键值可以对应多个真实值),这样可以快速在数组等条件中里面存取数据. ...
随机推荐
- HDFS集中式缓存管理(Centralized Cache Management)
Hadoop从2.3.0版本号開始支持HDFS缓存机制,HDFS同意用户将一部分文件夹或文件缓存在HDFS其中.NameNode会通知拥有相应块的DataNodes将其缓存在DataNode的内存其中 ...
- 电子设计省赛--PID
//2014年4月17日 //2014年6月20日入"未完毕"(未完毕) //2014年6月21日 一開始还以为是多难的算法.事实上就是个渣渣. 当然PID实践中应该会非常难. 另 ...
- jquery源码学习笔记三:jQuery工厂剖析
jquery源码学习笔记二:jQuery工厂 jquery源码学习笔记一:总体结构 上两篇说过,query的核心是一个jQuery工厂.其代码如下 function( window, noGlobal ...
- JAVA泛型类
泛型是JDK 5.0后出现新概念,泛型的本质是参数化类型,也就是说所操作的数据类型被指定为一个参数.这种参数类型可以用在类.接口和方法的创建中,分别称为泛型类.泛型接口.泛型方法. 泛型类引入的好处不 ...
- MVC优缺点
1.通过把项目分成model view和controller,使得复杂项目更加容易维护. 2.没有使用view state和服务器表单控件,可以更方便的控制应用程序的行为 3.应用程序通过contro ...
- java中字节数组byte[]和字符(字符串)之间的转换
转自:http://blog.csdn.net/linlzk/article/details/6566124 Java与其他语言编写的程序进行tcp/ip socket通讯时,通讯内容一般都转换成by ...
- 并不对劲的manacher算法
有些时候,后缀自动机并不能解决某些问题,或者解决很麻烦.这时就有各种神奇的字符串算法了. manacher算法用来O(|S|)地求出字符串S的最长的回文子串的长度.这是怎么做到的呢? 并不对劲的暴力选 ...
- bzoj 5281 Talent Show —— 01分数规划+背包
题目:https://www.lydsy.com/JudgeOnline/problem.php?id=5281 二分一个答案比值,因为最后要*1000,不如先把 v[] *1000,就可以二分整数: ...
- VS2013文件同步插件开发
一.插件功能描述 插件监控一个xml文件,当该文档有添加新元素在保存的时候将新增的元素同步到指定的目录下. 二.模板的选择 由于该功能是跟代码编辑有关的,要监控文档的保存事件,所以要在文档打开的时候就 ...
- js追加子元素
在页面加载完毕后,向div元素追加span子元素 <html><head><title>js</title><script type=" ...