Hash Map (Hash Table)
Reference: Wiki PrincetonAlgorithm
What is Hash Table
Hash table (hash map) is a data structure used to implement an associative array, a structure that can map keys to values.
A hash table uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.
hash function: transform the search key into an array index
What is Hash Collisions
Different keys that are assigned by the hash function to the same bucket.
Perfect Hashing
A perfect hash function for a set S is a hash function that maps distinct elements in S to a set of integers, with no collisions.
Perfect hashing allows for constant time lookups in the worst case. This is in contrast to most chaining and open addressing methods, where the time for lookup is low on average, but may be very large, O(n), for some sets of keys.
Load Factor
A critical statistic for a hash table.
Load factor = n / k
where:
n = number of entries
k = number of buckets
- high load factor: hash table becomes slower, and it may even fail to work (depending on the method used).
- low load factor: considering the proportion of unused areas in the hash table. This results in wasted memory.
Also, one should examine the variance of number of entries per bucket. For example, two tables both have 1000 entries and 1000 buckets; one has exactly one entry in each bucket, the other has all entries in the same bucket. Clearly the hashing is not working in the second one.
Hash Functions
3 primary requirements in implementing a good hash function for a given data type:
- It should be deterministic - equal keys must produce the same hash value.
- It should be efficient to compute.
- It should uniformly distribute the keys.
Suppose we have an array that can hold M key-value pairs, then we need a function that can transform any given key into an index into that array.
Positive Integers
modular hashing
Choose the array size M to be prime, and, for any positive integer key k, compute the remainder when dividing k by M. (k % M)
Floating-Point Numbers
Simple but defective way:
If the keys are real numbers between 0 and 1, we might just multiply by M and round off to the nearest integer to get an index between 0 and M-1.
This approach is defective because it gives more weight to the most significant bits of the keys; the least significant bits play no role.
Better way (adopted by Java):
Use modular hashing on the binary representation of the key.
String
Simply treat them as huge integers.
R is prime number, Java uses 31. Calculate each bit of the array.
int hash = 0;
for (int i = 0; i < s.length(); i++)
hash = (R * hash + s.charAt(i)) % M;
Also, we can use MD5 to randomize input keys
int hashFunc(String key) {
return md5(key) % HASH_TABLE_SIZE;
}
APR hash function uses magic number 33. Similar with the first method.
int hashFunc(String key) {
int sum = 0;
for (int i = 0; i < key.length(); i++) {
sum = sum * 33 + (int) key.charAt(i);
sum = sum % HASH_TABLE_SIZE;
}
return sum;
}
Compound Keys
We can follow the ways of processing string.
For example, we use a tuple as key (element1, element2, element3)
int hash = (((element1 * R + element2) % M) * R + element3) % M;
By Using hashCode() in Java
private int hash(Key key) {
return (key.hashCode() & 0x7fffffff) % M;
}
Collision Resolutions
Generally, there are four ways to resolve collision:
- Separate Chaining
- Open Addressing
- Robin Hood Hashing
- 2-Choic Hashing
details check here
Hash Map (Hash Table)的更多相关文章
- [LeetCode] Repeated DNA Sequences hash map
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- POJ 1840 Eqs 二分+map/hash
Description Consider equations having the following form: a1x13+ a2x23+ a3x33+ a4x43+ a5x53=0 The co ...
- Implement Hash Map Using Primitive Types
A small coding test that I encountered today. Question Using only primitive types, implement a fixed ...
- hdu1381 Crazy Search(hash map)
题目意思: 给出一个字符串和字串的长度,求出该字符串的全部给定长度的字串的个数(不同样). 题目分析: 此题为简单的字符串哈hash map问题,能够直接调用STL里的map类. map<str ...
- (通俗易懂小白入门)字符串Hash+map判重——暴力且优雅
字符串Hash 今天我们要讲解的是用于处理字符串匹配查重的一个算法,当我们处理一些问题如给出10000个字符串输出其中不同的个数,或者给一个长度100000的字符串,找出其中相同的字符串有多少个(这样 ...
- Hash Map 在java中的解释及示例
目录 HashMap在java中的应用及示例 HashMap的内部结构 HashMap的性能 同步HashMap HashMap的构造函数 HashMap的时间复杂度 HashMap的方法 1. vo ...
- TTTTTTTTTTTT 百度之星D map+hash
Problem D Accepts: 2806 Submissions: 8458 Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 6 ...
- Redis的增删改查 c# key value类型和hash map 类型
using Newtonsoft.Json; using StackExchange.Redis; using System; using System.Collections.Generic; us ...
- hdu 4622 (hash+“map”)
题目链接:https://vjudge.net/problem/HDU-4622 题意:给定t组字符串每组m条询问--求问每条询问区间内有多少不同的子串. 题解:把每个询问区间的字符串hash一下存图 ...
随机推荐
- ELT工具Kettle之CDC(Change Data Capture)实现实例
ETL过程的第一步就是从不同的数据源抽取数据并把数据存储在数据的缓存区.这个过程的主要挑战就是初始加载数据量大和比较慢的网络延迟.在初始加载完成之后,不能再把所有数据重新加载一遍,我们需要的只是变化的 ...
- jQuery扩展与noConflict的用法-小示例
有时我们要用到自己定义的jquery,这时可以通过jQuery扩展来实现该功能 index.html <!DOCTYPE html> <html> <head> & ...
- SEO学习之路
一.入门建站篇 1.Wamp集成环境安装 2.Wamp集成环境配置多站点 3.DedeCMS安装及目录结构 4.DedeCMS源码安装 5.DedeCMS官方手册 未完待续...
- [原创作品]观察者模式在Web App的应用
(转载请注明:http://zhutty.cnblogs.com, 交流请加群:164858883) 在软件工程中,有一条重要的原则就是:高内聚低耦合.这是评定软件的设计好坏的一个标准.所谓高内聚,指 ...
- struts2讲义----二
Struts的namespace 示例工程Struts2_0200_Namespace Struts.xml <struts> <constant name="struts ...
- nginx 配置访问正则匹配
server{ listen 80; server_name api.zyy.com; root /var/www/api_zyy; index index.php; location ~ /asse ...
- 3.1,pandas【基本功能】
一:改变索引 reindex方法对于Series直接索引,对于DataFrame既可以改变行索引,也可以改变列索引,还可以两个一起改变. 1)对于Series In [2]: seri = pd.Se ...
- HTML、CSS、JS、PHP 的学习顺序~(零基础初学者)
如果你有耐心坚持一年以上的话, 我会推荐HTML->CSS->JS->PHP的顺序来学习. 1. HTML学习:首先学习HTML,HTML作为标记语言是非常容易学的,把w3schoo ...
- Catel(翻译)-为什么选择Catel
1. 介绍 这篇文章主要是为了说明,我们为什么要使用Catel框架作为开发WPF,Silverlight,和Windows phone7应用程序的开发框架. 2. 通用功能 2. ...
- xcode下载方式
1.去AppStore下载 对于Xcode老是在AppStore升级失败,而且下载慢,可取找到了这个--> 官方 Xcode .dmg 文件下载链接:超级传送门 2.开发者中心官网下载 可参考这 ...