Extremely fast hash algorithm-xxHash
xxHash - Extremely fast hash algorithm
xxHash is an Extremely fast Hash algorithm, running at RAM speed limits. It successfully completes the SMHasher test suite which evaluates collision, dispersion and randomness qualities of hash functions. Code is highly portable, and hashes are identical on all platforms (little / big endian).
| Branch | Status |
|---|---|
| master | |
| dev |
Benchmarks
The benchmark uses SMHasher speed test, compiled with Visual 2010 on a Windows Seven 32-bit box. The reference system uses a Core 2 Duo @3GHz
| Name | Speed | Quality | Author |
|---|---|---|---|
| xxHash | 5.4 GB/s | 10 | Y.C. |
| MurmurHash 3a | 2.7 GB/s | 10 | Austin Appleby |
| SBox | 1.4 GB/s | 9 | Bret Mulvey |
| Lookup3 | 1.2 GB/s | 9 | Bob Jenkins |
| CityHash64 | 1.05 GB/s | 10 | Pike & Alakuijala |
| FNV | 0.55 GB/s | 5 | Fowler, Noll, Vo |
| CRC32 | 0.43 GB/s | 9 | |
| MD5-32 | 0.33 GB/s | 10 | Ronald L.Rivest |
| SHA1-32 | 0.28 GB/s | 10 |
Q.Score is a measure of quality of the hash function. It depends on successfully passing SMHasher test set. 10 is a perfect score. Algorithms with a score < 5 are not listed on this table.
A more recent version, XXH64, has been created thanks to Mathias Westerdahl, which offers superior speed and dispersion for 64-bit systems. Note however that 32-bit applications will still run faster using the 32-bit version.
SMHasher speed test, compiled using GCC 4.8.2, on Linux Mint 64-bit. The reference system uses a Core i5-3340M @2.7GHz
| Version | Speed on 64-bit | Speed on 32-bit |
|---|---|---|
| XXH64 | 13.8 GB/s | 1.9 GB/s |
| XXH32 | 6.8 GB/s | 6.0 GB/s |
This project also includes a command line utility, named xxhsum, offering similar features as md5sum, thanks to Takayuki Matsuoka contributions.
License
The library files xxhash.c and xxhash.h are BSD licensed. The utility xxhsum is GPL licensed.
Build modifiers
The following macros can be set at compilation time, they modify xxhash behavior. They are all disabled by default.
XXH_INLINE_ALL: Make all functionsinline, with bodies directly included withinxxhash.h. There is no need for anxxhash.omodule in this case. Inlining functions is generally beneficial for speed on small keys. It's especially effective when key length is a compile time constant, with observed performance improvement in the +200% range . See this article for details.XXH_ACCEPT_NULL_INPUT_POINTER: if set to1, when input is a null-pointer, xxhash result is the same as a zero-length key (instead of a dereference segfault).XXH_FORCE_MEMORY_ACCESS: default method0uses a portablememcpy()notation. Method1uses a gcc-specificpackedattribute, which can provide better performance for some targets. Method2forces unaligned reads, which is not standard compliant, but might sometimes be the only way to extract better performance.XXH_CPU_LITTLE_ENDIAN: by default, endianess is determined at compile time. It's possible to skip auto-detection and force format to little-endian, by setting this macro to 1. Setting it to 0 forces big-endian.XXH_FORCE_NATIVE_FORMAT: on big-endian systems : use native number representation. Breaks consistency with little-endian results.XXH_PRIVATE_API: same impact asXXH_INLINE_ALL. Name underlines that symbols will not be published on library public interface.XXH_NAMESPACE: prefix all symbols with the value ofXXH_NAMESPACE. Useful to evade symbol naming collisions, in case of multiple inclusions of xxHash source code. Client applications can still use regular function name, symbols are automatically translated throughxxhash.h.XXH_STATIC_LINKING_ONLY: gives access to state declaration for static allocation. Incompatible with dynamic linking, due to risks of ABI changes.XXH_NO_LONG_LONG: removes support for XXH64, for targets without 64-bit support.
Example
Calling xxhash 64-bit variant from a C program :
#include "xxhash.h"
unsigned long long calcul_hash(const void* buffer, size_t length)
{
unsigned long long const seed = 0; /* or any other value */
unsigned long long const hash = XXH64(buffer, length, seed);
return hash;
}
Using streaming variant is more involved, but makes it possible to provide data in multiple rounds :
#include "stdlib.h" /* abort() */
#include "xxhash.h"
unsigned long long calcul_hash_streaming(someCustomType handler)
{
XXH64_state_t* const state = XXH64_createState();
if (state==NULL) abort();
size_t const bufferSize = SOME_VALUE;
void* const buffer = malloc(bufferSize);
if (buffer==NULL) abort();
unsigned long long const seed = 0; /* or any other value */
XXH_errorcode const resetResult = XXH64_reset(state, seed);
if (resetResult == XXH_ERROR) abort();
(...)
while ( /* any condition */ ) {
size_t const length = get_more_data(buffer, bufferSize, handler); /* undescribed */
XXH_errorcode const addResult = XXH64_update(state, buffer, length);
if (addResult == XXH_ERROR) abort();
(...)
}
(...)
unsigned long long const hash = XXH64_digest(state);
free(buffer);
XXH64_freeState(state);
return hash;
}
Other programming languages
Beyond the C reference version, xxHash is also available on many programming languages, thanks to great contributors. They are listed here.
Branch Policy
- The "master" branch is considered stable, at all times.
- The "dev" branch is the one where all contributions must be merged before being promoted to master.
- If you plan to propose a patch, please commit into the "dev" branch, or its own feature branch. Direct commit to "master" are not permitted.
Extremely fast hash algorithm-xxHash的更多相关文章
- Deep Learning 17:DBN的学习_读论文“A fast learning algorithm for deep belief nets”的总结
1.论文“A fast learning algorithm for deep belief nets”的“explaining away”现象的解释: 见:Explaining Away的简单理解 ...
- Reducing the Dimensionality of data with neural networks / A fast learing algorithm for deep belief net
Deeplearning原文作者Hinton代码注解 Matlab示例代码为两部分,分别对应不同的论文: . Reducing the Dimensionality of data with neur ...
- SHA1 安全哈希算法(Secure Hash Algorithm)
安全哈希算法(Secure Hash Algorithm)主要适用于数字签名标准 (Digital Signature Standard DSS)里面定义的数字签名算法(Digital Signatu ...
- 论文笔记(2):A fast learning algorithm for deep belief nets.
论文笔记(2):A fast learning algorithm for deep belief nets. 这几天继续学习一篇论文,Hinton的A Fast Learning Algorithm ...
- super fast sort algorithm in js
super fast sort algorithm in js sort algorithm Promise.race (return the fast one) Async / Await // c ...
- BeeProg2C Extremely fast universal USB interfaced programmer
http://www.elnec.com/products/universal-programmers/beeprog2c/ FPGA based totally reconfigurable 48 ...
- Package md5 implements the MD5 hash algorithm as defined in RFC 1321 base64
https://golang.google.cn/pkg/crypto/md5/ Go by Example 中文:Base64编码 https://books.studygolang.com/gob ...
- Awesome C/C++
Awesome C/C++ A curated list of awesome C/C++ frameworks, libraries, resources, and shiny things. In ...
- C/C++ 框架,类库,资源集合
很棒的 C/C++ 框架,类库,资源集合. Awesome C/C++ Standard Libraries Frameworks Artificial Intelligence Asynchrono ...
随机推荐
- JAVA FileUtils(文件读写以及操作工具类)
文件操作常用功能: package com.suning.yypt.business.report; import java.io.*; import java.util.*; @SuppressWa ...
- PAT_A1016#Phone Bills
Source: PAT A1016 Phone Bills (25 分) Description: A long-distance telephone company charges its cust ...
- 常用的一些js事件及案例
比如金额需要显示的时候转换成有千分位,小数点后保留2位等.去编辑的时候,又要格式化,把逗号都去掉.网上找了段代码,但是再次编辑会有问题,修改了一下,代码如下: function outputMoney ...
- NIO 源码分析(03) 从 BIO 到 NIO
目录 一.NIO 三大组件 Channels.Buffers.Selectors 1.1 Channel 和 Buffer 1.2 Selector 1.3 Linux IO 和 NIO 编程的区别 ...
- HTML5 Canvas知识点学习笔记
版权声明:本文为博主原创文章,未经博主同意不得转载. https://blog.csdn.net/huangyibin628/article/details/30108165 canvas ① 主要作 ...
- Markdown测试2
四级标题 内容测试 内容测试 内容测试 为知笔记发布博客时会添加一些HTML或CSS的标记,会影响文章的摘要显示. A B 一 二 α" role="presentation&q ...
- 【HDOJ】P1007 Quoit Design (最近点对)
题目意思很简单,意思就是求一个图上最近点对. 具体思想就是二分法,这里就不做介绍,相信大家都会明白的,在这里我说明一下如何进行拼合. 具体证明一下为什么只需要检查6个点 首先,假设当前左侧和右侧的最小 ...
- 微信小程序之模板消息推送
最近在用sanic框架写微信小程序,其中写了一个微信消息推送,还挺有意思的,写了个小demo 具体见官方文档:https://developers.weixin.qq.com/miniprogram/ ...
- HTML 自定义元素教程
组件是 Web 开发的方向,现在的热点是 JavaScript 组件,但是 HTML 组件未来可能更有希望. 本文就介绍 HTML 组件的基础知识:自定义元素(custom elements). 文章 ...
- cookie、session、sessionStorage和localStorage
摘抄并整理后查 cookie 和 session 一般用来跟踪浏览器的用户身份 Session的存储方式 1. 使用cookie:保存 session id 的方式可以采用 cookie,这样在交互过 ...