Extremely fast hash algorithm-xxHash
xxHash - Extremely fast hash algorithm
xxHash is an Extremely fast Hash algorithm, running at RAM speed limits. It successfully completes the SMHasher test suite which evaluates collision, dispersion and randomness qualities of hash functions. Code is highly portable, and hashes are identical on all platforms (little / big endian).
| Branch | Status |
|---|---|
| master | |
| dev |
Benchmarks
The benchmark uses SMHasher speed test, compiled with Visual 2010 on a Windows Seven 32-bit box. The reference system uses a Core 2 Duo @3GHz
| Name | Speed | Quality | Author |
|---|---|---|---|
| xxHash | 5.4 GB/s | 10 | Y.C. |
| MurmurHash 3a | 2.7 GB/s | 10 | Austin Appleby |
| SBox | 1.4 GB/s | 9 | Bret Mulvey |
| Lookup3 | 1.2 GB/s | 9 | Bob Jenkins |
| CityHash64 | 1.05 GB/s | 10 | Pike & Alakuijala |
| FNV | 0.55 GB/s | 5 | Fowler, Noll, Vo |
| CRC32 | 0.43 GB/s | 9 | |
| MD5-32 | 0.33 GB/s | 10 | Ronald L.Rivest |
| SHA1-32 | 0.28 GB/s | 10 |
Q.Score is a measure of quality of the hash function. It depends on successfully passing SMHasher test set. 10 is a perfect score. Algorithms with a score < 5 are not listed on this table.
A more recent version, XXH64, has been created thanks to Mathias Westerdahl, which offers superior speed and dispersion for 64-bit systems. Note however that 32-bit applications will still run faster using the 32-bit version.
SMHasher speed test, compiled using GCC 4.8.2, on Linux Mint 64-bit. The reference system uses a Core i5-3340M @2.7GHz
| Version | Speed on 64-bit | Speed on 32-bit |
|---|---|---|
| XXH64 | 13.8 GB/s | 1.9 GB/s |
| XXH32 | 6.8 GB/s | 6.0 GB/s |
This project also includes a command line utility, named xxhsum, offering similar features as md5sum, thanks to Takayuki Matsuoka contributions.
License
The library files xxhash.c and xxhash.h are BSD licensed. The utility xxhsum is GPL licensed.
Build modifiers
The following macros can be set at compilation time, they modify xxhash behavior. They are all disabled by default.
XXH_INLINE_ALL: Make all functionsinline, with bodies directly included withinxxhash.h. There is no need for anxxhash.omodule in this case. Inlining functions is generally beneficial for speed on small keys. It's especially effective when key length is a compile time constant, with observed performance improvement in the +200% range . See this article for details.XXH_ACCEPT_NULL_INPUT_POINTER: if set to1, when input is a null-pointer, xxhash result is the same as a zero-length key (instead of a dereference segfault).XXH_FORCE_MEMORY_ACCESS: default method0uses a portablememcpy()notation. Method1uses a gcc-specificpackedattribute, which can provide better performance for some targets. Method2forces unaligned reads, which is not standard compliant, but might sometimes be the only way to extract better performance.XXH_CPU_LITTLE_ENDIAN: by default, endianess is determined at compile time. It's possible to skip auto-detection and force format to little-endian, by setting this macro to 1. Setting it to 0 forces big-endian.XXH_FORCE_NATIVE_FORMAT: on big-endian systems : use native number representation. Breaks consistency with little-endian results.XXH_PRIVATE_API: same impact asXXH_INLINE_ALL. Name underlines that symbols will not be published on library public interface.XXH_NAMESPACE: prefix all symbols with the value ofXXH_NAMESPACE. Useful to evade symbol naming collisions, in case of multiple inclusions of xxHash source code. Client applications can still use regular function name, symbols are automatically translated throughxxhash.h.XXH_STATIC_LINKING_ONLY: gives access to state declaration for static allocation. Incompatible with dynamic linking, due to risks of ABI changes.XXH_NO_LONG_LONG: removes support for XXH64, for targets without 64-bit support.
Example
Calling xxhash 64-bit variant from a C program :
#include "xxhash.h"
unsigned long long calcul_hash(const void* buffer, size_t length)
{
unsigned long long const seed = 0; /* or any other value */
unsigned long long const hash = XXH64(buffer, length, seed);
return hash;
}
Using streaming variant is more involved, but makes it possible to provide data in multiple rounds :
#include "stdlib.h" /* abort() */
#include "xxhash.h"
unsigned long long calcul_hash_streaming(someCustomType handler)
{
XXH64_state_t* const state = XXH64_createState();
if (state==NULL) abort();
size_t const bufferSize = SOME_VALUE;
void* const buffer = malloc(bufferSize);
if (buffer==NULL) abort();
unsigned long long const seed = 0; /* or any other value */
XXH_errorcode const resetResult = XXH64_reset(state, seed);
if (resetResult == XXH_ERROR) abort();
(...)
while ( /* any condition */ ) {
size_t const length = get_more_data(buffer, bufferSize, handler); /* undescribed */
XXH_errorcode const addResult = XXH64_update(state, buffer, length);
if (addResult == XXH_ERROR) abort();
(...)
}
(...)
unsigned long long const hash = XXH64_digest(state);
free(buffer);
XXH64_freeState(state);
return hash;
}
Other programming languages
Beyond the C reference version, xxHash is also available on many programming languages, thanks to great contributors. They are listed here.
Branch Policy
- The "master" branch is considered stable, at all times.
- The "dev" branch is the one where all contributions must be merged before being promoted to master.
- If you plan to propose a patch, please commit into the "dev" branch, or its own feature branch. Direct commit to "master" are not permitted.
Extremely fast hash algorithm-xxHash的更多相关文章
- Deep Learning 17:DBN的学习_读论文“A fast learning algorithm for deep belief nets”的总结
1.论文“A fast learning algorithm for deep belief nets”的“explaining away”现象的解释: 见:Explaining Away的简单理解 ...
- Reducing the Dimensionality of data with neural networks / A fast learing algorithm for deep belief net
Deeplearning原文作者Hinton代码注解 Matlab示例代码为两部分,分别对应不同的论文: . Reducing the Dimensionality of data with neur ...
- SHA1 安全哈希算法(Secure Hash Algorithm)
安全哈希算法(Secure Hash Algorithm)主要适用于数字签名标准 (Digital Signature Standard DSS)里面定义的数字签名算法(Digital Signatu ...
- 论文笔记(2):A fast learning algorithm for deep belief nets.
论文笔记(2):A fast learning algorithm for deep belief nets. 这几天继续学习一篇论文,Hinton的A Fast Learning Algorithm ...
- super fast sort algorithm in js
super fast sort algorithm in js sort algorithm Promise.race (return the fast one) Async / Await // c ...
- BeeProg2C Extremely fast universal USB interfaced programmer
http://www.elnec.com/products/universal-programmers/beeprog2c/ FPGA based totally reconfigurable 48 ...
- Package md5 implements the MD5 hash algorithm as defined in RFC 1321 base64
https://golang.google.cn/pkg/crypto/md5/ Go by Example 中文:Base64编码 https://books.studygolang.com/gob ...
- Awesome C/C++
Awesome C/C++ A curated list of awesome C/C++ frameworks, libraries, resources, and shiny things. In ...
- C/C++ 框架,类库,资源集合
很棒的 C/C++ 框架,类库,资源集合. Awesome C/C++ Standard Libraries Frameworks Artificial Intelligence Asynchrono ...
随机推荐
- 入门 Webpack,看这篇就够
写在前面的话 阅读本文之前,先看下面这个webpack的配置文件,如果每一项你都懂,那本文能带给你的收获也许就比较有限,你可以快速浏览或直接跳过:如果你和十天前的我一样,对很多选项存在着疑惑,那花一段 ...
- 内网端口转发[netsh]
一.利用场景 当前获取目标内网边界区域一台机器,可以通外网和内网也就是存在两块网卡,又通过其他手段获取到内网另外一台机器,但是这台机器不能出外网,所以我们可以使用windows自带netsh命令通过边 ...
- unicode_stop - 撤销控制台unicode模式(例如, 回到8-bit模式).
总览 unicode_stop 描述 unicode_stop 撤销以前 unicode_start(1) 命令的效果, 将显示屏和键盘设回到 8-bit 模式.
- linux 两个进程通过 共享内存 通信例子
例子1:两个进程通过共享内存通信,一个进程向共享内存中写入数据,另一个进程从共享内存中读出数据 文件1 创建进程1,实现功能,打印共享内存中的数据 #include <stdio.h> # ...
- http11.Http11OutputBuffer.SocketOutputBuffer.doWrite
这是一个错误. 我在spring框架中,创建了一个基类SuperBaseController, 并且使用了@ModelAttribute用来给HttpServletRequest和HttpServle ...
- 杭电多校第一场-M-Code
题目描述 After returning with honour from ICPC(International Cat Programming Contest) World Finals, Tom ...
- BIO、NIO和AIO
BIO(Blocking I/O)同步阻塞I/O 这是最基本与简单的I/O操作方式,其根本特性是做完一件事再去做另一件事,一件事一定要等前一件事做完,这很符合程序员传统的顺序来开发思想,因此BIO模型 ...
- rest framework之APIView
一.rest framework配置 1.安装rest framework 在django环境中安装rest-framework框架: (automatic) C:\Users\Administrat ...
- vue组件的调用方式
vue中一般都会把公共内容作为一个组件去布局,但是如何引用自定义的组件呢?下面就是vue调用自定义组件的方式,主要代码如下: <template> <div> <span ...
- PagedLOD模型对象选择关键技术点
DatabaseCacheReadCallback这个类继承ReadCallback,在相交的测试中,场景可能有PagedLOD,而计算相交过程中,PagedLOD不是精度最高的节点,这样计算的就不准 ...