folly/GroupVarint.h

folly/GroupVarint.h is an implementation of variable-length encoding for 32- and 64-bit integers using the Group Varint encoding scheme as described in Jeff Dean's WSDM 2009 talk and in Information Retrieval: Implementing and Evaluating Search Engines.

Briefly, a group of four 32-bit integers is encoded as a sequence of variable length, between 5 and 17 bytes; the first byte encodes the length (in bytes) of each integer in the group. A group of five 64-bit integers is encoded as a sequence of variable length, between 7 and 42 bytes; the first two bytes encode the length (in bytes) of each integer in the group.

GroupVarint.h defines a few classes:

  • GroupVarint<T>, where T is uint32_t or uint64_t:

    Basic encoding / decoding interface, mainly aimed at encoding / decoding one group at a time.

  • GroupVarintEncoder<T, Output>, where T is uint32_t or uint64_t, and Output is a functor that accepts StringPieceobjects as arguments:

    Streaming encoder: add values one at a time, and they will be flushed to the output one group at a time. Handles the case where the last group is incomplete (the number of integers to encode isn't a multiple of the group size)

  • GroupVarintDecoder<T>, where T is uint32_t or uint64_t:

    Streaming decoder: extract values one at a time. Handles the case where the last group is incomplete.

The 32-bit implementation is significantly faster than the 64-bit implementation; on platforms supporting the SSSE3 instruction set, we use the PSHUFB instruction to speed up lookup, as described in SIMD-Based Decoding of Posting Lists(CIKM 2011).

For more details, see the header file folly/GroupVarint.h and the associated test file folly/test/GroupVarintTest.cpp.

GroupVarint的更多相关文章

  1. 今天听说了一个压缩解压整型的方式-group-varint

    group varint https://github.com/facebook/folly/blob/master/folly/docs/GroupVarint.md 这个是facebook的实现 ...

  2. folly学习心得(转)

    原文地址:  https://www.cnblogs.com/Leo_wl/archive/2012/06/27/2566346.html   阅读目录 学习代码库的一般步骤 folly库的学习心得 ...

  3. Folly: Facebook Open-source Library Readme.md 和 Overview.md(感觉包含的东西并不多,还是Boost更有用)

    folly/ For a high level overview see the README Components Below is a list of (some) Folly component ...

随机推荐

  1. jquery 判断checkbox状态

    jquery判断checked的三种方法:.attr('checked):   //看版本1.6+返回:”checked”或”undefined” ;1.5-返回:true或false.prop('c ...

  2. vue.js 源代码学习笔记 ----- instance index

    import { initMixin } from './init' import { stateMixin } from './state' import { renderMixin } from ...

  3. cscope使用记录

    在看c的源码过程中,仅仅使用ctags不够用,加入cscope会好一点,关于vim的配置就不多说了,在这里主要是记录常用的几个东西: 在代码的最顶层执行: cscope -Rbkq 打开vim: cs ...

  4. (转)MapReduce Design Patterns(chapter 4 (part 1))(七)

    Chapter 4. Data Organization Patterns 与前面章节的过滤器相比,本章是关于数据重组.个别记录的价值通常靠分区,分片,排序成倍增加.特别是在分布式系统中,因为这能提高 ...

  5. 使用TR1的智能指针

    作为C++程序员,在没有智能指针,手动管理内存的蛮荒岁月里,可以说是暗无天日,痛苦异常.直到上帝说,还是要有光,于是智能指针进了标准.C++码农的日子总算好起来了. 虽然一直鄙视着没有显式指针的语言, ...

  6. LARAVEL 路由原理分析

    <?php class App {    protected $routes = [];    protected $responseStatus = '200 OK';    protecte ...

  7. CSS格式化工具

    一直想自己写个css格式化工具,因为原先的<CSS代码格式化和压缩化>工具,压缩or格式化的都不是我的编码习惯.我的格式化工具也许代码方面细节方面都没他的好,但是符合自身需要的东西才是好东 ...

  8. ueditor使用小结【来源网络】

    原文地址:http://www.cnblogs.com/janes/p/5072496.html ueditor是百度编辑器,官网地址:http://ueditor.baidu.com/website ...

  9. 每天一个linux命令:【转载】head命令

    head 与 tail 就像它的名字一样的浅显易懂,它是用来显示开头或结尾某个数量的文字区块,head 用来显示档案的开头至标准输出中,而 tail 想当然尔就是看档案的结尾. 1.命令格式: hea ...

  10. 使用 ExceptionDispatchInfo 捕捉并重新抛出异常

    当你跑起了一个异步线程,并用 await 异步等待时,有没有好奇为什么能够在主线程 catch 到异步线程的异常? 当你希望在代码中提前收集好异常,最后一并把收集到的异常抛出的时候,能不能做到就像在原 ...