A HyperLogLog is a probabilistic data structure used in order to count unique things (technically this is referred to estimating the cardinality of a set). Usually counting unique items requires using an amount of memory proportional to the number of items you want to count, because you need to remember the elements you have already seen in the past in order to avoid counting them multiple times. However there is a set of algorithms that trade memory for precision: you end with an estimated measure with a standard error, in the case of the Redis implementation, which is less than 1%. The magic of this algorithm is that you no longer need to use an amount of memory proportional to the number of items counted, and instead can use a constant amount of memory! 12k bytes in the worst case, or a lot less if your HyperLogLog (We'll just call them HLL from now) has seen very few elements.

HLLs in Redis, while technically a different data structure, is encoded as a Redis string, so you can call GET to serialize a HLL, and SET to deserialize it back to the server.

Conceptually the HLL API is like using Sets to do the same task. You would SADD every observed element into a set, and would use SCARD to check the number of elements inside the set, which are unique since SADD will not re-add an existing element.

While you don't really add items into an HLL, because the data structure only contains a state that does not include actual elements, the API is the same:

  • Every time you see a new element, you add it to the count with PFADD.
  • Every time you want to retrieve the current approximation of the unique elements added with PFADD so far, you use the PFCOUNT.

127.0.0.1:6379> PFADD hll a b c d
(integer) 1
127.0.0.1:6379> PFCOUNT hll
(integer) 4

An example of use case for this data structure is counting unique queries performed by users in a search form every day.

Redis - HyperLogLogs的更多相关文章

  1. Java Spring mvc 操作 Redis 及 Redis 集群

    本文原创,转载请注明:http://www.cnblogs.com/fengzheng/p/5941953.html 关于 Redis 集群搭建可以参考我的另一篇文章 Redis集群搭建与简单使用 R ...

  2. redis.conf配置详细解析

    # redis 配置文件示例 # 当你需要为某个配置项指定内存大小的时候,必须要带上单位, # 通常的格式就是 1k 5gb 4m 等酱紫: # # 1k => 1000 bytes # 1kb ...

  3. 玩转Redis之Window安装使用(干货)

    距离上次定Gc.Db框架,好久没有更新博客了,今日没什么事,就打算就Redis写点东西. Redis是一个开源(BSD许可),内存存储的数据结构服务器,可用作数据库,高速缓存和消息队列代理.它支持字符 ...

  4. ASP.NET Core 使用 Redis 和 Protobuf 进行 Session 缓存

    前言 上篇博文介绍了怎么样在 asp.net core 中使用中间件,以及如何自定义中间件.项目中刚好也用到了Redis,所以本篇就介绍下怎么样在 asp.net core 中使用 Redis 进行资 ...

  5. python之redis和memcache操作

    Redis 教程 Redis是一个开源(BSD许可),内存存储的数据结构服务器,可用作数据库,高速缓存和消息队列代理.Redis 是完全开源免费的,遵守BSD协议,是一个高性能的key-value数据 ...

  6. Linux(Centos)之安装Redis及注意事项

    1.redis简单说明 a.在前面我简单的说过redis封装成共用类的实现,地址如下:http://www.cnblogs.com/hanyinglong/p/Redis.html. b.redis是 ...

  7. redis配置详解

    ##redis配置详解 # Redis configuration file example. # # Note that in order to read the configuration fil ...

  8. 初探Redis+Net在Windows环境下的使用

    Redis官网地址:https://redis.io/:Redis官方暂时不支持Windows环境,但是MicroSoft Open Tech group开发了一个Windows平台下运行的版本. R ...

  9. redis.conf配置详细翻译解析

    # redis 配置文件示例 # 当你需要为某个配置项指定内存大小的时候,必须要带上单位, # 通常的格式就是 1k 5gb 4m 等酱紫: # # 1k => 1000 bytes # 1kb ...

随机推荐

  1. UVA 11354 Bond(MST + LCA)

    n<=50000, m<=100000的无向图,对于Q<=50000个询问,每次求q->p的瓶颈路. 其实求瓶颈路数组maxcost[u][v]有用邻接矩阵prim的方法.但是 ...

  2. ssi整合报错org.apache.struts2.convention.ConventionsServiceImpl.determineResultPath(ConventionsServiceImpl.java:100)

    java.lang.RuntimeException: Invalid action class configuration that references an unknown class name ...

  3. 关于三目运算符与if语句的效率与洛谷P2704题解

    题目描述 司令部的将军们打算在N*M的网格地图上部署他们的炮兵部队.一个N*M的地图由N行M列组成,地图的每一格可能是山地(用“H” 表示),也可能是平原(用“P”表示),如下图.在每一格平原地形上最 ...

  4. apache win openssl

    Rubayat Hasan Software Development, Music, Web Design, life, thoughts…   Home Portfolio Projects Con ...

  5. Android中图表AChartEngine学习使用与例子

    很多时候项目中我们需要对一些统计数据进行绘制表格,更多直观查看报表分析结果.基本有以下几种方法: 1:可以进行android  api进行draw这样的话,效率比较低 2:使用开源绘表引擎,这样效率比 ...

  6. Enhancing the Scalability of Memcached

    原文地址: https://software.intel.com/en-us/articles/enhancing-the-scalability-of-memcached-0 1 Introduct ...

  7. NEUOJ 1117: Ready to declare(单调队列)

    1117: Ready to declare 时间限制: 1 Sec  内存限制: 128 MB 提交: 358  解决: 41 [提交][状态][pid=1117" style=" ...

  8. Gym 100637F F. The Pool for Lucky Ones 暴力

    F. The Pool for Lucky Ones Time Limit: 20 Sec Memory Limit: 256 MB 题目连接 http://codeforces.com/gym/10 ...

  9. Codeforces Round #315 (Div. 1) A. Primes or Palindromes? 暴力

    A. Primes or Palindromes?Time Limit: 20 Sec Memory Limit: 256 MB 题目连接 http://poj.org/problem?id=3261 ...

  10. hdu1428漫步校园( 最短路+BFS(优先队列)+记忆化搜索(DFS))

    Problem Description LL最近沉迷于AC不能自拔,每天寝室.机房两点一线.由于长时间坐在电脑边,缺乏运动.他决定充分利用每次从寝室到机房的时间,在校园里散散步.整个HDU校园呈方形布 ...