https://github.com/google/snappy

Introduction

【速度第一，压缩比适宜】

【favors speed over compression ratio】

Snappy is a compression/decompression library. It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression. For instance, compared to the fastest mode of zlib, Snappy is an order of magnitude faster for most inputs, but the resulting compressed files are anywhere from 20% to 100% bigger. (For more information, see "Performance", below.)

Snappy has the following properties:

Fast: Compression speeds at 250 MB/sec and beyond, with no assembler code. See "Performance" below.
Stable: Over the last few years, Snappy has compressed and decompressed petabytes of data in Google's production environment. The Snappy bitstream format is stable and will not change between versions.
Robust: The Snappy decompressor is designed not to crash in the face of corrupted or malicious input.
Free and open source software: Snappy is licensed under a BSD-type license. For more information, see the included COPYING file.

Snappy has previously been called "Zippy" in some Google presentations and the like.

Performance

【64-bit】

【i7 c,d = 250,500 MB/sec】

【compress ratio plain text,HTML,JPEGs PNGS = 1.5-1.7x,2-4x,1.0x】

Snappy is intended to be fast. On a single core of a Core i7 processor in 64-bit mode, it compresses at about 250 MB/sec or more and decompresses at about 500 MB/sec or more. (These numbers are for the slowest inputs in our benchmark suite; others are much faster.) In our tests, Snappy usually is faster than algorithms in the same class (e.g. LZO, LZF, QuickLZ, etc.) while achieving comparable compression ratios.

Typical compression ratios (based on the benchmark suite) are about 1.5-1.7x for plain text, about 2-4x for HTML, and of course 1.0x for JPEGs, PNGs and other already-compressed data. Similar numbers for zlib in its fastest mode are 2.6-2.8x, 3-7x and 1.0x, respectively. More sophisticated algorithms are capable of achieving yet higher compression rates, although usually at the expense of speed. Of course, compression ratio will vary significantly with the input.

Although Snappy should be fairly portable, it is primarily optimized for 64-bit x86-compatible processors, and may run slower in other environments. In particular:

Snappy uses 64-bit operations in several places to process more data at once than would otherwise be possible.
Snappy assumes unaligned 32- and 64-bit loads and stores are cheap. On some platforms, these must be emulated with single-byte loads and stores, which is much slower.
Snappy assumes little-endian throughout, and needs to byte-swap data in several places if running on a big-endian platform.

https://google.github.io/snappy/

bmdiff snappy lzw gzip的更多相关文章

spark-sql性能测试
一,测试环境 1) 硬件环境完全相同: 包括:cpu/内存/网络/磁盘Io/机器数量等 2)软件环境: 相同数据 ...
HBase应用开发回顾与总结系列之一：概述HBase设计规范
概述笔者本人接触研究HBase也有半年之久了,虽说不上深入和系统,但至少算是比较沉迷.作为部门里大数据技术的探路者,笔者还要承担起技术传播的职责,所以在摸索研究的过程中总是不断地进行总结和测试, ...
kafka概念
一.结构与概念解释 1.基础概念 topics: kafka通过topics维护各类信息. producer:发布消息到Kafka topic的进程. consumer:订阅kafka topic进程 ...
大数据查询——HBase读写设计与实践
导语:本文介绍的项目主要解决 check 和 opinion2 张历史数据表(历史数据是指当业务发生过程中的完整中间流程和结果数据)的在线查询.原实现基于 Oracle 提供存储查询服务,随着数据量的 ...
Kafka生产者-向Kafka中写入数据
(1)生产者概览 (1)不同的应用场景对消息有不同的需求,即是否允许消息丢失.重复.延迟以及吞吐量的要求.不同场景对Kafka生产者的API使用和配置会有直接的影响. 例子1:信用卡事务处理系统,不允 ...
Hbase中HMaster作用
HMaster在功能上主要负责Table表和HRegion的管理工作,具体包括: 1.管理用户对Table表的增.删.改.查操作: 2.管理HRegion服务器的负载均衡,调整HRegion分布: 3 ...
Kafka 详解（三）------Producer生产者
在第一篇博客我们了解到一个kafka系统,通常是生产者Producer 将消息发送到 Broker,然后消费者 Consumer 去 Broker 获取,那么本篇博客我们来介绍什么是生产者Produc ...
Kafka权威指南读书笔记之（五）深入Kafka
集中讨论以下3 个有意思的话题 :• Kafka 如何进行复制:• Kafka 如何处理来自生产者和消费者的请求 :• Kafka 的存储细节,比如文件格式和索引. 集群成员关系 Kafka 使用 Z ...
Kafka权威指南读书笔记之（三）Kafka 生产者一一向 Kafka 写入数据
不管是把 Kafka 作为消息队列.消息总线还是数据存储平台来使用 ,总是需要有一个可以往 Kafka 写入数据的生产者和一个从 Kafka 读取数据的消费者,或者一个兼具两种角色的应用程序. 开发者 ...

随机推荐

Add and Search Word - Data structure design - LeetCode
Design a data structure that supports the following two operations: void addWord(word) bool search(w ...
SQL SERVER 技术博客外文
https://www.sqlskills.com/blogs/paul/capturing-io-latencies-period-time/ http://www.sqlskills.com/bl ...
iOS网络交互数据格式解析之json
作为一种轻量级的数据交换格式,json正在逐步取代xml,成为网络数据的通用格式.从ios5开始,apple提供了对json的原生支持,但为了兼容以前的ios版本,我们仍然需要使用第三方库来解析常用 ...
WEB API 返回类型设置为JSON 【转】
http://blog.sina.com.cn/s/blog_60ba16ed0102uzc7.html web api写api接口时默认返回的是把你的对象序列化后以XML形式返回,那么怎样才能让其返 ...
HDFS冗余数据块的自动删除
HDFS冗余数据块的自动删除在日常维护hadoop集群的过程中发现这样一种情况: 某个节点由于网络故障或者DataNode进程死亡,被NameNode判定为死亡,HDFS马上自动开始数据块的容错拷贝 ...
Android4.4电池管理
一.概述 Android4.4的电池管理功能用于管理电池的充.放电功能. 整个电池管理的部分包含Linux电池驱动.Android电池服务.电池属性和參数.电池曲线优化四个部分. Linux电池驱动用 ...
RPi Cam v2 之一：基础及牛刀小试
前言原创文章,转载引用务必注明链接,水平有限,如有疏漏,欢迎指正. 本文使用markdown写成,为获得更好的阅读体验,可以访问我的博客. 1.unboxing & comparison 包 ...
路飞学城Python爬虫课第一章笔记
前言原创文章,转载引用务必注明链接.水平有限,如有疏漏,欢迎指正. 之前看阮一峰的博客文章,介绍到路飞学城爬虫课程限免,看了眼内容还不错,就兴冲冲报了名,99块钱满足以下条件会返还并送书送视频. 缴 ...
总结自己使用shell命令行经常使用到的8个小技巧
原创blog,转载请注明出处 Shell是命令解释器 [root@localhost ~]# cat /etc/shells 查看本系统共支持哪些shell 1 tab 命令补全这个差点儿每次都能用 ...
C#比較对象的相等性
对于相等的机制全部不同,这取决于比較的是引用类型还是值类型.以下分别介绍引用类型和值类型的相等性. 1.比較引用类型的相等性 System.Object定义了三种不同的方法,来比較对象的相等性:Ref ...

bmdiff snappy lzw gzip

Introduction

Performance

bmdiff snappy lzw gzip的更多相关文章

随机推荐

热门专题