序列化战争:主流序列化框架Benchmark

GitHub上有这样一个关于序列化的Benchmark,被好多文章引用。但这个项目考虑到完整性,代码有些复杂。为了个人学习,自己实现了个简单的Benchmark测试类,也算是总结一下当今主流序列化框架的用法。

1.序列化的战争

按照序列化后的数据格式,主流的序列化框架主要可以分为四大类:JSON、二进制、XML、RPC。从更高层次来说,JSON和XML都可以算作是文本类的,而RPC类因为不只是序列化,框架往往还提供了底层RPC以及跨语言代码生成等基础设施,所以单列作一类。具体说来,本次测试涵盖了以下这些:

  • JSON类

  • 二进制类
    • 老牌劲旅Hessian(以前很喜欢用的)
    • 功能全面而强大的FST
    • 后起之秀Kryo
  • XML类
    • StAX(Streaming API for XML)
    • Thoughwork的XStream
  • RPC类
    • Protobuf:这里“偷了点懒”,因为Protobuf和Thrift都要安装、编译,所以这里使用了Protostuff,可以在运行时自动获取对象的Schema信息,省去了额外安装和手动编写协议格式文件的过程(Protostuff真是太好了!)。
    • Thrift、Apache Avro:同上,都需要预编译。

Why does Jackson-JSON call BSON the “smile format” of JSON?

BSON and Smile are two distinct binary formats. They are related in that they are both based on the logical format of JSON (i.e., key-value objects) but they are distinct in that they write incompatible binary formats (you can neither directly read Smile as BSON nor vice-versa). They also have different incompatible features (e.g., BSON defines a date type, while Smile does not as far as I can tell.) BSON is the binary serialization used by MongoDB for network transfer and disk serialization. Smile is the binary JSON format used by the Jackson project.

        <!-- JSON BEGIN -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.5.4</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-afterburner</artifactId>
<version>2.5.4</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-scala_2.10</artifactId>
<version>2.5.3</version>
</dependency>
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.3.1</version>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.6</version>
</dependency>
<dependency>
<groupId>io.fastjson</groupId>
<artifactId>boon</artifactId>
<version>0.33</version>
</dependency>
<!-- JSON END --> <!-- JSON-like BEGIN -->
<dependency>
<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-smile</artifactId>
<version>2.5.4</version>
</dependency>
<dependency>
<groupId>org.msgpack</groupId>
<artifactId>msgpack</artifactId>
<version>0.6.12</version>
</dependency>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>bson</artifactId>
<version>3.0.2</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-yaml</artifactId>
<version>2.5.4</version>
</dependency>
<!-- JSON-like END --> <!-- Binary BEGIN -->
<dependency>
<groupId>com.caucho</groupId>
<artifactId>hessian</artifactId>
<version>4.0.38</version>
</dependency>
<dependency>
<groupId>de.ruedigermoeller</groupId>
<artifactId>fst</artifactId>
<version>2.31</version>
</dependency>
<dependency>
<groupId>com.esotericsoftware</groupId>
<artifactId>kryo</artifactId>
<version>3.0.2</version>
</dependency>
<!-- Binary END --> <!-- XML BEGIN -->
<dependency>
<groupId>com.thoughtworks.xstream</groupId>
<artifactId>xstream</artifactId>
<version>1.4.8</version>
</dependency>
<dependency>
<groupId>com.fasterxml</groupId>
<artifactId>aalto-xml</artifactId>
<version>0.9.11</version>
</dependency>
<!-- XML END --> <!-- RPC BEGIN -->
<dependency>
<groupId>io.protostuff</groupId>
<artifactId>protostuff-core</artifactId>
<version>1.3.5</version>
</dependency>
<dependency>
<groupId>io.protostuff</groupId>
<artifactId>protostuff-runtime</artifactId>
<version>1.3.5</version>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>1.7.7</version>
</dependency>
<!-- RPC END -->

2.Benchmark代码

2.1 测试对象

用Serializer接口实现表示不同的序列化框架,作为测试对象集合。测试主要关注序列化数据大小、序列化时间消耗、反序列化时间消耗三个指标。

public class SerializerBenchmark {

    private static final int WARMUP_COUNT = 100;
private static final int TEST_COUNT = 1000 * 1000; /** Column index */
private static final int COL_SER_SIZE = 0;
private static final int COL_SER_COST = 1;
private static final int COL_DER_COST = 2; /** Dictionary for random generation */
private static final char[] ALPHA =
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ".toCharArray(); public static void main(String[] args) throws Exception { Serializer[] serializers =
{
// ============= JSON ==============
new Serializer<Person>() {
private ObjectMapper mapper = new ObjectMapper(); @Override
public String name() {
return "Jackson";
} @Override
public byte[] serialize(Person obj) throws Exception {
return mapper.writeValueAsBytes(obj);
} @Override
public Person deserialize(byte[] data, Class<Person> type) throws Exception {
return mapper.readValue(data, type);
}
},
new Serializer<Person>() {
private Gson gson = new GsonBuilder().create(); @Override
public String name() {
return "Gson";
} @Override
public byte[] serialize(Person obj) {
return gson.toJson(obj).getBytes();
} @Override
public Person deserialize(byte[] data, Class<Person> type) {
return gson.fromJson(new String(data), type);
}
},
new Serializer<Person>() { @Override
public String name() {
return "FastJSON";
} @Override
public byte[] serialize(Person obj) {
return JSON.toJSONBytes(obj);
} @Override
public Person deserialize(byte[] data, Class<Person> type) {
return JSON.parseObject(data, type);
}
}, // ============= JSON-like ==============
new Serializer<Person>() {
private ObjectMapper mapper = new ObjectMapper(new SmileFactory()); @Override
public String name() {
return "Jackson-smile";
} @Override
public byte[] serialize(Person obj) throws Exception {
return mapper.writeValueAsBytes(obj);
} @Override
public Person deserialize(byte[] data, Class<Person> type) throws Exception {
return mapper.readValue(data, type);
}
},
new Serializer<Person>() {
private ObjectMapper mapper = new ObjectMapper(new SmileFactory());
{
mapper.registerModule(new AfterburnerModule());
} @Override
public String name() {
return "Jackson-smile-afterburner";
} @Override
public byte[] serialize(Person obj) throws Exception {
return mapper.writeValueAsBytes(obj);
} @Override
public Person deserialize(byte[] data, Class<Person> type) throws Exception {
return mapper.readValue(data, type);
}
},
new Serializer<Person>() {
private ObjectMapper mapper = new ObjectMapper(new SmileFactory());
{
mapper.registerModule(new DefaultScalaModule());
} @Override
public String name() {
return "Jackson-smile-scala";
} @Override
public byte[] serialize(Person obj) throws Exception {
return mapper.writeValueAsBytes(obj);
} @Override
public Person deserialize(byte[] data, Class<Person> type) throws Exception {
return mapper.readValue(data, type);
}
},
new Serializer<Person>() {
private ObjectMapper mapper = new ObjectMapper(new YAMLFactory()); @Override
public String name() {
return "Jackson-yaml";
} @Override
public byte[] serialize(Person obj) throws Exception {
return mapper.writeValueAsBytes(obj);
} @Override
public Person deserialize(byte[] data, Class<Person> type) throws Exception {
return mapper.readValue(data, type);
}
},
new Serializer<Person>() {
private MessagePack msgpack = new MessagePack();
{
msgpack.register(Person.class);
} @Override
public String name() {
return "MessagePack";
} @Override
public byte[] serialize(Person obj) throws Exception {
return msgpack.write(obj);
} @Override
public Person deserialize(byte[] data, Class type) throws Exception {
return msgpack.read(data, Person.class);
}
}, // ============= Binary ==============
new Serializer<Person>() {
private Schema<Person> schema = RuntimeSchema.getSchema(Person.class);
private LinkedBuffer buffer = LinkedBuffer.allocate(); @Override
public String name() {
return "Protostuff";
} @Override
public byte[] serialize(Person obj) {
byte[] data = ProtobufIOUtil.toByteArray(obj, schema, buffer);
buffer.clear();
return data;
} @Override
public Person deserialize(byte[] data, Class<Person> type) {
Person obj = new Person();
ProtobufIOUtil.mergeFrom(data, obj, schema);
return obj;
}
},
new Serializer<Person>() { @Override
public String name() {
return "Hessian";
} @Override
public byte[] serialize(Person obj) throws Exception {
ByteArrayOutputStream bytes = new ByteArrayOutputStream();
Hessian2Output output = new Hessian2Output(bytes);
output.writeObject(obj);
output.close(); // flush to avoid EOF error
return bytes.toByteArray();
} @Override
public Person deserialize(byte[] data, Class<Person> type) throws Exception {
Hessian2Input input = new Hessian2Input(new ByteArrayInputStream(data));
return (Person) input.readObject();
}
},
new Serializer<Person>() {
private FSTObjectInput input = new FSTObjectInput();
private FSTObjectOutput output = new FSTObjectOutput(); @Override
public String name() {
return "FST";
} @Override
public byte[] serialize(Person obj) throws Exception {
output.resetForReUse();
output.writeObject(obj);
return output.getCopyOfWrittenBuffer();
} @Override
public Person deserialize(byte[] data, Class<Person> type) throws Exception {
input.resetForReuseUseArray(data);
return (Person) input.readObject();
}
},
new Serializer<Person>() {
private Kryo kryo = new Kryo();
{
kryo.setReferences(false);
kryo.setRegistrationRequired(true);
kryo.register(Person.class);
}
private byte[] buffer = new byte[512];
private Output output = new Output(buffer, -1);
private Input input = new Input(buffer); @Override
public String name() {
return "Kryo";
} @Override
public byte[] serialize(Person obj) {
output.setBuffer(buffer, -1); // reset
kryo.writeObject(output, obj);
return output.toBytes();
} @Override
public Person deserialize(byte[] data, Class<Person> type) {
input.setBuffer(data);
return kryo.readObject(input, type);
}
},
new Serializer<Person>() { @Override
public String name() {
return "JDK Built-in";
} @Override
public byte[] serialize(Person obj) throws Exception {
ByteArrayOutputStream out = new ByteArrayOutputStream();
new ObjectOutputStream(out).writeObject(obj);
return out.toByteArray();
} @Override
public Person deserialize(byte[] data, Class<Person> type) throws Exception {
return (Person) new ObjectInputStream(new ByteArrayInputStream(data)).readObject();
}
}, // ============= XML ==============
new Serializer<Person>() {
private XStream xstream = new XStream(); @Override
public String name() {
return "XStream";
} @Override
public byte[] serialize(Person obj) throws Exception {
ByteArrayOutputStream out = new ByteArrayOutputStream();
xstream.toXML(obj, out);
return out.toByteArray();
} @Override
public Person deserialize(byte[] data, Class<Person> type) throws Exception {
return (Person) xstream.fromXML(new ByteArrayInputStream(data));
}
},
}; // Sheet
int[] testCase = { 10, 100, 1000 };
String[] sheetNames = new String[testCase.length];
for (int i = 0; i < sheetNames.length; i++) {
sheetNames[i] = "Size=" + testCase[i];
} // Row
String[] rowNames = new String[serializers.length];
for (int i = 0; i < rowNames.length; i++) {
rowNames[i] = serializers[i].name();
} // Column
String[] colNames = new String[3];
colNames[0] = "Size";
colNames[1] = "Ser";
colNames[2] = "Der"; Reporter reporter = new Reporter(sheetNames, rowNames, colNames);
for (int i = 0; i < testCase.length; i++) {
int length = testCase[i];
System.out.printf("===== Round [%d]: %d =====\n", i, length); for (int j = 0; j < serializers.length; j++) {
testSerializer(reporter, length, i, j, serializers[j]);
}
}
System.out.println(reporter.generateFinalReport());
}
...
}

2.2 测试Runner

每轮测试前都先Warmup并GC,避免JIT和GC对测试的影响。同时,Warmup时检测序列化和反序列化的正确性。

    private static void testSerializer(Reporter reporter,
int length,
int sheet,
int row,
Serializer<Person> serializer)
throws Exception { System.out.println("===== " + serializer.name() + " ====="); // 1.Warm-up and validate
System.out.println("Pre-warmup & Check correctness...");
Person p1 = newPerson(length);
for (int i = 0; i < WARMUP_COUNT; i++) {
byte[] bytes = serializer.serialize(p1);
Person p2 = serializer.deserialize(bytes, Person.class);
if (!p1.equals(p2)) {
throw new IllegalStateException(p1 + " not equals to " + p2);
}
}
int serSize = serializer.serialize(p1).length;
System.out.printf("%s serialization size[%d]\n", serializer.name(), serSize);
reporter.report(sheet, row, COL_SER_SIZE, serSize);
doGc(); // 2.Serialization
long startTime = System.currentTimeMillis();
for (int i = 0; i < TEST_COUNT; i++) {
serializer.serialize(p1);
}
long serCostTime = System.currentTimeMillis() - startTime;
System.out.printf("%s serialization benchmark[%d]\n", serializer.name(), serCostTime);
reporter.report(sheet, row, COL_SER_COST, serCostTime); // Warm up again
for (int i = 0; i < WARMUP_COUNT; i++) {
byte[] bytes = serializer.serialize(p1);
serializer.deserialize(bytes, Person.class);
}
doGc(); // 3.De-Serialization
byte[] bytes = serializer.serialize(p1);
startTime = System.currentTimeMillis();
for (int i = 0; i < TEST_COUNT; i++) {
serializer.deserialize(bytes, Person.class);
}
long derCostTime = System.currentTimeMillis() - startTime;
System.out.printf("%s de-serialization benchmark[%d]\n", serializer.name(), derCostTime);
reporter.report(sheet, row, COL_DER_COST, derCostTime);
System.out.println();
}

3.测试报告

3.1 报告生成

这里“偷了点小懒”,用Apache Common Lang提供的StringUtils中的pad()方法排版。

    static class Reporter {
private final String[] sheetNames;
private final String[] rowNames;
private final String[] colNames;
private final long[][][] table; Reporter(String[] sheetNames,
String[] rowNames,
String[] colNames) {
this.sheetNames = sheetNames;
this.rowNames = rowNames;
this.colNames = colNames;
this.table = new long[sheetNames.length]
[rowNames.length]
[colNames.length];
} public void report(int sheet, int row, int col, long val) {
table[sheet][row][col] = val;
} public String generateFinalReport() {
StringBuilder report = new StringBuilder();
for (int i = 0; i < table.length; i++) {
report.append(center(sheetNames[i], 50, '*'))
.append("\n");
// 1.Header
final int width0 = 30;
final int width1 = 10;
report.append(rightPad("", width0));
for (String colName : colNames) {
report.append(rightPad(colName, width1));
}
report.append("\n"); // 2.Row
for (int j = 0; j < table[i].length; j++) {
report.append(rightPad(rowNames[j], width0));
for (int k = 0; k < table[i][j].length; k++) {
report.append(rightPad(
String.valueOf(table[i][j][k]), width1));
}
report.append("\n");
}
report.append("\n");
}
return report.toString();
}
}

3.2 测试结果

测试结果可以简单总结如下:

  • Kryo占用空间最小,其次是MessagePack和Protostuff(Protobuf)。
  • Protostuff在不同数据长度下表现都非常出色
  • JSON以及类JSON框架中,Jackson+Smile格式+Afterburner模块的组合表现最好。
  • XStream出奇地慢,印象中XStream挺快的吧,难道有优化参数没配?
*********************Size=10**********************
Size Ser Der
Jackson 39 602 758
Gson 38 1204 1181
FastJSON 38 573 608
Jackson-smile 35 415 465
Jackson-smile-afterburner 35 305 377
Jackson-smile-scala 34 522 590
Jackson-yaml 39 4233 5638
MessagePack 15 891 1075
Protostuff 17 148 130
Hessian 84 2459 1233
FST 73 334 481
Kryo 13 98 117
JDK Built-in 138 1462 4526
XStream 169 6088 13007 *********************Size=100*********************
Size Ser Der
Jackson 129 403 565
Gson 128 1056 1248
FastJSON 129 522 571
Jackson-smile 126 426 472
Jackson-smile-afterburner 126 454 371
Jackson-smile-scala 126 452 639
Jackson-yaml 129 5250 5330
MessagePack 108 948 976
Protostuff 107 172 192
Hessian 176 2528 1513
FST 163 288 470
Kryo 105 440 134
JDK Built-in 228 1332 4559
XStream 259 5913 12797 ********************Size=1000*********************
Size Ser Der
Jackson 1029 1412 1411
Gson 1029 4614 3855
FastJSON 1029 2476 2011
Jackson-smile 1026 1052 1343
Jackson-smile-afterburner 1025 1105 1232
Jackson-smile-scala 1025 1058 1452
Jackson-yaml 1029 18983 13065
MessagePack 1008 2101 2010
Protostuff 1008 1172 838
Hessian 1075 4358 6587
FST 1063 1083 1567
Kryo 1005 2675 921
JDK Built-in 1128 2502 8537
XStream 1158 10633 16981

序列化战争:主流序列化框架Benchmark的更多相关文章

  1. DRF框架之Serializer序列化器的序列化操作

    在DRF框架中,有两种序列化器,一种是Serializer,另一种是ModelSerializer. 今天,我们就先来学习一下Serializer序列化器. 使用Serializer序列化器的开发步骤 ...

  2. Flask(1)- 主流web框架、初识flask

    一.Python 现阶段三大主流Web框架 Django.Tornado.Flask 对比 Django 主要特点是大而全,集成了很多组件(例如Models.Admin.Form等等), 不管你用得到 ...

  3. 主流RPC框架详解,以及与SOA、REST的区别

    什么是RPC RPC(Remote Procedure Call Protocol)——远程过程调用协议,它是一种通过网络从远程计算机程序上请求服务,而不需要了解底层网络技术的协议. 简言之,RPC使 ...

  4. 一篇文章带你掌握主流办公框架——SpringBoot

    一篇文章带你掌握主流办公框架--SpringBoot 在之前的文章中我们已经学习了SSM的全部内容以及相关整合 SSM是Spring的产品,主要用来简化开发,但我们现在所介绍的这款框架--Spring ...

  5. C#中的二进制序列化和Json序列化

    序列化就是把一个对象变成流的形式,方便传输和还原.小弟不才,总结下对二进制序列化和Json序列化的使用: 1.首先,二进制序列化(BinaryFormatter)要求要序列化的类必须是可序列化的(即在 ...

  6. Django-Rest-Framework的序列化之serializers 序列化组件

    Django-Rest-Framework的序列化之serializers 序列化组件 restful framework 正常的序列化 from django.http import HttpRes ...

  7. [LeetCode] Serialize and Deserialize BST 二叉搜索树的序列化和去序列化

    Serialization is the process of converting a data structure or object into a sequence of bits so tha ...

  8. [LeetCode] Serialize and Deserialize Binary Tree 二叉树的序列化和去序列化

    Serialization is the process of converting a data structure or object into a sequence of bits so tha ...

  9. 几款主流PHP框架的优缺点评比

    PHP是一种在国内外都比较流行的开源服务器端脚本开发语言.能够适应大中小型项目的开发需求.我们将在这篇文章中向大家介绍几款主流PHP框架及其相关优缺点评比,作为一个参考分享给朋友们. 主要参考的PHP ...

随机推荐

  1. html超文本标记语言的由来

    万维网上的一个超媒体文档称为一个页面:page,作为一个组织或者个人在万维网上放置开始点的页面称为主页:homepage或者首页,主页中通常有指向其他相关页面或者其他节点的指针,就是通常所说的超链接, ...

  2. 【转】python3 urllib.request 网络请求操作

    python3 urllib.request 网络请求操作 基本的网络请求示例 ''' Created on 2014年4月22日 @author: dev.keke@gmail.com ''' im ...

  3. Jenkins配置Gogs webhook插件

    前言 我们在前面使用Jenkins集合Gogs来进行持续集成的时候,选择的是Jenkins定时检测git仓库是否有更新来决定是否构建.也就是说,我们提交了代码Jenkins并不会马上知道,那么我们可以 ...

  4. Efficient&Elegant:Java程序员入门Cpp

    最近项目急需C++ 的知识结构,虽说我有过快速学习很多新语言的经验,但对于C++ 老特工我还需保持敬畏(内容太多),本文会从一个Java程序员的角度,制定高效学习路线快速入门C++ . Java是为了 ...

  5. python3进阶之推导式之列表(list)推导式(comprehensions)

    1.前言 推导式,英文名字叫comprehensions,注意与comprehension(理解)只有s字母之差.推导式又可以叫解析式,推导式可以从一种数据序列构建新的数据序列的结构体.推导式分为,列 ...

  6. [ZJOI2009]染色游戏

    Description 一共n × m 个硬币,摆成n × m 的长方形.dongdong 和xixi 玩一个游戏, 每次可以选择一个连通块,并把其中的硬币全部翻转,但是需要满足存在一个 硬币属于这个 ...

  7. TopCoder SRM 566 Div 1 - Problem 1000 FencingPenguins

    传送门:https://284914869.github.io/AEoj/566.html 题目简述: 平面上有中心在原点,一个点在(r,0)处的正n边形的n个顶点.平面上还有m个企鹅,每个企鹅有一个 ...

  8. ●BZOJ 3672 [Noi2014]购票

    题链: http://www.lydsy.com/JudgeOnline/problem.php?id=3672 题解: 斜率优化DP,点分治(树上CDQ分治...) 这里有一个没有距离限制的简单版: ...

  9. ●BZOJ 4710 [Jsoi2011]分特产

    题链: http://www.lydsy.com/JudgeOnline/problem.php?id=4710 题解: 容斥,组合先看看这个方案数的计算:把 M 个相同的东西分给 N 个人,每个人可 ...

  10. 以太坊区块链Java(EthereumJ)学习笔记:概述

    本系列文章介绍以太坊区块链基于Java语言的解决方案.通过介绍EthereumJ定义的主要模块和Class,希望为大家学习和使用EthereumJ提供一些帮助. 整体架构 以太坊的Java解决方案主要 ...