这篇文章中，我们将定义一个相对复杂的数据结构，直接分析其序列化后的二进制文件。

Proto文件

编写addressbook.proto文件，在官方例子上略作修改，增加了float字段，以分析浮点数的存储方式。

syntax = "proto2";

package tutorial;

message Person {

  required string name = 1;

  required int32 id = 2;

  optional string email = 3;

  enum PhoneType {

    MOBILE = 0;

    HOME = 1;

    WORK = 2;

  }

  message PhoneNumber {

    required string number = 1;

    optional PhoneType type = 2 [default = HOME];

  }

  repeated PhoneNumber phones = 4;

  repeated float weight_recent_months = 100 [packed = true];

}

message AddressBook {

  repeated Person people = 1;

}

生成编解码文件，addressbook.pb.cc和addressbook.pb.h。

protoc.exe addressbook.proto --cpp_out=.

序列化

编写如下代码，将address_book对象序列化，保存到二进制文件address_book.bin。

int main()

{

    tutorial::AddressBook address_book;

    tutorial::Person* person = address_book.add_people();

    person->set_id(1);

    person->set_name("Jack");

    person->set_email("Jack@qq.com");

    tutorial::Person::PhoneNumber* phone_number = person->add_phones();

    phone_number->set_number("123456");

    phone_number->set_type(tutorial::Person::HOME);

    phone_number = person->add_phones();

    phone_number->set_number("234567");

    phone_number->set_type(tutorial::Person::MOBILE);

    person->add_weight_recent_months(50);

    person->add_weight_recent_months(52);

    person->add_weight_recent_months(54);

    fstream fw("./address_book.bin", ios::out | ios::binary);

    address_book.SerializePartialToOstream(&fw);

    fw.close();

    return 0;

}

二进制文件address_book.bin一共有62个字节，内容如下：

二进制文件解析

由前面的文章，每个field的key = (field_number << 3) | wire_type都通过varint表示。

message Addressbook的第一个字段为Person people，Person也是一个message，下面逐个字节地进行解析。

0a    // (1 << 3) + 2，1为people的field_bumber,2为embedded message对应的wire type

3c    // 0x3c = 60，表示接下来60个字节为Person people的数据

// 下面进入到 message Person

0a    // (1 << 3) + 2，Person的第一个字段name field_number=1，2为string对应的wire type

04    // name字段的字符串长度为4

4a 61 63 6b    // "Jack" 的ascii编码

10    // (2 << 3) + 0，字段id field_number=2，0为int32对应的wire type

01    // id为1

1a    // (3 << 3) + 2，字段email field_number=3，2为string对应的wire type

0b    // 0x0b = 11 email字段的字符串长度为11

4a 61 63 6b 40 71 71 2e 63 6f 6d        // "Jack@qq.com"

	//第1个PhoneNumber，嵌套message

	22    // (4 << 3) + 2，，phones字段，field_number=4，2为embedded message对应的wire type

	0a    // 接下来10个字节为PhoneNumber的数据

	0a    // (1 << 3) + 2, message PhoneNumber的第一个字段number，2为string对应的wire type

	06    // number字段的字符串长度为6

	31 32 33 34 35 36    // "123456"

	10   // (2 << 3) + 0，PhoneType type字段，0为enum对应的wire type

	01   // HOME，enum被视为整数

	// 第2个PhoneNumber，嵌套message

	22 0a 0a 06 32 33 34 35 36 37 10 00  //信息解读同上，最后的00为MOBILE

a2 06   // 1010 0010 0000 0110 varint方式，weight_recent_months的key

        //  010 0010  000 0110 → 000 0110 0100 010 little-endian存储

        // (100 << 3) + 2，100为weight_recent_months的field number

        //  2为 packed repeated field的wire type

0c    // 后面12个字节为packed float的数据，每4个字节一个

00 00 48 42 // float 50

00 00 50 42 // float 52

00 00 58 42 // float 54

需要注意的是，repeated后面接的字段如果是个message，比如上面的PhoneNumber，有几个PhoneNumber，编码时其key就会出现几次；如果接的是数值型的字段，且以packed = true压缩存储时，只会出现1个key，如果不以压缩方式存储，其key也会出现多次，在proto3中，默认以压缩方式进行存储，proto2中则需要显式地声明。

至此，二进制文件已经分析完毕，现在再去看解码代码，就so easy了。

反序列化

这里只贴上message Person对应的解码代码，可以看到其中遇到嵌套message PhoneNumber时，会去调用PhoneNumber的解码代码。

bool Person::MergePartialFromCodedStream(

    ::google::protobuf::io::CodedInputStream* input) {

#define DO_(EXPRESSION) if (!PROTOBUF_PREDICT_TRUE(EXPRESSION)) goto failure

  ::google::protobuf::uint32 tag;

  // @@protoc_insertion_point(parse_start:tutorial.Person)

  for (;;) {

    ::std::pair<::google::protobuf::uint32, bool> p = input->ReadTagWithCutoffNoLastTag(16383u);

    tag = p.first;

    if (!p.second) goto handle_unusual;

    switch (::google::protobuf::internal::WireFormatLite::GetTagFieldNumber(tag)) {

      // required string name = 1;

      case 1: {

        if (static_cast< ::google::protobuf::uint8>(tag) == (10 & 0xFF)) {

          DO_(::google::protobuf::internal::WireFormatLite::ReadString(

                input, this->mutable_name()));

          ::google::protobuf::internal::WireFormat::VerifyUTF8StringNamedField(

            this->name().data(), static_cast<int>(this->name().length()),

            ::google::protobuf::internal::WireFormat::PARSE,

            "tutorial.Person.name");

        } else {

          goto handle_unusual;

        }

        break;

      }

      // required int32 id = 2;

      case 2: {

        if (static_cast< ::google::protobuf::uint8>(tag) == (16 & 0xFF)) {

          HasBitSetters::set_has_id(this);

          DO_((::google::protobuf::internal::WireFormatLite::ReadPrimitive<

                   ::google::protobuf::int32, ::google::protobuf::internal::WireFormatLite::TYPE_INT32>(

                 input, &id_)));

        } else {

          goto handle_unusual;

        }

        break;

      }

      // optional string email = 3;

      case 3: {

        if (static_cast< ::google::protobuf::uint8>(tag) == (26 & 0xFF)) {

          DO_(::google::protobuf::internal::WireFormatLite::ReadString(

                input, this->mutable_email()));

          ::google::protobuf::internal::WireFormat::VerifyUTF8StringNamedField(

            this->email().data(), static_cast<int>(this->email().length()),

            ::google::protobuf::internal::WireFormat::PARSE,

            "tutorial.Person.email");

        } else {

          goto handle_unusual;

        }

        break;

      }

      // repeated .tutorial.Person.PhoneNumber phones = 4;

      case 4: {

        if (static_cast< ::google::protobuf::uint8>(tag) == (34 & 0xFF)) {

          DO_(::google::protobuf::internal::WireFormatLite::ReadMessage(

                input, add_phones()));

        } else {

          goto handle_unusual;

        }

        break;

      }

      // repeated float weight_recent_months = 100 [packed = true];

      case 100: {

        if (static_cast< ::google::protobuf::uint8>(tag) == (802 & 0xFF)) {

          DO_((::google::protobuf::internal::WireFormatLite::ReadPackedPrimitive<

                   float, ::google::protobuf::internal::WireFormatLite::TYPE_FLOAT>(

                 input, this->mutable_weight_recent_months())));

        } else if (static_cast< ::google::protobuf::uint8>(tag) == (805 & 0xFF)) {

          DO_((::google::protobuf::internal::WireFormatLite::ReadRepeatedPrimitiveNoInline<

                   float, ::google::protobuf::internal::WireFormatLite::TYPE_FLOAT>(

                 2, 802u, input, this->mutable_weight_recent_months())));

        } else {

          goto handle_unusual;

        }

        break;

      }

      default: {

      handle_unusual:

        if (tag == 0) {

          goto success;

        }

        DO_(::google::protobuf::internal::WireFormat::SkipField(

              input, tag, _internal_metadata_.mutable_unknown_fields()));

        break;

      }

    }

  }

success:

  // @@protoc_insertion_point(parse_success:tutorial.Person)

  return true;

failure:

  // @@protoc_insertion_point(parse_failure:tutorial.Person)

  return false;

#undef DO_

}

以上。

参考

Protocol Buffer Basics: C++

Protocol Buffers（3）：阅读一个二进制文件的更多相关文章

【笔记】golang中使用protocol buffers的底层库直接解码二进制数据
背景一个简单的代理程序,发现单核QPS达到2万/s左右就上不去了,40%的CPU消耗在pb的decode/encode上面. 于是我想,对于特定的场景,直接从[]byte中取出字段,而不用完全的把整 ...
使用 Protocol Buffers 代替 JSON 的五个原因
国内私募机构九鼎控股打造APP,来就送 20元现金领取地址:http://jdb.jiudingcapital.com/phone.html内部邀请码:C8E245J (不写邀请码,没有现金送)国内私 ...
通讯协议（三）Protocol Buffers协议
Protocol Buffers是Google开发一种数据描述语言,能够将结构化数据序列化,可用于数据存储.通信协议等方面. 不了解Protocol Buffers的同学可以把它理解为更快.更简单.更 ...
Protocol Buffers（1）：序列化、编译与使用
目录序列化与反序列化 Protocol Buffers概览 Protocol Buffers C++ 编译 Protocol Buffers C++ 使用 Protocol Buffers的可读性 ...
protobuf Protocol Buffers 简介案例 MD
Markdown版本笔记我的GitHub首页我的博客我的微信我的邮箱 MyAndroidBlogs baiqiantao baiqiantao bqt20094 baiqiantao@sina ...
Protocol Buffers(Protobuf)开发者指南---概览
Protocol Buffers(Protobuf)开发者指南---概览欢迎来到protocol buffers的开发者指南文档,protocol buffers是一个与编程语言无关‘.系统平台无关 ...
Protocol Buffers介绍
基本概念 Protocol Buffers(以下简称PB)是一种独立于语言.独立于开发平台.可扩展的序列化数据结构框架,它常常被用在通信.数据序列化保存等方面. PB是一种敏捷.高效.自动化的用于对数 ...
Google Protocol Buffers简介
什么是 protocol buffers ? Protocol buffers 是一种灵活.高效的序列化结构数据的自动机制--想想XML,但是它更小,更快,更简单.你只需要把你需要怎样结构化你的数据定 ...
Protocol Buffers编码详解，例子，图解
Protocol Buffers编码详解,例子,图解本文不是让你掌握protobuf的使用,而是以超级细致的例子的方式分析protobuf的编码设计.通过此文你可以了解protobuf的数据压缩能力 ...

随机推荐

关于cannot find module 'xxxx’的一个可能解决方法。
关于cannot find module 'xxxx'的一个可能解决方法. 由于学习angular2,想单独学习一下typescript下angular2使用的'rxjs'是怎么使用的,我用npm自己 ...
JavaScript高级程序设计（二）
一.函数 1.1 JS中函数无重载,同一作用域下定义两个函数,而不会引发错误,但真正调用的是后面定义的函数.例如: function doAdd(iNum){ alert(iNum+100); } f ...
[ SSH框架 ] Struts2框架学习之三（OGNl和ValueStack值栈学习）
一.OGNL概述 1.1 什么是OGNL OGNL的全称是对象图导航语言( object-graph Navigation Language),它是一种功能强大的开源表达式语言,使用这种表达式语言,可 ...
linux下svn(subversion)服务端添加工程及配置权限
linux下svn(subversion)服务端添加工程及配置权限转载请注明源地址:http://www.cnblogs.com/funnyzpc/p/9010507.html 此篇我只是将所做过的 ...
Python_mongoDB
''' MogoDB数据库可以到官方网站https://www.mongodb.org/downloads下载,安装之后打开命令提示符环境并切换到MongoDB安装目录总的 server\3.2\bi ...
mysql中float类型使用总结
对于单精度浮点数Float: 当数据范围在±131072(65536×2)以内的时候,float数据精度是正确的,但是超出这个范围的数据就不稳定,没有发现有相关的参数设置建议:将float改成dou ...
Spring Boot全局支持CORS（跨源请求）的配置方法
http://blog.csdn.net/zhangchao19890805/article/details/53893735
@SpringBootApplication注解理解
@SpringBootApplication包含三个有用的注解,包括 @SpringBootConfiguration:看源码其实就是@Configuration,表示当前类是一个配置类,就像xml配 ...
自动化测试--protractor
前戏面向模型编程: 测试驱动开发: 先保障交互逻辑,再调整细节.---by 雪狼. 为什么要自动化测试? 1,提高产出质量. 2,减少重构时的痛.反正我最近重构多了,痛苦经历多了. 3,便于新人接手 ...
Spring support optimize
https://github.com/alibaba/fastjson/pull/1337

Protocol Buffers（3）：阅读一个二进制文件

Proto文件

序列化

二进制文件解析

反序列化

参考

Protocol Buffers（3）：阅读一个二进制文件的更多相关文章

随机推荐

热门专题