Packing data with Python
Defining how a sequence of bytes sits in a memory buffer or on disk can be challenging from time to time. Since everything that you’ll work with is a byte, it makes sense that we have an intuitive way to work with this information agnostic of the overlying type restrictions that the language will enforce on us.
In today’s post, I’m going to run through Python’s byte string packing and unpacking using the struct package.
Basics
From the Python documentation:
This module performs conversions between Python values and C structs represented as Python bytes objects. This can be used in handling binary data stored in files or from network connections, among other sources. It uses Format Strings as compact descriptions of the layout of the C structs and the intended conversion to/from Python values.
When working with a byte string in Python, you prefix your literals with b.
>>> b'Hello'
'Hello'
The ord function call is used to convert a text character into its character code representation.
>>> ord(b'H')
72
>>> ord(b'e')
101
>>> ord(b'l')
108
We can use list to convert a whole string of byte literals into an array.
>>> list(b'Hello')
[72, 101, 108, 108, 111]
The compliment to the ord call is chr, which converts the byte-value back into a character.
Packing
Using the struct module, we’re offered the pack function call. This function takes in a format of data and then the data itself. The first parameter defines how the data supplied in the second parameter should be laid out. We get started:
>>> import struct
If we pack the string 'Hello' as single bytes:
>>> list(b'Hello')
[72, 101, 108, 108, 111]
>>> struct.pack(b'BBBBB', 72, 101, 108, 108, 111)
b'Hello'
The format string b'BBBBB' tells pack to pack the values supplied into a string of 5 unsigned values. If we were to use a lower case b in our format string, pack would expect the byte value to be signed.
>>> struct.pack(b'bbbbb', 72, 101, 108, 108, 111)
b'Hello'
This only gets interesting once we send a value that would make the request overflow:
>>> struct.pack(b'bbbbb', 72, 101, 108, 129, 111)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
struct.error: byte format requires -128 <= number <= 127
The following tables have been re-produced from the Python documentation.
Byte order, size and alignment
| Character | Byte order | Size | Alignment |
|---|---|---|---|
@ |
native | native | native |
= |
native | standard | none |
< |
little-endian | standard | none |
> |
big-endian | standard | none |
! |
network (= big-endian) | standard | none |
Types
| Format | C Type | Python type | Standard size | Notes |
|---|---|---|---|---|
x |
pad byte | no value | ||
c |
char | bytes of length 1 | 1 | |
b |
signed char | integer | 1 | (1),(3) |
B |
unsigned char | integer | 1 | (3) |
? |
_Bool | bool | 1 | (1) |
h |
short | integer | 2 | (3) |
H |
unsigned short | integer | 2 | (3) |
i |
int | integer | 4 | (3) |
I |
unsigned int | integer | 4 | (3) |
l |
long | integer | 4 | (3) |
L |
unsigned long | integer | 4 | (3) |
q |
long long | integer | 8 | (2), (3) |
Q |
unsigned long long | integer | 8 | (2), (3) |
n |
ssize_t | integer | (4) | |
N |
size_t | integer | (4) | |
f |
float | float | 4 | (5) |
d |
double | float | 8 | (5) |
s |
char[] | bytes | ||
p |
char[] | bytes | ||
P |
void * | integer | (6) |
Unpacking
The direct reverse process of packing bytes into an array, is unpacking them again into usable variables inside of your python code.
>>> struct.unpack(b'BBBBB', struct.pack(b'BBBBB', 72, 101, 108, 108, 111))
(72, 101, 108, 108, 111)
>>> struct.unpack(b'5s', struct.pack(b'BBBBB', 72, 101, 108, 108, 111))
(b'Hello',)
Packing data with Python的更多相关文章
- 使用Python对Twitter进行数据挖掘(Mining Twitter Data with Python)
目录 1.Collecting data 1.1 Register Your App 1.2 Accessing the Data 1.3 Streaming 2.Text Pre-processin ...
- python data analysis | python数据预处理(基于scikit-learn模块)
原文:http://www.jianshu.com/p/94516a58314d Dataset transformations| 数据转换 Combining estimators|组合学习器 Fe ...
- Mining Twitter Data with Python
目录 1.Collecting data 1.1 Register Your App 1.2 Accessing the Data 1.3 Streaming 2.Text Pre-processin ...
- Working with Binary Data in Python
http://www.devdungeon.com/content/working-binary-data-python
- 7 Tools for Data Visualization in R, Python, and Julia
7 Tools for Data Visualization in R, Python, and Julia Last week, some examples of creating visualiz ...
- 一句Python,一句R︱pandas模块——高级版data.frame
先学了R,最近刚刚上手python,所以想着将python和R结合起来互相对比来更好理解python.最好就是一句python,对应写一句R. pandas可谓如雷贯耳,数据处理神器. 以下符号: = ...
- Python - 2. Built-in Collection Data Types
From: http://interactivepython.org/courselib/static/pythonds/Introduction/GettingStartedwithData.htm ...
- A Complete Tutorial to Learn Data Science with Python from Scratch
A Complete Tutorial to Learn Data Science with Python from Scratch Introduction It happened few year ...
- python接口测试(post,get)-传参(data和json之间的区别)
python接口测试如何正确传参: POST 传data:data是python字典格式:传参data=json.dumps(data)是字符串类型传参 #!/usr/bin/env python3 ...
随机推荐
- P1604_B进制星球(JAVA语言)
思路:BigInteger 五杀!利用BigInteger自带的进制转换. //第一次提交WA了几组数据,下载测试数据发现带字母的答案要转换为大写. 题目背景 进制题目,而且还是个计算器~~ 题目描述 ...
- PTA 两个有序链表序列的合并
6-5 两个有序链表序列的合并 (15 分) 本题要求实现一个函数,将两个链表表示的递增整数序列合并为一个非递减的整数序列. 函数接口定义: List Merge( List L1, List L ...
- Python基础之:Python中的IO
目录 简介 linux输入输出 格式化输出 f格式化 format格式化 repr和str %格式化方法 读写文件 文件对象的方法 使用json 简介 IO就是输入和输出,任何一个程序如果和外部希望有 ...
- Git基本操作流程
技术背景 Gitee是一款国内的git托管服务,对于国内用户较为友好,用户可以访问Gitee地址来创建自己的帐号和项目,并托管在Gitee平台上.既然是git的托管服务,那我们就可以先看看git的一些 ...
- 第24 章 : Kubernetes API 编程利器:Operator 和 Operator Framework
Kubernetes API 编程利器:Operator 和 Operator Framework 本节课程主要分享以下三方面的内容: operator 概述 operator framework 实 ...
- 三次给你讲清楚Redis之Redis是个啥
摘要:Redis是一款基于键值对的NoSQL数据库,它的值支持多种数据结构:字符串(strings).哈希(hashes).列表(lists).集合(sets).有序集合(sorted sets)等. ...
- 开源组件编排引擎LiteFlow发布里程碑版本2.5.0
介绍 LiteFlow作为一款轻量级组件编排框架,自开源来,获得了挺多人的关注.社区群也扩展到了接近200人. 早期版本因为疏忽打理,有一些BUG,迭代也不及时.距离上一个稳定版本2.3.3,已经有超 ...
- Hadoop学习笔记—Yarn
目录 一些基本知识 ResourceManager 的恢复 Resource Manager的HA YARN Node Labels YARN Node Attributes Web Applicat ...
- C# WebView2 在你的应用中使用Chromium内核
什么是WebView2? Win10上对标Edge浏览器 Chromium内核 简单的可视为WebBrowser组件的升级版 如何使用WebView2? 官网下载 WebView2 RunTime V ...
- Apache Hudi C位!云计算一哥AWS EMR 2020年度回顾
1. 概述 成千上万的客户在Amazon EMR上使用Apache Spark,Apache Hive,Apache HBase,Apache Flink,Apache Hudi和Presto运行大规 ...