Packing data with Python
Defining how a sequence of bytes sits in a memory buffer or on disk can be challenging from time to time. Since everything that you’ll work with is a byte, it makes sense that we have an intuitive way to work with this information agnostic of the overlying type restrictions that the language will enforce on us.
In today’s post, I’m going to run through Python’s byte string packing and unpacking using the struct package.
Basics
From the Python documentation:
This module performs conversions between Python values and C structs represented as Python bytes objects. This can be used in handling binary data stored in files or from network connections, among other sources. It uses Format Strings as compact descriptions of the layout of the C structs and the intended conversion to/from Python values.
When working with a byte string in Python, you prefix your literals with b
.
>>> b'Hello'
'Hello'
The ord
function call is used to convert a text character into its character code representation.
>>> ord(b'H')
72
>>> ord(b'e')
101
>>> ord(b'l')
108
We can use list
to convert a whole string of byte literals into an array.
>>> list(b'Hello')
[72, 101, 108, 108, 111]
The compliment to the ord
call is chr
, which converts the byte-value back into a character.
Packing
Using the struct
module, we’re offered the pack
function call. This function takes in a format of data and then the data itself. The first parameter defines how the data supplied in the second parameter should be laid out. We get started:
>>> import struct
If we pack the string 'Hello'
as single bytes:
>>> list(b'Hello')
[72, 101, 108, 108, 111]
>>> struct.pack(b'BBBBB', 72, 101, 108, 108, 111)
b'Hello'
The format string b'BBBBB'
tells pack
to pack the values supplied into a string of 5 unsigned values. If we were to use a lower case b
in our format string, pack
would expect the byte value to be signed.
>>> struct.pack(b'bbbbb', 72, 101, 108, 108, 111)
b'Hello'
This only gets interesting once we send a value that would make the request overflow:
>>> struct.pack(b'bbbbb', 72, 101, 108, 129, 111)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
struct.error: byte format requires -128 <= number <= 127
The following tables have been re-produced from the Python documentation.
Byte order, size and alignment
Character | Byte order | Size | Alignment |
---|---|---|---|
@ |
native | native | native |
= |
native | standard | none |
< |
little-endian | standard | none |
> |
big-endian | standard | none |
! |
network (= big-endian) | standard | none |
Types
Format | C Type | Python type | Standard size | Notes |
---|---|---|---|---|
x |
pad byte | no value | ||
c |
char | bytes of length 1 | 1 | |
b |
signed char | integer | 1 | (1),(3) |
B |
unsigned char | integer | 1 | (3) |
? |
_Bool | bool | 1 | (1) |
h |
short | integer | 2 | (3) |
H |
unsigned short | integer | 2 | (3) |
i |
int | integer | 4 | (3) |
I |
unsigned int | integer | 4 | (3) |
l |
long | integer | 4 | (3) |
L |
unsigned long | integer | 4 | (3) |
q |
long long | integer | 8 | (2), (3) |
Q |
unsigned long long | integer | 8 | (2), (3) |
n |
ssize_t | integer | (4) | |
N |
size_t | integer | (4) | |
f |
float | float | 4 | (5) |
d |
double | float | 8 | (5) |
s |
char[] | bytes | ||
p |
char[] | bytes | ||
P |
void * | integer | (6) |
Unpacking
The direct reverse process of packing bytes into an array, is unpacking them again into usable variables inside of your python code.
>>> struct.unpack(b'BBBBB', struct.pack(b'BBBBB', 72, 101, 108, 108, 111))
(72, 101, 108, 108, 111)
>>> struct.unpack(b'5s', struct.pack(b'BBBBB', 72, 101, 108, 108, 111))
(b'Hello',)
Packing data with Python的更多相关文章
- 使用Python对Twitter进行数据挖掘(Mining Twitter Data with Python)
目录 1.Collecting data 1.1 Register Your App 1.2 Accessing the Data 1.3 Streaming 2.Text Pre-processin ...
- python data analysis | python数据预处理(基于scikit-learn模块)
原文:http://www.jianshu.com/p/94516a58314d Dataset transformations| 数据转换 Combining estimators|组合学习器 Fe ...
- Mining Twitter Data with Python
目录 1.Collecting data 1.1 Register Your App 1.2 Accessing the Data 1.3 Streaming 2.Text Pre-processin ...
- Working with Binary Data in Python
http://www.devdungeon.com/content/working-binary-data-python
- 7 Tools for Data Visualization in R, Python, and Julia
7 Tools for Data Visualization in R, Python, and Julia Last week, some examples of creating visualiz ...
- 一句Python,一句R︱pandas模块——高级版data.frame
先学了R,最近刚刚上手python,所以想着将python和R结合起来互相对比来更好理解python.最好就是一句python,对应写一句R. pandas可谓如雷贯耳,数据处理神器. 以下符号: = ...
- Python - 2. Built-in Collection Data Types
From: http://interactivepython.org/courselib/static/pythonds/Introduction/GettingStartedwithData.htm ...
- A Complete Tutorial to Learn Data Science with Python from Scratch
A Complete Tutorial to Learn Data Science with Python from Scratch Introduction It happened few year ...
- python接口测试(post,get)-传参(data和json之间的区别)
python接口测试如何正确传参: POST 传data:data是python字典格式:传参data=json.dumps(data)是字符串类型传参 #!/usr/bin/env python3 ...
随机推荐
- PTA 中序输出叶子结点
6-8 中序输出叶子结点 (10 分) 本题要求实现一个函数,按照中序遍历的顺序输出给定二叉树的叶结点. 函数接口定义: void InorderPrintLeaves( BiTree T); T ...
- ECharts绘制折线图
首先看实现好的页面 实现 首先引入echarts工具 // vue文件中引入echarts工具 let echarts = require('echarts/lib/echarts') require ...
- 带你全面认识CMMI V2.0(二)
CMMI V2.0成熟度等级 CMMI V2.0的一大变化是,所有实践领域均适用于成熟度三级(ML3),并具有特定的附加必需实践水平. 例如,在ML3上需要进行因果分析和解决,但在CMMI成熟度四级( ...
- Dynamics CRM分享记录后出现关联记录被共享的问题
Dynamics CRM的权限配置有许多的问题,其中分享功能也是未来解决标准功能分配的权限不满足需求而设计的.但是这个功能使用的时候也要注意,否则会出现其他记录被共享的问题导致数据泄露可能会对项目的安 ...
- 一般实现分布式锁都有哪些方式?使用redis如何设计分布式锁?使用zk来设计分布式锁可以吗?这两种分布式锁的实现方式哪种效率比较高?
#(1)redis分布式锁 官方叫做RedLock算法,是redis官方支持的分布式锁算法. 这个分布式锁有3个重要的考量点,互斥(只能有一个客户端获取锁),不能死锁,容错(大部分redis节点创建了 ...
- 鹏城杯_2018_treasure
鹏城杯_2018_treasure 首先检查一下保护: IDA分析 我们先来看看settreasure()函数 申请了两个内存空间,并往sea中复制了shellcode 看看这个shellcode,不 ...
- All in All UVA - 10340
You have devised a new encryption technique which encodes a message by inserting between its charac ...
- Day01_05_Java第一个程序 HelloWorld - java类规则
第一个程序Hello World *基础语法规则: 1. 第一个Java程序 HelloWorld! public class HelloWorld{ public static void main( ...
- Day13_71_线程同步(synchronized)
线程同步 * 异步编程模型和同步编程模拟的区别? - 有T1和T2 两个线程 > 异步编程模型:T1线程执行T1的,T2线程执行T2的,谁也不等谁 > 同步编程模型:T1和T2 线程执行, ...
- 安全高效跨平台的. NET 模板引擎 Fluid 使用文档
Liquid 是一门开源的模板语言,由 Shopify 创造并用 Ruby 实现.它是 Shopify 主题的主要构成部分,并且被用于加载店铺系统的动态内容.它是一种安全的模板语言,对于非程序员的受众 ...