python对象序列化之pickle
本片文章主要是对pickle官网的阅读记录。
The pickle module implements binary protocols for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierarchy. Pickling (and unpickling) is alternatively known as “serialization”, “marshalling,” [1] or “flattening”; however, to avoid confusion, the terms used here are “pickling” and “unpickling”.
pickle是python标准模块之一,不需要再额外安装。
pickle用来 序列化和反序列化 Python object structure。其实就是一种数据存储方式,将python的数据结构以特定的形式保存下来。另外,经过pickle序列化后的数据不是human-readable的。
这里提一下老外对事物的命名习惯,pickle是腌制的意思,那么对python object的"腌制",其实就是一种数据处理,至于数据处理的规则是什么,这里暂时不做进一步介绍。
“Pickling” 就是将有层次结构的python object转换成字节流;“unpickling” 就是相反的过程。
说明: 如果碰到“Pickling” “serialization”, “marshalling,” or “flattening”,都是表达相同的意思,翻译成"序列化"就好了;如果单词前加了un,就翻成“反序列化”。
Warning:The pickle module is not secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.
不要去序列化 错误的或者恶意的 结构化数据,也不要去反序列化 不受信任或未授权的数据源。意思就是“序列化”和“反序列化”要按照pickle模块的规则来进行。
Data stream format
The data format used by pickle is Python-specific. This has the advantage that there are no restrictions imposed by external standards such as JSON or XDR (which can’t represent pointer sharing); however it means that non-Python programs may not be able to reconstruct pickled Python objects.
pickle使用的数据格式是Python语言特有的。非Python程序可能不能重构 被序列化 的数据。
By default, the pickle data format uses a relatively compact binary representation. If you need optimal size characteristics, you can efficiently compress pickled data.
默认,pickle的序列化数据格式是一种相对紧凑的二进制表示。如果对数据大小有更高要求,可以压缩 已序列化的数据。
The module pickletools contains tools for analyzing data streams generated by pickle. pickletools source code has extensive comments about opcodes used by pickle protocols.
pickletools包含很多用来解析 已序列化数据的工具。
There are currently 5 different protocols which can be used for pickling. The higher the protocol used, the more recent the version of Python needed to read the pickle produced.
- Protocol version 0 is the original “human-readable” protocol and is backwards compatible with earlier versions of Python.
- Protocol version 1 is an old binary format which is also compatible with earlier versions of Python.
- Protocol version 2 was introduced in Python 2.3. It provides much more efficient pickling of new-style classes. Refer to PEP 307 for information about improvements brought by protocol 2.
- Protocol version 3 was added in Python 3.0. It has explicit support for
bytesobjects and cannot be unpickled by Python 2.x. This is the default protocol, and the recommended protocol when compatibility with other Python 3 versions is required. - Protocol version 4 was added in Python 3.4. It adds support for very large objects, pickling more kinds of objects, and some data format optimizations. Refer to PEP 3154 for information about improvements brought by protocol 4.
Note:
Serialization is a more primitive notion than persistence; although pickle reads and writes file objects, it does not handle the issue of naming persistent objects, nor the (even more complicated) issue of concurrent access to persistent objects. The pickle module can transform a complex object into a byte stream and it can transform the byte stream into an object with the same internal structure. Perhaps the most obvious thing to do with these byte streams is to write them onto a file, but it is also conceivable to send them across a network or store them in a database. The shelve module provides a simple interface to pickle and unpickle objects on DBM-style database files.
Module Interface
To serialize an object hierarchy, you simply call the dumps() function. Similarly, to de-serialize a data stream, you call the loads() function. However, if you want more control over serialization and de-serialization, you can create a Pickler or an Unpickler object, respectively.
通过dumps()进行序列化,通过loads()进行反序列化.
The pickle module provides the following constants:
pickle.HIGHEST_PROTOCOL 即指定协议版本号为最高版本号。
An integer, the highest protocol version available. This value can be passed as a protocol value to functions dump() and dumps() as well as the Pickler constructor.
pickle.DEFAULT_PROTOCOL 即指定默认版本号。当前的默认版本号是version3
An integer, the default protocol version used for pickling. May be less than HIGHEST_PROTOCOL. Currently the default protocol is 3, a new protocol designed for Python 3.
The pickle module provides the following functions to make the pickling process more convenient:
pickle.dump(obj, file, protocol=None, *, fix_imports=True) 即将数据obj写进文件
Write a pickled representation of obj to the open file object file. This is equivalent to Pickler(file, protocol).dump(obj).
The optional protocol argument, an integer, tells the pickler to use the given protocol; supported protocols are 0 to HIGHEST_PROTOCOL. If not specified, the default is DEFAULT_PROTOCOL. If a negative number is specified, HIGHEST_PROTOCOL is selected.
The file argument must have a write() method that accepts a single bytes argument. It can thus be an on-disk file opened for binary writing, an io.BytesIO instance, or any other custom object that meets this interface.
If fix_imports is true and protocol is less than 3, pickle will try to map the new Python 3 names to the old module names used in Python 2, so that the pickle data stream is readable with Python 2.
pickle.dumps(obj, protocol=None, *, fix_imports=True)
Return the pickled representation of the object as a bytes object, instead of writing it to a file.
Arguments protocol and fix_imports have the same meaning as in dump().
pickle.load(file, *, fix_imports=True, encoding="ASCII", errors="strict")-
Read a pickled object representation from the open file object file and return the reconstituted object hierarchy specified therein. This is equivalent to
Unpickler(file).load().The protocol version of the pickle is detected automatically, so no protocol argument is needed. Bytes past the pickled object’s representation are ignored.
The argument file must have two methods, a read() method that takes an integer argument, and a readline() method that requires no arguments. Both methods should return bytes. Thus file can be an on-disk file opened for binary reading, an
io.BytesIOobject, or any other custom object that meets this interface.Optional keyword arguments are fix_imports, encoding and errors, which are used to control compatibility support for pickle stream generated by Python 2. If fix_imports is true, pickle will try to map the old Python 2 names to the new names used in Python 3. The encoding and errors tell pickle how to decode 8-bit string instances pickled by Python 2; these default to ‘ASCII’ and ‘strict’, respectively. The encoding can be ‘bytes’ to read these 8-bit string instances as bytes objects.
pickle.loads(bytes_object, *, fix_imports=True, encoding="ASCII", errors="strict")
-
Read a pickled object hierarchy from a
bytesobject and return the reconstituted object hierarchy specified therein.The protocol version of the pickle is detected automatically, so no protocol argument is needed. Bytes past the pickled object’s representation are ignored.
Optional keyword arguments are fix_imports, encoding and errors, which are used to control compatibility support for pickle stream generated by Python 2. If fix_imports is true, pickle will try to map the old Python 2 names to the new names used in Python 3. The encoding and errors tell pickle how to decode 8-bit string instances pickled by Python 2; these default to ‘ASCII’ and ‘strict’, respectively. The encoding can be ‘bytes’ to read these 8-bit string instances as bytes objects.
The pickle module defines three exceptions:
- exception
pickle.PickleError -
Common base class for the other pickling exceptions. It inherits
Exception.
- exception
pickle.PicklingError -
Error raised when an unpicklable object is encountered by
Pickler. It inheritsPickleError.Refer to What can be pickled and unpickled? to learn what kinds of objects can be pickled.
- exception
pickle.UnpicklingError -
Error raised when there is a problem unpickling an object, such as a data corruption or a security violation. It inherits
PickleError.Note that other exceptions may also be raised during unpickling, including (but not necessarily limited to) AttributeError, EOFError, ImportError, and IndexError.
What can be pickled and unpickled?
The following types can be pickled:
None,True, andFalse- integers, floating point numbers, complex numbers
- strings, bytes, bytearrays
- tuples, lists, sets, and dictionaries containing only picklable objects
- functions defined at the top level of a module (using
def, notlambda) - built-in functions defined at the top level of a module
- classes that are defined at the top level of a module
- instances of such classes whose
__dict__or the result of calling__getstate__()is picklable (see section Pickling Class Instances for details).
python对象序列化之pickle的更多相关文章
- Python学习笔记12:标准库之对象序列化(pickle包,cPickle包)
计算机的内存中存储的是二进制的序列. 我们能够直接将某个对象所相应位置的数据抓取下来,转换成文本流 (这个过程叫做serialize),然后将文本流存入到文件里. 因为Python在创建对象时,要參考 ...
- python对象序列化或持久化的方法
http://blog.csdn.net/chen_lovelotus/article/details/7233293 一.Python对象持久化方法 目前为止,据我所知,在python中对象持久化有 ...
- Python基础-序列化(json/pickle)
我们把对象(变量)从内存中变成可存储的过程称之为序列化,比如XML,在Python中叫pickling,在其他语言中也被称之为serialization,marshalling,flattening等 ...
- python对象序列化pickle
import pickle class A: users = {} c = 1 def get_self(self): return self def n(self): return 1 def pi ...
- 对象序列化:pickle和shelve
import pickle class DVD: def __init__(self,tilte,year=None,duration=None,director_id=None): self.tit ...
- python 序列化 json pickle
python的pickle模块实现了基本的数据序列和反序列化.通过pickle模块的序列化操作我们能够将程序中运行的对象信息保存到文件中去,永久存储:通过pickle模块的反序列化操作,我们能够从文件 ...
- cPickle对python对象进行序列化,序列化到文件或内存
pickle模块使用的数据格式是python专用的,并且不同版本不向后兼容,同时也不能被其他语言说识别.要和其他语言交互,可以使用内置的json包 cPickle可以对任意一种类型的python对象进 ...
- Python:序列化 pickle JSON
序列化 在程序运行的过程中,所有的变量都储存在内存中,例如定义一个dict d=dict(name='Bob',age=20,score=88) 可以随时修改变量,比如把name修改为'Bill',但 ...
- python pickle模块的使用/将python数据对象序列化保存到文件中
# Python 使用pickle/cPickle模块进行数据的序列化 """Python序列化的概念很简单.内存里面有一个数据结构, 你希望将它保存下来,重用,或者发送 ...
随机推荐
- C#之Enum中的Flag
我们知道在默认情况下,第一个枚举数的值为0,后面每个枚举数的值一次加1. enum Days {Sat, Sun, Mon, Tue, Wed, Thu, Fri}; 我们也可以用初始值来重写默认值. ...
- [转]SQL Server 创建数据库邮件
本文转自:http://www.cnblogs.com/gaizai/p/3358958.html 一. 背景 数据库发邮件通知数据库的运行状态(状态可以通过JOB形式获取)和信息,达到预警的效果. ...
- eclipse安装pydev
eclipse是常用的用来写java代码的IDE,但是其实也可以用来写python代码,只需要配置好pydev即可. 第一步 打开eclipse,点击Help,install new sofeware ...
- Javascript函数式编程的一些例子[转载]
函数式编程风格 通常来讲,函数式编程的谓词(关系运算符,如大于,小于,等于的判断等),以及运算(如加减乘数等)都会以函数的形式出现,比如: a > b通常表示为: gt(a, b)/ ...
- 一步一步学RenderMonkey(5)--渲染到纹理(RTT) 【转】
转载请注明出处: http://blog.csdn.net/tianhai110 渲染到纹理: 新建一个空effect; 添加渲染目标纹理, Add Texture-> Add Render ...
- 第二章:ES索引说明
1."_boost": 20评分的权重(排序) 2."analyzer": "ik"中文分词 "analyzer": & ...
- PHP的两个科学计数法转换为字符串的方法
不常用,所以整理在这里,分享给同行使用 方法一:取尾数法 public function NumToStr($num) { if (stripos($num, 'e') === false) retu ...
- Node.js meitulu图片批量下载爬虫1.051
原有1.05版程序没有断点续传模式,现在在最近程序基础上改写一版1.051. //====================================================== // m ...
- FTP服务器配置
一.FTP服务器: FTP服务使用FTP协议来进行文件的上传和下载,可以非常方便的进行远距离的文件传输,并可以实现相应的安全控制. FTP和NFS.Samba :三大文件服务器 主动模式:消息端口21 ...
- 解决rails4.0中send_file文件下载两次的问题
之前在开发文件下载的功能时,我遇到了一个很奇怪的问题,点击下载链接,在chrome console中会出现两次请求,第一次返回200,下载的数据缓存在chrome的cache中,第二次返回304,直接 ...