[python标准库]Pickle模块

Pickle-------python对象序列化

本文主要阐述以下几点:

　　1.pickle模块简介

　　2.pickle模块提供的方法

　　3.注意事项

　　4.实例解析

1.pickle模块简介

The pickle module implements a fundamental, but powerful algorithm for serializing(序列化) and de-serializing(反序列化) a Python object structure.
“Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream is converted back into an object hierarchy.

　　　　　　　　　　　　　　　　　　上面图示说明了pickle模块的功能

　　那么为什么需要序列化和反序列化这一操作呢？

　　1.便于存储。序列化过程将文本信息转变为二进制数据流。这样就信息就容易存储在硬盘之中，当需要读取文件的时候，从硬盘中读取数据，然后再将其反序列化便可以得到原始的数据。在Python程序运行中得到了一些字符串、列表、字典等数据，想要长久的保存下来，方便以后使用，而不是简单的放入内存中关机断电就丢失数据。python模块大全中的Pickle模块就派上用场了，它可以将对象转换为一种可以传输或存储的格式。

　　2.便于传输。当两个进程在进行远程通信时，彼此可以发送各种类型的数据。无论是何种类型的数据，都会以二进制序列的形式在网络上传送。发送方需要把這个对象转换为字节序列，在能在网络上传输；接收方则需要把字节序列在恢复为对象。

2.pickle模块提供的方法(python2.x与python3.x)

　先来看看python2.x提供的方法，然后通过源码来看看两着的区别。

pickle.dump(obj, file[, protocol]) #将pickle对象保存至文件
　　Write a pickled representation of obj to the open file object file.
    #将一个pickled对象写入到文件对象中。等价于Pickler(file, protocol).dump(obj)

　　If the protocol parameter is omitted, protocol 0 is used. 
   #如果协议参数省略，默认为0，这里的协议参数我们先不讨论
　
　　file must have a write() method that accepts a single string argument. 
    #注意这里。文件必须具有可写权限

　　由上看出dump方法是将一个被pickle的字符串写入到文件中保存。这种方法可以用在保存用户密码，通过dump方法写入文件保存。

pickle.load(file) #从文件读取数据,边读取边反序列
 Read a string from the open file object file and interpret it as a  pickle data stream, reconstructing and returning the original object hierarchy. 
　 #从文件对象读取字符串，同时将其反序列化输出，这个方法等价于Unpickler(file).load().

 file must have two methods, a read() method that takes an integer argument, and a readline() method that requires no arguments. 
　#文件必须可读.其中read()方法必须接受一个整数参数，来 确定读取的行数;readline()可不必设置参数

 This function automatically determines whether the data stream was written in binary mode or not.  
　#这个函数能够自动识别写入的数据是否为二进制形式

　　以上两个方法是对文件的序列化和反序列化，切记不要和以下两个方法弄混了。

pickle.dumps(obj[, protocol])
　　Return the pickled representation of the object as a string, instead of writing it to a file.#序列化字符串对象，但不写入到文件

　　If the protocol parameter is omitted, protocol 0 is used. 

pickle.loads(string)
　　Read a pickled object hierarchy from a string.

　　总结：1.如果想把序列化的字符串写入文件,用load()和dump()方法。

　　　　　2.序列化到内存,可以采用loads()和dumps()方法。

　　下面来看看python3.x提供的四种方法，我们重点关注它们与Python2.x提供方法的区别。

pickle.dump(obj, file, protocol=None, *, fix_imports=True)
  Write a pickled representation of obj to the open file object file. 等价于Pickler(file, protocol).dump(obj).#保存至文件
　　
  The file argument must have a write() method that accepts a single bytes argument.
pickle.dumps(obj, protocol=None, *, fix_imports=True)
  Return the pickled representation of the object as a bytes object, instead of writing it to a file.

pickle.load(file, *, fix_imports=True, encoding="ASCII", errors="strict")
  Read a pickled object representation from the open file object file and return the reconstituted object hierarchy specified therein. 
　# 等价于 Unpickler(file).load().

　The argument file must have two methods, a read() method that takes an integer argument, and a readline() method that requires no arguments. 

  Both methods should return bytes. #read()和readline()返回字节

  Optional keyword arguments are fix_importsre, encoding and errors.#重点来关注编码

  The encoding can be ‘bytes’ to read these 8-bit string instances as bytes objects.

pickle.loads(bytes_object, *, fix_imports=True, encoding="ASCII", errors="strict")
  Read a pickled object hierarchy from a bytes object and return the reconstituted object hierarchy specified therein.
 
  similar to pickle.load()

　　总结：

3.注意事项

　　哪些可以被序列化和反序列化？　

None, True, and False
整数, 长整数, 浮点数, 复合整数
normal and Unicode strings
tuples, lists, sets, and dictionaries containing only picklable objects(也就是说这三种类型可嵌套)
functions defined at the top level of a module#定义在顶层模块的函数
built-in functions defined at the top level of a module#定义在顶层模块的内置函数
classes that are defined at the top level of a module#定义在顶层模块的类
instances of such classes whose __dict__ or the result of calling __getstate__() is picklable (see section The pickle protocol for details).

4.实例解析

#example1
import pickle

info = [1,2,3,4,'asd']
data1 = pickle.dumps(info)
data2 = pickle.loads(data1)

print data1
print data2

#example2
import pickle

entry = {'a' : 11,'b' : 22}
with open('entry.pickle','wb') as f :
    pickle.dump(entry,f)  #序列化到文件

with open('entry.pickle','rb') as f:
    entry = pickle.load(f) #从文件中反序列化出数据
print entry

#序列化
import pickle

data1 = {'a': [1, 2.0, 3, 4+6j],
         'b': ('string', u'Unicode string'),
         'c': None}

selfref_list = [1, 2, 3]
selfref_list.append(selfref_list)

output = open('data', 'wb')

# Pickle dictionary using protocol 0.
pickle.dump(data1, output)

# Pickle the list using the highest protocol available.
pickle.dump(selfref_list, output, -1)

output.close()

#反序列化
import pprint, pickle

pkl_file = open('data', 'rb')

data1 = pickle.load(pkl_file)
pprint.pprint(data1)

data2 = pickle.load(pkl_file)
pprint.pprint(data2)

pkl_file.close()

reader = TextReader("hello.txt")
>>> reader.readline()
'1: Hello world!'
>>> reader.readline()
'2: I am line number two.'
>>> new_reader = pickle.loads(pickle.dumps(reader))
>>> new_reader.readline()
'3: Goodbye!'

　　关于pickle模块先暂时介绍到这里，关于pickle类待更新！！！

下一篇:[python标准库]JSON模块http://www.cnblogs.com/vipchenwei/p/6951455.html