Day6 模块及Python常用模块

模块概述

定义：模块，用一砣代码实现了某类功能的代码集合。

为了编写可维护的代码，我们把很多函数分组，分别放到不同的文件里，提供了代码的重用性。在Python中，一个.py文件就称之为一个模块（Module）。

注意：

模块让你能够有逻辑地组织你的Python代码段。
把相关的代码分配到一个模块里能让你的代码更好用，更易懂。
简单地说，模块就是一个保存了Python代码的文件。模块能定义函数，类和变量。模块里也能包含可执行的代码！

模块分为三种：

自定义模块
开源模块
内置模块（又称标准库）

使用模块有什么好处？

　　第一：最大的好处是大大提高了代码的可维护性。

　　其次：编写代码不必从零开始。当一个模块编写完毕，就可以被其他地方引用。我们在编写程序的时候，也经常引用其他模块，包括Python

　　　　内置的模块和来自第三方的模块。

　　第三：使用模块还可以避免函数名和变量名冲突。相同名字的函数和变量完全可以分别存在不同的模块中，因此，我们自己在编写模块时，

　　不必考虑名字会与其他模块冲突。

但是也要注意：尽量不要与内置函数名字冲突。

自定义的模块：

注意：自定义模块不要与系统内置的模块同名！

2.模块须知：

Python之所以应用越来越广泛，在一定程度上也依赖于其为程序员提供了大量的模块以供使用，如果想要使用模块，则需要导入。导入模块用import：

import 语法：

import module1[, module2[,... moduleN]

Python 本身带着一些标准的模块库

#!/usr/bin/python3

# 文件名: using_sys.py

import sys

print('命令行参数如下:')

for i in sys.argv:

   print(i)

print('\n\nPython 路径为：', sys.path, '\n')

注意：

1、import sys 引入 python 标准库中的 sys.py 模块；这是引入某一模块的方法。
2、sys.argv 是一个包含命令行参数的列表。
3、sys.path sys.path是python的搜索模块的路径集，是一个list,可以在python 环境下使用sys.path.append(path)添加相关的路径，这样就可以添加指定地址中的py文件了！
4.千万要区分开：python ab.py python解释器会从当前目录中查找ab文件，而程序中代码执行用到的变量、方法、模块是从sys.path路径集中按顺序查找！
5.一个模块只会被导入一次，不管你执行了多少次import。这样可以防止导入模块被一遍又一遍地执行[只初始化执行依次]。
6.import语句可以出现在任意位置！

import os

#获取文件的当前路径

current_path = os.path.dirname(__file__)

#获取父级目录

pre_path = os.path.dirname(current_path)

print(pre_path)

import拓展：

import module #导入整个模块，但是指定模块中的变量和函数并没有导入，导入之后就可以直接用module.[点]的方式访问里面的变量和方法了!

from module.xx.xx import xx[,yy] #导入模块中某些变量或函数

from module.xx.xx import xx as rename[,yy as rename] #别名

from module.xx.xx import * #导入模块中所有的不是以下划线(_)开头的名字都导入到当前位置.[_]单下划线开头的是私有变量或者方法，只能在本模块使用，不可以在别的模块使用！

一般来说，应该避免使用from … import 而使用import语句，因为这样可以使你的程序更加易读，也可以避免名称冲突

导入模块其实就是告诉Python解释器去解释那个py文件

导入一个py文件，解释器解释该py文件
导入一个包，解释器解释该包下的 __init__.py 文件

那么问题来了，导入模块时是根据那个路径作为基准来进行的呢？即：sys.path；如果sys.path路径列表没有你想要的路径，可以通过 sys.path.append('路径') 添加。

sys.path.insert(0,'/x/y/z') #排在前的目录，优先被搜索

sys.path.remove() #删除某个搜索目录

综上所述：当我们导入某个模块的时候，首先会从python内置的模块【内置模块如：sys】中找,找不到再去sys.path路径中查找【此路径包含当前目录（空目录）】

作用域

在一个模块中，我们可能会定义很多函数和变量，但有的函数和变量我们希望给别人使用，有的函数和变量我们希望仅仅在模块内部使用。在Python中，是通过_前缀来实现的。

正常的函数和变量名是公开的（public），可以被直接引用，比如：abc，x123，PI等；

类似__xxx__这样的变量是特殊变量，可以被直接引用，但是有特殊用途

比如:作者:__author__,文档注释__doc__就是特殊变量，hello模块定义的也可以用特殊变量访问，我们自己的变量一般不要用这种变量名；

类似_xxx和__xxx这样的函数或变量就是非公开的（private），不应该被直接引用，比如_abc，__abc等；

之所以我们说，private函数和变量“不应该”被直接引用，而不是“不能”被直接引用，是因为Python并没有一种方法可以完全限制访问private函数或变量，但是，从编程习惯上不应该引用private函数或变量。

private函数或变量不应该被别人引用，那它们有什么用呢？请看例子：

def _private_1(name):

    return 'Hello, %s' % name

def _private_2(name):

    return 'Hi, %s' % name

def greeting(name):

    if len(name) > 3:

        return _private_1(name)

    else:

        return _private_2(name)

我们在模块里公开greeting()函数，而把内部逻辑用private函数隐藏起来了，这样，调用greeting()函数不用关心内部的private函数细节，这也是一种非常有用的代码封装和抽象的方法，即：

外部不需要引用的函数全部定义成private，只有外部需要引用的函数才定义为public。

其它小知识点：

__name__属性:

一个模块被另一个程序第一次引入时，其主程序将运行。如果我们想在模块被引入时，模块中的某一程序块不执行，我们可以用__name__属性来使该程序块仅在该模块自身运行时执行。

if __name__ == '__main__':

   print('程序自身在运行')

else:

   print('我来自另一模块')

每个模块都有一个__name__属性，当其值是'__main__'时，表明该模块自身在运行，否则是被引入。

dir() 函数:

内置的函数 dir() 可以找到模块【可调用对象】内定义的所有属性和方法,这些方法可以以模块名.[点]的方式访问。以一个字符串列表的形式返回:

import sys

print(sys.__name__) #注意：用import导入时是可以访问它的所有的属性和方法的！

print(dir(sys))

如果没有参数,dir()列举出当前定义的名字

dir()不会列举出内建函数或者变量的名字，它们都被定义到了标准模块builtin中，可以列举出它们，

import builtins

dir(builtins)

编译python文件

为了提高模块的加载速度，Python缓存编译的版本，每个模块在__pycache__目录的以module.version.pyc的形式命名，通常包含了python的版本号，如在CPython版本3.3，关于spam.py的编译版本将被缓存成__pycache__/spam.cpython-33.pyc，这种命名约定允许不同的版本，不同版本的Python编写模块共存。

Python检查源文件的修改时间与编译的版本进行对比，如果过期就需要重新编译。这是完全自动的过程。并且编译的模块是平台独立的，所以相同的库可以在不同的架构的系统之间共享，即pyc使一种跨平台的字节码，类似于JAVA、NET,是由python虚拟机来执行的，但是pyc的内容跟python的版本相关，不同的版本编译后的pyc文件不同，2.5编译的pyc文件不能到3.5上执行，并且pyc文件是可以反编译的，因而它的出现仅仅是用来提升模块的加载速度的。

提示：

1.模块名区分大小写，foo.py与FOO.py代表的是两个模块

2.你可以使用-O或者-OO转换python命令来减少编译模块的大小

1 -O转换会帮你去掉assert语句

2 -OO转换会帮你去掉assert语句和__doc__文档字符串

3 由于一些程序可能依赖于assert语句或文档字符串，你应该在在确认需要的情况下使用这些选项。

3.在速度上从.pyc文件中读指令来执行不会比从.py文件中读指令执行更快，只有在模块被加载时，.pyc文件才是更快的

4.只有使用import语句是才将文件自动编译为.pyc文件，在命令行或标准输入中指定运行脚本则不会生成这类文件，因而我们可以使用compieall模块为一个目录中的所有模块创建.pyc文件

模块可以作为一个脚本（使用python -m compileall）编译Python源

python -m compileall /module_directory 递归着编译

如果使用python -O -m compileall /module_directory -l则只一层

命令行里使用compile()函数时，自动使用python -O -m compileall

详见：https://docs.python.org/3/library/compileall.html#module-compileall

总结：python并非完全是解释性语言，它是有编译的，先把源码py文件编译成pyc，然后由python的虚拟机执行！

我们平时所说的python解释器其实是Cpython，在执行的时候，python会先将.py文件编译成中间形式的字节码(bytecode)并存放在内存当中，然后在正真执行的时候将字节码解释为机器可识别的二进制码。
默认情况下，被import的文件编译出字节码会被保存下来，即我们看到的.pyc文件了。当然我们可以显示的编译一个.py文件并保存。

包：

你也许还想到，如果不同的人编写的模块名相同怎么办？为了避免模块名冲突，Python又引入了按目录来组织模块的方法，称为包（Package）。

函数就相当于工具，模块就是工具包，包就相当于工具包的集合！

举个例子，一个init.py的文件就是一个名字叫init的模块，一个Demo.py的文件就是一个名字叫Demo的模块。

现在，假设我们的init和Demo这两个模块名字与其他模块冲突了，于是我们可以通过包来组织模块，避免冲突。方法是选择一个顶层包名，比如oop，按照如下目录存放：

引入了包以后，只要顶层的包名不与别人冲突，那所有模块都不会与别人冲突。现在，init.py模块的名字就变成了oop.init，类似的，Demo.py的模块名变成了oop.Demo。

请注意，每一个包目录下面都会有一个__init__.py的文件，这个文件是必须存在的，否则，Python就把这个目录当成普通目录，而不是一个包。

__init__.py可以是空文件，也可以有Python代码，因为__init__.py本身就是一个模块，而它的模块名就是oop。

类似的，可以有多级目录，组成多级层次的包结构。比如如下的目录结构：

两个文件Demo.py的模块名分别是oop.Demo和oop.conf.Demo。

无论是import形式还是from...import形式，凡是在导入语句中（而不是在使用时）遇到带点的，都要第一时间提高警觉：这是关于包才有的导入语法

包的本质就是一个包含__init__.py文件的目录。

包A和包B下有同名模块也不会冲突，如A.a与B.a来自俩个命名空间:

glance/                   #Top-level package

├── __init__.py      #Initialize the glance package

├── api                  #Subpackage for api

│   ├── __init__.py

│   ├── policy.py

│   └── versions.py

├── cmd                #Subpackage for cmd

│   ├── __init__.py

│   └── manage.py

└── db                  #Subpackage for db

    ├── __init__.py

    └── models.py

#文件内容

#policy.py

def get():

    print('from policy.py')

#versions.py

def create_resource(conf):

    print('from version.py: ',conf)

#manage.py

def main():

    print('from manage.py')

#models.py

def register_models(engine):

    print('from models.py: ',engine)

2.1 注意事项
1.关于包相关的导入语句也分为import和from ... import ...两种，但是无论哪种，无论在什么位置，在导入时都必须遵循一个原则：凡是在导入时带点的，点的左边都必须是一个包，否则非法。可以带有一连串的点，如item.subitem.subsubitem,但都必须遵循这个原则。

2.对于导入后，在使用时就没有这种限制了，点的左边可以是包,模块，函数，类(它们都可以用点的方式调用自己的属性)。

3.对比import item 和from item import name的应用场景：
如果我们想直接使用name那必须使用后者。

import

我们在与包glance同级别的文件中测试

import glance.db.models

glance.db.models.register_models('mysql')

from ... import ...

需要注意的是from后import导入的模块，必须是明确的一个不能带点，否则会有语法错误，如：from a import b.c是错误语法

我们在与包glance同级别的文件中测试

from glance.db import models

models.register_models('mysql')

from glance.db.models import register_models

register_models('mysql')

__init__.py文件

不管是哪种方式，只要是第一次导入包或者是包的任何其他部分，都会依次执行包下的__init__.py文件(我们可以在每个包的文件内都打印一行内容来验证一下)，这个文件可以为空，但是也可以存放一些初始化包的代码。

from glance.api import *

在讲模块时，我们已经讨论过了从一个模块内导入所有*，此处我们研究从一个包导入所有模块[*]。

此处是想从包api中导入所有，实际上该语句只会执行导入包api下__init__.py文件中的代码【，此时在导入的地方就可以引用定义在__init__文件的变量和方法了】，而没有导入这个包下的其它模块，要是想导入里面的某些或所有模块，我们可以在这个文件中定义__all__,然后将模块都放到__all__表示的列表中:

 #在__init__.py中定义

 x=10

 def func():

     print('from api.__init.py')

 __all__=['x','func','policy']

此时我们在于glance同级的文件中执行from glance.api import *就导入__all__中的内容（versions仍然不能导入），同时如果__init__文件中的变量或者函数如果不加入到__all__列表中，实际上在导入的地方也是不能够直接使用的，简言之：__all__限定了 glance.api import *导入包的时候只是导入了init文件的__all__中的所有内容[注意：__all__只是给from module import *用的哦]。

注意：上面的from glance.api import * 仅仅是导入一个包中的所有模块的时候是按着上面那么用，实际上如下：

从某个包中导入某个文件from glance.db import models即从glance.db包中导入models模块还是可以这么用的，然后使用models.register_models()函数也是可以的！

当然导入某个包某个文件中的所有不是以_开头的属性和方法也是可以的，如下所示：

from test.lib.aa import *  #导入test包下lib包下的aa模块，然后就可以在下面使用aa模块中的af()函数了！

af()

虽然有这种方式，但是不建议这么使用！

绝对导入和相对导入

我们的最顶级包glance是写给别人用的，然后在glance包内部也会有彼此之间互相导入的需求，这时候就有绝对导入和相对导入两种方式：

绝对导入：以glance作为起始

相对导入：用.或者..的方式最为起始（只能在一个包中使用，不能用于不同目录内）

例如：我们在glance/api/versions.py中想要导入glance/api/policy.py文件，这时候我们又在与glance包在同一级目录的py文件中导入了glance/api/versions.py就会出问题！如下所示：

import policy

policy.get()  #注意：此时导入policy模块，并调用policy模块中的get()方法是没问题的，此时执行的时候是以当前文件的相对路径导入的，所以没问题！

def create_resource(conf):

    print('from version.py: ',conf)

import glance.api.versions #Demo.py与glance包在同一级目录，此时就会有问题，因为此时是以Demo的路径为当前路径执行的，当执行导入glance.api.versions的时候，在这个文件中

                           #又导入了policy文件，但是此时的policy文件查找路径是以Demo的路径为基准查找的，所以找不到，报错误！

出现如下错误[原因在上面已写]：

ImportError: No module named 'policy'

特别需要注意的是：可以用import导入内置或者第三方开源模块，但是要绝对避免使用import来导入自定义包的子模块，应该使用from... import ...的绝对或者相对导入,且包的相对导入只能用from的形式。

这时我们可以导入例如：我们在glance/api/versions.py中想要导入glance/api/policy.py可以用相对路径和绝对路径的方式，如下所示：

from glance.api import policy #绝对路径

# from . import policy 相对路径  #from ..cmd import manage

policy.get()

def create_resource(conf):

    print('from version.py: ',conf)

尤其是我们在用到了包的概念的时候，我们对项目的组织结构就是一个项目下有多个包，包中再有子包或者py文件，此时我们包中的文件一定要用from ... import ... 的方式导入，方便自己在别的包使用，也方便给别人的时候使用！

单独导入包

单独导入包名称时不会导入包中所有包含的所有子模块，如:

#在与glance同级的test.py中

import glance

glance.cmd.manage.main()

'''

执行结果：

AttributeError: module 'glance' has no attribute 'cmd'

'''

解决方法：

#glance/__init__.py

from . import cmd

#glance/cmd/__init__.py

from . import manage

千万别问：__all__不能解决吗，__all__是用于控制from...import * ，fuck！

综上所述：在包内的py文件如果想要导入自己写的py文件就用from module import 导入【如果导入的是内置的或者第三方包可以用import】，在包外面导入包【导入包中的文件就不用这么做了】的时候，可以用我们上面这种方式，在包下面的__init__.py文件中加入 from . import module/py

开源模块

下载安装有两种方式：

yum

pip

apt-get

...

下载源码

解压源码

进入目录

编译源码    python setup.py build

安装源码    python setup.py install

注：在使用源码安装时，需要使用到gcc编译和python开发环境，所以，需要先执行：

yum install gcc

yum install python-devel

或

apt-get python-dev

安装成功后，模块会自动安装到 sys.path 中的某个目录中，如：

/usr/lib/python2.7/site-packages/

二、导入模块

同自定义模块中导入的方式

三、模块 paramiko

paramiko是一个用于做远程控制的模块，使用该模块可以对远程服务器进行命令或文件操作，值得一说的是，fabric和ansible内部的远程管理就是使用的paramiko来现实。

1、下载安装

pip3 install paramiko

或

# pycrypto，由于 paramiko 模块内部依赖pycrypto，所以先下载安装pycrypto

# 下载安装 pycrypto

wget http://files.cnblogs.com/files/wupeiqi/pycrypto-2.6.1.tar.gz

tar -xvf pycrypto-2.6.1.tar.gz

cd pycrypto-2.6.1

python setup.py build

python setup.py install

# 进入python环境，导入Crypto检查是否安装成功

# 下载安装 paramiko

wget http://files.cnblogs.com/files/wupeiqi/paramiko-1.10.1.tar.gz

tar -xvf paramiko-1.10.1.tar.gz

cd paramiko-1.10.1

python setup.py build

python setup.py install

# 进入python环境，导入paramiko检查是否安装成功

2、使用模块

#!/usr/bin/env python

#coding:utf-8

import paramiko

ssh = paramiko.SSHClient()

ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())

ssh.connect('192.168.1.108', 22, 'alex', '')

stdin, stdout, stderr = ssh.exec_command('df')

print stdout.read()

ssh.close();

执行命令 - 通过用户名和密码连接服务器

执行命令 - 过密钥链接服务器

import os,sys

import paramiko

t = paramiko.Transport(('182.92.219.86',22))

t.connect(username='wupeiqi',password='')

sftp = paramiko.SFTPClient.from_transport(t)

sftp.put('/tmp/test.py','/tmp/test.py')

t.close()

import os,sys

import paramiko

t = paramiko.Transport(('182.92.219.86',22))

t.connect(username='wupeiqi',password='')

sftp = paramiko.SFTPClient.from_transport(t)

sftp.get('/tmp/test.py','/tmp/test2.py')

t.close()

上传或者下载文件 - 通过用户名和密码

上传或下载文件 - 通过密钥

内置模块

python的内置模块在python下载的解释器中是找不到py文件的!比如sys模块对吧，那么在下载python解释器的时候，还自带了Lib库等文件夹，这里面也是有py文件的，我们把这些自带的以及找不到py文件的这些统称为内置模块!

一、os

用于提供系统级别的操作

os.getcwd() 获取当前工作目录，即当前python脚本工作的目录路径

os.chdir("dirname")  改变当前脚本工作目录；相当于shell下cd

os.curdir  返回当前目录: ('.')

os.pardir  获取当前目录的父目录字符串名：('..')

os.makedirs('dirname1/dirname2')    可生成多层递归目录

os.removedirs('dirname1')    若目录为空，则删除，并递归到上一级目录，如若也为空，则删除，依此类推

os.mkdir('dirname')    生成单级目录；相当于shell中mkdir dirname

os.rmdir('dirname')    删除单级空目录，若目录不为空则无法删除，报错；相当于shell中rmdir dirname

os.listdir('dirname')    列出指定目录下的所有文件和子目录，包括隐藏文件，并以列表方式打印

os.remove()  删除一个文件

os.rename("oldname","newname")  重命名文件/目录

os.stat('path/filename')  获取文件/目录信息

os.sep    输出操作系统特定的路径分隔符，win下为"\\",Linux下为"/"

os.linesep    输出当前平台使用的行终止符，win下为"\t\n",Linux下为"\n"

os.pathsep    输出用于分割文件路径的字符串

os.name    输出字符串指示当前使用平台。win->'nt'; Linux->'posix'

os.system("bash command")  运行shell命令，直接显示

os.environ  获取系统环境变量

os.path.abspath(path)  返回path规范化的绝对路径

os.path.split(path)  将path分割成目录和文件名二元组返回

os.path.dirname(path)  返回path的目录。其实就是os.path.split(path)的第一个元素

os.path.basename(path)  返回path最后的文件名。如何path以／或\结尾，那么就会返回空值。即os.path.split(path)的第二个元素

os.path.exists(path)  如果path存在，返回True；如果path不存在，返回False

os.path.isabs(path)  如果path是绝对路径，返回True

os.path.isfile(path)  如果path是一个存在的文件，返回True。否则返回False

os.path.isdir(path)  如果path是一个存在的目录，则返回True。否则返回False

os.path.join(path1[, path2[, ...]])  将多个路径组合后返回，第一个绝对路径之前的参数将被忽略

os.path.getatime(path)  返回path所指向的文件或者目录的最后存取时间

os.path.getmtime(path)  返回path所指向的文件或者目录的最后修改时间

更多猛击这里

二、sys

用于提供对解释器相关的操作

sys.argv           命令行参数List，第一个元素是程序本身路径

sys.exit(n)        退出程序，正常退出时exit(0)

sys.version        获取Python解释程序的版本信息

sys.maxint         最大的Int值

sys.path           返回模块的搜索路径，初始化时使用PYTHONPATH环境变量的值

sys.platform       返回操作系统平台名称

sys.stdout.write('please:')

val = sys.stdin.readline()[:-1]

更多猛击这里

三、hashlib

用于加密相关的操作，代替了md5模块和sha模块，主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ，MD5 算法

import md5

hash = md5.new()

hash.update('admin')

print hash.hexdigest()

md5-废弃

import sha

hash = sha.new()

hash.update('admin')

print hash.hexdigest()

sha-废弃

import hashlib

  

# ######## md5 ########

  

hash = hashlib.md5()

hash.update('admin')

print hash.hexdigest()

  

# ######## sha1 ########

  

hash = hashlib.sha1()

hash.update('admin')

print hash.hexdigest()

  

# ######## sha256 ########

  

hash = hashlib.sha256()

hash.update('admin')

print hash.hexdigest()

  

  

# ######## sha384 ########

  

hash = hashlib.sha384()

hash.update('admin')

print hash.hexdigest()

  

# ######## sha512 ########

  

hash = hashlib.sha512()

hash.update('admin')

print hash.hexdigest()

以上加密算法虽然依然非常厉害，但时候存在缺陷，即：通过撞库可以反解。所以，有必要对加密算法中添加自定义key再来做加密。

import hashlib

# ######## md5 ########

hash = hashlib.md5('898oaFs09f')

hash.update('admin')

print hash.hexdigest()

还不够吊？python 还有一个 hmac 模块，它内部对我们创建 key 和内容再进行处理然后再加密

散列消息鉴别码，简称HMAC，是一种基于消息鉴别码MAC（Message Authentication Code）的鉴别机制。使用HMAC时,消息通讯的双方，通过验证消息中加入的鉴别密钥K来鉴别消息的真伪；

一般用于网络通信中消息加密，前提是双方先要约定好key,就像接头暗号一样，然后消息发送把用key把消息加密，接收方用key ＋消息明文再加密，拿加密后的值跟发送者的相对比是否相等，这样就能验证消息的真实性，及发送者的合法性了。

import hmac

h = hmac.new(b'天王盖地虎', b'宝塔镇河妖')

print h.hexdigest()

更多关于md5,sha1,sha256等介绍的文章看这里https://www.tbs-certificates.co.uk/FAQ/en/sha256.html　　

四、random模块

随机数

mport random

print random.random()

print random.randint(1,2)

print random.randrange(1,10)

生成随机验证码

import random

checkcode = ''

for i in range(4):

    current = random.randrange(0,4)

    if current != i:

        temp = chr(random.randint(65,90))

    else:

        temp = random.randint(0,9)

    checkcode += str(temp)

print checkcode

json & pickle 模块

用于序列化的两个模块

json，用于字符串和 python数据类型间进行转换
pickle，用于python特有的类型和 python的数据类型间进行转换

Json模块提供了四个功能：dumps、dump、loads、load

pickle模块提供了四个功能：dumps、dump、loads、load

⑥time & datetime模块

#_*_coding:utf-8_*_

__author__ = 'Alex Li'

import time

# print(time.clock()) #返回处理器时间,3.3开始已废弃 , 改成了time.process_time()测量处理器运算时间,不包括sleep时间,不稳定,mac上测不出来

# print(time.altzone)  #返回与utc时间的时间差,以秒计算\

# print(time.asctime()) #返回时间格式"Fri Aug 19 11:14:16 2016",

# print(time.localtime()) #返回本地时间 的struct time对象格式

# print(time.gmtime(time.time()-800000)) #返回utc时间的struc时间对象格式

# print(time.asctime(time.localtime())) #返回时间格式"Fri Aug 19 11:14:16 2016",

#print(time.ctime()) #返回Fri Aug 19 12:38:29 2016 格式, 同上

# 日期字符串 转成  时间戳

# string_2_struct = time.strptime("2016/05/22","%Y/%m/%d") #将 日期字符串 转成 struct时间对象格式

# print(string_2_struct)

# #

# struct_2_stamp = time.mktime(string_2_struct) #将struct时间对象转成时间戳

# print(struct_2_stamp)

#将时间戳转为字符串格式

# print(time.gmtime(time.time()-86640)) #将utc时间戳转换成struct_time格式

# print(time.strftime("%Y-%m-%d %H:%M:%S",time.gmtime()) ) #将utc struct_time格式转成指定的字符串格式

#时间加减

import datetime

# print(datetime.datetime.now()) #返回 2016-08-19 12:47:03.941925

#print(datetime.date.fromtimestamp(time.time()) )  # 时间戳直接转成日期格式 2016-08-19

# print(datetime.datetime.now() )

# print(datetime.datetime.now() + datetime.timedelta(3)) #当前时间+3天

# print(datetime.datetime.now() + datetime.timedelta(-3)) #当前时间-3天

# print(datetime.datetime.now() + datetime.timedelta(hours=3)) #当前时间+3小时

# print(datetime.datetime.now() + datetime.timedelta(minutes=30)) #当前时间+30分

#

# c_time  = datetime.datetime.now()

# print(c_time.replace(minute=3,hour=2)) #时间替换

Directive	Meaning	Notes
`%a`	Locale’s abbreviated weekday name.
`%A`	Locale’s full weekday name.
`%b`	Locale’s abbreviated month name.
`%B`	Locale’s full month name.
`%c`	Locale’s appropriate date and time representation.
`%d`	Day of the month as a decimal number [01,31].
`%H`	Hour (24-hour clock) as a decimal number [00,23].
`%I`	Hour (12-hour clock) as a decimal number [01,12].
`%j`	Day of the year as a decimal number [001,366].
`%m`	Month as a decimal number [01,12].
`%M`	Minute as a decimal number [00,59].
`%p`	Locale’s equivalent of either AM or PM.	(1)
`%S`	Second as a decimal number [00,61].	(2)
`%U`	Week number of the year (Sunday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Sunday are considered to be in week 0.	(3)
`%w`	Weekday as a decimal number [0(Sunday),6].
`%W`	Week number of the year (Monday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Monday are considered to be in week 0.	(3)
`%x`	Locale’s appropriate date representation.
`%X`	Locale’s appropriate time representation.
`%y`	Year without century as a decimal number [00,99].
`%Y`	Year with century as a decimal number.
`%z`	Time zone offset indicating a positive or negative time difference from UTC/GMT of the form +HHMM or -HHMM, where H represents decimal hour digits and M represents decimal minute digits [-23:59, +23:59].
`%Z`	Time zone name (no characters if no time zone exists).
`%%`	A literal `'%'` character.

七、shutil

高级的文件、文件夹、压缩包处理模块

shutil.copyfileobj(fsrc, fdst[, length])
将文件内容拷贝到另一个文件中，可以部分内容

def copyfileobj(fsrc, fdst, length=16*1024):

    """copy data from file-like object fsrc to file-like object fdst"""

    while 1:

        buf = fsrc.read(length)

        if not buf:

            break

        fdst.write(buf)

shutil.copyfile(src, dst)
拷贝文件

def copyfile(src, dst):

    """Copy data from src to dst"""

    if _samefile(src, dst):

        raise Error("`%s` and `%s` are the same file" % (src, dst))

    for fn in [src, dst]:

        try:

            st = os.stat(fn)

        except OSError:

            # File most likely does not exist

            pass

        else:

            # XXX What about other special files? (sockets, devices...)

            if stat.S_ISFIFO(st.st_mode):

                raise SpecialFileError("`%s` is a named pipe" % fn)

    with open(src, 'rb') as fsrc:

        with open(dst, 'wb') as fdst:

            copyfileobj(fsrc, fdst)

shutil.copymode(src, dst)
仅拷贝权限。内容、组、用户均不变

def copymode(src, dst):

    """Copy mode bits from src to dst"""

    if hasattr(os, 'chmod'):

        st = os.stat(src)

        mode = stat.S_IMODE(st.st_mode)

        os.chmod(dst, mode)

shutil.copystat(src, dst)
拷贝状态的信息，包括：mode bits, atime, mtime, flags

def copystat(src, dst):

    """Copy all stat info (mode bits, atime, mtime, flags) from src to dst"""

    st = os.stat(src)

    mode = stat.S_IMODE(st.st_mode)

    if hasattr(os, 'utime'):

        os.utime(dst, (st.st_atime, st.st_mtime))

    if hasattr(os, 'chmod'):

        os.chmod(dst, mode)

    if hasattr(os, 'chflags') and hasattr(st, 'st_flags'):

        try:

            os.chflags(dst, st.st_flags)

        except OSError, why:

            for err in 'EOPNOTSUPP', 'ENOTSUP':

                if hasattr(errno, err) and why.errno == getattr(errno, err):

                    break

            else:

                raise

shutil.copy(src, dst)
拷贝文件和权限

def copy(src, dst):

    """Copy data and mode bits ("cp src dst").

    The destination may be a directory.

    """

    if os.path.isdir(dst):

        dst = os.path.join(dst, os.path.basename(src))

    copyfile(src, dst)

    copymode(src, dst)

shutil.copy2(src, dst)
拷贝文件和状态信息

def copy2(src, dst):

    """Copy data and all stat info ("cp -p src dst").

    The destination may be a directory.

    """

    if os.path.isdir(dst):

        dst = os.path.join(dst, os.path.basename(src))

    copyfile(src, dst)

    copystat(src, dst)

shutil.ignore_patterns(*patterns)
shutil.copytree(src, dst, symlinks=False, ignore=None)
递归的去拷贝文件

例如：copytree(source, destination, ignore=ignore_patterns('*.pyc', 'tmp*'))

def ignore_patterns(*patterns):

    """Function that can be used as copytree() ignore parameter.

    Patterns is a sequence of glob-style patterns

    that are used to exclude files"""

    def _ignore_patterns(path, names):

        ignored_names = []

        for pattern in patterns:

            ignored_names.extend(fnmatch.filter(names, pattern))

        return set(ignored_names)

    return _ignore_patterns

def copytree(src, dst, symlinks=False, ignore=None):

    """Recursively copy a directory tree using copy2().

    The destination directory must not already exist.

    If exception(s) occur, an Error is raised with a list of reasons.

    If the optional symlinks flag is true, symbolic links in the

    source tree result in symbolic links in the destination tree; if

    it is false, the contents of the files pointed to by symbolic

    links are copied.

    The optional ignore argument is a callable. If given, it

    is called with the `src` parameter, which is the directory

    being visited by copytree(), and `names` which is the list of

    `src` contents, as returned by os.listdir():

        callable(src, names) -> ignored_names

    Since copytree() is called recursively, the callable will be

    called once for each directory that is copied. It returns a

    list of names relative to the `src` directory that should

    not be copied.

    XXX Consider this example code rather than the ultimate tool.

    """

    names = os.listdir(src)

    if ignore is not None:

        ignored_names = ignore(src, names)

    else:

        ignored_names = set()

    os.makedirs(dst)

    errors = []

    for name in names:

        if name in ignored_names:

            continue

        srcname = os.path.join(src, name)

        dstname = os.path.join(dst, name)

        try:

            if symlinks and os.path.islink(srcname):

                linkto = os.readlink(srcname)

                os.symlink(linkto, dstname)

            elif os.path.isdir(srcname):

                copytree(srcname, dstname, symlinks, ignore)

            else:

                # Will raise a SpecialFileError for unsupported file types

                copy2(srcname, dstname)

        # catch the Error from the recursive copytree so that we can

        # continue with other files

        except Error, err:

            errors.extend(err.args[0])

        except EnvironmentError, why:

            errors.append((srcname, dstname, str(why)))

    try:

        copystat(src, dst)

    except OSError, why:

        if WindowsError is not None and isinstance(why, WindowsError):

            # Copying file access times may fail on Windows

            pass

        else:

            errors.append((src, dst, str(why)))

    if errors:

        raise Error, errors

shutil.rmtree(path[, ignore_errors[, onerror]])
递归的去删除文件

def rmtree(path, ignore_errors=False, onerror=None):

    """Recursively delete a directory tree.

    If ignore_errors is set, errors are ignored; otherwise, if onerror

    is set, it is called to handle the error with arguments (func,

    path, exc_info) where func is os.listdir, os.remove, or os.rmdir;

    path is the argument to that function that caused it to fail; and

    exc_info is a tuple returned by sys.exc_info().  If ignore_errors

    is false and onerror is None, an exception is raised.

    """

    if ignore_errors:

        def onerror(*args):

            pass

    elif onerror is None:

        def onerror(*args):

            raise

    try:

        if os.path.islink(path):

            # symlinks to directories are forbidden, see bug #1669

            raise OSError("Cannot call rmtree on a symbolic link")

    except OSError:

        onerror(os.path.islink, path, sys.exc_info())

        # can't continue even if onerror hook returns

        return

    names = []

    try:

        names = os.listdir(path)

    except os.error, err:

        onerror(os.listdir, path, sys.exc_info())

    for name in names:

        fullname = os.path.join(path, name)

        try:

            mode = os.lstat(fullname).st_mode

        except os.error:

            mode = 0

        if stat.S_ISDIR(mode):

            rmtree(fullname, ignore_errors, onerror)

        else:

            try:

                os.remove(fullname)

            except os.error, err:

                onerror(os.remove, fullname, sys.exc_info())

    try:

        os.rmdir(path)

    except os.error:

        onerror(os.rmdir, path, sys.exc_info())

shutil.move(src, dst)
递归的去移动文件

def move(src, dst):

    """Recursively move a file or directory to another location. This is

    similar to the Unix "mv" command.

    If the destination is a directory or a symlink to a directory, the source

    is moved inside the directory. The destination path must not already

    exist.

    If the destination already exists but is not a directory, it may be

    overwritten depending on os.rename() semantics.

    If the destination is on our current filesystem, then rename() is used.

    Otherwise, src is copied to the destination and then removed.

    A lot more could be done here...  A look at a mv.c shows a lot of

    the issues this implementation glosses over.

    """

    real_dst = dst

    if os.path.isdir(dst):

        if _samefile(src, dst):

            # We might be on a case insensitive filesystem,

            # perform the rename anyway.

            os.rename(src, dst)

            return

        real_dst = os.path.join(dst, _basename(src))

        if os.path.exists(real_dst):

            raise Error, "Destination path '%s' already exists" % real_dst

    try:

        os.rename(src, real_dst)

    except OSError:

        if os.path.isdir(src):

            if _destinsrc(src, dst):

                raise Error, "Cannot move a directory '%s' into itself '%s'." % (src, dst)

            copytree(src, real_dst, symlinks=True)

            rmtree(src)

        else:

            copy2(src, real_dst)

            os.unlink(src)

shutil.make_archive(base_name, format,...)

创建压缩包并返回文件路径，例如：zip、tar

base_name：压缩包的文件名，也可以是压缩包的路径。只是文件名时，则保存至当前目录，否则保存至指定路径，
如：www =>保存至当前路径
如：/Users/wupeiqi/www =>保存至/Users/wupeiqi/
format：压缩包种类，“zip”, “tar”, “bztar”，“gztar”
root_dir：要压缩的文件夹路径（默认当前目录）
owner：用户，默认当前用户
group：组，默认当前组
logger：用于记录日志，通常是logging.Logger对象

#将 /Users/wupeiqi/Downloads/test 下的文件打包放置当前程序目录

import shutil

ret = shutil.make_archive("wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')

#将 /Users/wupeiqi/Downloads/test 下的文件打包放置 /Users/wupeiqi/目录

import shutil

ret = shutil.make_archive("/Users/wupeiqi/wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')

def make_archive(base_name, format, root_dir=None, base_dir=None, verbose=0,

                 dry_run=0, owner=None, group=None, logger=None):

    """Create an archive file (eg. zip or tar).

    'base_name' is the name of the file to create, minus any format-specific

    extension; 'format' is the archive format: one of "zip", "tar", "bztar"

    or "gztar".

    'root_dir' is a directory that will be the root directory of the

    archive; ie. we typically chdir into 'root_dir' before creating the

    archive.  'base_dir' is the directory where we start archiving from;

    ie. 'base_dir' will be the common prefix of all files and

    directories in the archive.  'root_dir' and 'base_dir' both default

    to the current directory.  Returns the name of the archive file.

    'owner' and 'group' are used when creating a tar archive. By default,

    uses the current owner and group.

    """

    save_cwd = os.getcwd()

    if root_dir is not None:

        if logger is not None:

            logger.debug("changing into '%s'", root_dir)

        base_name = os.path.abspath(base_name)

        if not dry_run:

            os.chdir(root_dir)

    if base_dir is None:

        base_dir = os.curdir

    kwargs = {'dry_run': dry_run, 'logger': logger}

    try:

        format_info = _ARCHIVE_FORMATS[format]

    except KeyError:

        raise ValueError, "unknown archive format '%s'" % format

    func = format_info[0]

    for arg, val in format_info[1]:

        kwargs[arg] = val

    if format != 'zip':

        kwargs['owner'] = owner

        kwargs['group'] = group

    try:

        filename = func(base_name, base_dir, **kwargs)

    finally:

        if root_dir is not None:

            if logger is not None:

                logger.debug("changing back to '%s'", save_cwd)

            os.chdir(save_cwd)

    return filename

shutil 对压缩包的处理是调用 ZipFile 和 TarFile 两个模块来进行的，详细：

import zipfile

# 压缩

z = zipfile.ZipFile('laxi.zip', 'w')

z.write('a.log')

z.write('data.data')

z.close()

# 解压

z = zipfile.ZipFile('laxi.zip', 'r')

z.extractall()

z.close()

zipfile 压缩解压

import tarfile

# 压缩

tar = tarfile.open('your.tar','w')

tar.add('/Users/wupeiqi/PycharmProjects/bbs2.zip', arcname='bbs2.zip')

tar.add('/Users/wupeiqi/PycharmProjects/cmdb.zip', arcname='cmdb.zip')

tar.close()

# 解压

tar = tarfile.open('your.tar','r')

tar.extractall()  # 可设置解压地址

tar.close()

tarfile 压缩解压

class ZipFile(object):

    """ Class with methods to open, read, write, close, list zip files.

    z = ZipFile(file, mode="r", compression=ZIP_STORED, allowZip64=False)

    file: Either the path to the file, or a file-like object.

          If it is a path, the file will be opened and closed by ZipFile.

    mode: The mode can be either read "r", write "w" or append "a".

    compression: ZIP_STORED (no compression) or ZIP_DEFLATED (requires zlib).

    allowZip64: if True ZipFile will create files with ZIP64 extensions when

                needed, otherwise it will raise an exception when this would

                be necessary.

    """

    fp = None                   # Set here since __del__ checks it

    def __init__(self, file, mode="r", compression=ZIP_STORED, allowZip64=False):

        """Open the ZIP file with mode read "r", write "w" or append "a"."""

        if mode not in ("r", "w", "a"):

            raise RuntimeError('ZipFile() requires mode "r", "w", or "a"')

        if compression == ZIP_STORED:

            pass

        elif compression == ZIP_DEFLATED:

            if not zlib:

                raise RuntimeError,\

                      "Compression requires the (missing) zlib module"

        else:

            raise RuntimeError, "That compression method is not supported"

        self._allowZip64 = allowZip64

        self._didModify = False

        self.debug = 0  # Level of printing: 0 through 3

        self.NameToInfo = {}    # Find file info given name

        self.filelist = []      # List of ZipInfo instances for archive

        self.compression = compression  # Method of compression

        self.mode = key = mode.replace('b', '')[0]

        self.pwd = None

        self._comment = ''

        # Check if we were passed a file-like object

        if isinstance(file, basestring):

            self._filePassed = 0

            self.filename = file

            modeDict = {'r' : 'rb', 'w': 'wb', 'a' : 'r+b'}

            try:

                self.fp = open(file, modeDict[mode])

            except IOError:

                if mode == 'a':

                    mode = key = 'w'

                    self.fp = open(file, modeDict[mode])

                else:

                    raise

        else:

            self._filePassed = 1

            self.fp = file

            self.filename = getattr(file, 'name', None)

        try:

            if key == 'r':

                self._RealGetContents()

            elif key == 'w':

                # set the modified flag so central directory gets written

                # even if no files are added to the archive

                self._didModify = True

            elif key == 'a':

                try:

                    # See if file is a zip file

                    self._RealGetContents()

                    # seek to start of directory and overwrite

                    self.fp.seek(self.start_dir, 0)

                except BadZipfile:

                    # file is not a zip file, just append

                    self.fp.seek(0, 2)

                    # set the modified flag so central directory gets written

                    # even if no files are added to the archive

                    self._didModify = True

            else:

                raise RuntimeError('Mode must be "r", "w" or "a"')

        except:

            fp = self.fp

            self.fp = None

            if not self._filePassed:

                fp.close()

            raise

    def __enter__(self):

        return self

    def __exit__(self, type, value, traceback):

        self.close()

    def _RealGetContents(self):

        """Read in the table of contents for the ZIP file."""

        fp = self.fp

        try:

            endrec = _EndRecData(fp)

        except IOError:

            raise BadZipfile("File is not a zip file")

        if not endrec:

            raise BadZipfile, "File is not a zip file"

        if self.debug > 1:

            print endrec

        size_cd = endrec[_ECD_SIZE]             # bytes in central directory

        offset_cd = endrec[_ECD_OFFSET]         # offset of central directory

        self._comment = endrec[_ECD_COMMENT]    # archive comment

        # "concat" is zero, unless zip was concatenated to another file

        concat = endrec[_ECD_LOCATION] - size_cd - offset_cd

        if endrec[_ECD_SIGNATURE] == stringEndArchive64:

            # If Zip64 extension structures are present, account for them

            concat -= (sizeEndCentDir64 + sizeEndCentDir64Locator)

        if self.debug > 2:

            inferred = concat + offset_cd

            print "given, inferred, offset", offset_cd, inferred, concat

        # self.start_dir:  Position of start of central directory

        self.start_dir = offset_cd + concat

        fp.seek(self.start_dir, 0)

        data = fp.read(size_cd)

        fp = cStringIO.StringIO(data)

        total = 0

        while total < size_cd:

            centdir = fp.read(sizeCentralDir)

            if len(centdir) != sizeCentralDir:

                raise BadZipfile("Truncated central directory")

            centdir = struct.unpack(structCentralDir, centdir)

            if centdir[_CD_SIGNATURE] != stringCentralDir:

                raise BadZipfile("Bad magic number for central directory")

            if self.debug > 2:

                print centdir

            filename = fp.read(centdir[_CD_FILENAME_LENGTH])

            # Create ZipInfo instance to store file information

            x = ZipInfo(filename)

            x.extra = fp.read(centdir[_CD_EXTRA_FIELD_LENGTH])

            x.comment = fp.read(centdir[_CD_COMMENT_LENGTH])

            x.header_offset = centdir[_CD_LOCAL_HEADER_OFFSET]

            (x.create_version, x.create_system, x.extract_version, x.reserved,

                x.flag_bits, x.compress_type, t, d,

                x.CRC, x.compress_size, x.file_size) = centdir[1:12]

            x.volume, x.internal_attr, x.external_attr = centdir[15:18]

            # Convert date/time code to (year, month, day, hour, min, sec)

            x._raw_time = t

            x.date_time = ( (d>>9)+1980, (d>>5)&0xF, d&0x1F,

                                     t>>11, (t>>5)&0x3F, (t&0x1F) * 2 )

            x._decodeExtra()

            x.header_offset = x.header_offset + concat

            x.filename = x._decodeFilename()

            self.filelist.append(x)

            self.NameToInfo[x.filename] = x

            # update total bytes read from central directory

            total = (total + sizeCentralDir + centdir[_CD_FILENAME_LENGTH]

                     + centdir[_CD_EXTRA_FIELD_LENGTH]

                     + centdir[_CD_COMMENT_LENGTH])

            if self.debug > 2:

                print "total", total

    def namelist(self):

        """Return a list of file names in the archive."""

        l = []

        for data in self.filelist:

            l.append(data.filename)

        return l

    def infolist(self):

        """Return a list of class ZipInfo instances for files in the

        archive."""

        return self.filelist

    def printdir(self):

        """Print a table of contents for the zip file."""

        print "%-46s %19s %12s" % ("File Name", "Modified    ", "Size")

        for zinfo in self.filelist:

            date = "%d-%02d-%02d %02d:%02d:%02d" % zinfo.date_time[:6]

            print "%-46s %s %12d" % (zinfo.filename, date, zinfo.file_size)

    def testzip(self):

        """Read all the files and check the CRC."""

        chunk_size = 2 ** 20

        for zinfo in self.filelist:

            try:

                # Read by chunks, to avoid an OverflowError or a

                # MemoryError with very large embedded files.

                with self.open(zinfo.filename, "r") as f:

                    while f.read(chunk_size):     # Check CRC-32

                        pass

            except BadZipfile:

                return zinfo.filename

    def getinfo(self, name):

        """Return the instance of ZipInfo given 'name'."""

        info = self.NameToInfo.get(name)

        if info is None:

            raise KeyError(

                'There is no item named %r in the archive' % name)

        return info

    def setpassword(self, pwd):

        """Set default password for encrypted files."""

        self.pwd = pwd

    @property

    def comment(self):

        """The comment text associated with the ZIP file."""

        return self._comment

    @comment.setter

    def comment(self, comment):

        # check for valid comment length

        if len(comment) > ZIP_MAX_COMMENT:

            import warnings

            warnings.warn('Archive comment is too long; truncating to %d bytes'

                          % ZIP_MAX_COMMENT, stacklevel=2)

            comment = comment[:ZIP_MAX_COMMENT]

        self._comment = comment

        self._didModify = True

    def read(self, name, pwd=None):

        """Return file bytes (as a string) for name."""

        return self.open(name, "r", pwd).read()

    def open(self, name, mode="r", pwd=None):

        """Return file-like object for 'name'."""

        if mode not in ("r", "U", "rU"):

            raise RuntimeError, 'open() requires mode "r", "U", or "rU"'

        if not self.fp:

            raise RuntimeError, \

                  "Attempt to read ZIP archive that was already closed"

        # Only open a new file for instances where we were not

        # given a file object in the constructor

        if self._filePassed:

            zef_file = self.fp

            should_close = False

        else:

            zef_file = open(self.filename, 'rb')

            should_close = True

        try:

            # Make sure we have an info object

            if isinstance(name, ZipInfo):

                # 'name' is already an info object

                zinfo = name

            else:

                # Get info object for name

                zinfo = self.getinfo(name)

            zef_file.seek(zinfo.header_offset, 0)

            # Skip the file header:

            fheader = zef_file.read(sizeFileHeader)

            if len(fheader) != sizeFileHeader:

                raise BadZipfile("Truncated file header")

            fheader = struct.unpack(structFileHeader, fheader)

            if fheader[_FH_SIGNATURE] != stringFileHeader:

                raise BadZipfile("Bad magic number for file header")

            fname = zef_file.read(fheader[_FH_FILENAME_LENGTH])

            if fheader[_FH_EXTRA_FIELD_LENGTH]:

                zef_file.read(fheader[_FH_EXTRA_FIELD_LENGTH])

            if fname != zinfo.orig_filename:

                raise BadZipfile, \

                        'File name in directory "%s" and header "%s" differ.' % (

                            zinfo.orig_filename, fname)

            # check for encrypted flag & handle password

            is_encrypted = zinfo.flag_bits & 0x1

            zd = None

            if is_encrypted:

                if not pwd:

                    pwd = self.pwd

                if not pwd:

                    raise RuntimeError, "File %s is encrypted, " \

                        "password required for extraction" % name

                zd = _ZipDecrypter(pwd)

                # The first 12 bytes in the cypher stream is an encryption header

                #  used to strengthen the algorithm. The first 11 bytes are

                #  completely random, while the 12th contains the MSB of the CRC,

                #  or the MSB of the file time depending on the header type

                #  and is used to check the correctness of the password.

                bytes = zef_file.read(12)

                h = map(zd, bytes[0:12])

                if zinfo.flag_bits & 0x8:

                    # compare against the file type from extended local headers

                    check_byte = (zinfo._raw_time >> 8) & 0xff

                else:

                    # compare against the CRC otherwise

                    check_byte = (zinfo.CRC >> 24) & 0xff

                if ord(h[11]) != check_byte:

                    raise RuntimeError("Bad password for file", name)

            return ZipExtFile(zef_file, mode, zinfo, zd,

                    close_fileobj=should_close)

        except:

            if should_close:

                zef_file.close()

            raise

    def extract(self, member, path=None, pwd=None):

        """Extract a member from the archive to the current working directory,

           using its full name. Its file information is extracted as accurately

           as possible. `member' may be a filename or a ZipInfo object. You can

           specify a different directory using `path'.

        """

        if not isinstance(member, ZipInfo):

            member = self.getinfo(member)

        if path is None:

            path = os.getcwd()

        return self._extract_member(member, path, pwd)

    def extractall(self, path=None, members=None, pwd=None):

        """Extract all members from the archive to the current working

           directory. `path' specifies a different directory to extract to.

           `members' is optional and must be a subset of the list returned

           by namelist().

        """

        if members is None:

            members = self.namelist()

        for zipinfo in members:

            self.extract(zipinfo, path, pwd)

    def _extract_member(self, member, targetpath, pwd):

        """Extract the ZipInfo object 'member' to a physical

           file on the path targetpath.

        """

        # build the destination pathname, replacing

        # forward slashes to platform specific separators.

        arcname = member.filename.replace('/', os.path.sep)

        if os.path.altsep:

            arcname = arcname.replace(os.path.altsep, os.path.sep)

        # interpret absolute pathname as relative, remove drive letter or

        # UNC path, redundant separators, "." and ".." components.

        arcname = os.path.splitdrive(arcname)[1]

        arcname = os.path.sep.join(x for x in arcname.split(os.path.sep)

                    if x not in ('', os.path.curdir, os.path.pardir))

        if os.path.sep == '\\':

            # filter illegal characters on Windows

            illegal = ':<>|"?*'

            if isinstance(arcname, unicode):

                table = {ord(c): ord('_') for c in illegal}

            else:

                table = string.maketrans(illegal, '_' * len(illegal))

            arcname = arcname.translate(table)

            # remove trailing dots

            arcname = (x.rstrip('.') for x in arcname.split(os.path.sep))

            arcname = os.path.sep.join(x for x in arcname if x)

        targetpath = os.path.join(targetpath, arcname)

        targetpath = os.path.normpath(targetpath)

        # Create all upper directories if necessary.

        upperdirs = os.path.dirname(targetpath)

        if upperdirs and not os.path.exists(upperdirs):

            os.makedirs(upperdirs)

        if member.filename[-1] == '/':

            if not os.path.isdir(targetpath):

                os.mkdir(targetpath)

            return targetpath

        with self.open(member, pwd=pwd) as source, \

             file(targetpath, "wb") as target:

            shutil.copyfileobj(source, target)

        return targetpath

    def _writecheck(self, zinfo):

        """Check for errors before writing a file to the archive."""

        if zinfo.filename in self.NameToInfo:

            import warnings

            warnings.warn('Duplicate name: %r' % zinfo.filename, stacklevel=3)

        if self.mode not in ("w", "a"):

            raise RuntimeError, 'write() requires mode "w" or "a"'

        if not self.fp:

            raise RuntimeError, \

                  "Attempt to write ZIP archive that was already closed"

        if zinfo.compress_type == ZIP_DEFLATED and not zlib:

            raise RuntimeError, \

                  "Compression requires the (missing) zlib module"

        if zinfo.compress_type not in (ZIP_STORED, ZIP_DEFLATED):

            raise RuntimeError, \

                  "That compression method is not supported"

        if not self._allowZip64:

            requires_zip64 = None

            if len(self.filelist) >= ZIP_FILECOUNT_LIMIT:

                requires_zip64 = "Files count"

            elif zinfo.file_size > ZIP64_LIMIT:

                requires_zip64 = "Filesize"

            elif zinfo.header_offset > ZIP64_LIMIT:

                requires_zip64 = "Zipfile size"

            if requires_zip64:

                raise LargeZipFile(requires_zip64 +

                                   " would require ZIP64 extensions")

    def write(self, filename, arcname=None, compress_type=None):

        """Put the bytes from filename into the archive under the name

        arcname."""

        if not self.fp:

            raise RuntimeError(

                  "Attempt to write to ZIP archive that was already closed")

        st = os.stat(filename)

        isdir = stat.S_ISDIR(st.st_mode)

        mtime = time.localtime(st.st_mtime)

        date_time = mtime[0:6]

        # Create ZipInfo instance to store file information

        if arcname is None:

            arcname = filename

        arcname = os.path.normpath(os.path.splitdrive(arcname)[1])

        while arcname[0] in (os.sep, os.altsep):

            arcname = arcname[1:]

        if isdir:

            arcname += '/'

        zinfo = ZipInfo(arcname, date_time)

        zinfo.external_attr = (st[0] & 0xFFFF) << 16L      # Unix attributes

        if compress_type is None:

            zinfo.compress_type = self.compression

        else:

            zinfo.compress_type = compress_type

        zinfo.file_size = st.st_size

        zinfo.flag_bits = 0x00

        zinfo.header_offset = self.fp.tell()    # Start of header bytes

        self._writecheck(zinfo)

        self._didModify = True

        if isdir:

            zinfo.file_size = 0

            zinfo.compress_size = 0

            zinfo.CRC = 0

            zinfo.external_attr |= 0x10  # MS-DOS directory flag

            self.filelist.append(zinfo)

            self.NameToInfo[zinfo.filename] = zinfo

            self.fp.write(zinfo.FileHeader(False))

            return

        with open(filename, "rb") as fp:

            # Must overwrite CRC and sizes with correct data later

            zinfo.CRC = CRC = 0

            zinfo.compress_size = compress_size = 0

            # Compressed size can be larger than uncompressed size

            zip64 = self._allowZip64 and \

                    zinfo.file_size * 1.05 > ZIP64_LIMIT

            self.fp.write(zinfo.FileHeader(zip64))

            if zinfo.compress_type == ZIP_DEFLATED:

                cmpr = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,

                     zlib.DEFLATED, -15)

            else:

                cmpr = None

            file_size = 0

            while 1:

                buf = fp.read(1024 * 8)

                if not buf:

                    break

                file_size = file_size + len(buf)

                CRC = crc32(buf, CRC) & 0xffffffff

                if cmpr:

                    buf = cmpr.compress(buf)

                    compress_size = compress_size + len(buf)

                self.fp.write(buf)

        if cmpr:

            buf = cmpr.flush()

            compress_size = compress_size + len(buf)

            self.fp.write(buf)

            zinfo.compress_size = compress_size

        else:

            zinfo.compress_size = file_size

        zinfo.CRC = CRC

        zinfo.file_size = file_size

        if not zip64 and self._allowZip64:

            if file_size > ZIP64_LIMIT:

                raise RuntimeError('File size has increased during compressing')

            if compress_size > ZIP64_LIMIT:

                raise RuntimeError('Compressed size larger than uncompressed size')

        # Seek backwards and write file header (which will now include

        # correct CRC and file sizes)

        position = self.fp.tell()       # Preserve current position in file

        self.fp.seek(zinfo.header_offset, 0)

        self.fp.write(zinfo.FileHeader(zip64))

        self.fp.seek(position, 0)

        self.filelist.append(zinfo)

        self.NameToInfo[zinfo.filename] = zinfo

    def writestr(self, zinfo_or_arcname, bytes, compress_type=None):

        """Write a file into the archive.  The contents is the string

        'bytes'.  'zinfo_or_arcname' is either a ZipInfo instance or

        the name of the file in the archive."""

        if not isinstance(zinfo_or_arcname, ZipInfo):

            zinfo = ZipInfo(filename=zinfo_or_arcname,

                            date_time=time.localtime(time.time())[:6])

            zinfo.compress_type = self.compression

            if zinfo.filename[-1] == '/':

                zinfo.external_attr = 0o40775 << 16   # drwxrwxr-x

                zinfo.external_attr |= 0x10           # MS-DOS directory flag

            else:

                zinfo.external_attr = 0o600 << 16     # ?rw-------

        else:

            zinfo = zinfo_or_arcname

        if not self.fp:

            raise RuntimeError(

                  "Attempt to write to ZIP archive that was already closed")

        if compress_type is not None:

            zinfo.compress_type = compress_type

        zinfo.file_size = len(bytes)            # Uncompressed size

        zinfo.header_offset = self.fp.tell()    # Start of header bytes

        self._writecheck(zinfo)

        self._didModify = True

        zinfo.CRC = crc32(bytes) & 0xffffffff       # CRC-32 checksum

        if zinfo.compress_type == ZIP_DEFLATED:

            co = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,

                 zlib.DEFLATED, -15)

            bytes = co.compress(bytes) + co.flush()

            zinfo.compress_size = len(bytes)    # Compressed size

        else:

            zinfo.compress_size = zinfo.file_size

        zip64 = zinfo.file_size > ZIP64_LIMIT or \

                zinfo.compress_size > ZIP64_LIMIT

        if zip64 and not self._allowZip64:

            raise LargeZipFile("Filesize would require ZIP64 extensions")

        self.fp.write(zinfo.FileHeader(zip64))

        self.fp.write(bytes)

        if zinfo.flag_bits & 0x08:

            # Write CRC and file sizes after the file data

            fmt = '<LQQ' if zip64 else '<LLL'

            self.fp.write(struct.pack(fmt, zinfo.CRC, zinfo.compress_size,

                  zinfo.file_size))

        self.fp.flush()

        self.filelist.append(zinfo)

        self.NameToInfo[zinfo.filename] = zinfo

    def __del__(self):

        """Call the "close()" method in case the user forgot."""

        self.close()

    def close(self):

        """Close the file, and for mode "w" and "a" write the ending

        records."""

        if self.fp is None:

            return

        try:

            if self.mode in ("w", "a") and self._didModify: # write ending records

                pos1 = self.fp.tell()

                for zinfo in self.filelist:         # write central directory

                    dt = zinfo.date_time

                    dosdate = (dt[0] - 1980) << 9 | dt[1] << 5 | dt[2]

                    dostime = dt[3] << 11 | dt[4] << 5 | (dt[5] // 2)

                    extra = []

                    if zinfo.file_size > ZIP64_LIMIT \

                            or zinfo.compress_size > ZIP64_LIMIT:

                        extra.append(zinfo.file_size)

                        extra.append(zinfo.compress_size)

                        file_size = 0xffffffff

                        compress_size = 0xffffffff

                    else:

                        file_size = zinfo.file_size

                        compress_size = zinfo.compress_size

                    if zinfo.header_offset > ZIP64_LIMIT:

                        extra.append(zinfo.header_offset)

                        header_offset = 0xffffffffL

                    else:

                        header_offset = zinfo.header_offset

                    extra_data = zinfo.extra

                    if extra:

                        # Append a ZIP64 field to the extra's

                        extra_data = struct.pack(

                                '<HH' + 'Q'*len(extra),

                                1, 8*len(extra), *extra) + extra_data

                        extract_version = max(45, zinfo.extract_version)

                        create_version = max(45, zinfo.create_version)

                    else:

                        extract_version = zinfo.extract_version

                        create_version = zinfo.create_version

                    try:

                        filename, flag_bits = zinfo._encodeFilenameFlags()

                        centdir = struct.pack(structCentralDir,

                        stringCentralDir, create_version,

                        zinfo.create_system, extract_version, zinfo.reserved,

                        flag_bits, zinfo.compress_type, dostime, dosdate,

                        zinfo.CRC, compress_size, file_size,

                        len(filename), len(extra_data), len(zinfo.comment),

                        0, zinfo.internal_attr, zinfo.external_attr,

                        header_offset)

                    except DeprecationWarning:

                        print >>sys.stderr, (structCentralDir,

                        stringCentralDir, create_version,

                        zinfo.create_system, extract_version, zinfo.reserved,

                        zinfo.flag_bits, zinfo.compress_type, dostime, dosdate,

                        zinfo.CRC, compress_size, file_size,

                        len(zinfo.filename), len(extra_data), len(zinfo.comment),

                        0, zinfo.internal_attr, zinfo.external_attr,

                        header_offset)

                        raise

                    self.fp.write(centdir)

                    self.fp.write(filename)

                    self.fp.write(extra_data)

                    self.fp.write(zinfo.comment)

                pos2 = self.fp.tell()

                # Write end-of-zip-archive record

                centDirCount = len(self.filelist)

                centDirSize = pos2 - pos1

                centDirOffset = pos1

                requires_zip64 = None

                if centDirCount > ZIP_FILECOUNT_LIMIT:

                    requires_zip64 = "Files count"

                elif centDirOffset > ZIP64_LIMIT:

                    requires_zip64 = "Central directory offset"

                elif centDirSize > ZIP64_LIMIT:

                    requires_zip64 = "Central directory size"

                if requires_zip64:

                    # Need to write the ZIP64 end-of-archive records

                    if not self._allowZip64:

                        raise LargeZipFile(requires_zip64 +

                                           " would require ZIP64 extensions")

                    zip64endrec = struct.pack(

                            structEndArchive64, stringEndArchive64,

                            44, 45, 45, 0, 0, centDirCount, centDirCount,

                            centDirSize, centDirOffset)

                    self.fp.write(zip64endrec)

                    zip64locrec = struct.pack(

                            structEndArchive64Locator,

                            stringEndArchive64Locator, 0, pos2, 1)

                    self.fp.write(zip64locrec)

                    centDirCount = min(centDirCount, 0xFFFF)

                    centDirSize = min(centDirSize, 0xFFFFFFFF)

                    centDirOffset = min(centDirOffset, 0xFFFFFFFF)

                endrec = struct.pack(structEndArchive, stringEndArchive,

                                    0, 0, centDirCount, centDirCount,

                                    centDirSize, centDirOffset, len(self._comment))

                self.fp.write(endrec)

                self.fp.write(self._comment)

                self.fp.flush()

        finally:

            fp = self.fp

            self.fp = None

            if not self._filePassed:

                fp.close()

ZipFile

class TarFile(object):

    """The TarFile Class provides an interface to tar archives.

    """

    debug = 0                   # May be set from 0 (no msgs) to 3 (all msgs)

    dereference = False         # If true, add content of linked file to the

                                # tar file, else the link.

    ignore_zeros = False        # If true, skips empty or invalid blocks and

                                # continues processing.

    errorlevel = 1              # If 0, fatal errors only appear in debug

                                # messages (if debug >= 0). If > 0, errors

                                # are passed to the caller as exceptions.

    format = DEFAULT_FORMAT     # The format to use when creating an archive.

    encoding = ENCODING         # Encoding for 8-bit character strings.

    errors = None               # Error handler for unicode conversion.

    tarinfo = TarInfo           # The default TarInfo class to use.

    fileobject = ExFileObject   # The default ExFileObject class to use.

    def __init__(self, name=None, mode="r", fileobj=None, format=None,

            tarinfo=None, dereference=None, ignore_zeros=None, encoding=None,

            errors=None, pax_headers=None, debug=None, errorlevel=None):

        """Open an (uncompressed) tar archive `name'. `mode' is either 'r' to

           read from an existing archive, 'a' to append data to an existing

           file or 'w' to create a new file overwriting an existing one. `mode'

           defaults to 'r'.

           If `fileobj' is given, it is used for reading or writing data. If it

           can be determined, `mode' is overridden by `fileobj's mode.

           `fileobj' is not closed, when TarFile is closed.

        """

        modes = {"r": "rb", "a": "r+b", "w": "wb"}

        if mode not in modes:

            raise ValueError("mode must be 'r', 'a' or 'w'")

        self.mode = mode

        self._mode = modes[mode]

        if not fileobj:

            if self.mode == "a" and not os.path.exists(name):

                # Create nonexistent files in append mode.

                self.mode = "w"

                self._mode = "wb"

            fileobj = bltn_open(name, self._mode)

            self._extfileobj = False

        else:

            if name is None and hasattr(fileobj, "name"):

                name = fileobj.name

            if hasattr(fileobj, "mode"):

                self._mode = fileobj.mode

            self._extfileobj = True

        self.name = os.path.abspath(name) if name else None

        self.fileobj = fileobj

        # Init attributes.

        if format is not None:

            self.format = format

        if tarinfo is not None:

            self.tarinfo = tarinfo

        if dereference is not None:

            self.dereference = dereference

        if ignore_zeros is not None:

            self.ignore_zeros = ignore_zeros

        if encoding is not None:

            self.encoding = encoding

        if errors is not None:

            self.errors = errors

        elif mode == "r":

            self.errors = "utf-8"

        else:

            self.errors = "strict"

        if pax_headers is not None and self.format == PAX_FORMAT:

            self.pax_headers = pax_headers

        else:

            self.pax_headers = {}

        if debug is not None:

            self.debug = debug

        if errorlevel is not None:

            self.errorlevel = errorlevel

        # Init datastructures.

        self.closed = False

        self.members = []       # list of members as TarInfo objects

        self._loaded = False    # flag if all members have been read

        self.offset = self.fileobj.tell()

                                # current position in the archive file

        self.inodes = {}        # dictionary caching the inodes of

                                # archive members already added

        try:

            if self.mode == "r":

                self.firstmember = None

                self.firstmember = self.next()

            if self.mode == "a":

                # Move to the end of the archive,

                # before the first empty block.

                while True:

                    self.fileobj.seek(self.offset)

                    try:

                        tarinfo = self.tarinfo.fromtarfile(self)

                        self.members.append(tarinfo)

                    except EOFHeaderError:

                        self.fileobj.seek(self.offset)

                        break

                    except HeaderError, e:

                        raise ReadError(str(e))

            if self.mode in "aw":

                self._loaded = True

                if self.pax_headers:

                    buf = self.tarinfo.create_pax_global_header(self.pax_headers.copy())

                    self.fileobj.write(buf)

                    self.offset += len(buf)

        except:

            if not self._extfileobj:

                self.fileobj.close()

            self.closed = True

            raise

    def _getposix(self):

        return self.format == USTAR_FORMAT

    def _setposix(self, value):

        import warnings

        warnings.warn("use the format attribute instead", DeprecationWarning,

                      2)

        if value:

            self.format = USTAR_FORMAT

        else:

            self.format = GNU_FORMAT

    posix = property(_getposix, _setposix)

    #--------------------------------------------------------------------------

    # Below are the classmethods which act as alternate constructors to the

    # TarFile class. The open() method is the only one that is needed for

    # public use; it is the "super"-constructor and is able to select an

    # adequate "sub"-constructor for a particular compression using the mapping

    # from OPEN_METH.

    #

    # This concept allows one to subclass TarFile without losing the comfort of

    # the super-constructor. A sub-constructor is registered and made available

    # by adding it to the mapping in OPEN_METH.

    @classmethod

    def open(cls, name=None, mode="r", fileobj=None, bufsize=RECORDSIZE, **kwargs):

        """Open a tar archive for reading, writing or appending. Return

           an appropriate TarFile class.

           mode:

           'r' or 'r:*' open for reading with transparent compression

           'r:'         open for reading exclusively uncompressed

           'r:gz'       open for reading with gzip compression

           'r:bz2'      open for reading with bzip2 compression

           'a' or 'a:'  open for appending, creating the file if necessary

           'w' or 'w:'  open for writing without compression

           'w:gz'       open for writing with gzip compression

           'w:bz2'      open for writing with bzip2 compression

           'r|*'        open a stream of tar blocks with transparent compression

           'r|'         open an uncompressed stream of tar blocks for reading

           'r|gz'       open a gzip compressed stream of tar blocks

           'r|bz2'      open a bzip2 compressed stream of tar blocks

           'w|'         open an uncompressed stream for writing

           'w|gz'       open a gzip compressed stream for writing

           'w|bz2'      open a bzip2 compressed stream for writing

        """

        if not name and not fileobj:

            raise ValueError("nothing to open")

        if mode in ("r", "r:*"):

            # Find out which *open() is appropriate for opening the file.

            for comptype in cls.OPEN_METH:

                func = getattr(cls, cls.OPEN_METH[comptype])

                if fileobj is not None:

                    saved_pos = fileobj.tell()

                try:

                    return func(name, "r", fileobj, **kwargs)

                except (ReadError, CompressionError), e:

                    if fileobj is not None:

                        fileobj.seek(saved_pos)

                    continue

            raise ReadError("file could not be opened successfully")

        elif ":" in mode:

            filemode, comptype = mode.split(":", 1)

            filemode = filemode or "r"

            comptype = comptype or "tar"

            # Select the *open() function according to

            # given compression.

            if comptype in cls.OPEN_METH:

                func = getattr(cls, cls.OPEN_METH[comptype])

            else:

                raise CompressionError("unknown compression type %r" % comptype)

            return func(name, filemode, fileobj, **kwargs)

        elif "|" in mode:

            filemode, comptype = mode.split("|", 1)

            filemode = filemode or "r"

            comptype = comptype or "tar"

            if filemode not in ("r", "w"):

                raise ValueError("mode must be 'r' or 'w'")

            stream = _Stream(name, filemode, comptype, fileobj, bufsize)

            try:

                t = cls(name, filemode, stream, **kwargs)

            except:

                stream.close()

                raise

            t._extfileobj = False

            return t

        elif mode in ("a", "w"):

            return cls.taropen(name, mode, fileobj, **kwargs)

        raise ValueError("undiscernible mode")

    @classmethod

    def taropen(cls, name, mode="r", fileobj=None, **kwargs):

        """Open uncompressed tar archive name for reading or writing.

        """

        if mode not in ("r", "a", "w"):

            raise ValueError("mode must be 'r', 'a' or 'w'")

        return cls(name, mode, fileobj, **kwargs)

    @classmethod

    def gzopen(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):

        """Open gzip compressed tar archive name for reading or writing.

           Appending is not allowed.

        """

        if mode not in ("r", "w"):

            raise ValueError("mode must be 'r' or 'w'")

        try:

            import gzip

            gzip.GzipFile

        except (ImportError, AttributeError):

            raise CompressionError("gzip module is not available")

        try:

            fileobj = gzip.GzipFile(name, mode, compresslevel, fileobj)

        except OSError:

            if fileobj is not None and mode == 'r':

                raise ReadError("not a gzip file")

            raise

        try:

            t = cls.taropen(name, mode, fileobj, **kwargs)

        except IOError:

            fileobj.close()

            if mode == 'r':

                raise ReadError("not a gzip file")

            raise

        except:

            fileobj.close()

            raise

        t._extfileobj = False

        return t

    @classmethod

    def bz2open(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):

        """Open bzip2 compressed tar archive name for reading or writing.

           Appending is not allowed.

        """

        if mode not in ("r", "w"):

            raise ValueError("mode must be 'r' or 'w'.")

        try:

            import bz2

        except ImportError:

            raise CompressionError("bz2 module is not available")

        if fileobj is not None:

            fileobj = _BZ2Proxy(fileobj, mode)

        else:

            fileobj = bz2.BZ2File(name, mode, compresslevel=compresslevel)

        try:

            t = cls.taropen(name, mode, fileobj, **kwargs)

        except (IOError, EOFError):

            fileobj.close()

            if mode == 'r':

                raise ReadError("not a bzip2 file")

            raise

        except:

            fileobj.close()

            raise

        t._extfileobj = False

        return t

    # All *open() methods are registered here.

    OPEN_METH = {

        "tar": "taropen",   # uncompressed tar

        "gz":  "gzopen",    # gzip compressed tar

        "bz2": "bz2open"    # bzip2 compressed tar

    }

    #--------------------------------------------------------------------------

    # The public methods which TarFile provides:

    def close(self):

        """Close the TarFile. In write-mode, two finishing zero blocks are

           appended to the archive.

        """

        if self.closed:

            return

        if self.mode in "aw":

            self.fileobj.write(NUL * (BLOCKSIZE * 2))

            self.offset += (BLOCKSIZE * 2)

            # fill up the end with zero-blocks

            # (like option -b20 for tar does)

            blocks, remainder = divmod(self.offset, RECORDSIZE)

            if remainder > 0:

                self.fileobj.write(NUL * (RECORDSIZE - remainder))

        if not self._extfileobj:

            self.fileobj.close()

        self.closed = True

    def getmember(self, name):

        """Return a TarInfo object for member `name'. If `name' can not be

           found in the archive, KeyError is raised. If a member occurs more

           than once in the archive, its last occurrence is assumed to be the

           most up-to-date version.

        """

        tarinfo = self._getmember(name)

        if tarinfo is None:

            raise KeyError("filename %r not found" % name)

        return tarinfo

    def getmembers(self):

        """Return the members of the archive as a list of TarInfo objects. The

           list has the same order as the members in the archive.

        """

        self._check()

        if not self._loaded:    # if we want to obtain a list of

            self._load()        # all members, we first have to

                                # scan the whole archive.

        return self.members

    def getnames(self):

        """Return the members of the archive as a list of their names. It has

           the same order as the list returned by getmembers().

        """

        return [tarinfo.name for tarinfo in self.getmembers()]

    def gettarinfo(self, name=None, arcname=None, fileobj=None):

        """Create a TarInfo object for either the file `name' or the file

           object `fileobj' (using os.fstat on its file descriptor). You can

           modify some of the TarInfo's attributes before you add it using

           addfile(). If given, `arcname' specifies an alternative name for the

           file in the archive.

        """

        self._check("aw")

        # When fileobj is given, replace name by

        # fileobj's real name.

        if fileobj is not None:

            name = fileobj.name

        # Building the name of the member in the archive.

        # Backward slashes are converted to forward slashes,

        # Absolute paths are turned to relative paths.

        if arcname is None:

            arcname = name

        drv, arcname = os.path.splitdrive(arcname)

        arcname = arcname.replace(os.sep, "/")

        arcname = arcname.lstrip("/")

        # Now, fill the TarInfo object with

        # information specific for the file.

        tarinfo = self.tarinfo()

        tarinfo.tarfile = self

        # Use os.stat or os.lstat, depending on platform

        # and if symlinks shall be resolved.

        if fileobj is None:

            if hasattr(os, "lstat") and not self.dereference:

                statres = os.lstat(name)

            else:

                statres = os.stat(name)

        else:

            statres = os.fstat(fileobj.fileno())

        linkname = ""

        stmd = statres.st_mode

        if stat.S_ISREG(stmd):

            inode = (statres.st_ino, statres.st_dev)

            if not self.dereference and statres.st_nlink > 1 and \

                    inode in self.inodes and arcname != self.inodes[inode]:

                # Is it a hardlink to an already

                # archived file?

                type = LNKTYPE

                linkname = self.inodes[inode]

            else:

                # The inode is added only if its valid.

                # For win32 it is always 0.

                type = REGTYPE

                if inode[0]:

                    self.inodes[inode] = arcname

        elif stat.S_ISDIR(stmd):

            type = DIRTYPE

        elif stat.S_ISFIFO(stmd):

            type = FIFOTYPE

        elif stat.S_ISLNK(stmd):

            type = SYMTYPE

            linkname = os.readlink(name)

        elif stat.S_ISCHR(stmd):

            type = CHRTYPE

        elif stat.S_ISBLK(stmd):

            type = BLKTYPE

        else:

            return None

        # Fill the TarInfo object with all

        # information we can get.

        tarinfo.name = arcname

        tarinfo.mode = stmd

        tarinfo.uid = statres.st_uid

        tarinfo.gid = statres.st_gid

        if type == REGTYPE:

            tarinfo.size = statres.st_size

        else:

            tarinfo.size = 0L

        tarinfo.mtime = statres.st_mtime

        tarinfo.type = type

        tarinfo.linkname = linkname

        if pwd:

            try:

                tarinfo.uname = pwd.getpwuid(tarinfo.uid)[0]

            except KeyError:

                pass

        if grp:

            try:

                tarinfo.gname = grp.getgrgid(tarinfo.gid)[0]

            except KeyError:

                pass

        if type in (CHRTYPE, BLKTYPE):

            if hasattr(os, "major") and hasattr(os, "minor"):

                tarinfo.devmajor = os.major(statres.st_rdev)

                tarinfo.devminor = os.minor(statres.st_rdev)

        return tarinfo

    def list(self, verbose=True):

        """Print a table of contents to sys.stdout. If `verbose' is False, only

           the names of the members are printed. If it is True, an `ls -l'-like

           output is produced.

        """

        self._check()

        for tarinfo in self:

            if verbose:

                print filemode(tarinfo.mode),

                print "%s/%s" % (tarinfo.uname or tarinfo.uid,

                                 tarinfo.gname or tarinfo.gid),

                if tarinfo.ischr() or tarinfo.isblk():

                    print "%10s" % ("%d,%d" \

                                    % (tarinfo.devmajor, tarinfo.devminor)),

                else:

                    print "%10d" % tarinfo.size,

                print "%d-%02d-%02d %02d:%02d:%02d" \

                      % time.localtime(tarinfo.mtime)[:6],

            print tarinfo.name + ("/" if tarinfo.isdir() else ""),

            if verbose:

                if tarinfo.issym():

                    print "->", tarinfo.linkname,

                if tarinfo.islnk():

                    print "link to", tarinfo.linkname,

            print

    def add(self, name, arcname=None, recursive=True, exclude=None, filter=None):

        """Add the file `name' to the archive. `name' may be any type of file

           (directory, fifo, symbolic link, etc.). If given, `arcname'

           specifies an alternative name for the file in the archive.

           Directories are added recursively by default. This can be avoided by

           setting `recursive' to False. `exclude' is a function that should

           return True for each filename to be excluded. `filter' is a function

           that expects a TarInfo object argument and returns the changed

           TarInfo object, if it returns None the TarInfo object will be

           excluded from the archive.

        """

        self._check("aw")

        if arcname is None:

            arcname = name

        # Exclude pathnames.

        if exclude is not None:

            import warnings

            warnings.warn("use the filter argument instead",

                    DeprecationWarning, 2)

            if exclude(name):

                self._dbg(2, "tarfile: Excluded %r" % name)

                return

        # Skip if somebody tries to archive the archive...

        if self.name is not None and os.path.abspath(name) == self.name:

            self._dbg(2, "tarfile: Skipped %r" % name)

            return

        self._dbg(1, name)

        # Create a TarInfo object from the file.

        tarinfo = self.gettarinfo(name, arcname)

        if tarinfo is None:

            self._dbg(1, "tarfile: Unsupported type %r" % name)

            return

        # Change or exclude the TarInfo object.

        if filter is not None:

            tarinfo = filter(tarinfo)

            if tarinfo is None:

                self._dbg(2, "tarfile: Excluded %r" % name)

                return

        # Append the tar header and data to the archive.

        if tarinfo.isreg():

            with bltn_open(name, "rb") as f:

                self.addfile(tarinfo, f)

        elif tarinfo.isdir():

            self.addfile(tarinfo)

            if recursive:

                for f in os.listdir(name):

                    self.add(os.path.join(name, f), os.path.join(arcname, f),

                            recursive, exclude, filter)

        else:

            self.addfile(tarinfo)

    def addfile(self, tarinfo, fileobj=None):

        """Add the TarInfo object `tarinfo' to the archive. If `fileobj' is

           given, tarinfo.size bytes are read from it and added to the archive.

           You can create TarInfo objects using gettarinfo().

           On Windows platforms, `fileobj' should always be opened with mode

           'rb' to avoid irritation about the file size.

        """

        self._check("aw")

        tarinfo = copy.copy(tarinfo)

        buf = tarinfo.tobuf(self.format, self.encoding, self.errors)

        self.fileobj.write(buf)

        self.offset += len(buf)

        # If there's data to follow, append it.

        if fileobj is not None:

            copyfileobj(fileobj, self.fileobj, tarinfo.size)

            blocks, remainder = divmod(tarinfo.size, BLOCKSIZE)

            if remainder > 0:

                self.fileobj.write(NUL * (BLOCKSIZE - remainder))

                blocks += 1

            self.offset += blocks * BLOCKSIZE

        self.members.append(tarinfo)

    def extractall(self, path=".", members=None):

        """Extract all members from the archive to the current working

           directory and set owner, modification time and permissions on

           directories afterwards. `path' specifies a different directory

           to extract to. `members' is optional and must be a subset of the

           list returned by getmembers().

        """

        directories = []

        if members is None:

            members = self

        for tarinfo in members:

            if tarinfo.isdir():

                # Extract directories with a safe mode.

                directories.append(tarinfo)

                tarinfo = copy.copy(tarinfo)

                tarinfo.mode = 0700

            self.extract(tarinfo, path)

        # Reverse sort directories.

        directories.sort(key=operator.attrgetter('name'))

        directories.reverse()

        # Set correct owner, mtime and filemode on directories.

        for tarinfo in directories:

            dirpath = os.path.join(path, tarinfo.name)

            try:

                self.chown(tarinfo, dirpath)

                self.utime(tarinfo, dirpath)

                self.chmod(tarinfo, dirpath)

            except ExtractError, e:

                if self.errorlevel > 1:

                    raise

                else:

                    self._dbg(1, "tarfile: %s" % e)

    def extract(self, member, path=""):

        """Extract a member from the archive to the current working directory,

           using its full name. Its file information is extracted as accurately

           as possible. `member' may be a filename or a TarInfo object. You can

           specify a different directory using `path'.

        """

        self._check("r")

        if isinstance(member, basestring):

            tarinfo = self.getmember(member)

        else:

            tarinfo = member

        # Prepare the link target for makelink().

        if tarinfo.islnk():

            tarinfo._link_target = os.path.join(path, tarinfo.linkname)

        try:

            self._extract_member(tarinfo, os.path.join(path, tarinfo.name))

        except EnvironmentError, e:

            if self.errorlevel > 0:

                raise

            else:

                if e.filename is None:

                    self._dbg(1, "tarfile: %s" % e.strerror)

                else:

                    self._dbg(1, "tarfile: %s %r" % (e.strerror, e.filename))

        except ExtractError, e:

            if self.errorlevel > 1:

                raise

            else:

                self._dbg(1, "tarfile: %s" % e)

    def extractfile(self, member):

        """Extract a member from the archive as a file object. `member' may be

           a filename or a TarInfo object. If `member' is a regular file, a

           file-like object is returned. If `member' is a link, a file-like

           object is constructed from the link's target. If `member' is none of

           the above, None is returned.

           The file-like object is read-only and provides the following

           methods: read(), readline(), readlines(), seek() and tell()

        """

        self._check("r")

        if isinstance(member, basestring):

            tarinfo = self.getmember(member)

        else:

            tarinfo = member

        if tarinfo.isreg():

            return self.fileobject(self, tarinfo)

        elif tarinfo.type not in SUPPORTED_TYPES:

            # If a member's type is unknown, it is treated as a

            # regular file.

            return self.fileobject(self, tarinfo)

        elif tarinfo.islnk() or tarinfo.issym():

            if isinstance(self.fileobj, _Stream):

                # A small but ugly workaround for the case that someone tries

                # to extract a (sym)link as a file-object from a non-seekable

                # stream of tar blocks.

                raise StreamError("cannot extract (sym)link as file object")

            else:

                # A (sym)link's file object is its target's file object.

                return self.extractfile(self._find_link_target(tarinfo))

        else:

            # If there's no data associated with the member (directory, chrdev,

            # blkdev, etc.), return None instead of a file object.

            return None

    def _extract_member(self, tarinfo, targetpath):

        """Extract the TarInfo object tarinfo to a physical

           file called targetpath.

        """

        # Fetch the TarInfo object for the given name

        # and build the destination pathname, replacing

        # forward slashes to platform specific separators.

        targetpath = targetpath.rstrip("/")

        targetpath = targetpath.replace("/", os.sep)

        # Create all upper directories.

        upperdirs = os.path.dirname(targetpath)

        if upperdirs and not os.path.exists(upperdirs):

            # Create directories that are not part of the archive with

            # default permissions.

            os.makedirs(upperdirs)

        if tarinfo.islnk() or tarinfo.issym():

            self._dbg(1, "%s -> %s" % (tarinfo.name, tarinfo.linkname))

        else:

            self._dbg(1, tarinfo.name)

        if tarinfo.isreg():

            self.makefile(tarinfo, targetpath)

        elif tarinfo.isdir():

            self.makedir(tarinfo, targetpath)

        elif tarinfo.isfifo():

            self.makefifo(tarinfo, targetpath)

        elif tarinfo.ischr() or tarinfo.isblk():

            self.makedev(tarinfo, targetpath)

        elif tarinfo.islnk() or tarinfo.issym():

            self.makelink(tarinfo, targetpath)

        elif tarinfo.type not in SUPPORTED_TYPES:

            self.makeunknown(tarinfo, targetpath)

        else:

            self.makefile(tarinfo, targetpath)

        self.chown(tarinfo, targetpath)

        if not tarinfo.issym():

            self.chmod(tarinfo, targetpath)

            self.utime(tarinfo, targetpath)

    #--------------------------------------------------------------------------

    # Below are the different file methods. They are called via

    # _extract_member() when extract() is called. They can be replaced in a

    # subclass to implement other functionality.

    def makedir(self, tarinfo, targetpath):

        """Make a directory called targetpath.

        """

        try:

            # Use a safe mode for the directory, the real mode is set

            # later in _extract_member().

            os.mkdir(targetpath, 0700)

        except EnvironmentError, e:

            if e.errno != errno.EEXIST:

                raise

    def makefile(self, tarinfo, targetpath):

        """Make a file called targetpath.

        """

        source = self.extractfile(tarinfo)

        try:

            with bltn_open(targetpath, "wb") as target:

                copyfileobj(source, target)

        finally:

            source.close()

    def makeunknown(self, tarinfo, targetpath):

        """Make a file from a TarInfo object with an unknown type

           at targetpath.

        """

        self.makefile(tarinfo, targetpath)

        self._dbg(1, "tarfile: Unknown file type %r, " \

                     "extracted as regular file." % tarinfo.type)

    def makefifo(self, tarinfo, targetpath):

        """Make a fifo called targetpath.

        """

        if hasattr(os, "mkfifo"):

            os.mkfifo(targetpath)

        else:

            raise ExtractError("fifo not supported by system")

    def makedev(self, tarinfo, targetpath):

        """Make a character or block device called targetpath.

        """

        if not hasattr(os, "mknod") or not hasattr(os, "makedev"):

            raise ExtractError("special devices not supported by system")

        mode = tarinfo.mode

        if tarinfo.isblk():

            mode |= stat.S_IFBLK

        else:

            mode |= stat.S_IFCHR

        os.mknod(targetpath, mode,

                 os.makedev(tarinfo.devmajor, tarinfo.devminor))

    def makelink(self, tarinfo, targetpath):

        """Make a (symbolic) link called targetpath. If it cannot be created

          (platform limitation), we try to make a copy of the referenced file

          instead of a link.

        """

        if hasattr(os, "symlink") and hasattr(os, "link"):

            # For systems that support symbolic and hard links.

            if tarinfo.issym():

                if os.path.lexists(targetpath):

                    os.unlink(targetpath)

                os.symlink(tarinfo.linkname, targetpath)

            else:

                # See extract().

                if os.path.exists(tarinfo._link_target):

                    if os.path.lexists(targetpath):

                        os.unlink(targetpath)

                    os.link(tarinfo._link_target, targetpath)

                else:

                    self._extract_member(self._find_link_target(tarinfo), targetpath)

        else:

            try:

                self._extract_member(self._find_link_target(tarinfo), targetpath)

            except KeyError:

                raise ExtractError("unable to resolve link inside archive")

    def chown(self, tarinfo, targetpath):

        """Set owner of targetpath according to tarinfo.

        """

        if pwd and hasattr(os, "geteuid") and os.geteuid() == 0:

            # We have to be root to do so.

            try:

                g = grp.getgrnam(tarinfo.gname)[2]

            except KeyError:

                g = tarinfo.gid

            try:

                u = pwd.getpwnam(tarinfo.uname)[2]

            except KeyError:

                u = tarinfo.uid

            try:

                if tarinfo.issym() and hasattr(os, "lchown"):

                    os.lchown(targetpath, u, g)

                else:

                    if sys.platform != "os2emx":

                        os.chown(targetpath, u, g)

            except EnvironmentError, e:

                raise ExtractError("could not change owner")

    def chmod(self, tarinfo, targetpath):

        """Set file permissions of targetpath according to tarinfo.

        """

        if hasattr(os, 'chmod'):

            try:

                os.chmod(targetpath, tarinfo.mode)

            except EnvironmentError, e:

                raise ExtractError("could not change mode")

    def utime(self, tarinfo, targetpath):

        """Set modification time of targetpath according to tarinfo.

        """

        if not hasattr(os, 'utime'):

            return

        try:

            os.utime(targetpath, (tarinfo.mtime, tarinfo.mtime))

        except EnvironmentError, e:

            raise ExtractError("could not change modification time")

    #--------------------------------------------------------------------------

    def next(self):

        """Return the next member of the archive as a TarInfo object, when

           TarFile is opened for reading. Return None if there is no more

           available.

        """

        self._check("ra")

        if self.firstmember is not None:

            m = self.firstmember

            self.firstmember = None

            return m

        # Read the next block.

        self.fileobj.seek(self.offset)

        tarinfo = None

        while True:

            try:

                tarinfo = self.tarinfo.fromtarfile(self)

            except EOFHeaderError, e:

                if self.ignore_zeros:

                    self._dbg(2, "0x%X: %s" % (self.offset, e))

                    self.offset += BLOCKSIZE

                    continue

            except InvalidHeaderError, e:

                if self.ignore_zeros:

                    self._dbg(2, "0x%X: %s" % (self.offset, e))

                    self.offset += BLOCKSIZE

                    continue

                elif self.offset == 0:

                    raise ReadError(str(e))

            except EmptyHeaderError:

                if self.offset == 0:

                    raise ReadError("empty file")

            except TruncatedHeaderError, e:

                if self.offset == 0:

                    raise ReadError(str(e))

            except SubsequentHeaderError, e:

                raise ReadError(str(e))

            break

        if tarinfo is not None:

            self.members.append(tarinfo)

        else:

            self._loaded = True

        return tarinfo

    #--------------------------------------------------------------------------

    # Little helper methods:

    def _getmember(self, name, tarinfo=None, normalize=False):

        """Find an archive member by name from bottom to top.

           If tarinfo is given, it is used as the starting point.

        """

        # Ensure that all members have been loaded.

        members = self.getmembers()

        # Limit the member search list up to tarinfo.

        if tarinfo is not None:

            members = members[:members.index(tarinfo)]

        if normalize:

            name = os.path.normpath(name)

        for member in reversed(members):

            if normalize:

                member_name = os.path.normpath(member.name)

            else:

                member_name = member.name

            if name == member_name:

                return member

    def _load(self):

        """Read through the entire archive file and look for readable

           members.

        """

        while True:

            tarinfo = self.next()

            if tarinfo is None:

                break

        self._loaded = True

    def _check(self, mode=None):

        """Check if TarFile is still open, and if the operation's mode

           corresponds to TarFile's mode.

        """

        if self.closed:

            raise IOError("%s is closed" % self.__class__.__name__)

        if mode is not None and self.mode not in mode:

            raise IOError("bad operation for mode %r" % self.mode)

    def _find_link_target(self, tarinfo):

        """Find the target member of a symlink or hardlink member in the

           archive.

        """

        if tarinfo.issym():

            # Always search the entire archive.

            linkname = "/".join(filter(None, (os.path.dirname(tarinfo.name), tarinfo.linkname)))

            limit = None

        else:

            # Search the archive before the link, because a hard link is

            # just a reference to an already archived file.

            linkname = tarinfo.linkname

            limit = tarinfo

        member = self._getmember(linkname, tarinfo=limit, normalize=True)

        if member is None:

            raise KeyError("linkname %r not found" % linkname)

        return member

    def __iter__(self):

        """Provide an iterator object.

        """

        if self._loaded:

            return iter(self.members)

        else:

            return TarIter(self)

    def _dbg(self, level, msg):

        """Write debugging output to sys.stderr.

        """

        if level <= self.debug:

            print >> sys.stderr, msg

    def __enter__(self):

        self._check()

        return self

    def __exit__(self, type, value, traceback):

        if type is None:

            self.close()

        else:

            # An exception occurred. We must not call close() because

            # it would try to write end-of-archive blocks and padding.

            if not self._extfileobj:

                self.fileobj.close()

            self.closed = True

# class TarFile

TarFile

xml处理模块

xml是实现不同语言或程序之间进行数据交换的协议，跟json差不多，但json使用起来更简单，不过，古时候，在json还没诞生的黑暗年代，大家只能选择用xml呀，至今很多传统公司如金融行业的很多系统的接口还主要是xml。

xml的格式如下，就是通过<>节点来区别数据结构的:

<?xml version="1.0"?>

<data>

    <country name="Liechtenstein">

        <rank updated="yes">2</rank>

        <year>2008</year>

        <gdppc>141100</gdppc>

        <neighbor name="Austria" direction="E"/>

        <neighbor name="Switzerland" direction="W"/>

    </country>

    <country name="Singapore">

        <rank updated="yes">5</rank>

        <year>2011</year>

        <gdppc>59900</gdppc>

        <neighbor name="Malaysia" direction="N"/>

    </country>

    <country name="Panama">

        <rank updated="yes">69</rank>

        <year>2011</year>

        <gdppc>13600</gdppc>

        <neighbor name="Costa Rica" direction="W"/>

        <neighbor name="Colombia" direction="E"/>

    </country>

</data>

xml协议在各个语言里的都是支持的，在python中可以用以下模块操作xml

import xml.etree.ElementTree as ET

tree = ET.parse("xmltest.xml")

root = tree.getroot()

print(root.tag)

#遍历xml文档

for child in root:

    print(child.tag, child.attrib)

    for i in child:

        print(i.tag,i.text)

#只遍历year 节点

for node in root.iter('year'):

    print(node.tag,node.text)

修改和删除xml文档内容

import xml.etree.ElementTree as ET

tree = ET.parse("xmltest.xml")

root = tree.getroot()

#修改

for node in root.iter('year'):

    new_year = int(node.text) + 1

    node.text = str(new_year)

    node.set("updated","yes")

tree.write("xmltest.xml")

#删除node

for country in root.findall('country'):

   rank = int(country.find('rank').text)

   if rank > 50:

     root.remove(country)

tree.write('output.xml')

自己创建xml文档

import xml.etree.ElementTree as ET

new_xml = ET.Element("namelist")

name = ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"})

age = ET.SubElement(name,"age",attrib={"checked":"no"})

sex = ET.SubElement(name,"sex")

sex.text = '33'

name2 = ET.SubElement(new_xml,"name",attrib={"enrolled":"no"})

age = ET.SubElement(name2,"age")

age.text = '19'

et = ET.ElementTree(new_xml) #生成文档对象

et.write("test.xml", encoding="utf-8",xml_declaration=True)

ET.dump(new_xml) #打印生成的格式

ConfigParser模块

用于生成和修改常见配置文档，当前模块的名称在 python 3.x 版本中变更为 configparser。

来看一个好多软件的常见文档格式如下

[DEFAULT]

ServerAliveInterval = 45

Compression = yes

CompressionLevel = 9

ForwardX11 = yes

[bitbucket.org]

User = hg

[topsecret.server.com]

Port = 50022

ForwardX11 = no

如果想用python生成一个这样的文档怎么做呢？

import configparser

config = configparser.ConfigParser()

config["DEFAULT"] = {'ServerAliveInterval': '45',

                      'Compression': 'yes',

                     'CompressionLevel': '9'}

config['bitbucket.org'] = {}

config['bitbucket.org']['User'] = 'hg'

config['topsecret.server.com'] = {}

topsecret = config['topsecret.server.com']

topsecret['Host Port'] = '50022'     # mutates the parser

topsecret['ForwardX11'] = 'no'  # same here

config['DEFAULT']['ForwardX11'] = 'yes'

with open('example.ini', 'w') as configfile:

   config.write(configfile)

写完了还可以再读出来哈。

>>> import configparser

>>> config = configparser.ConfigParser()

>>> config.sections()

[]

>>> config.read('example.ini')

['example.ini']

>>> config.sections()

['bitbucket.org', 'topsecret.server.com']

>>> 'bitbucket.org' in config

True

>>> 'bytebong.com' in config

False

>>> config['bitbucket.org']['User']

'hg'

>>> config['DEFAULT']['Compression']

'yes'

>>> topsecret = config['topsecret.server.com']

>>> topsecret['ForwardX11']

'no'

>>> topsecret['Port']

'50022'

>>> for key in config['bitbucket.org']: print(key)

...

user

compressionlevel

serveraliveinterval

compression

forwardx11

>>> config['bitbucket.org']['ForwardX11']

'yes'

configparser增删改查语法

[section1]

k1 = v1

k2:v2

[section2]

k1 = v1

import ConfigParser

config = ConfigParser.ConfigParser()

config.read('i.cfg')

# ########## 读 ##########

#secs = config.sections()

#print secs

#options = config.options('group2')

#print options

#item_list = config.items('group2')

#print item_list

#val = config.get('group1','key')

#val = config.getint('group1','key')

# ########## 改写 ##########

#sec = config.remove_section('group1')

#config.write(open('i.cfg', "w"))

#sec = config.has_section('wupeiqi')

#sec = config.add_section('wupeiqi')

#config.write(open('i.cfg', "w"))

#config.set('group2','k1',11111)

#config.write(open('i.cfg', "w"))

#config.remove_option('group2','age')

#config.write(open('i.cfg', "w"))

执行系统命令

可以执行shell命令的相关模块和函数有：

os.system
os.spawn*
os.popen* --废弃
popen2.* --废弃
commands.* --废弃，3.x中被移除

import commands

result = commands.getoutput('cmd')

result = commands.getstatus('cmd')

result = commands.getstatusoutput('cmd')

以上执行shell命令的相关的模块和函数的功能均在 subprocess 模块中实现，并提供了更丰富的功能。

call

执行命令，返回状态码

ret = subprocess.call(["ls", "-l"], shell=False)

ret = subprocess.call("ls -l", shell=True)

shell = True ，允许 shell 命令是字符串形式

check_call

执行命令，如果执行状态码是 0 ，则返回0，否则抛异常

subprocess.check_call(["ls", "-l"])

subprocess.check_call("exit 1", shell=True)

check_output

执行命令，如果状态码是 0 ，则返回执行结果，否则抛异常

subprocess.check_output(["echo", "Hello World!"])

subprocess.check_output("exit 1", shell=True)

subprocess.Popen(...)

用于执行复杂的系统命令

参数：

args：shell命令，可以是字符串或者序列类型（如：list，元组）
bufsize：指定缓冲。0 无缓冲,1 行缓冲,其他缓冲区大小,负值系统缓冲
stdin, stdout, stderr：分别表示程序的标准输入、输出、错误句柄
preexec_fn：只在Unix平台下有效，用于指定一个可执行对象（callable object），它将在子进程运行之前被调用
close_sfs：在windows平台下，如果close_fds被设置为True，则新创建的子进程将不会继承父进程的输入、输出、错误管道。
所以不能将close_fds设置为True同时重定向子进程的标准输入、输出与错误(stdin, stdout, stderr)。
shell：同上
cwd：用于设置子进程的当前目录
env：用于指定子进程的环境变量。如果env = None，子进程的环境变量将从父进程中继承。
universal_newlines：不同系统的换行符不同，True -> 同意使用 \n
startupinfo与createionflags只在windows下有效
将被传递给底层的CreateProcess()函数，用于设置子进程的一些属性，如：主窗口的外观，进程的优先级等等

import subprocess

ret1 = subprocess.Popen(["mkdir","t1"])

ret2 = subprocess.Popen("mkdir t2", shell=True)

执行普通命令

终端输入的命令分为两种：

输入即可得到输出，如：ifconfig
输入进行某环境，依赖再输入，如：python

import subprocess

obj = subprocess.Popen("mkdir t3", shell=True, cwd='/home/dev',)

import subprocess

obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

obj.stdin.write('print 1 \n ')

obj.stdin.write('print 2 \n ')

obj.stdin.write('print 3 \n ')

obj.stdin.write('print 4 \n ')

obj.stdin.close()

cmd_out = obj.stdout.read()

obj.stdout.close()

cmd_error = obj.stderr.read()

obj.stderr.close()

print cmd_out

print cmd_error

import subprocess

obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

obj.stdin.write('print 1 \n ')

obj.stdin.write('print 2 \n ')

obj.stdin.write('print 3 \n ')

obj.stdin.write('print 4 \n ')

out_error_list = obj.communicate()

print out_error_list

import subprocess

obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

out_error_list = obj.communicate('print "hello"')

print out_error_list

更多猛击这里

logging模块　　

很多程序都有记录日志的需求，并且日志中包含的信息即有正常的程序访问日志，还可能有错误、警告等信息输出，python的logging模块提供了标准的日志接口，你可以通过它存储各种格式的日志，logging的日志可以分为 debug(), info(), warning(), error() and critical() 5个级别，下面我们看一下怎么用。

最简单用法

import logging

logging.warning("user [alex] attempted wrong password more than 3 times")

logging.critical("server is down")

#输出

WARNING:root:user [alex] attempted wrong password more than 3 times

CRITICAL:root:server is down

看一下这几个日志级别分别代表什么意思

如果想把日志写到文件里，也很简单

import logging

logging.basicConfig(filename='example.log',level=logging.INFO)

logging.debug('This message should go to the log file')

logging.info('So should this')

logging.warning('And this, too')

其中下面这句中的level=loggin.INFO意思是，把日志纪录级别设置为INFO，也就是说，只有比日志是INFO或比INFO级别更高的日志才会被纪录到文件里，在这个例子，第一条日志是不会被纪录的，如果希望纪录debug的日志，那把日志级别改成DEBUG就行了。

logging.basicConfig(filename='example.log',level=logging.INFO)

感觉上面的日志格式忘记加上时间啦，日志不知道时间怎么行呢，下面就来加上!

import logging

logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%m/%d/%Y %I:%M:%S %p')

logging.warning('is when this event was logged.')

#输出

12/12/2010 11:46:36 AM is when this event was logged.

如果想同时把log打印在屏幕和文件日志里，就需要了解一点复杂的知识了

Python 使用logging模块记录日志涉及四个主要类，使用官方文档中的概括最为合适：

logger提供了应用程序可以直接使用的接口；

handler将(logger创建的)日志记录发送到合适的目的输出；

filter提供了细度设备来决定输出哪条日志记录；

formatter决定日志记录的最终输出格式。

logger
每个程序在输出信息之前都要获得一个Logger。Logger通常对应了程序的模块名，比如聊天工具的图形界面模块可以这样获得它的Logger：
LOG=logging.getLogger(”chat.gui”)
而核心模块可以这样：
LOG=logging.getLogger(”chat.kernel”)

Logger.setLevel(lel):指定最低的日志级别，低于lel的级别将被忽略。debug是最低的内置级别，critical为最高
Logger.addFilter(filt)、Logger.removeFilter(filt):添加或删除指定的filter
Logger.addHandler(hdlr)、Logger.removeHandler(hdlr)：增加或删除指定的handler
Logger.debug()、Logger.info()、Logger.warning()、Logger.error()、Logger.critical()：可以设置的日志级别

handler

handler对象负责发送相关的信息到指定目的地。Python的日志系统有多种Handler可以使用。有些Handler可以把信息输出到控制台，有些Logger可以把信息输出到文件，还有些 Handler可以把信息发送到网络上。如果觉得不够用，还可以编写自己的Handler。可以通过addHandler()方法添加多个多handler
Handler.setLevel(lel):指定被处理的信息级别，低于lel级别的信息将被忽略
Handler.setFormatter()：给这个handler选择一个格式
Handler.addFilter(filt)、Handler.removeFilter(filt)：新增或删除一个filter对象

每个Logger可以附加多个Handler。接下来我们就来介绍一些常用的Handler：
1) logging.StreamHandler
使用这个Handler可以向类似与sys.stdout或者sys.stderr的任何文件对象(file object)输出信息。它的构造函数是：
StreamHandler([strm])
其中strm参数是一个文件对象。默认是sys.stderr

2) logging.FileHandler
和StreamHandler类似，用于向一个文件输出日志信息。不过FileHandler会帮你打开这个文件。它的构造函数是：
FileHandler(filename[,mode])
filename是文件名，必须指定一个文件名。
mode是文件的打开方式。参见Python内置函数open()的用法。默认是’a'，即添加到文件末尾。

3) logging.handlers.RotatingFileHandler
这个Handler类似于上面的FileHandler，但是它可以管理文件大小。当文件达到一定大小之后，它会自动将当前日志文件改名，然后创建一个新的同名日志文件继续输出。比如日志文件是chat.log。当chat.log达到指定的大小之后，RotatingFileHandler自动把文件改名为chat.log.1。不过，如果chat.log.1已经存在，会先把chat.log.1重命名为chat.log.2。。。最后重新创建 chat.log，继续输出日志信息。它的构造函数是：
RotatingFileHandler( filename[, mode[, maxBytes[, backupCount]]])
其中filename和mode两个参数和FileHandler一样。
maxBytes用于指定日志文件的最大文件大小。如果maxBytes为0，意味着日志文件可以无限大，这时上面描述的重命名过程就不会发生。
backupCount用于指定保留的备份文件的个数。比如，如果指定为2，当上面描述的重命名过程发生时，原有的chat.log.2并不会被更名，而是被删除。

4) logging.handlers.TimedRotatingFileHandler
这个Handler和RotatingFileHandler类似，不过，它没有通过判断文件大小来决定何时重新创建日志文件，而是间隔一定时间就自动创建新的日志文件。重命名的过程与RotatingFileHandler类似，不过新的文件不是附加数字，而是当前时间。它的构造函数是：
TimedRotatingFileHandler( filename [,when [,interval [,backupCount]]])
其中filename参数和backupCount参数和RotatingFileHandler具有相同的意义。
interval是时间间隔。
when参数是一个字符串。表示时间间隔的单位，不区分大小写。它有以下取值：
S 秒
M 分
H 小时
D 天
W 每星期（interval==0时代表星期一）
midnight 每天凌晨

import logging

#create logger

logger = logging.getLogger('TEST-LOG')

logger.setLevel(logging.DEBUG)

# create console handler and set level to debug

ch = logging.StreamHandler()

ch.setLevel(logging.DEBUG)

# create file handler and set level to warning

fh = logging.FileHandler("access.log")

fh.setLevel(logging.WARNING)

# create formatter

formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')

# add formatter to ch and fh

ch.setFormatter(formatter)

fh.setFormatter(formatter)

# add ch and fh to logger

logger.addHandler(ch)

logger.addHandler(fh)

# 'application' code

logger.debug('debug message')

logger.info('info message')

logger.warn('warn message')

logger.error('error message')

logger.critical('critical message')

文件自动截断例子

import logging

from logging import handlers

logger = logging.getLogger(__name__)

log_file = "timelog.log"

#fh = handlers.RotatingFileHandler(filename=log_file,maxBytes=10,backupCount=3)

fh = handlers.TimedRotatingFileHandler(filename=log_file,when="S",interval=5,backupCount=3)

formatter = logging.Formatter('%(asctime)s %(module)s:%(lineno)d %(message)s')

fh.setFormatter(formatter)

logger.addHandler(fh)

logger.warning("test1")

logger.warning("test12")

logger.warning("test13")

logger.warning("test14")

re模块　　

常用正则表达式符号

'.'     默认匹配除\n之外的任意一个字符，若指定flag DOTALL,则匹配任意字符，包括换行

'^'     匹配字符开头，若指定flags MULTILINE,这种也可以匹配上(r"^a","\nabc\neee",flags=re.MULTILINE)

'$'     匹配字符结尾，或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也可以

'*'     匹配*号前的字符0次或多次，re.findall("ab*","cabb3abcbbac")  结果为['abb', 'ab', 'a']

'+'     匹配前一个字符1次或多次，re.findall("ab+","ab+cd+abb+bba") 结果['ab', 'abb']

'?'     匹配前一个字符1次或0次

'{m}'   匹配前一个字符m次

'{n,m}' 匹配前一个字符n到m次，re.findall("ab{1,3}","abb abc abbcbbb") 结果'abb', 'ab', 'abb']

'|'     匹配|左或|右的字符，re.search("abc|ABC","ABCBabcCD").group() 结果'ABC'

'(...)' 分组匹配，re.search("(abc){2}a(123|456)c", "abcabca456c").group() 结果 abcabca456c

'\A'    只从字符开头匹配，re.search("\Aabc","alexabc") 是匹配不到的

'\Z'    匹配字符结尾，同$

'\d'    匹配数字0-9

'\D'    匹配非数字

'\w'    匹配[A-Za-z0-9]

'\W'    匹配非[A-Za-z0-9]

's'     匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 结果 '\t'

'(?P<name>...)' 分组匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city") 结果{'province': '3714', 'city': '81', 'birthday': '1993'}

最常用的匹配语法

re.match 从头开始匹配

re.search 匹配包含

re.findall 把所有匹配到的字符放到以列表中的元素返回

re.splitall 以匹配到的字符当做列表分隔符

re.sub      匹配字符并替换

反斜杠的困扰
与大多数编程语言相同，正则表达式里使用"\"作为转义字符，这就可能造成反斜杠困扰。假如你需要匹配文本中的字符"\"，那么使用编程语言表示的正则表达式里将需要4个反斜杠"\\\\"：前两个和后两个分别用于在编程语言里转义成反斜杠，转换成两个反斜杠后再在正则表达式里转义成一个反斜杠。Python里的原生字符串很好地解决了这个问题，这个例子中的正则表达式可以使用r"\\"表示。同样，匹配一个数字的"\\d"可以写成r"\d"。有了原生字符串，你再也不用担心是不是漏写了反斜杠，写出来的表达式也更直观。

仅需轻轻知道的几个匹配模式

re.I(re.IGNORECASE): 忽略大小写（括号内是完整写法，下同）

M(MULTILINE): 多行模式，改变'^'和'$'的行为（参见上图）

S(DOTALL): 点任意匹配模式，改变'.'的行为

本节作业

开发一个简单的python计算器

实现加减乘除及拓号优先级解析
用户输入 1 - 2 * ( (60-30 +(-40/5) * (9-2*5/3 + 7 /3*99/4*2998 +10 * 568/14 )) - (-4*3)/ (16-3*2) )等类似公式后，必须自己解析里面的(),+,-,*,/符号和公式(不能调用eval等类似功能偷懒实现)，运算后得出结果，结果必须与真实的计算器所得出的结果一致

hint:

re.search(r'$[^()]+$',s).group()

'(-40/5)'

软件目录结构规范

为什么要设计好目录结构?

"设计项目目录结构"，就和"代码编码风格"一样，属于个人风格问题。对于这种风格上的规范，一直都存在两种态度:

一类同学认为，这种个人风格问题"无关紧要"。理由是能让程序work就好，风格问题根本不是问题。
另一类同学认为，规范化能更好的控制程序结构，让程序具有更高的可读性。

我是比较偏向于后者的，因为我是前一类同学思想行为下的直接受害者。我曾经维护过一个非常不好读的项目，其实现的逻辑并不复杂，但是却耗费了我非常长的时间去理解它想表达的意思。从此我个人对于提高项目可读性、可维护性的要求就很高了。"项目目录结构"其实也是属于"可读性和可维护性"的范畴，我们设计一个层次清晰的目录结构，就是为了达到以下两点:

可读性高: 不熟悉这个项目的代码的人，一眼就能看懂目录结构，知道程序启动脚本是哪个，测试目录在哪儿，配置文件在哪儿等等。从而非常快速的了解这个项目。
可维护性高: 定义好组织规则后，维护者就能很明确地知道，新增的哪个文件和代码应该放在什么目录之下。这个好处是，随着时间的推移，代码/配置的规模增加，项目结构不会混乱，仍然能够组织良好。

所以，我认为，保持一个层次清晰的目录结构是有必要的。更何况组织一个良好的工程目录，其实是一件很简单的事儿。

目录组织方式

关于如何组织一个较好的Python工程目录结构，已经有一些得到了共识的目录结构。在Stackoverflow的这个问题上，能看到大家对Python目录结构的讨论。

这里面说的已经很好了，我也不打算重新造轮子列举各种不同的方式，这里面我说一下我的理解和体会。

假设你的项目名为foo, 我比较建议的最方便快捷目录结构这样就足够了:

Foo/

|-- bin/

|   |-- foo

|

|-- foo/

|   |-- tests/

|   |   |-- __init__.py

|   |   |-- test_main.py

|   |

|   |-- __init__.py

|   |-- main.py

|

|-- docs/

|   |-- conf.py

|   |-- abc.rst

|

|-- setup.py

|-- requirements.txt

|-- README

简要解释一下:

bin/: 存放项目的一些可执行文件，当然你可以起名script/之类的也行。
foo/: 存放项目的所有源代码。(1) 源代码中的所有模块、包都应该放在此目录。不要置于顶层目录。(2) 其子目录tests/存放单元测试代码； (3) 程序的入口最好命名为main.py。
docs/: 存放一些文档。
setup.py: 安装、部署、打包的脚本。
requirements.txt: 存放软件依赖的外部Python包列表。
README: 项目说明文件。

除此之外，有一些方案给出了更加多的内容。比如LICENSE.txt,ChangeLog.txt文件等，我没有列在这里，因为这些东西主要是项目开源的时候需要用到。如果你想写一个开源软件，目录该如何组织，可以参考这篇文章。

下面，再简单讲一下我对这些目录的理解和个人要求吧。

关于README的内容

这个我觉得是每个项目都应该有的一个文件，目的是能简要描述该项目的信息，让读者快速了解这个项目。

它需要说明以下几个事项:

软件定位，软件的基本功能。
运行代码的方法: 安装环境、启动命令等。
简要的使用说明。
代码目录结构说明，更详细点可以说明软件的基本原理。
常见问题说明。

我觉得有以上几点是比较好的一个README。在软件开发初期，由于开发过程中以上内容可能不明确或者发生变化，并不是一定要在一开始就将所有信息都补全。但是在项目完结的时候，是需要撰写这样的一个文档的。

可以参考Redis源码中Readme的写法，这里面简洁但是清晰的描述了Redis功能和源码结构。

关于requirements.txt和setup.py

setup.py

一般来说，用setup.py来管理代码的打包、安装、部署问题。业界标准的写法是用Python流行的打包工具setuptools来管理这些事情。这种方式普遍应用于开源项目中。不过这里的核心思想不是用标准化的工具来解决这些问题，而是说，一个项目一定要有一个安装部署工具，能快速便捷的在一台新机器上将环境装好、代码部署好和将程序运行起来。

这个我是踩过坑的。

我刚开始接触Python写项目的时候，安装环境、部署代码、运行程序这个过程全是手动完成，遇到过以下问题:

安装环境时经常忘了最近又添加了一个新的Python包，结果一到线上运行，程序就出错了。
Python包的版本依赖问题，有时候我们程序中使用的是一个版本的Python包，但是官方的已经是最新的包了，通过手动安装就可能装错了。
如果依赖的包很多的话，一个一个安装这些依赖是很费时的事情。
新同学开始写项目的时候，将程序跑起来非常麻烦，因为可能经常忘了要怎么安装各种依赖。

setup.py可以将这些事情自动化起来，提高效率、减少出错的概率。"复杂的东西自动化，能自动化的东西一定要自动化。"是一个非常好的习惯。

setuptools的文档比较庞大，刚接触的话，可能不太好找到切入点。学习技术的方式就是看他人是怎么用的，可以参考一下Python的一个Web框架，flask是如何写的: setup.py

当然，简单点自己写个安装脚本（deploy.sh）替代setup.py也未尝不可。

requirements.txt

这个文件存在的目的是:

方便开发者维护软件的包依赖。将开发过程中新增的包添加进这个列表中，避免在setup.py安装依赖时漏掉软件包。
方便读者明确项目使用了哪些Python包。

这个文件的格式是每一行包含一个包依赖的说明，通常是flask>=0.10这种格式，要求是这个格式能被pip识别，这样就可以简单的通过 pip install -r requirements.txt来把所有Python包依赖都装好了。具体格式说明：点这里。

关于配置文件的使用方法

注意，在上面的目录结构中，没有将conf.py放在源码目录下，而是放在docs/目录下。

很多项目对配置文件的使用做法是:

配置文件写在一个或多个python文件中，比如此处的conf.py。
项目中哪个模块用到这个配置文件就直接通过import conf这种形式来在代码中使用配置。

这种做法我不太赞同:

这让单元测试变得困难（因为模块内部依赖了外部配置）
另一方面配置文件作为用户控制程序的接口，应当可以由用户自由指定该文件的路径。
程序组件可复用性太差，因为这种贯穿所有模块的代码硬编码方式，使得大部分模块都依赖conf.py这个文件。

所以，我认为配置的使用，更好的方式是，

模块的配置都是可以灵活配置的，不受外部配置文件的影响。
程序的配置也是可以灵活控制的。

能够佐证这个思想的是，用过nginx和mysql的同学都知道，nginx、mysql这些程序都可以自由的指定用户配置。

所以，不应当在代码中直接import conf来使用配置文件。上面目录结构中的conf.py，是给出的一个配置样例，不是在写死在程序中直接引用的配置文件。可以通过给main.py启动参数指定配置路径的方式来让程序读取配置内容。当然，这里的conf.py你可以换个类似的名字，比如settings.py。或者你也可以使用其他格式的内容来编写配置文件，比如settings.yaml之类的。