[python]《Python编程快速上手:让繁琐工作自动化》学习笔记3

1. 组织文件笔记（第9章）(代码下载)

1.1 文件与文件路径
通过import shutil调用shutil模块操作目录，shutil模块能够在Python 程序中实现文件复制、移动、改名和删除；同时也介绍部分os操作文件的函数。常用函数如下：

函数	用途	备注
shutil.copy(source, destination)	复制文件
shutil.copytree(source, destination)	复制文件夹	如果文件夹不存在，则创建文件夹
shutil.move(source, destination)	移动文件	返回新位置的绝对路径的字符串，且会覆写文件
os.unlink(path)	删除path处的文件
os.rmdir(path)	删除path处的文件夹	该文件夹必须为空，其中没有任何文件和文件
shutil.rmtree(path)	删除path处的文件夹	包含的所有文件和文件夹都会被删除
os.walk(path)	遍历path下所有文件夹和文件	返回3个值：当前文件夹名称，当前文件夹子文件夹的字符串列表，当前文件夹文件的字符串列表
os.rename(path)	path处文件重命名

1.2 用zipfile 模块压缩文件
通过import zipfile，利用zipfile模块中的函数，Python 程序可以创建和打开（或解压）ZIP 文件。常用函数如下：

函数	用途	备注
exampleZip=zipfile.ZipFile(‘example.zip’)	创建一个ZipFile对象	example.zip表示.zip 文件的文件名
exampleZip.namelist()	返回ZIP 文件中包含的所有文件和文件夹的字符串的列表
spamInfo = exampleZip.getinfo(‘example.txt’)	返回一个关于特定文件的ZipInfo 对象	example.txt为压缩文件中的某一文件
spamInfo.file_size	返回源文件大小	单位字节
spamInfo.compress_size	返回压缩后文件大小	单位字节
exampleZip.extractall(path))	解压压缩文件到path目录	path不写，默认为当前目录
exampleZip.extract(‘spam.txt’, path)	提取某一压缩文件当path目录	path不写，默认为当前目录
newZip = zipfile.ZipFile(‘new.zip’, ‘w’)	以“写模式”打开ZipFile 对象
newZip.write(‘spam.txt’, compress_type=zipfile.ZIP_DEFLATED)	压缩文件	第一个参数是要添加的文件。第二个参数是“压缩类型”参数
newZip.close()	关闭ZipFile对象

2. 项目练习

2.1 将带有美国风格日期的文件改名为欧洲风格日期

# 导入模块

import shutil

import os

import re

# Renames filenames with American MM-DD-YYYY date format to European DD-MM-YYYY.

# 含美国风格的日期

# Create a regex that matches files with the American date format.

datePattern = re.compile(

    # 匹配文件名开始处、日期出现之前的任何文本

    r"""^(.*?) # all text before the date

        # 匹配月份

        ((0|1)?\d)- # one or two digits for the month

        # 匹配日期

        ((0|1|2|3)?\d)- # one or two digits for the day

        # 匹配年份

        ((19|20)\d\d) # four digits for the year

        (.*?)$ # all text after the date

        """, re.VERBOSE)

# 查找路径

searchPath='d:/'

for amerFilename in os.listdir(searchPath):

    mo = datePattern.search(amerFilename)

    # Skip files without a date.

    if mo == None:

        continue

    # Get the different parts of the filename.

    # 识别日期

    beforePart = mo.group(1)

    monthPart = mo.group(2)

    dayPart = mo.group(4)

    yearPart = mo.group(6)

    afterPart = mo.group(8)

    # Form the European-style filename. 改为欧洲式命名

    euroFilename = beforePart + dayPart + '-' + \

        monthPart + '-' + yearPart + afterPart

    # Get the full, absolute file paths.

    # 返回绝对路径

    absWorkingDir = os.path.abspath(searchPath)

    # 原文件名

    amerFilename = os.path.join(absWorkingDir, amerFilename)

    # 改后文件名

    euroFilename = os.path.join(absWorkingDir, euroFilename)

    # Rename the files.

    print('Renaming "%s" to "%s"...' % (amerFilename, euroFilename))

    shutil.move(amerFilename, euroFilename)  # uncomment after testing

Renaming "d:\今天是06-28-2019.txt" to "d:\今天是28-06-2019.txt"...

2.2 将一个文件夹备份到一个ZIP 文件



import zipfile

import os

# 弄清楚ZIP 文件的名称

def backupToZip(folder):

    # Backup the entire contents of "folder" into a ZIP file.

    # 获得文件夹绝对路径

    folder = os.path.abspath(folder)  # make sure folder is absolute

    # Figure out the filename this code should use based on

    # what files already exist.

    number = 1

    while True:

        # 压缩文件名

        zipFilename = os.path.basename(folder) + '_' + str(number) + '.zip'

        # 如果压缩文件不存在

        if not os.path.exists(zipFilename):

            break

        number = number + 1

    # Create the ZIP file.

    print('Creating %s...' % (zipFilename))

    # 创建新ZIP 文件

    backupZip = zipfile.ZipFile(zipFilename, 'w')

    # TODO: Walk the entire folder tree and compress the files in each folder.

    print('Done.')

    # 提取文件目录

    # 一层一层获得目录

    # Walk the entire folder tree and compress the files in each folder.

    for foldername, subfolders, filenames in os.walk(folder):

        print('Adding files in %s...' % (foldername))

        # 压缩文件夹

        # Add the current folder to the ZIP file.

        backupZip.write(foldername)

        # Add all the files in this folder to the ZIP file.

        for filename in filenames:

            newBase = os.path.basename(folder) + '_'

            # 判断文件是否是压缩文件

            if filename.startswith(newBase) and filename.endswith('.zip'):

                continue  # don't backup the backup ZIP files

            backupZip.write(os.path.join(foldername, filename))

    backupZip.close()

    print('Done.')

backupToZip('image')

Creating image_1.zip...

Done.

Done.

2.3 选择性拷贝

编写一个程序，遍历一个目录树，查找特定扩展名的文件（诸如.pdf 或.jpg）。不论这些文件的位置在哪里，将它们拷贝到一个新的文件夹中。

import shutil

import os

def searchFile(path, savepath):

    # 判断要保存的文件夹路径是否存在

    if not os.path.exists(savepath):

        # 创建要保存的文件夹

        os.makedirs(savepath)

    # 遍历文件夹

    for foldername, subfolders, filenames in os.walk(path):

        for filename in filenames:

            # 判断是不是txt或者pdf文件

            if filename.endswith('txt') or filename.endswith('pdf'):

                inputFile = os.path.join(foldername, filename)

                # 保存文件路径

                outputFile = os.path.join(savepath, filename)

                # 文件保存

                shutil.copy(inputFile, outputFile)

searchFile("mytest", "save")

2.4 删除不需要的文件

编写一个程序，遍历一个目录树，查找特别大的文件或文件夹。将这些文件的绝对路径打印到屏幕上。



import os

def deletefile(path):

    for foldername, subfolders, filenames in os.walk(path):

        for filename in filenames:

            # 绝对路径

            filepath = os.path.join(foldername, filename)

            # 如果文件大于100MB

            if os.path.getsize(filepath)/1024/1024 > 100:

                # 获得绝对路径

                filepath = os.path.abspath(filepath)

                print(filepath)

                # 删除文件

                os.unlink(filepath)

deletefile("mytest")

2.5 消除缺失的编号

编写一个程序，在一个文件夹中，找到所有带指定前缀的文件，诸如spam001.txt,spam002.txt 等，并定位缺失的编号（例如存在spam001.txt 和spam003.txt，但不存在spam002.txt）。让该程序对所有后面的文件改名，消除缺失的编号。



import os

import re

# 路径地址

path = '.'

fileList = []

numList = []

# 寻找文件

pattern = re.compile('spam(\d{3}).txt')

for file in os.listdir(path):

    mo = pattern.search(file)

    if mo != None:

        fileList.append(file)

        numList.append(mo.group(1))

# 对存储的文件排序

fileList.sort()

numList.sort()

# 开始缺失的文件编号

# 编号从1开始

index = 1

# 打印不连续的文件

for i in range(len(numList)):

    # 如果文件编号不连续

    if int(numList[i]) != i+index:

        inputFile = os.path.join(path, fileList[i])

        print("the missing number file is {}:".format(inputFile))

        outputFile = os.path.join(path, 'spam'+'%03d' % (i+1)+'.txt')

        os.rename(inputFile, outputFile)

the missing number file is .\spam005.txt:

[python]《Python编程快速上手:让繁琐工作自动化》学习笔记3的更多相关文章

python学习笔记整理——字典
python学习笔记整理数据结构--字典无序的 {键:值} 对集合用于查询的方法 len(d) Return the number of items in the dictionary d. 返 ...
VS2013中Python学习笔记[Django Web的第一个网页]
前言前面我简单介绍了Python的Hello World.看到有人问我搞搞Python的Web,一时兴起,就来试试看. 第一篇 VS2013中Python学习笔记[环境搭建] 简单介绍Python环 ...
python学习笔记之module && package
个人总结: import module,module就是文件名,导入那个python文件 import package,package就是一个文件夹,导入的文件夹下有一个__init__.py的文件, ...
python学习笔记（六）文件夹遍历，异常处理
python学习笔记(六) 文件夹遍历 1.递归遍历 import os allfile = [] def dirList(path): filelist = os.listdir(path) for ...
python学习笔记--Django入门四管理站点--二
接上一节 python学习笔记--Django入门四管理站点设置字段可选编辑Book模块在email字段上加上blank=True,指定email字段为可选,代码如下: class Autho ...
python学习笔记--Django入门0 安装dangjo
经过这几天的折腾,经历了Django的各种报错,翻译的内容虽然不错,但是与实际的版本有差别,会出现各种奇葩的错误.现在终于找到了解决方法:查看英文原版内容:http://djangobook.com/ ...
python学习笔记(一)元组,序列,字典
python学习笔记(一)元组,序列,字典
Pythoner | 你像从前一样的Python学习笔记
Pythoner | 你像从前一样的Python学习笔记 Pythoner
OpenCV之Python学习笔记
OpenCV之Python学习笔记直都在用Python+OpenCV做一些算法的原型.本来想留下发布一些文章的,可是整理一下就有点无奈了,都是写零散不成系统的小片段.现在看到一本国外的新书< ...
python学习笔记（五岁以下儿童）深深浅浅的副本复印件，文件和文件夹
python学习笔记(五岁以下儿童) 深拷贝-浅拷贝浅拷贝就是对引用的拷贝(仅仅拷贝父对象) 深拷贝就是对对象的资源拷贝普通的复制,仅仅是添加了一个指向同一个地址空间的"标签" ...

随机推荐

程序员便于开发的一些工具、网站、App。
http://www.kancloud.cn 关于文档,各种技术,框架的学习指南,API文档搜索方便. https://leetcode.com/ 程序员刷题面试网站,无聊的时候可以做一做.
Hive 自定义UDF操作步骤
Hive 自定义UDF操作步骤需要自定义类,然后继承UDF 然后在方法envluate()方法里面实现具体的业务逻辑,打包上传到linux(以免出错打包成RunningJar) 一.创建临时函数 ( ...
16.MongoDB系列之分片管理
1. 查看当前状态 1.1 查看配置信息 mongos> use config // 查看分片 mongos> db.shards.find() { "_id" : & ...
SSM框架整合图书管理项目
SSM框架整合 1.建立简单的maven项目 2.导入依赖 <?xml version="1.0" encoding="UTF-8"?> <p ...
动词时态=>3.现在时态和过去时态构成详解
现在时态构成详解一般现在时态最容易构成的时态,直接加动词原形(字典当中显示的词条)就可以第三人称"单数"的话需要加s 这是最容易出错的时态:容易将现在的时间,和一般的状态: ...
C# Linq 查询汇总
分组取值.求和.计数 1 var resultlist = orderllist.GroupBy(oo => new { oo.Deptname, oo.Userid, oo.Username ...
SQL--临时表的使用
临时表的创建临时表分为:本地临时表和全局临时表通俗区分: 本地临时表:只能在当前查询页面使用,新开的查询是不能使用它的 #temp 全局临时表:不管开多少查询页面都可以使用 ##temp ...
【题解】CF631B Print Check
题面传送门解决思路: 首先考虑到,一个点最终的情况只有三种可能:不被染色,被行染色,被列染色. 若一个点同时被行.列染色多次,显示出的是最后一次被染色的结果.所以我们可以使用结构体,对每一行.每一列 ...
SpringBoot使用poi实现导出excel
//实体类 //导出的数据的实体 public class User { private String id; private String name; private String year; // ...