Python中的并行编程速度

　　这里主要想记录下今天碰到的一个小知识点：Python中的并行编程速率如何？

　　我想把AutoTool做一个并行化改造，主要目的当然是想提高多任务的执行速度。第一反应就是想到用多线程执行不同模块任务，但是在我收集Python多线程编程资料的时候发现一个非常奇怪的信息，那就是Python的多线程并不是真正的多线程，因为有一个GIL的存在（可以参考这篇文章讲解《Python最难的问题》）导致Python实际上默认（CPython解释器）只能是单线程执行。

　　这里我写了一个例子可以看看：

 #!/usr/bin/env python

 # -*- coding: utf-8 -*-

 # @File    : batch_swig_runner.py

 # @Time    : 2019/7/8 18:09

 # @Author  : KuLiuheng

 # @Email   : liuheng.klh@alibaba-inc.com

 from swig_runner import SwigRunner

 import time

 import logging

 from threading import Thread

 from multiprocessing import Pool

 class TestRunner(Thread):

     def __init__(self, name, path):

         super(TestRunner, self).__init__()

         self.name = name

         self.path = path

     def run(self):

         logging.warning("Message from the thread-%s START" % self.name)

         for i in range(10000000):   # 耗时操作模拟

             j = int(i) * 10.1

         # time.sleep(1)

         logging.warning("Message from the thread-%s END" % self.name)

         return self.path

 def multi_process(mname, mpath):

     logging.warning("Message from the thread-%s START" % mname)

     for i in range(10000000):   # 耗时操作模拟

         j = int(i) * 10.1

     # time.sleep(1)

     logging.warning("Message from the thread-%s END" % mname)

 class BatchSwigRunner(object):

     def __init__(self, modules=None):

         """

         用模块信息字典（工程名: 工程路径）来初始化

         :param modules: {工程名: 工程路径}

         """

         if modules is not None:

             self._modules = modules

         else:

             self._modules = dict()

     def add_module_info(self, name, path):

         self._modules[name] = path

     def start(self):

         """

         启动批量任务执行，并返回执行过程中的错误信息

         :return: list(工程序号，工程名称) 出错的工程信息列表

         """

         runners = list()

         for (project_name, project_path) in self._modules.items():

             # logging.warning('BatchSwigRunner.start() [%s][%s]' % (project_name, project_path))

             sub_runner = TestRunner(project_name, project_path)

             sub_runner.daemon = True

             sub_runner.start()

             runners.append(sub_runner)

         for runner in runners:

             runner.join()

 if __name__ == '__main__':

     batch_runner = BatchSwigRunner()

     batch_runner.add_module_info('name1', 'path1')

     batch_runner.add_module_info('name2', 'path2')

     batch_runner.add_module_info('name3', 'path3')

     batch_runner.add_module_info('name4', 'path4')

     start_time = time.time()

     batch_runner.start()

     print 'Total time comsumed = %.2fs' % (time.time() - start_time)

     print('========================================')

     start_time = time.time()

     for index in range(4):

         logging.warning("Message from the times-%d START" % index)

         for i in range(10000000):       # 耗时操作模拟

             j = int(i) * 10.1

         # time.sleep(1)

         logging.warning("Message from the times-%d END" % index)

     print '>>Total time comsumed = %.2fs' % (time.time() - start_time)

     print('----------------------------------------------')

     start_time = time.time()

     pool = Pool(processes=4)

     for i in range(4):

         pool.apply_async(multi_process, ('name++%d' % i, 'path++%d' % i))

     pool.close()

     pool.join()

     print '>>>> Total time comsumed = %.2fs' % (time.time() - start_time)

　　看结果就发现很神奇的结论：

C:\Python27\python.exe E:/VirtualShare/gitLab/GBL-310/GBL/AutoJNI/autoTool/common/batch_swig_runner.py

WARNING:root:Message from the thread-name4 START

WARNING:root:Message from the thread-name2 START

WARNING:root:Message from the thread-name3 START

WARNING:root:Message from the thread-name1 START

WARNING:root:Message from the thread-name2 END

WARNING:root:Message from the thread-name4 END

WARNING:root:Message from the thread-name3 END

Total time comsumed = 15.92s

========================================

WARNING:root:Message from the thread-name1 END

WARNING:root:Message from the times-0 START

WARNING:root:Message from the times-0 END

WARNING:root:Message from the times-1 START

WARNING:root:Message from the times-1 END

WARNING:root:Message from the times-2 START

WARNING:root:Message from the times-2 END

WARNING:root:Message from the times-3 START

WARNING:root:Message from the times-3 END

>>Total time comsumed = 11.59s

----------------------------------------------

WARNING:root:Message from the thread-name++0 START

WARNING:root:Message from the thread-name++1 START

WARNING:root:Message from the thread-name++2 START

WARNING:root:Message from the thread-name++3 START

WARNING:root:Message from the thread-name++1 END

WARNING:root:Message from the thread-name++0 END

WARNING:root:Message from the thread-name++2 END

WARNING:root:Message from the thread-name++3 END

>>>> Total time comsumed = 5.69s

Process finished with exit code 0

　　其运行速度是（计算密集型）：multiprocessing > normal > threading.Thread

　　请注意这里用的是持续计算来模拟耗时操作：

for i in range(10000000):   # 耗时操作模拟

    j = int(i) * 10.1

　　如果用空等待（time.sleep(1)类似IO等待）来模拟耗时操作，那么结果就是（IO等待型）：threading.Thread > multiprocessing > normal

C:\Python27\python.exe E:/VirtualShare/gitLab/GBL-310/GBL/AutoJNI/autoTool/common/batch_swig_runner.py

WARNING:root:Message from the thread-name4 START

WARNING:root:Message from the thread-name2 START

WARNING:root:Message from the thread-name3 START

WARNING:root:Message from the thread-name1 START

WARNING:root:Message from the thread-name3 END

WARNING:root:Message from the thread-name4 END

WARNING:root:Message from the thread-name2 END

WARNING:root:Message from the thread-name1 END

WARNING:root:Message from the times-0 START

Total time comsumed = 1.01s

========================================

WARNING:root:Message from the times-0 END

WARNING:root:Message from the times-1 START

WARNING:root:Message from the times-1 END

WARNING:root:Message from the times-2 START

WARNING:root:Message from the times-2 END

WARNING:root:Message from the times-3 START

WARNING:root:Message from the times-3 END

>>Total time comsumed = 4.00s

----------------------------------------------

WARNING:root:Message from the thread-name++0 START

WARNING:root:Message from the thread-name++1 START

WARNING:root:Message from the thread-name++2 START

WARNING:root:Message from the thread-name++3 START

WARNING:root:Message from the thread-name++0 END

WARNING:root:Message from the thread-name++1 END

WARNING:root:Message from the thread-name++2 END

WARNING:root:Message from the thread-name++3 END

>>>> Total time comsumed = 1.73s

Process finished with exit code 0

　　为何会有这样的结果呢？

（1）threading机制中因为GIL的存在，实际上是一把全局锁让多线程变成了CPU线性执行，只可能用到一颗CPU计算。当sleep这样是释放CPU操作发生时，可以迅速切换线程，切换速度可以接受（比multiprocessing快），比normal（阻塞等待）当然快的多；

（2）这里用了多进程Pool，可以真正意义上使用多CPU，对于CPU计算密集型的操作（上面的for循环计算）那么肯定是多核比单核快。所以就出现了第一种测试场景的结果。

Python中的并行编程速度的更多相关文章

.Net中的并行编程-4.实现高性能异步队列
上文<.Net中的并行编程-3.ConcurrentQueue实现与分析>分析了ConcurrentQueue的实现,本章就基于ConcurrentQueue实现一个高性能的异步队列,该队 ...
Python中的并发编程
简介我们将一个正在运行的程序称为进程.每个进程都有它自己的系统状态,包含内存状态.打开文件列表.追踪指令执行情况的程序指针以及一个保存局部变量的调用栈.通常情况下,一个进程依照一个单序列控制流顺序执 ...
.Net中的并行编程-2.ConcurrentStack的实现与分析
在上篇文章<.net中的并行编程-1.基础知识>中列出了在.net进行多核或并行编程中需要的基础知识,今天就来分析在基础知识树中一个比较简单常用的并发数据结构--.net类库中无锁栈的实现 ...
.Net中的并行编程-3.ConcurrentQueue实现与分析
在上文<.Net中的并行编程-2.ConcurrentQueue的实现与分析> 中解释了无锁的相关概念,无独有偶BCL提供的ConcurrentQueue也是基于原子操作实现, 由于Con ...
.Net中的并行编程-6.常用优化策略
本文是.Net中的并行编程第六篇,今天就介绍一些我在实际项目中的一些常用优化策略. 一.避免线程之间共享数据避免线程之间共享数据主要是因为锁的问题,无论什么粒度的锁 ...
.Net中的并行编程-5.流水线模型实战
自己在Excel整理了很多想写的话题,但苦于最近比较忙(其实这是借口).... 上篇文章<.Net中的并行编程-4.实现高性能异步队列>介绍了异步队列的实现,本篇文章介绍我实际工作者遇到了 ...
Python 中的 TK编程
可爱的 Python:Python 中的 TK编程 http://www.ibm.com/developerworks/cn/linux/sdk/python/charm-12/ python che ...
可爱的 Python : Python中的函数式编程，第三部分
英文原文:Charming Python: Functional programming in Python, Part 3,翻译:开源中国摘要: 作者David Mertz在其文章<可爱的 ...
.Net中的并行编程-1.路线图（转）
大神,大神,膜拜膜拜,原文地址:http://www.cnblogs.com/zw369/p/3834559.html 目录 .Net中的并行编程-1.路线图分析.Net里线程同步机制 .Net中的 ...

随机推荐

SQL Server查询性能
sql server常用语句总结 http://ace105.blog.51cto.com/639741/792519 SQL Server 性能调优(一)--从等待状态判断系统资源瓶颈 ...
HTML5中table标签与form标签的区别
html中form表示一个表单,用来把一系列的控件包围起来,然后再统一发送这些数据到目标,比如最常见的注册,你说需要填写的资料,都是被封装在form里的,填写完毕后,提交form内的内容,如果不再fo ...
exam8.29
咕了好几篇后... 我终于开始重新写了 T1: 不会,没思路,暴搜还可能会(一开始我以为暴搜时间复杂度为$\Theta (mn ^ k)$) 于是码出了暴搜... 跑一遍$(4,4,5)$,然后... ...
Hello 2019题解
Hello 2019题解题解 CF1097A [Gennady and a Card Game] map大法好qwq 枚举每一个的第$1,2$位判是否与给定的重复即可 # include < ...
HDU4254 A Famous Game
luogu嘟嘟嘟这题刚开始特别容易理解错:直接枚举所有$n + 1$种情况,然后算哪一种情况合法,再统计答案. 上述思想的问题就在于我们从已知的结果出发,默认这种每一种情况中取出$q$个红球 ...
P2701 [USACO5.3]巨大的牛棚Big Barn
题目背景 (USACO 5.3.4) 题目描述农夫约翰想要在他的正方形农场上建造一座正方形大牛棚.他讨厌在他的农场中砍树,想找一个能够让他在空旷无树的地方修建牛棚的地方.我们假定,他的农场划分成 N ...
LibreOJ #6220. sum
二次联通门 : LibreOJ #6220. sum /* LibreOJ #6220. sum 对所有数做一个前缀和如果某一位模N等于另一位则他们中间的一段的和一定为N的倍数自己感悟一下 (M ...
Arrays.toString的作用
Arrays.toString()的作用是用来很方便地输出数组,而不用一个一个地输出数组中的元素. 这个方法是是用来将数组转换成String类型输出的,入参可以是long,float,double,i ...
HearthBuddy模拟对手的回合
start calculations, current time: 10:29:48 V2019.09.01.002 Rush 10000 face 27 berserk:1 ets 200 secr ...
elasticsearch工作笔记002---Centos7.3安装最新版elasticsearch-7.0.0-beta1-x86_64.rpm单机版安装
新版本es安装问题: https://blog.csdn.net/lidew521/article/details/88091539

Python中的并行编程速度

Python中的并行编程速度的更多相关文章

随机推荐

热门专题