multiprocessing module in python(转)
序.multiprocessing
python中的多线程其实并不是真正的多线程,如果想要充分地使用多核CPU的资源,在python中大部分情况需要使用多进程。Python提供了非常好用的多进程包multiprocessing,只需要定义一个函数,Python会完成其他所有事情。借助这个包,可以轻松完成从单进程到并发执行的转换。multiprocessing支持子进程、通信和共享数据、执行不同形式的同步,提供了Process、Queue、Pipe、Lock等组件。
1.Process
创建进程的类:Process([group [, target [, name [, args [, kwargs]]]]]),target表示调用对象,args表示调用对象的位置参数元组。kwargs表示调用对象的字典。name为别名。group实质上不使用。
方法:is_alive()、join([timeout])、run()、start()、terminate()。其中,Process以start()启动某个进程。
属性:authkey、daemon(要通过start()设置)、exitcode(进程在运行时为None、如果为–N,表示被信号N结束)、name、pid。其中daemon是父进程终止后自动终止,且自己不能产生新进程,必须在start()之前设置。
例1.1:创建函数并将其作为单个进程
import multiprocessing
import time def worker(interval):
n = 5
while n > 0:
print("The time is {0}".format(time.ctime()))
time.sleep(interval)
n -= 1 if __name__ == "__main__":
p = multiprocessing.Process(target = worker, args = (3,))
p.start()
print "p.pid:", p.pid
print "p.name:", p.name
print "p.is_alive:", p.is_alive()
p.pid: 8736
p.name: Process-1
p.is_alive: True
The time is Tue Apr 21 20:55:12 2015
The time is Tue Apr 21 20:55:15 2015
The time is Tue Apr 21 20:55:18 2015
The time is Tue Apr 21 20:55:21 2015
The time is Tue Apr 21 20:55:24 2015
例1.2:创建函数并将其作为多个进程
import multiprocessing
import time def worker_1(interval):
print "worker_1"
time.sleep(interval)
print "end worker_1" def worker_2(interval):
print "worker_2"
time.sleep(interval)
print "end worker_2" def worker_3(interval):
print "worker_3"
time.sleep(interval)
print "end worker_3" if __name__ == "__main__":
p1 = multiprocessing.Process(target = worker_1, args = (2,))
p2 = multiprocessing.Process(target = worker_2, args = (3,))
p3 = multiprocessing.Process(target = worker_3, args = (4,)) p1.start()
p2.start()
p3.start() print("The number of CPU is:" + str(multiprocessing.cpu_count()))
for p in multiprocessing.active_children():
print("child p.name:" + p.name + "\tp.id" + str(p.pid))
print "END!!!!!!!!!!!!!!!!!"
结果
例1.3:将进程定义为类
import multiprocessing
import time class ClockProcess(multiprocessing.Process):
def __init__(self, interval):
multiprocessing.Process.__init__(self)
self.interval = interval def run(self):
n = 5
while n > 0:
print("the time is {0}".format(time.ctime()))
time.sleep(self.interval)
n -= 1 if __name__ == '__main__':
p = ClockProcess(3)
p.start()
注:进程p调用start()时,自动调用run()
结果
the time is Tue Apr 21 20:31:30 2015
the time is Tue Apr 21 20:31:33 2015
the time is Tue Apr 21 20:31:36 2015
the time is Tue Apr 21 20:31:39 2015
the time is Tue Apr 21 20:31:42 2015
例1.4:daemon程序对比结果
#1.4-1 不加daemon属性
import multiprocessing
import time def worker(interval):
print("work start:{0}".format(time.ctime()));
time.sleep(interval)
print("work end:{0}".format(time.ctime())); if __name__ == "__main__":
p = multiprocessing.Process(target = worker, args = (3,))
p.start()
print "end!"
结果
end!
work start:Tue Apr 21 21:29:10 2015
work end:Tue Apr 21 21:29:13 2015
#1.4-2 加上daemon属性
import multiprocessing
import time def worker(interval):
print("work start:{0}".format(time.ctime()));
time.sleep(interval)
print("work end:{0}".format(time.ctime())); if __name__ == "__main__":
p = multiprocessing.Process(target = worker, args = (3,))
p.daemon = True
p.start()
print "end!"
结果
end!
注:因子进程设置了daemon属性,主进程结束,它们就随着结束了。
#1.4-3 设置daemon执行完结束的方法
import multiprocessing
import time def worker(interval):
print("work start:{0}".format(time.ctime()));
time.sleep(interval)
print("work end:{0}".format(time.ctime())); if __name__ == "__main__":
p = multiprocessing.Process(target = worker, args = (3,))
p.daemon = True
p.start()
p.join()
print "end!"
结果
work start:Tue Apr 21 22:16:32 2015
work end:Tue Apr 21 22:16:35 2015
end!
2.Lock
当多个进程需要访问共享资源的时候,Lock可以用来避免访问的冲突。
import multiprocessing
import sys def worker_with(lock, f):
with lock:
fs = open(f, 'a+')
n = 10
while n > 1:
fs.write("Lockd acquired via with\n")
n -= 1
fs.close() def worker_no_with(lock, f):
lock.acquire()
try:
fs = open(f, 'a+')
n = 10
while n > 1:
fs.write("Lock acquired directly\n")
n -= 1
fs.close()
finally:
lock.release() if __name__ == "__main__":
lock = multiprocessing.Lock()
f = "file.txt"
w = multiprocessing.Process(target = worker_with, args=(lock, f))
nw = multiprocessing.Process(target = worker_no_with, args=(lock, f))
w.start()
nw.start()
print "end"
结果(输出文件)
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
3.Semaphore
Semaphore用来控制对共享资源的访问数量,例如池的最大连接数。
import multiprocessing
import time def worker(s, i):
s.acquire()
print(multiprocessing.current_process().name + "acquire");
time.sleep(i)
print(multiprocessing.current_process().name + "release\n");
s.release() if __name__ == "__main__":
s = multiprocessing.Semaphore(2)
for i in range(5):
p = multiprocessing.Process(target = worker, args=(s, i*2))
p.start()
结果:
Process-1acquire
Process-1release Process-2acquire
Process-3acquire
Process-2release Process-5acquire
Process-3release Process-4acquire
Process-5release Process-4release
4.Event
Event用来实现进程间同步通信。
import multiprocessing
import time def wait_for_event(e):
print("wait_for_event: starting")
e.wait()
print("wairt_for_event: e.is_set()->" + str(e.is_set())) def wait_for_event_timeout(e, t):
print("wait_for_event_timeout:starting")
e.wait(t)
print("wait_for_event_timeout:e.is_set->" + str(e.is_set())) if __name__ == "__main__":
e = multiprocessing.Event()
w1 = multiprocessing.Process(name = "block",
target = wait_for_event,
args = (e,)) w2 = multiprocessing.Process(name = "non-block",
target = wait_for_event_timeout,
args = (e, 2))
w1.start()
w2.start() time.sleep(3) e.set()
print("main: event is set")
结果
wait_for_event: starting
wait_for_event_timeout:starting
wait_for_event_timeout:e.is_set->False
main: event is set
wairt_for_event: e.is_set()->True
5.Queue
Queue是多进程安全的队列,可以使用Queue实现多进程之间的数据传递。put方法用以插入数据到队列中,put方法还有两个可选参数:blocked和timeout。如果blocked为True(默认值),并且timeout为正值,该方法会阻塞timeout指定的时间,直到该队列有剩余的空间。如果超时,会抛出Queue.Full异常。如果blocked为False,但该Queue已满,会立即抛出Queue.Full异常。
import multiprocessing def writer_proc(q):
try:
q.put(1, block = False)
except:
pass def reader_proc(q):
try:
print q.get(block = False)
except:
pass if __name__ == "__main__":
q = multiprocessing.Queue()
writer = multiprocessing.Process(target=writer_proc, args=(q,))
writer.start() reader = multiprocessing.Process(target=reader_proc, args=(q,))
reader.start() reader.join()
writer.join()
结果
1
6.Pipe
Pipe方法返回(conn1, conn2)代表一个管道的两个端。Pipe方法有duplex参数,如果duplex参数为True(默认值),那么这个管道是全双工模式,也就是说conn1和conn2均可收发。duplex为False,conn1只负责接受消息,conn2只负责发送消息。
import multiprocessing
import time def proc1(pipe):
while True:
for i in xrange(10000):
print "send: %s" %(i)
pipe.send(i)
time.sleep(1) def proc2(pipe):
while True:
print "proc2 rev:", pipe.recv()
time.sleep(1) def proc3(pipe):
while True:
print "PROC3 rev:", pipe.recv()
time.sleep(1) if __name__ == "__main__":
pipe = multiprocessing.Pipe()
p1 = multiprocessing.Process(target=proc1, args=(pipe[0],))
p2 = multiprocessing.Process(target=proc2, args=(pipe[1],))
#p3 = multiprocessing.Process(target=proc3, args=(pipe[1],)) p1.start()
p2.start()
#p3.start() p1.join()
p2.join()
#p3.join()
结果
7.Pool
在利用Python进行系统管理的时候,特别是同时操作多个文件目录,或者远程控制多台主机,并行操作可以节约大量的时间。当被操作对象数目不大时,可以直接利用multiprocessing中的Process动态成生多个进程,十几个还好,但如果是上百个,上千个目标,手动的去限制进程数量却又太过繁琐,此时可以发挥进程池的功效。
Pool可以提供指定数量的进程,供用户调用,当有新的请求提交到pool中时,如果池还没有满,那么就会创建一个新的进程用来执行该请求;但如果池中的进程数已经达到规定最大值,那么该请求就会等待,直到池中有进程结束,才会创建新的进程来它。
例7.1:使用进程池
#coding: utf-8
import multiprocessing
import time def func(msg):
print "msg:", msg
time.sleep(3)
print "end" if __name__ == "__main__":
pool = multiprocessing.Pool(processes = 3)
for i in xrange(4):
msg = "hello %d" %(i)
pool.apply_async(func, (msg, )) #维持执行的进程总数为processes,当一个进程执行完毕后会添加新的进程进去 print "Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~"
pool.close()
pool.join() #调用join之前,先调用close函数,否则会出错。执行完close后不会有新的进程加入到pool,join函数等待所有子进程结束
print "Sub-process(es) done."
一次执行结果
mMsg: hark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~ello 0 msg: hello 1
msg: hello 2
end
msg: hello 3
end
end
end
Sub-process(es) done.
函数解释:
- apply_async(func[, args[, kwds[, callback]]]) 它是非阻塞,apply(func[, args[, kwds]])是阻塞的(理解区别,看例1例2结果区别)
- close() 关闭pool,使其不在接受新的任务。
- terminate() 结束工作进程,不在处理未完成的任务。
- join() 主进程阻塞,等待子进程的退出, join方法要在close或terminate之后使用。
执行说明:创建一个进程池pool,并设定进程的数量为3,xrange(4)会相继产生四个对象[0, 1, 2, 4],四个对象被提交到pool中,因pool指定进程数为3,所以0、1、2会直接送到进程中执行,当其中一个执行完事后才空出一个进程处理对象3,所以会出现输出“msg: hello 3”出现在”end”后。因为为非阻塞,主函数会自己执行自个的,不搭理进程的执行,所以运行完for循环后直接输出“mMsg: hark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~”,主程序在pool.join()处等待各个进程的结束。
例7.2:使用进程池(阻塞)
coding: utf-8
import multiprocessing
import time def func(msg):
print "msg:", msg
time.sleep(3)
print "end" if __name__ == "__main__":
pool = multiprocessing.Pool(processes = 3)
for i in xrange(4):
msg = "hello %d" %(i)
pool.apply(func, (msg, )) #维持执行的进程总数为processes,当一个进程执行完毕后会添加新的进程进去 print "Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~"
pool.close()
pool.join() #调用join之前,先调用close函数,否则会出错。执行完close后不会有新的进程加入到pool,join函数等待所有子进程结束
print "Sub-process(es) done."
一次执行的结果
msg: hello 0
end
msg: hello 1
end
msg: hello 2
end
msg: hello 3
end
Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~
Sub-process(es) done.
例7.3:使用进程池,并关注结果
import multiprocessing
import time def func(msg):
print "msg:", msg
time.sleep(3)
print "end"
return "done" + msg if __name__ == "__main__":
pool = multiprocessing.Pool(processes=4)
result = []
for i in xrange(3):
msg = "hello %d" %(i)
result.append(pool.apply_async(func, (msg, )))
pool.close()
pool.join()
for res in result:
print ":::", res.get()
print "Sub-process(es) done."
一次执行结果
msg: hello 0
msg: hello 1
msg: hello 2
end
end
end
::: donehello 0
::: donehello 1
::: donehello 2
Sub-process(es) done
例7.4:使用多个进程池
#coding: utf-8
import multiprocessing
import os, time, random def Lee():
print "\nRun task Lee-%s" %(os.getpid()) #os.getpid()获取当前的进程的ID
start = time.time()
time.sleep(random.random() * 10) #random.random()随机生成0-1之间的小数
end = time.time()
print 'Task Lee, runs %0.2f seconds.' %(end - start) def Marlon():
print "\nRun task Marlon-%s" %(os.getpid())
start = time.time()
time.sleep(random.random() * 40)
end=time.time()
print 'Task Marlon runs %0.2f seconds.' %(end - start) def Allen():
print "\nRun task Allen-%s" %(os.getpid())
start = time.time()
time.sleep(random.random() * 30)
end = time.time()
print 'Task Allen runs %0.2f seconds.' %(end - start) def Frank():
print "\nRun task Frank-%s" %(os.getpid())
start = time.time()
time.sleep(random.random() * 20)
end = time.time()
print 'Task Frank runs %0.2f seconds.' %(end - start) if __name__=='__main__':
function_list= [Lee, Marlon, Allen, Frank]
print "parent process %s" %(os.getpid()) pool=multiprocessing.Pool(4)
for func in function_list:
pool.apply_async(func) #Pool执行函数,apply执行函数,当有一个进程执行完毕后,会添加一个新的进程到pool中 print 'Waiting for all subprocesses done...'
pool.close()
pool.join() #调用join之前,一定要先调用close() 函数,否则会出错, close()执行后不会有新的进程加入到pool,join函数等待素有子进程结束
print 'All subprocesses done.'
结果:
parent process 7704 Waiting for all subprocesses done...
Run task Lee-6948 Run task Marlon-2896 Run task Allen-7304 Run task Frank-3052
Task Lee, runs 1.59 seconds.
Task Marlon runs 8.48 seconds.
Task Frank runs 15.68 seconds.
Task Allen runs 18.08 seconds.
All subprocesses done.
multiprocessing module in python(转)的更多相关文章
- Pyperclip – A cross-platform clipboard module for Python
Usage is simple: import pyperclip pyperclip.copy('The text to be copied to the clipboard.') spam = p ...
- AttributeError: module ‘tensorflow.python.ops.nn’ has no attribute ‘leaky_relu’
#AttributeError: module 'tensorflow.python.ops.nn' has no attribute 'leaky_relu' 的原因主要是版本的问题 解决方法是更新 ...
- pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
# 背景 安装pip后发现执行pip install pytest,提示下面错误 pip is configured with locations that require TLS/SSL, howe ...
- 启动Tensorboard时发生错误:class BeholderHook(tf.estimator.SessionRunHook): AttributeError: module 'tensorflow.python.estimator.estimator_lib' has no attribute 'SessionRunHook'
报错:class BeholderHook(tf.estimator.SessionRunHook):AttributeError: module 'tensorflow.python.estimat ...
- the ssl module in Python is not available错误解决
在使用pip安装pymongo的过程中报错,提示如下: $ pip3 install pymongo pip is configured with locations that require TLS ...
- 多进程(multiprocessing module)
一.多进程 1.1 多进程的概念 由于GIL的存在,python中的多线程其实并不是真正的多线程,如果想要充分地使用多核CPU的资源,在python中大部分情况需要使用多进程.Python提供了非常好 ...
- pip install 时报错 pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
pip install 时报错: pip is configured with locations that require TLS/SSL, however the ssl module in Py ...
- Httpd服务入门知识-Httpd服务常见配置案例之MPM( Multi-Processing Module)多路处理模块
Httpd服务入门知识-Httpd服务常见配置案例之MPM( Multi-Processing Module)多路处理模块 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.M ...
- 【Linux】CentOS6上安装Python3.7(config、make、make install)及“No module named '_ctypes'”/pip install时“ssl module in Python is not available.”的解决
1.下载安装包 https://www.python.org/ftp/python/ 该目录下选择所需要的版本进行下载.解压. wget https://www.python.org/ftp/pyth ...
随机推荐
- JavaScript语言知识收藏
接触Web开发也已经有一段时间了,对javascript的认识也比以前有了更加深入的认识了,所以觉得应该整理一下. 一.JavaScript不支持函数(方法)的重载,用一个例子证明如下: functi ...
- HTML5实现屏幕手势解锁(转载)
来源:https://github.com/lvming6816077/H5lockhttp://threejs.org/examples/http://www.inf.usi.ch/phd/wett ...
- Access-Control-Allow-Origin: Dealing with CORS Errors in Angular
https://daveceddia.com/access-control-allow-origin-cors-errors-in-angular/ Getting this error in you ...
- css中inline、block、inline-block的区别
http://www.cnblogs.com/fxair/archive/2012/07/05/2577280.html display:inline就是将元素显示为块级元素. block元素的特点是 ...
- PHP学习笔记:keditor的使用
keditor时一个免费的开源编辑器,很多公司在使用(百度编辑器也不错).最近为了做一个客户信息管理系统,在发送邮件模块用到这个编辑器,也算学习一下新的东西. 第一步:下载编辑器 到它的官网下载:ht ...
- Ahjesus Nodejs02 使用集成开发环境
下载最新版webstorm, 选择此集成开发环境是因为支持性较好,在vs下也有插件支持,不过感觉有些牵强 附vs插件 NTVS 详细介绍 安装好以后就需要配置npm NPM 国内高速镜像 source ...
- [.NET] 使用C#开发SQL Function来提供服务 - 简讯发送
[.NET] 使用C#开发SQL Function来提供服务 - 简讯发送 范例下载 范例程序代码:点此下载 问题情景 在「使用C#开发SQL Function来提供数据 - 天气预报」这篇文章中,介 ...
- C#获取本地系统日期格式
我们可以通过使用DataTime这个类来获取当前的时间.通过调用类中的各种方法我们可以获取不同的时间:如:日期(2008-09-04).时间(12:12:12).日期+时间(2008-09-04 12 ...
- android Android-PullToRefresh 下拉刷新
1.github下载地址 原作者: https://github.com/chrisbanes/Android-PullToRefresh 我自己的: https://github.com/zyj ...
- [Android]编译错误:Could not get unknown property 'release' for SigningConfig container
使用Gradle进行安卓编译时,出现如下错误: Could not get unknown property 'release' for SigningConfig container. 原因: 在主 ...