python3 多线程和多进程

一.线程和进程

1.操作系统中，线程是CPU调度和分派的基本单位，线程依存于程序中

2.操作系统中，进程是系统进行资源分配和调度的一个基本单位，一个程序至少有一个进程

3.一个进程由至少一个线程组成，线程组成进程

4.多进程、多进程实际是进程、线程、进程和线程的并发而不是并行，用来加快程序运行速度

5.Python既支持多线程，也支持多进程。

二.多线程threading

1.python3线程操作中常用模块：_thread和threading，其中一般都用threading模块

2.线程分为：内核线程：由操作系统内核创建和撤销；用户线程：不需要内核支持而在用户程序中实现的线程

3.Python中使用线程有两种方式：函数或者用类来包装线程对象

2. 创建线程

 import threading

 #def main():#定义一个存放多线程的函数

 #   print(threading.active_count())#获取已激活的线程数

 #   print(threading.enumerate()) # see the thread list查询线程信息

 #   print(threading.current_thread())#查询当前运行的线程

 def thread_job():#定义一个线程的工作的函数

     print('This is a thread of %s' % threading.current_thread())

 def main():

     thread = threading.Thread(target=thread_job,)#添加线程，参数为线程的目标（任务）

     thread.start()#执行线程thread

 if __name__ == '__main__':#在该程序中运行

     main()

 ---------------------------------------

 This is a thread of <Thread(Thread-1, started 5664)>

创建线程1

 #调用 _thread 模块中的start_new_thread()函数来产生新线程

 #_thread.start_new_thread ( function, args[, kwargs] )（线程函数，函数的参数tuple，可选参数）

 import _thread

 import time

 # 为线程定义一个函数

 def print_time( threadName, delay):

    count = 0

    while count < 5:

       time.sleep(delay)

       count += 1

       print ("%s: %s" % ( threadName, time.ctime(time.time()) ))

 # 创建两个线程

 try:

    _thread.start_new_thread( print_time, ("Thread-1", 2, ) )

    _thread.start_new_thread( print_time, ("Thread-2", 4, ) )

 except:

    print ("Error: 无法启动线程")

 while 1:

    pass

 ---------------------------------------------------------

 Thread-1: Thu May 17 17:13:01 2018

 Thread-2: Thu May 17 17:13:03 2018

 Thread-1: Thu May 17 17:13:03 2018

 Thread-1: Thu May 17 17:13:05 2018

 Thread-2: Thu May 17 17:13:07 2018

 。。。

创建线程 2

 #从 threading.Thread 继承创建一个新的子类，并实例化后调用 start() 方法启动新线程，即它调用了线程的 run() 方法

 import threading

 import time

 exitFlag = 0

 class myThread (threading.Thread):

     def __init__(self, threadID, name, counter):

         threading.Thread.__init__(self)

         self.threadID = threadID

         self.name = name

         self.counter = counter

     def run(self):

         print ("开始线程：" + self.name)

         print_time(self.name, self.counter, 5)

         print ("退出线程：" + self.name)

 def print_time(threadName, delay, counter):

     while counter:

         if exitFlag:

             threadName.exit()

         time.sleep(delay)

         print ("%s: %s" % (threadName, time.ctime(time.time())))

         counter -= 1

 # 创建新线程

 thread1 = myThread(1, "Thread-1", 1)

 thread2 = myThread(2, "Thread-2", 2)

 # 开启新线程

 thread1.start()

 thread2.start()

 thread1.join()

 thread2.join()

 print ("退出主线程")

 ------------------------------------------------

 开始线程：Thread-1

 开始线程：Thread-2

 Thread-1: Thu May 17 17:18:32 2018

 Thread-2: Thu May 17 17:18:33 2018

 Thread-1: Thu May 17 17:18:33 2018

 。。。

创建线程 3

3.join功能，控制执行顺序

 import threading

 import time

 def thread_job():#定义一个线程

     print('T1 start\n')#开始t1线程

     for i in range(10):#设置10步

         time.sleep(0.1)#任务间隔0.1秒，增加时耗

     print('T1 finish\n')#结束t1线程

 def T2_job():#又定义一个线程

     print('T2 start\n')#开始

     print('T2 finish\n')#结束

 def main():#定义主程序

     thread1 = threading.Thread(target=thread_job, name='T1')#添加线程t1

     thread2 = threading.Thread(target=T2_job, name='T2')#添加线程t2

     thread1.start()#开始运行t1

     thread2.start()#开始运行t2

     #使用join控制多个线程的执行顺序

     thread2.join()#等到t2运行完再进行下一步

     thread1.join()#到t1运行完再进行下一步

     print('all done\n')#表明程序执行完

 if __name__ == '__main__':

     main()

 --------------------------------------------------------

 T1 start

 T2 start

 T2 finish

 T1 finish

 all done

join

4.存储进程结果Queue,多线程调用的函数不能有返回值, 所以使用Queue存储多个线程运算的结果

 #，将数据列表中的数据传入，使用四个线程处理，将结果保存在Queue中，

 # 线程执行完后，从Queue中获取存储的结果

 import threading

 import time

 from queue import Queue#导入队列模块

 def job(l,q):#函数的参数是一个列表l和一个队列q

     #函数的功能是，对列表的每个元素进行平方计算，将结果保存在队列中

     for i in range(len(l)):

         l[i] = l[i]**2

     q.put(l)#把列表l存放进队列q中

 def multithreading():#定义一个多线程函数

     q = Queue()#创建一个空队列，用来保存返回值

     threads = []#定义一个多线程列表

     data = [[1,2,3],[3,4,5],[4,4,4],[5,5,5]]#初始化一个多维数据列表

     for i in range(4):#定义四个线程

         t = threading.Thread(target=job, args=(data[i], q))

         #创建一个线程，任务是job函数，被调用的job函数没有括号，只是一个索引，参数在后面

         t.start()#开始t

         threads.append(t)# #把每个线程append到线程列表中

     for thread in threads:

         thread.join()#分别join四个线程到主线程

     results = []#定义一个空的列表results，将四个线运行后保存在队列中的结果返回给空列表results

     for _ in range(4):

           results.append(q.get()) #q.get()按顺序从q中拿出一个值

     print(results)

 if __name__ == '__main__':

     multithreading()

 -------------------------------------------------

 [[1, 4, 9], [9, 16, 25], [16, 16, 16], [25, 25, 25]]

queue

 import queue

 import threading

 import time

 exitFlag = 0

 class myThread (threading.Thread):

     def __init__(self, threadID, name, q):

         threading.Thread.__init__(self)

         self.threadID = threadID

         self.name = name

         self.q = q

     def run(self):

         print ("开启线程：" + self.name)

         process_data(self.name, self.q)

         print ("退出线程：" + self.name)

 def process_data(threadName, q):

     while not exitFlag:

         queueLock.acquire()

         if not workQueue.empty():

             data = q.get()

             queueLock.release()

             print ("%s processing %s" % (threadName, data))

         else:

             queueLock.release()

         time.sleep(1)

 threadList = ["Thread-1", "Thread-2", "Thread-3"]

 nameList = ["One", "Two", "Three", "Four", "Five"]

 queueLock = threading.Lock()

 workQueue = queue.Queue(10)

 threads = []

 threadID = 1

 # 创建新线程

 for tName in threadList:

     thread = myThread(threadID, tName, workQueue)

     thread.start()

     threads.append(thread)

     threadID += 1

 # 填充队列

 queueLock.acquire()

 for word in nameList:

     workQueue.put(word)

 queueLock.release()

 # 等待队列清空

 while not workQueue.empty():

     pass

 # 通知线程是时候退出

 exitFlag = 1

 # 等待所有线程完成

 for t in threads:

     t.join()

 print ("退出主线程")

 ------------------------------------------------------

 开启线程：Thread-1

 开启线程：Thread-2

 开启线程：Thread-3

 Thread-3 processing One

 Thread-2 processing Two

 Thread-1 processing Three

 Thread-3 processing Four

 Thread-2 processing Five

 退出线程：Thread-3

 退出线程：Thread-2

 退出线程：Thread-1

 退出主线程

queue

5.GIL（Global Interpreter Lock）全局解释锁：它确保任何时候都只有一个Python线程执行，导致了多线程无法利用多核。

实质上python中的多线程就是节省了i/o时间

 # 测试 GIL

 import threading

 from queue import Queue

 import copy

 import time

 def job(l, q):#传入一个列表和一个队列，

     #操作：计算列表的总和，并把它传入到队列中

     res = sum(l)

     q.put(res)

 def multithreading(l):#多线程函数

     q = Queue()

     threads = []

     for i in range(4):#同时执行4个线程

         t = threading.Thread(target=job, args=(copy.copy(l), q), name='T%i' % i)

         t.start()#t开始执行

         threads.append(t)#线程列表中添加t

     [t.join() for t in threads]

     total = 0

     for _ in range(4):

         total += q.get()

     print(total)

 def normal(l):#计算列表的总值

     total = sum(l)

     print(total)

 if __name__ == '__main__':

     l = list(range(1000000))

     # 1.正常执行：一个列表扩展4倍，并打印出执行时间

     s_t = time.time()

     normal(l*4)

     print('normal: ',time.time()-s_t)

     #2.线程执行：建立四个线程执行

     s_t = time.time()

     multithreading(l)

     print('multithreading: ', time.time()-s_t)

 -------------------------------------------------------------

 1999998000000

 normal:  0.24673938751220703

 1999998000000

 multithreading:  0.244584321975708

GIL

6.lock线程锁：lock在不同线程使用同一共享内存时（只允许一个线程修改数据），能够确保线程之间互不影响，但是阻止了多线程并行，也可能导致死锁

如果多个线程共同对某个数据修改，则可能出现不可预料的结果，为了保证数据的正确性，需要对多个线程进行同步

多线程和多进程最大的不同在于:多进程中，同一个变量，各自有一份拷贝存在于每个进程中，互不影响，而多线程中，所有变量都由所有线程共享，所以，任何一个变量都可以被任何一个线

程修改，因此，线程之间共享数据最大的危险在于多个线程同时改一个变量，把内容给改乱了

 #使用lock的方法是， 在每个线程执行运算修改共享内存之前，执行lock.acquire()将共享内存上锁，

 # 确保当前线程执行时，内存不会被其他线程访问，

 # 执行运算完毕后，使用lock.release()将锁打开， 保证其他的线程可以使用该共享内存。

 import threading

 def job1():

     global A, lock#使用线程锁

     lock.acquire()#lock.acquire()将共享内存上锁

     for i in range(10):

         A += 1

         print('job1', A)

     lock.release()#lock.release()将锁打开， 保证其他的线程可以使用该共享内存

 def job2():

     global A, lock

     lock.acquire()#确保当前线程执行时，内存不会被其他线程访问

     for i in range(10):

         A += 10

         print('job2', A)

     lock.release()#保证其他的线程可以使用该共享内存

 if __name__ == '__main__':

     lock = threading.Lock()#创建一个线程锁

     A = 0

     t1 = threading.Thread(target=job1)

     t2 = threading.Thread(target=job2)

     t1.start()

     t2.start()

     t1.join()

     t2.join()

 ------------------------------------------------------------

 job1 1

 job1 2

 job1 3

 job1 4

 job1 5

 job1 6

 job1 7

 job1 8

 job1 9

 job1 10

 job2 20

 job2 30

 job2 40

 job2 50

 job2 60

 job2 70

 job2 80

 job2 90

 job2 100

lock

 #使用 Thread 对象的 Lock 和 Rlock 可以实现简单的线程同步

 import threading

 import time

 class myThread (threading.Thread):

     def __init__(self, threadID, name, counter):

         threading.Thread.__init__(self)

         self.threadID = threadID

         self.name = name

         self.counter = counter

     def run(self):

         print ("开启线程： " + self.name)

         # 获取锁，用于线程同步

         threadLock.acquire()

         print_time(self.name, self.counter, 3)

         # 释放锁，开启下一个线程

         threadLock.release()

 def print_time(threadName, delay, counter):

     while counter:

         time.sleep(delay)

         print ("%s: %s" % (threadName, time.ctime(time.time())))

         counter -= 1

 threadLock = threading.Lock()

 threads = []

 # 创建新线程

 thread1 = myThread(1, "Thread-1", 1)

 thread2 = myThread(2, "Thread-2", 2)

 # 开启新线程

 thread1.start()

 thread2.start()

 # 添加线程到线程列表

 threads.append(thread1)

 threads.append(thread2)

 # 等待所有线程完成

 for t in threads:

     t.join()

 print ("退出主线程")

 ------------------------------------------------------

 开启线程： Thread-1

 开启线程： Thread-2

 Thread-1: Thu May 17 17:25:58 2018

 Thread-1: Thu May 17 17:25:59 2018

 Thread-1: Thu May 17 17:26:00 2018

 Thread-2: Thu May 17 17:26:02 2018

 Thread-2: Thu May 17 17:26:04 2018

 Thread-2: Thu May 17 17:26:06 2018

 退出主线程

lock 补充

递归锁RLcok类的用法和Lock类一模一样，但它支持嵌套，在多个锁没有释放的时候一般会使用使用RLcok类

信号量（BoundedSemaphore类）用法和Lock类一模一样，但同时允许一定数量的线程更改数据

7.补充

三.多进程multiprocessing(或者说是多核)

1.多进程 Multiprocessing 和多线程 threading 类似，但是多核使用，用来弥补 threading 的一些劣势, 比如GIL

2.创建进程

 import multiprocessing as mp

 import threading as td

 def job(a,d):#定义一个被线程和进程调用的函数

     print('aaaaa')

 if __name__=='__main__':

 # 创建线程和进程

     t1 = td.Thread(target=job,args=(1,2))

     p1 = mp.Process(target=job,args=(1,2))

     t1.start()

     p1.start()

     t1.join()

     p1.join()

 -------------------------------------------------------

 aaaaa

 aaaaa

进程创建

3.Queue:将每个核或线程的运算结果放在队里中，等到每个线程或核运行完毕后再从队列中取出结果，继续加载运算。

 import multiprocessing as mp

 #调用的函数不能有返回值

 def job(q):

     res = 0

     for i in range(1000):

         res += i+i**2+i**3

     q.put(res) # queue把结果放到q中

 if __name__ == '__main__':

     q = mp.Queue()

     p1 = mp.Process(target=job, args=(q,))

     #要加逗号，否则TypeError: 'Queue' object is not iterable

     p2 = mp.Process(target=job, args=(q,))

     p1.start()

     p2.start()

     p1.join()

     p2.join()

     res1 = q.get()#获取q中的结果

     res2 = q.get()

     print(res1+res2)

 --------------------------------------------------------------

 499667166000

queue

4.效率对比

 #多线程与多进程的效率对比

 import multiprocessing as mp

 import threading as td

 import time

 def job(q):

     res = 0

     for i in range(1000000):

         res += i+i**2+i**3

     q.put(res) # queue

 def multicore():

     q = mp.Queue()

     p1 = mp.Process(target=job, args=(q,))

     p2 = mp.Process(target=job, args=(q,))

     p1.start()

     p2.start()

     p1.join()

     p2.join()

     res1 = q.get()

     res2 = q.get()

     print('multicore:' , res1+res2)

 def normal():#普通运算

     res = 0

     for _ in range(2):

         for i in range(1000000):

             res += i+i**2+i**3

     print('normal:', res)

 def multithread():

     q = mp.Queue()

     t1 = td.Thread(target=job, args=(q,))

     t2 = td.Thread(target=job, args=(q,))

     t1.start()

     t2.start()

     t1.join()

     t2.join()

     res1 = q.get()

     res2 = q.get()

     print('multithread:', res1+res2)

 if __name__ == '__main__':

     st = time.time()

     normal()

     st1= time.time()

     print('normal time:', st1 - st)

     st2 = time.time()

     multithread()

     st3 = time.time()

     print('multithread time:', st3 - st2)

     st4 = time.time()

     multicore()

     st5 = time.time()

     print('multicore time:', st5-st4)

 -----------------------------------------------------------------

 normal: 499999666667166666000000

 normal time: 1.8749690055847168

 multithread: 499999666667166666000000

 multithread time: 1.583195686340332

 multicore: 499999666667166666000000

 multicore time: 1.2573533058166504

效率对比

5.进程池pool

 #进程池就是我们将所要运行的东西，放到池子里，Python会自行解决多进程的问题

 import multiprocessing as mp

 def job(x):

     return x*x

 def multicore():

     pool = mp.Pool(processes=2)#创建进程池，定义调用2个cpu核

     # 让池子对应某一个函数，我们向池子里丢数据，池子就会返回函数返回的值

     # Pool和之前的Process的不同点是丢向Pool的函数有返回值，而Process的没有返回值

     res = pool.map(job, range(10))

     #用map()获取结果，在map()中需要放入函数和需要迭代运算的值，然后它会自动分配给CPU核，返回结果

     print(res)

     res = pool.apply_async(job, (2,))

     # apply_async()返回结果，只能传递一个值，它只会放入一个核进行运算，但是传入值时要注意是可迭代的，

     # 所以在传入值后需要加逗号, 同时需要用get()方法获取返回值

     print(res.get())

     # 将apply_async()放入迭代器中，来输出多个结果

     multi_res =[pool.apply_async(job, (i,)) for i in range(10)]

     print([res.get() for res in multi_res])

 if __name__ == '__main__':

     multicore()

 --------------------------------------------------

 [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

 4

 [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

pool

6.进程锁lock

 # 共享内存

 # import multiprocessing as mp

 #

 # # Value数据存储在一个共享的内存表中

 # value1 = mp.Value('i', 0)

 # value2 = mp.Value('d', 3.14)

 # #d和i参数用来设置数据类型的

 #

 # #Array类，可以和共享内存交互，来实现在进程之间共享数据。

 # array = mp.Array('i', [1, 2, 3, 4])

 # # 它只能是一维的，不能是多维的

 #进程锁lock

 import multiprocessing as mp

 import time

 def job(v, num, l):

     l.acquire()#设置进程锁的使用，保证运行时一个进程的对锁内内容的独占

     for _ in range(10):

         time.sleep(0.1)

         v.value += num# v.value获取共享变量值

         print(v.value)

     l.release()#进程锁保证了进程p1的完整运行，然后才进行了进程p2的运行

 def multicore():

     l = mp.Lock()# 定义一个进程锁

     v = mp.Value('i', 0) # 定义共享变量

     # 设定不同的number看如何抢夺内存

     p1 = mp.Process(target=job, args=(v, 1, l))

     p2 = mp.Process(target=job, args=(v, 3, l))

     p1.start()

     p2.start()

     p1.join()

     p2.join()

 if __name__ == '__main__':

     multicore()

 -------------------------------------------------

 1

 2

 3

 4

 5

 6

 7

 8

 9

 10

 13

 16

 19

 22

 25

 28

 31

 34

 37

 40

进程锁lock

python3 多线程和多进程的更多相关文章

Python3 多线程、多进程
python中的线程是假线程,不同线程之间的切换是需要耗费资源的,因为需要存储线程的上下文,不断的切换就会耗费资源.. python多线程适合io操作密集型的任务(如socket server 网络并 ...
多线程、多进程、协程、缓存（memcache、redis）
本节内容: 线程: a:基本的使用: 创建线程: 1:方法 import threading def f1(x): print(x) if __name__=='__main__': t=thread ...
Python多线程和多进程谁更快？
python多进程和多线程谁更快 python3.6 threading和multiprocessing 四核+三星250G-850-SSD 自从用多进程和多线程进行编程,一致没搞懂到底谁更快.网上很 ...
重写父类、多线程、多进程、logging模块
1. 重写父类 1) 子类定义父类同名函数后,父类函数被覆盖: 2) 如果需要,可以在子类中调用父类方法:”父类名.方法(self)”.”父类名().方法()”.”super(子类名,self). ...
Python 多线程、多进程（一）之源码执行流程、GIL
Python 多线程.多进程 (一)之源码执行流程.GIL Python 多线程.多进程 (二)之多线程.同步.通信 Python 多线程.多进程 (三)之线程进程对比.多线程一.python ...
Python 多线程、多进程（二）之多线程、同步、通信
Python 多线程.多进程 (一)之源码执行流程.GIL Python 多线程.多进程 (二)之多线程.同步.通信 Python 多线程.多进程 (三)之线程进程对比.多线程一.python ...
Python之多线程与多进程（二）
多进程上一章:Python多线程与多进程(一) 由于GIL的存在,Python的多线程并没有实现真正的并行.因此,一些问题使用threading模块并不能解决不过Python为并行提供了一个替代方 ...
python多线程与多进程及其区别
个人一直觉得对学习任何知识而言,概念是相当重要的.掌握了概念和原理,细节可以留给实践去推敲.掌握的关键在于理解,通过具体的实例和实际操作来感性的体会概念和原理可以起到很好的效果.本文通过一些具体的例子 ...
python的多线程和多进程（一）
在进入主题之前,我们先学习一下并发和并行的概念: --并发:在操作系统中,并发是指一个时间段中有几个程序都处于启动到运行完毕之间,且这几个程序都是在同一个处理机上运行.但任一时刻点上只有一个程序在处理 ...

随机推荐

VUE-CLI项目同一局域网手机通过IP访问电脑本地项目
0.找到config文件夹下的index.js文件,修改host内容为hots:'0.0.0.0',此时重新运行项目,则其他设备可以通过ip进行访问 1.首先确保电脑防火墙或者杀毒软件关闭,因为大多数 ...
机器学习_第一节_numpy
今天学了机器学习第一节, 希望能够坚持下去,其实不在乎课程是什么?关键要坚持下去今天主要学了对矩阵的一些操作, 用的库是numpy 开始从头到尾捋一遍, 作者说的很有道理,学计算机,动手能力要强,所 ...
Spark简介 --大数据
一.Spark是什么? 快速且通用的集群计算平台二.Spark的特点: 快速:Spark扩充流行的Mapreduce计算模型,是基于内存的计算通用:Spark的设计容纳了其它分布式系统拥有的功能, ...
Snapshot Array
Implement a SnapshotArray that supports the following interface: SnapshotArray(int length) initializ ...
PAT甲级并查集相关题_C++题解
并查集 PAT (Advanced Level) Practice 并查集相关题 <算法笔记> 重点摘要 1034 Head of a Gang (30) 1107 Social Clu ...
flask返回自定义的Response
from json import dumps from flask import Response from flask_api import status from protocol.errors_ ...
【Scratch】编程？一节课就教会你！其实我们不用一个个学习如何使用代码。
第199篇文章老丁的课程在很多教程里面,大家都喜欢把模块拿出来一个个讲述其功能. 这样做的好处是,可以把每个代码模块的功能讲的很清楚.但最最讨厌的问题也随之而来…… 举个例子,当你学习英语的时候, ...
C#记录日志到本地文件工具类
using System; using System.Diagnostics; using System.IO; using System.Threading; using System.Web; n ...
Android开发中常见问题分析及解决
最近公司有新的业务需求,需要开发一款APP,因为我开发过Android APP(我想告诉他们,那是4年前的事了,嘤嘤嘤),就把开发任务交给我了,当然也不是我一个人啦,让我组开发小组,说白了,就是让我来 ...
Go Select使用
原文:https://golangbot.com/pointers/ 作者:Nick Coghlan 译者:Noluye 什么是 select? select 语句用于在多个发送/接收信道操作中进行选 ...

python3 多线程和多进程

python3 多线程和多进程的更多相关文章

随机推荐

热门专题