In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython’s memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.)
看个例子:这个怎么都不会实现为0。具体如何解决看后面
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
import threading
total = 0 defadd(): global total for i inrange(1000000): total += 1 defdesc(): global total for i inrange(1000000): total -= 1
# 批量提交 import time from concurrent.futures import ThreadPoolExecutor, as_completed, wait defget_html(times): print("start sleep {}".format(times)) time.sleep(times) print("get page {} success".format(times)) return times
oneList = [2, 4, 3] executor = ThreadPoolExecutor(max_workers=2) # allTask = [executor.submit(get_html, each) for each in oneList] # wait(allTask) # 所有线程执行完毕再走 # # 一旦有完成的了就能获取到(谁先完成) # for future in as_completed(allTask): # 获取到已经完成的了 # data = future.result() # print("! get {}".format(str(data))) # 换个写法 (但是会按照oneList的顺序) for data in executor.map(get_html, oneList): print("! get {}".format(str(data)))
多进程编程
耗cpu的操作,用多进程编程,
对于io操作来说, 使用多线程编程
进程切换代价要高于线程
一般都是用多线程,多进程与多线程的库使用差不多~
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
import time from concurrent.futures import ProcessPoolExecutor, as_completed
defrandom_sleep(n): time.sleep(n) return n
if __name__ == "__main__": with ProcessPoolExecutor(3) as executor: all_task = [executor.submit(random_sleep, (num)) for num in [2] * 30] start_time = time.time() for future in as_completed(all_task): data = future.result() print("exe result: {}".format(data)) print("last time is: {}".format(time.time() - start_time))
共享全局变量在多进程中是不使用的
multiprocessing中的queue不能用于pool进程池
pool中的进程间通信需要使用manager中的queue
使用Pipe通信,但是只能适用于2个进程
1 2 3 4 5 6 7 8
from queue import Queue # 多进程不能用
from multiprocessing import Queue # 正常的多进程使用
from multiprocessing import Manager # pool里面使用 Manager().Queue()