Source From Here
Preface
This article is going to teach you, How to apply python multiprocessing for your long-running functions
What is multiprocessing,
Basically, multiprocessing means run two or more tasks parallelly. So in Python, We can use python’s inbuilt multiprocessing module to achieve that. Imagine you have ten functions that take ten seconds to run and you're in a situation that you want to run that long-running function ten times. Without a doubt, It will take hundred seconds to finish if you run it sequentially. That is where multiprocessing comes into action. By using multiprocessing, you can separate those ten processes into ten sub-processes and complete them all in ten seconds.
Different between multiprocessing and multithreading,
So didn’t you wonder why we use multiprocessing instead of multithreading? It is good to use multithreading in the above example, but if your function required more processing power and more memory, It is ideal to use multiprocessing because when you use multiprocessing, each sub-process will have a dedicated CPU and Memory slot. So it is ideal to use multiprocessing instead of multithreading (multi-threading has another issue called GIL) if your long-running function required more processing power and memory:
Let’s see multiprocessing in action,
Imagine this is your long-running function:
If you want to run this function ten times without using multiprocessing or multithreading it will look something like this:
Output:
Let’s see how to apply multiprocessing to this simple example. First of all, you will have to import python’s multiprocessing module,
Then you have to make an object from the Process and pass the target function and arguments if any. e.g.:
So now we can call its start method to start the execution of the function factorize:
Output:
Then our for loop will look like this:
Execution result:
This article is going to teach you, How to apply python multiprocessing for your long-running functions
What is multiprocessing,
Basically, multiprocessing means run two or more tasks parallelly. So in Python, We can use python’s inbuilt multiprocessing module to achieve that. Imagine you have ten functions that take ten seconds to run and you're in a situation that you want to run that long-running function ten times. Without a doubt, It will take hundred seconds to finish if you run it sequentially. That is where multiprocessing comes into action. By using multiprocessing, you can separate those ten processes into ten sub-processes and complete them all in ten seconds.
Different between multiprocessing and multithreading,
So didn’t you wonder why we use multiprocessing instead of multithreading? It is good to use multithreading in the above example, but if your function required more processing power and more memory, It is ideal to use multiprocessing because when you use multiprocessing, each sub-process will have a dedicated CPU and Memory slot. So it is ideal to use multiprocessing instead of multithreading (multi-threading has another issue called GIL) if your long-running function required more processing power and memory:
Let’s see multiprocessing in action,
Imagine this is your long-running function:
- def factorize(number):
- for i in range(1, number + 1):
- if number % i == 0:
- yield i
- from time import time
- numbers = [8402868, 2295738, 5938342, 7925426, 98761244, 87129945, 14789235, 66543218, 53218950, 33218765]
- start = time()
- for number in numbers:
- list(factorize(number))
- end = time()
- print ('Took %.3f seconds' % (end - start))
Let’s see how to apply multiprocessing to this simple example. First of all, you will have to import python’s multiprocessing module,
- import multiprocessing
- def print_factorize(num, q):
- q.put((num, list(factorize(num))))
- q = mp.Queue()
- process = mp.Process(target=print_factorize, args=(8402868, q, ))
- process.start()
- process.join()
- while not q.empty():
- print(q.get())
Then our for loop will look like this:
- import multiprocessing as mp
- from time import time
- def factorize(number):
- for i in range(1, number + 1):
- if number % i == 0:
- yield i
- def print_factorize(num, q):
- start = time()
- ans = list(factorize(num))
- end = time()
- q.put((num, ans, end - start))
- start = time()
- numbers = [8402868, 2295738, 5938342, 7925426, 98761244, 87129945, 14789235, 66543218, 53218950, 33218765]
- plist = []
- q = mp.Queue()
- for n in numbers:
- process = mp.Process(target=print_factorize, args=(n, q, ))
- plist.append(process)
- process.start()
- for p in plist:
- p.join()
- while not q.empty():
- num, flist, et = q.get()
- print(f"{num} took {et} seconds!")
- end = time()
- print ('Total took %.3f seconds' % (end - start))
If you run the calculation sequentially, you will take 0.309 + 0.505 + ... + 4.638 + 4.76 >> 4.795 seconds!
沒有留言:
張貼留言