Question
I'm looking for a simple process-based parallel map for python. With native support map function, the performance:
- In [1]: data = range(10000000)
- In [2]: time alist = list(map(lambda e:(e*5+1)/2, data))
- CPU times: user 1.48 s, sys: 47.6 ms, total: 1.53 s
- Wall time: 1.53 s
- In [3]: time olist = [(e*5+1)/2 for e in data]
- CPU times: user 862 ms, sys: 54 ms, total: 916 ms
- Wall time: 917 ms
I seems like what you need is the map method in multiprocessing.Pool():
Below is the sample code to show the usage:
- test.py
- #!/usr/bin/env python3
- import multiprocessing
- from datetime import datetime
- def f(e):
- return (e*5+1)/2
- data = range(10000000)
- pool = multiprocessing.Pool()
- st = datetime.now()
- print("Start at {}".format(st))
- mlist = pool.map(f, data)
- diff = datetime.now() - st
- print("Done with {} ms".format(diff.microseconds/1000))
Supplement
* Python 文章收集 - multiprocessing 模塊介紹
沒有留言:
張貼留言