[ad_1]
Thanks to GIL, using multiple threads to perform CPU-bound tasks has never been an option. With the popularity of multicore CPUs, Python offers a multiprocessing solution to perform CPU-bound tasks. But until now, there were still some problems with using multiprocess-related APIs directly.
Before we start, we still have a small piece of code to aid in the demonstration:
The method takes one argument and starts accumulating from 0 to this argument. Print the method execution time and return the result.
As the code shows, we directly create and start multiple processes, and call the start and join methods of each process. However, there are some problems here:
- The join method cannot return the result of task execution.
- the join method blocks the main process and executes it sequentially.
Even if the later tasks are executed faster than the earlier ones, as shown in the following figure:
Problems of using Pool
If we use multiprocessing.Pool
, there are also some problems:
As the code shows, Pool’s apply
method is synchronous, which means you have to wait for the previously apply task to finish before the next apply
task can start executing.
Of course, we can use the apply_async
method to create the task asynchronously. But again, you need to use the get method to get the result blockingly. It brings us back to the problem with the join method:
So, what if we use concurrent.futures.ProcesssPoolExecutor
to execute our CPU-bound tasks?
As the code shows, everything looks great and is called just like asyncio.as_completed
. But look at the results; they are still fetched in startup order. This is not at all the same as asyncio.as_completed
, which gets the results in the order in which they were executed:
Fortunately, we can use asyncio to handle IO-bound tasks, and its run_in_executor
method to invoke multi-process tasks in the same way as asyncio. Not only unifying concurrent and parallel APIs but also solving the various problems we encountered above:
Since the sample code in the previous article was all about simulating what we should call the methods of the concurrent process, many readers still need help understanding how to use it in the actual coding after learning it. So after understanding why we need to perform CPU-bound parallel tasks in asyncio, today we will use a real-world example to explain how to use asyncio to handle IO-bound and CPU-bound tasks simultaneously and appreciate the efficiency of asyncio for our code.
Note: Before continuing, if you are interested in the practice of using asyncio.gather
and asyncio.as_completed
, you can read this article of mine:
Source link