Professional Documents
Culture Documents
With the use of multiprocessing, we can effectively bypass the limitation caused by GIL −
● By using multiprocessing, we are utilizing the capability of multiple processes and hence we are
utilizing multiple instances of the GIL.
● Due to this, there is no restriction of executing the bytecode of one thread within our programs at
any one time.
Starting Processes in Python
The following three methods can be used to start a process in Python within the multiprocessing module −
● Fork
● Spawn
● Forkserver
● fork() − It is a system call generally implemented in kernel. It is used to create a copy of the
process.p>
● getpid() − This system call returns the process ID(PID) of the calling process.
Example
The following Python script example will help you understand how to create a new child process and get
the PIDs of child and parent processes −
Creating a Process with Spawn
Spawn means to start something new. Hence, spawning a process means the creation of a new process by a parent process.
The parent process continues its execution asynchronously or waits until the child process ends its execution. Follow these
steps for spawning a process −
True parallelism in Python is achieved by creating multiple processes, each having a Python
interpreter with its own separate GIL.
Python has three modules for concurrency: multiprocessing, threading, and asyncio. When the tasks
are CPU intensive, we should consider the multiprocessing module. When the tasks are I/O bound
and require lots of connections, the asyncio module is recommended. For other types of tasks and
when libraries cannot cooperate with asyncio, the threading module can be considered.
Embarrassinbly parallel
The term embarrassinbly parallel is used to describe a problem or workload that can be easily run
in parallel. It is important to realize that not all workloads can be divided into subtasks and run
parallelly. For instance those, who need lots of communication among subtasks.
Another situation where parallel computations can be applied is when we run several
different computations, that is, we don't divide a problem into subtasks. For instance, we
could run calculations of π using different algorithms in parallel.
Process
The Process object represents an activity that is run in a separate process. The
multiprocessing.Process class has equivalents of all the methods of threading.Thread. The Process
constructor should always be called with keyword arguments.
The target argument of the constructor is the callable object to be invoked by the run method. The
name is the process name. The start method starts the process's activity. The join method blocks
until the process whose join method is called terminates. If the timeout option is provided, it blocks
at most timeout seconds. The is_alive method returns a boolean value indicationg whether the
process is alive. The terminate method terminates the process.
A new process is created. The target option provides the callable that is run in the new process. The
args provides the data to be passed. The multiprocessing code is placed inside the main guard. The
process is started with the start method.
$ ./joining.py
starting main
starting fun
finishing fun
finishing main
The finishing main message is printed after the child process has finished.
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
When we comment out the join method, the main process finishes before the child process.
With the name property of the Process, we can give the worker a specific name. Otherwise, the module creates
its own name.
Subclassing Process
Example
Here, we are using the same example as used in the daemon threads. The only difference is the change of module from
multithreading to multiprocessing and setting the daemonic flag to true. However, there would be a change in output as shown
below −
Example Output
import multiprocessing My Process has terminated, terminating main thread
import time Terminating Child Process
def Child_process(): Child Process successfully terminated
print ('Starting function')
time.sleep(5)
print ('Finished function')
P = multiprocessing.Process(target = Child_process)
P.start()
print("My Process has terminated, terminating main thread")
print("Terminating Child Process")
P.terminate()
print("Child Process successfully terminated")
The output shows that the program terminates before the execution of child process that has been created
with the help of the Child_process() function. This implies that the child process has been terminated
successfully.
Python multiprocessing Pool
To pass multiple arguments to a worker function, we can use the starmap method. The elements of the iterable
are expected to be iterables that are unpacked as arguments.
Multiple functions
Note: It is best to avoid sharing data between processes. Message passing is preferred.
The M is the number of generated points in the square and N is the total number of points.
While this method of π calculation is interesting and perfect for school examples, it is not very
accurate. There are far better algorithms to get π.
In the example, we calculate the approximation of the π value using one hundred million generated
random points.
Instead of calculating 100_000_000 in one go, each subtask will calculate a portion of it.
The partial calculations are passed to the count variable and the sum is then used in the final formula.
When running the example in parallel with four cores, the calculations took 29.46 seconds.
References
● https://zetcode.com/python/multiprocessing/
● https://www.tutorialspoint.com/concurrency_in_python/concurrency_in_python_multiprocessing.ht
m