You are on page 1of 63

Parallel programming paradigm

• Parallel processing is the processing of program instructions by dividing


them among multiple processors.
• A parallel processing system possess many numbers of processor with the
objective of running a program in less time by dividing them.
• This approach seems to be like divide and conquer
• The maximum number of processes you can run at a time is limited by the
number of processors in your computer
• cpu_count() function will show us how many processors are present in the
machine
import multiprocessing as mp
print("Number of processors: ", mp.cpu_count())
Ways to handle parallel processing
• Shared Memory In shared memory, the sub-units can communicate
with each other through the same memory space. The advantage is that
you don’t need to handle the communication explicitly because this
approach is sufficient to read or write from the shared memory. But
the problem arises when multiple process access and change the same
memory location at the same time. This conflict can be avoided using
synchronization techniques.
• Distributed memory In distributed memory, each process is totally
separated and has its own memory space. In this scenario,
communication is handled explicitly between the processes. Since the
communication happens through a network interface, it is costlier
compared to shared memory
Serial and Parallel execution

• A synchronous/serial execution is one the processes are completed in


the same order in which it was started. This is achieved by locking the
main program until the respective processes are finished.
• Asynchronous/parallel, on the other hand, doesn’t involve locking. As
a result, the order of results can get mixed up but usually gets done
quicker.
Synchronous and asynchronous execution
• The parallel processing holds two varieties of execution: Synchronous
and Asynchronous.
• In synchronous execution, once a process starts execution, it puts a
lock over the main program until its get accomplished.
• While the asynchronous execution doesn’t require locking, it performs
a task quickly but the outcome can be in the rearranged order.
Threads vs. Processes
• A process is a program that is in execution. In other words, code that
are running (e.g. Jupyter notebook, Google Chrome, Python
interpreter). Multiple processes are always running in a computer, and
they are executed in parallel.
• A process can spawn multiple threads (sub-processes) to handle
subtasks. They live inside processes and share the same memory space
(they can read and write to the same variables). Ideally, they run in
parallel, but not necessarily. The reason why processes aren't enough is
because applications need to be responsive and listen for user actions
while updating the display and saving a file.
e.g. Microsoft Word. When we open up Word, we're essentially creating
a process (an instance of the program). When we start typing, the
process spawns a number of threads: one to read keystrokes, another to
display text on the screen, a thread to autosave our file, and yet another
to highlight spelling mistakes. By spawning multiple threads, Microsoft
takes advantage of "wasted CPU time" (waiting for our keystrokes or
waiting for a file to save) to provide a smoother user interface and make
us more productive.
Threads vs. Processes(2)

Process Threads

Processes don't share memory Threads share memory

Spawning/switching processes is Spawning/switching threads


more expensive requires less resources

Needs synchronization mechanisms


No memory synchronisation needed to ensure we're correctly handling
the data
Global Interpreter Lock
• The Global Interpreter Lock (GIL) is one of the most controversial
subjects in the Python world.
• In CPython, the most popular implementation of Python, the GIL is a
mutex (mutually exclusive object)that makes things thread-safe.
• The GIL makes it easy to integrate with external libraries that are not
thread-safe, and it makes non-parallel code faster. This comes at a
cost, though.
• Due to the GIL, we can't achieve true parallelism via multithreading.
• Basically, two different native threads of the same process can't run
Python code at the same time.
Multiprocessing in python
• The multiprocessing package supports spawning processes. It refers to
a function that loads and executes a new child processes.
• When we work with Multiprocessing , At first we import the Process
class then initiate Process object with the display() function.
• Then process is started with start() method and then complete the
process with the join() method.
• We can also pass arguments to the function using args keyword.
• Multiprocessing supports Pipes and Queues, which are two types of
communication channels between processes.
from multiprocessing import Process
def cube(x):
for x in my_numbers:
print('%s cube is %s' % (x, x**3))
def evenno(x):
for x in my_numbers:
if x % 2 == 0:
print('%s is an even number ' % (x))
if __name__ == '__main__':
my_numbers = [3, 4, 5, 6, 7, 8]
my_process1 = Process(target=cube, args=('x',))
my_process2 = Process(target=evenno, args=('x',))
my_process1.start()
my_process2.start()
my_process1.join()
my_process2.join()
print ("Done")
Pipes
• In multiprocessing, when we want to communicate between processes,
in that situation Pipes are used.
• Pipes return two connection objects and these are representing the two
ends of the pipe. Each connection object has two methods one is
send() and another one is recv() method.
from multiprocessing import Process, Pipe
def myfunction(conn):
conn.send(['hi!! I am Python'])
conn.close()
if __name__ == '__main__':
parent_conn, child_conn = Pipe()
p = Process(target=myfunction, args=(child_conn,))
p.start()
print (parent_conn.recv() )
p.join()
Queue
• When we pass data between processes then at that time we can use Queue
object.
import multiprocessing
def evenno(numbers, q):
for n in numbers:
if n % 2 == 0:
q.put(n)
if __name__ == "__main__":
q = multiprocessing.Queue()
p = multiprocessing.Process(target=evenno, args=(range(10), q))
p.start()
p.join()
while q:
print(q.get())
Locks
• When we want that only one process is executed at a time in that situation
Locks is use. That means that time blocks other process from executing
similar code. Lock will be released after the process gets completed.
from multiprocessing import Process, Lock
def display_name(l, i):
l.acquire()
print ('Hi', i)
l.release()
if __name__ == '__main__':
my_lock = Lock()
my_name = [‘Abc', 'Adw', ‘Say', 'San']
for name in my_name:
Process(target=display_name, args=(my_lock,name)).start()
Multi-Threading vs. Multi-Processing
• Depending on the application, two common approaches in parallel
programming are either to run code via threads or multiple processes,
respectively. If we submit “jobs” to different threads, those jobs can be
pictured as “sub-tasks” of a single process and those threads will usually
have access to the same memory areas (i.e., shared memory). This approach
can easily lead to conflicts in case of improper synchronization, for
example, if processes are writing to the same memory location at the same
time.
• A safer approach (although it comes with an additional overhead due to the
communication overhead between separate processes) is to submit multiple
processes to completely separate memory locations (i.e., distributed
memory): Every process will run completely independent from each other.
Pool and Process class

• Two main objects in multiprocessing to implement parallel execution of a


function
➢Pool class
1. Synchronous execution
• Pool.map() and Pool.starmap()
• Pool.apply()
2.Asynchronous execution
• Pool.map_async() and Pool.starmap_async()
• Pool.apply_async()
➢ Process class
Pool class
• Pool class can be used for parallel execution of a function for different
input data. The multiprocessing.Pool() class spawns a set of processes
called workers and can submit tasks using the methods
apply/apply_async and map/map_async.
• For parallel mapping, you should first initialize a
multiprocessing.Pool() object. The first argument is the number of
workers; if not given, that number will be equal to the number of cores
in the system.
Pool.map()
import multiprocessing
import time

def square(x):
return x * x

if __name__ == '__main__':
pool = multiprocessing.Pool()
pool = multiprocessing.Pool(processes=4)
inputs = [0,1,2,3,4]
outputs = pool.map(square, inputs)
print("Input: {}".format(inputs))
print("Output: {}".format(outputs))
Functions provided with pool
• apply Call func with arguments args. It blocks until the result is ready.
• apply_async It is better suited for performing work in parallel.
• map A parallel equivalent of the map() built-in function (it supports
only one iterable argument though, for multiple iterables). But it
blocks.
• map_async It is better suited for performing work as map in parallel.
Process class
• Its used when function based parallelism is required, where I could
define different functionality with parameters that they receive and run
those different functions in parallel which are doing totally various
kind of computations.
import time
from multiprocessing import Process
def test(fname):
f = open(fname, "w")
f.write("hi")
f.write("hi")
f.write("hi")
f.write("hi")
f.close()
if __name__ == "__main__":
starttime = time.time()
processlist = []
p1 = Process(target=test, args=("sample1.txt",))
p2 = Process(target=test, args=("sample2.txt",))
p1.start()
p2.start()
p1.join()
p2.join()
endtime = time.time()
print(f"Time taken {endtime-starttime} seconds")
Functions provided with process class

• .start() helps in starting a process and that too asynchronously.


• .join() method on a Process does block until the process has finished,
but because we called .start() on both p1 and p2 before joining, then
both processes will run asynchronously. The interpreter will, however,
wait until p1 finishes before attempting to wait for p2 to finish
Difference between pool and process class
• Management: Pool class is easier to use than the Process class because you
do not have to manage the processes by yourself. It creates the processes,
splits the input data, and returns the result in a list. It also waits for the
workers to finish their tasks, i.e., you do not have to call the join() method
explicitly.
• Memory:
➢While the Process keeps all the processes in the memory, the Pool keeps
only those that are under execution. Therefore, if you have a large number
of tasks, and if they have more data and take a lot of space too, then using
process class might waste a lot of memory.
➢The overhead of creating a Pool is more. Therefore, when there are a small
number of tasks, and they are not repetitive, it is advisable to use a Process
in this case.
Difference between pool and process class(1)
• I/O operations: Both the Process and the Pool class use FIFO (First
In First Out) scheduler. However, if the current process is waiting for,
or executing an I/O operation, then the Process class halts the current
one and schedules another one from the task queue. The Pool class, on
the other hand, waits for the process to complete its I/O operation, i.e.,
it does not schedule another one until the current has finished its
execution. Because of this, the execution time might increase. Process
is preferred over Pool when your task is I/O bound (A program is I/O
bound if it spends most of its time waiting for the I/O operation to
complete).
Asyncio
• The basic concept of asyncio is that a single Python object known as
the event loop controls how and when each task gets run.
• The event loop is aware of each task and its state. The ready state
indicates that the task is ready to run and the waiting stage indicates
that the task is waiting for some external task to complete.
• In asyncio tasks never give up control and do not get interrupted in the
middle of execution so object sharing is safer in it than threading and
it is also thread-safe.
import asyncio
#from datetime import datetime
async def count_number_of_words(sentence):
print("Counting number of words in sentence : {}".format(sentence))
number_of_words = len(sentence.split())
located = [] for i in range(3):
located.append(i)
print("Done with counting number of words in sentence : {}".format(number
_of_words))
return located
async def main():
task1 = loop1.create_task(count_number_of_words("Asyncio is a way of achi
eving Concurrency in Python."))
task2 = loop1.create_task(count_number_of_words("It is not concurrency in t
rue sense."))
task3 = loop1.create_task(count_number_of_words("It uses only one process
or at a time."))
await asyncio.wait([task1, task2, task3])
if __name__ == '__main__':
loop1 = asyncio.get_event_loop()
loop1.run_until_complete(main())
loop1.close()
Concurrency vs parallelism
• Concurrency is the ability of parts of a program to work correctly
when executed out of order. For instance, imagine tasks A and B. One
way to execute them is sequentially, meaning doing all steps for A,
then all for B:

• Concurrent execution, on the other hand, alternates doing a


little of each task until both are all complete:
Concurrency vs parallelism(1)
• Concurrency allows a program to make progress even when certain
parts are blocked. For instance, when one task is waiting for user
input, the system can switch to another task and do calculations.
• When tasks don’t just interleave, but run at the same time, that’s called
parallelism. Multiple CPU cores can run instructions simultaneously:
Difference between Concurrency and Parallelism
S.NO Concurrency Parallelism

Concurrency is the task of running and managing the multiple While parallelism is the task of running multiple computations
1.
computations at the same time. simultaneously.

Concurrency is achieved through the interleaving operation of


While it is achieved by through multiple central processing
2. processes on the central processing unit(CPU) or in other words
units(CPUs).
by the context switching.

While this can’t be done by using a single processing unit. it


3. Concurrency can be done by using a single processing unit.
needs multiple processing units.

While it improves the throughput and computational speed of


4. Concurrency increases the amount of work finished at a time.
the system.

5. Concurrency deals lot of things simultaneously. While it do lot of things simultaneously.

6. Concurrency is the non-deterministic control flow approach. While it is deterministic control flow approach.

While in this debugging is also hard but simple than


7. In concurrency debugging is very hard.
concurrency
Multithreaded Programming
• Running several threads is similar to running several different programs
concurrently, but with the following benefits −
• Multiple threads within a process share the same data space with the main
thread and can therefore share information or communicate with each other
more easily than if they were separate processes.
• Threads sometimes called light-weight processes and they do not require
much memory overhead; they are cheaper than processes.
• A thread has a beginning, an execution sequence, and a conclusion. It has an
instruction pointer that keeps track of where within its context it is currently
running.
• It can be pre-empted (interrupted)
• It can temporarily be put on hold (also known as sleeping) while other
threads are running - this is called yielding.
Starting a new thread
• To spawn another thread, you need to call following method available
in thread module −
thread.start_new_thread ( function, args[, kwargs] )
• This method call enables a fast and efficient way to create new threads
in both Linux and Windows.
• The method call returns immediately and the child thread starts and
calls function with the passed list of args. When function returns, the
thread terminates.
• Here, args is a tuple of arguments; use an empty tuple to call function
without passing any arguments. kwargs is an optional dictionary of
keyword arguments.
The Threading Module

The threading module exposes all the methods of


the thread module and provides some additional methods −
• threading.activeCount() − Returns the number of thread objects
that are active.
• threading.currentThread() − Returns the number of thread objects
in the caller's thread control.
• threading.enumerate() − Returns a list of all thread objects that
are currently active.
Thread class methods
The threading module has the Thread class that implements threading.
The methods provided by the Thread class are as follows −
• run() − The run() method is the entry point for a thread.
• start() − The start() method starts a thread by calling the run method.
• join([time]) − The join() waits for threads to terminate.
• isAlive() − The isAlive() method checks whether a thread is still
executing.
• getName() − The getName() method returns the name of a thread.
• setName() − The setName() method sets the name of a thread.
Creating Thread Using Threading Module
• To implement a new thread using the threading module, you have to
do the following −
• Define a new subclass of the Thread class.
• Override the __init__(self [,args]) method to add additional arguments.
• Then, override the run(self [,args]) method to implement what the
thread should do when started.
• Once you have created the new Thread subclass, you can create an
instance of it and then start a new thread by invoking the start(), which
in turn calls run() method.
import threading
def print_cube(num):
"""
function to print cube of given num
"""
print("Cube: {}".format(num * num * num))
def print_square(num):
"""
function to print square of given num
"""
print("Square: {}".format(num * num))

if __name__ == "__main__":
# creating thread
t1 = threading.Thread(target=print_square, args=(10,))
t2 = threading.Thread(target=print_cube, args=(10,))
# starting thread 1
t1.start()
# starting thread 2
t2.start()
# wait until thread 1 is completely executed
t1.join()
# wait until thread 2 is completely executed
t2.join()
# both threads completely executed
print("Done!")
Pros of Threading
• Lightweight - low memory footprint
• Shared memory - makes access to state from another context easier
• Allows you to easily make responsive UIs
• cPython C extension modules that properly release the GIL will run in
parallel
• Great option for I/O-bound applications
Cons of Threading
• cPython - subject to the GIL
• Not interruptible/killable
• If not following a command queue/message pump model (using the
Queue module), then manual use of synchronization primitives
become a necessity (decisions are needed for the granularity of
locking)
• Code is usually harder to understand and to get right - the potential for
race conditions increases dramatically
Multiprocessing

• Multiprocessing adds CPUs to increase computing power.


• Multiple processes are executed concurrently.
• Creation of a process is time-consuming and resource intensive.
• Multiprocessing can be symmetric or asymmetric.
• The multiprocessing library in Python uses separate memory space,
multiple CPU cores, bypasses GIL limitations in CPython, child
processes are killable (ex. function calls in program) and is much
easier to use.
• Some caveats of the module are a larger memory footprint and IPC’s a
little more complicated with more overhead.
from multiprocessing import Queue

fruits = ['Apple', 'Orange', 'Guava', 'Papaya', 'Banana']


count = 1
# creating a queue object
queue = Queue()
print('pushing items to the queue:')
for fr in fruits:
print('item no: ', count, ' ', fr)
queue.put(fr)
count += 1

print('\npopping items from the queue:')


count = 0
while not queue.empty():
print('item no: ', count, ' ', queue.get())
count += 1
Pros of multiprocessing
• Separate memory space
• Code is usually straightforward
• Takes advantage of multiple CPUs & cores
• Avoids GIL limitations for cPython
• Eliminates most needs for synchronization primitives unless if you use
shared memory (instead, it's more of a communication model for IPC)
• Child processes are interruptible/killable
• Python multiprocessing module includes useful abstractions with an
interface much like threading.Thread
• A must with cPython for CPU-bound processing
Cons of multiprocessing
• IPC a little more complicated with more overhead (communication
model vs. shared memory/objects)
• Larger memory footprint
Threading vs multiprocessing
• The threading module uses threads, the multiprocessing module uses
processes.
• The difference is that threads run in the same memory space, while
processes have separate memory. This makes it a bit harder to share
objects between processes with multiprocessing. Since threads use the
same memory, precautions have to be taken or two threads will write
to the same memory at the same time. This is what the global
interpreter lock is for.
• Spawning processes is a bit slower than spawning threads.
Concurrent futures
• The concurrent.futures modules provides interfaces for running tasks
using pools of thread or process workers. The APIs are the same, so
applications can switch between threads and processes with minimal
changes.
• The module provides two types of classes for interacting with the
pools. Executors are used for managing pools of workers, and futures
are used for managing results computed by the workers.
• To use a pool of workers, an application creates an instance of the
appropriate executor class and then submits tasks for it to run.
Concurrent futures(1)
• When each task is started, a Future instance is returned. When the
result of the task is needed, an application can use the Future to block
until the result is available. Various APIs are provided to make it
convenient to wait for tasks to complete, so that the Future objects do
not need to be managed directly.
from concurrent import futures
import threading
import time
def task(n):
print('{}: sleeping {}'.format(
threading.current_thread().name,
n)
)
time.sleep(n / 10)
print('{}: done with {}'.format(
threading.current_thread().name,
n)
)
return n / 10
ex = futures.ThreadPoolExecutor(max_workers=2)
print('main: starting')
results = ex.map(task, range(5, 0, -1))
print('main: unprocessed results {}'.format(results))
print('main: waiting for real results')
real_results = list(results)
print('main: results: {}'.format(real_results))
Gevent

• Gevent is a concurrency library based around libev. It provides a clean


API for a variety of concurrency and network related tasks.
• The real power of gevent comes when we use it for network and IO
bound functions which can be cooperatively scheduled.
Synchronous & Asynchronous Execution

• The core idea of concurrency is that a larger task can be broken down
into a collection of subtasks which are scheduled to run
simultaneously or asynchronously, instead of one at a time or
synchronously. A switch between the two subtasks is known as a
context switch.
• A context switch in gevent is done through yielding.
import gevent
from gevent import socket
urls = ['www.google.com', 'www.example.com', 'www.python.org']
jobs = [gevent.spawn(socket.gethostbyname, url) for url in urls]
_ = gevent.joinall(jobs, timeout=2)
[job.value for job in jobs]
Output:
['74.125.79.106', '208.77.188.166', '82.94.164.162']
Greenlets
• Greenlets are lightweight coroutines for in-process sequential
concurrent programming.
• Greenlets all run inside of the OS process for the main program but are
scheduled cooperatively.
• Greenlets can be used on their own, but they are frequently used with
frameworks such as gevent to provide higher-level abstractions and
asynchronous I/O.
• Greenlets, sometimes referred to as "green threads," are a lightweight
structure that allows you to do some cooperative multithreading in
Python without the system overhead of real threads
import gevent
from gevent import Greenlet
def foo(message, n):
"""
Each thread will be passed the message, and n arguments
in its initialization.
"""
gevent.sleep(n)
print(message)
# Initialize a new Greenlet instance running the named function
# foo
thread1 = Greenlet.spawn(foo, "Hello", 1)
# Wrapper for creating and running a new Greenlet from the named
# function foo, with the passed arguments
thread2 = gevent.spawn(foo, "I live!", 2)
# Lambda expressions
thread3 = gevent.spawn(lambda x: (x+1), 2)
threads = [thread1, thread2, thread3]
# Block until all threads complete.
gevent.joinall(threads)
Greenlet vs thread
• Threads (in theory) are preemptive and parallel , meaning that multiple
threads can be processing work at the same time, and it’s impossible to say
in what order different threads will proceed or see the effects of other
threads. This necessitates careful programming using locks, queues, or other
approaches to avoid race conditions, deadlocks, or other bugs.
• In contrast, greenlets are cooperative and sequential. This means that when
one greenlet is running, no other greenlet can be running; the programmer is
fully in control of when execution switches between greenlets. This can
eliminate race conditions and greatly simplify the programming task.
• Also, threads require resources from the operating system (the thread stack,
and bookkeeping in the kernel). Because greenlets are implemented entirely
without involving the operating system, they can require fewer resources; it
is often practical to have many more greenlets than it is threads.
Task queue
• Task queues are used as a mechanism to distribute work across threads or
machines.
• A task queue’s input is a unit of work called a task. Dedicated worker
processes constantly monitor task queues for new work to perform.
• Celery communicates via messages, usually using a broker to mediate
between clients and workers. To initiate a task the client adds a message to
the queue, the broker then delivers that message to a worker.
• A Celery system can consist of multiple workers and brokers, giving way to
high availability and horizontal scaling.
• Celery is written in Python, but the protocol can be implemented in any
language
Celery

• Celery is a simple, flexible, and reliable distributed system to process


vast amounts of messages, while providing operations with the tools
required to maintain such a system.
• It’s a task queue with focus on real-time processing, while also
supporting task scheduling.
• Celery requires a message transport to send and receive messages.
• Celery can run on a single machine, on multiple machines, or even
across data centers.
Celery(1)
• Celery communicates via messages, usually using brokers to mediate
between clients and workers
• To initialise a task the client adds a message to the queue ,the broken
then delivers that message to a worker.
from celery import Celery
app= Celery(‘hello’,broker=‘amqp://guest@localhost//’)#RabbitMQ
@app.task
def hello():
return ‘hello world’

The above program can be compiled on Unix/Linux OS.


Functional programming
• Functional programming is a programming paradigm in which we try
to bind everything in pure mathematical functions style.
• It is a declarative type of programming style. Its main focus is on
“what to solve” in contrast to an imperative style where the main focus
is “how to solve”.
• It uses expressions instead of statements. An expression is evaluated to
produce a value whereas a statement is executed to assign variables.
Lambda Calculus
• Lambda calculus is framework developed by Alonzo Church to study
computations with functions.
• It can be called as the smallest programming language of the world. It
gives the definition of what is computable. Anything that can be
computed by lambda calculus is computable.
• It is equivalent to Turing machine in its ability to compute.
• It provides a theoretical framework for describing functions and their
evaluation. It forms the basis of almost all current functional
programming languages.
Python functools
• Python Functools is a library designed for high order functions.
• These functions can perform operations and can return functions also.
It permits developers to design code in such a fashion that can be re-
usable.
• Functions can be used or extended for new requirements without fully
re-writing them. Python Functools module provides various such tools
to attain the mentioned functionality
• Some of them are as follows :
➢Partial Function
➢Total Ordering
Partial function
• By using partial functions, we can replace the existing function with
already passed arguments. Moreover, we can also create a new
function version by adding documentation in well-mannered.
• We can create a new function by passing partial arguments. We can
freeze some portion of function arguments which in turn results into a
new object. A different way to present is that using partial, we can
create a function with some defaults. Partial supports keywords and
positional arguments as fixed ones.
from functools import partial

def multiply(x, y):


return x * y

doubleNum = partial(multiply, 2)
tripleNum = partial(multiply, 3)
print(doubleNum(10))

Output: 20
Total ordering function
• Functools module in python helps in implementing higher-order
functions.
• Higher-order functions are dependent functions that call other
functions.
• Total_ordering provides rich class comparison methods that help in
comparing classes without explicitly defining a function for it. So, It
helps in the redundancy of code.
from functools import partial
def orderFunc(a,b,c,d):
return a*4 + b*3 + c*2 + d

result = partial(orderFunc,5,6,7)
print(result(8))

Output: 60

You might also like