Professional Documents
Culture Documents
Process Threads
def square(x):
return x * x
if __name__ == '__main__':
pool = multiprocessing.Pool()
pool = multiprocessing.Pool(processes=4)
inputs = [0,1,2,3,4]
outputs = pool.map(square, inputs)
print("Input: {}".format(inputs))
print("Output: {}".format(outputs))
Functions provided with pool
• apply Call func with arguments args. It blocks until the result is ready.
• apply_async It is better suited for performing work in parallel.
• map A parallel equivalent of the map() built-in function (it supports
only one iterable argument though, for multiple iterables). But it
blocks.
• map_async It is better suited for performing work as map in parallel.
Process class
• Its used when function based parallelism is required, where I could
define different functionality with parameters that they receive and run
those different functions in parallel which are doing totally various
kind of computations.
import time
from multiprocessing import Process
def test(fname):
f = open(fname, "w")
f.write("hi")
f.write("hi")
f.write("hi")
f.write("hi")
f.close()
if __name__ == "__main__":
starttime = time.time()
processlist = []
p1 = Process(target=test, args=("sample1.txt",))
p2 = Process(target=test, args=("sample2.txt",))
p1.start()
p2.start()
p1.join()
p2.join()
endtime = time.time()
print(f"Time taken {endtime-starttime} seconds")
Functions provided with process class
Concurrency is the task of running and managing the multiple While parallelism is the task of running multiple computations
1.
computations at the same time. simultaneously.
6. Concurrency is the non-deterministic control flow approach. While it is deterministic control flow approach.
if __name__ == "__main__":
# creating thread
t1 = threading.Thread(target=print_square, args=(10,))
t2 = threading.Thread(target=print_cube, args=(10,))
# starting thread 1
t1.start()
# starting thread 2
t2.start()
# wait until thread 1 is completely executed
t1.join()
# wait until thread 2 is completely executed
t2.join()
# both threads completely executed
print("Done!")
Pros of Threading
• Lightweight - low memory footprint
• Shared memory - makes access to state from another context easier
• Allows you to easily make responsive UIs
• cPython C extension modules that properly release the GIL will run in
parallel
• Great option for I/O-bound applications
Cons of Threading
• cPython - subject to the GIL
• Not interruptible/killable
• If not following a command queue/message pump model (using the
Queue module), then manual use of synchronization primitives
become a necessity (decisions are needed for the granularity of
locking)
• Code is usually harder to understand and to get right - the potential for
race conditions increases dramatically
Multiprocessing
• The core idea of concurrency is that a larger task can be broken down
into a collection of subtasks which are scheduled to run
simultaneously or asynchronously, instead of one at a time or
synchronously. A switch between the two subtasks is known as a
context switch.
• A context switch in gevent is done through yielding.
import gevent
from gevent import socket
urls = ['www.google.com', 'www.example.com', 'www.python.org']
jobs = [gevent.spawn(socket.gethostbyname, url) for url in urls]
_ = gevent.joinall(jobs, timeout=2)
[job.value for job in jobs]
Output:
['74.125.79.106', '208.77.188.166', '82.94.164.162']
Greenlets
• Greenlets are lightweight coroutines for in-process sequential
concurrent programming.
• Greenlets all run inside of the OS process for the main program but are
scheduled cooperatively.
• Greenlets can be used on their own, but they are frequently used with
frameworks such as gevent to provide higher-level abstractions and
asynchronous I/O.
• Greenlets, sometimes referred to as "green threads," are a lightweight
structure that allows you to do some cooperative multithreading in
Python without the system overhead of real threads
import gevent
from gevent import Greenlet
def foo(message, n):
"""
Each thread will be passed the message, and n arguments
in its initialization.
"""
gevent.sleep(n)
print(message)
# Initialize a new Greenlet instance running the named function
# foo
thread1 = Greenlet.spawn(foo, "Hello", 1)
# Wrapper for creating and running a new Greenlet from the named
# function foo, with the passed arguments
thread2 = gevent.spawn(foo, "I live!", 2)
# Lambda expressions
thread3 = gevent.spawn(lambda x: (x+1), 2)
threads = [thread1, thread2, thread3]
# Block until all threads complete.
gevent.joinall(threads)
Greenlet vs thread
• Threads (in theory) are preemptive and parallel , meaning that multiple
threads can be processing work at the same time, and it’s impossible to say
in what order different threads will proceed or see the effects of other
threads. This necessitates careful programming using locks, queues, or other
approaches to avoid race conditions, deadlocks, or other bugs.
• In contrast, greenlets are cooperative and sequential. This means that when
one greenlet is running, no other greenlet can be running; the programmer is
fully in control of when execution switches between greenlets. This can
eliminate race conditions and greatly simplify the programming task.
• Also, threads require resources from the operating system (the thread stack,
and bookkeeping in the kernel). Because greenlets are implemented entirely
without involving the operating system, they can require fewer resources; it
is often practical to have many more greenlets than it is threads.
Task queue
• Task queues are used as a mechanism to distribute work across threads or
machines.
• A task queue’s input is a unit of work called a task. Dedicated worker
processes constantly monitor task queues for new work to perform.
• Celery communicates via messages, usually using a broker to mediate
between clients and workers. To initiate a task the client adds a message to
the queue, the broker then delivers that message to a worker.
• A Celery system can consist of multiple workers and brokers, giving way to
high availability and horizontal scaling.
• Celery is written in Python, but the protocol can be implemented in any
language
Celery
doubleNum = partial(multiply, 2)
tripleNum = partial(multiply, 3)
print(doubleNum(10))
Output: 20
Total ordering function
• Functools module in python helps in implementing higher-order
functions.
• Higher-order functions are dependent functions that call other
functions.
• Total_ordering provides rich class comparison methods that help in
comparing classes without explicitly defining a function for it. So, It
helps in the redundancy of code.
from functools import partial
def orderFunc(a,b,c,d):
return a*4 + b*3 + c*2 + d
result = partial(orderFunc,5,6,7)
print(result(8))
Output: 60