Python For Data Science - ANR PL - Final

Page 1
UNIT - I
S.NO TOPIC Pg No
1.1 Introduction to Data Science 1 - 3
1.2 Why Python? 3 - 5
1.3 Essential Python 5 - 6
1.4 libraries Python Introduction- 6 - 8
1.5 Features 8 - 9
1.6 Identifiers 9 - 9
1.7 Reserved words 10 - 11
1.8 Indentation 11 - 12
1.9 Comments 12 - 14
1.10 Built-in Data types and their Methods: Strings, List, 14 - 22
Tuples, Dictionary,Set.
1.11 Type Conversion 22 - 28
1.12 Operators. 29 - 44
1.13 Decision Making 45 - 46
1.14 Looping- Loop Control statement 46 - 47
1.15 Math and Random number functions. 48 - 53
1.16 User defined functions 54 - 57
1.17 function arguments & its types. 57 - 60
UNIT - II
S.NO TOPIC Pg No
2.1 User defined Modules and Packages in Python 61 - 74
2.2 Files: File manipulations 75 - 88
2.3 File and Directory related methods 89 - 92
2.4 Python Exception Handling. 93 - 96
2.5 OOPs Concepts 97 - 100
2.6 Class and Objects 101 - 104
2.7 Constructors 105 - 112
2.8 Data hiding 113 - 115
2.9 Data Abstraction 116 - 121
2.10 Inheritance. 122 - 128
Page 2
UNIT - III
S.NO TOPIC Pg No
3.1 NumPy Basics: Arrays The NumPy ndarray 129 - 130
3.2 Creating ndarrays 131 - 134
3.3 Data Types for ndarrays 135 - 135
3.4 Arithmetic with NumPy 136 - 139
3.5 Arrays- Basic Indexing and Slicing, Boolean Indexing, 140 - 144
Transposing Arrays and Swapping Axes. Universal
Functions:
3.6 Mathematical and Statistical Methods-Sorting 145 - 146
UNIT - IV
S.NO TOPIC Pg No
4.1 Introduction to pandas Data Structures: 147 - 149
4.2 Series 150 - 151
4.3 Data Frame 152 - 154
4.4 panels 155 - 156
4.5 Indexing Selection, 157 - 163
4.6 Filtering Function Application 164 - 164
4.7 Mapping 165 - 165
4.8 Sorting 166 - 168
4.9 Ranking. 169 - 169
4.10 Reading and Writing Data in Text Format 170 - 172
UNIT - V
S.NO TOPIC Pg No
5.1 Data Cleaning and Preparation: Handling Missing Data, 173 - 175
Data Transformation: Removing Duplicates
5.2 Transforming Data Using a Function or Mapping, 176 - 179
Replacing Values, Detecting and Filtering Outliers
5.3 String Manipulation: 180 - 183
5.4 Vectorized String Functions in pandas. 184 - 184
5.5 Plotting with pandas: Line Plots, Bar Plots, Histograms 184 - 191
and Density Plots, Scatter or Point Plots.
Page 3
Data Science using Python – Unit I
UNIT - I
1.1 Introduction of DATA SCIENCE
What is DATA SCIENCE?
Data Science is kind a blended with various tools, algorithms, and

machine learning principles. Most simply, it involves obtaining
meaningful information or insights from structured or unstructured data
through a process of analyzing, programming and business skills. It is a
field containing many elements like mathematics, statistics, computer
science, etc. Those who are good at these respective fields with enough
knowledge of the domain in which you are willing to work can call
themselves as Data Scientist. It’s not an easy thing to do but not
impossible too. You need to start from data, it’s visualization,
programming, formulation, development, and deployment of your model.
In the future, there will be great hype for data scientist jobs. Taking in
that mind, be ready to prepare yourself to fit in this world.
How Data Science Works?

Data science is not a one-step process such that you will get to
learn it in a short time and call ourselves a Data Scientist. It’s passes
from many stages and every element is important. One should always
follow the proper steps to reach the ladder. Every step has its value and
it counts in your model. Buckle up in your seats and get ready to learn
about those steps.
Problem Statement: No work start without motivation, Data science

is no exception though. It’s really important to declare or formulate your
Pavitra Degree College – B.Sc.(Computers) III Year VI sem Page 1
Page 4
problem statement very clearly and precisely. Your whole model and it’s
working depend on your statement. Many scientists considers this as the
main and much important step of Date Science. So make sure what’s
your problem statement and how well can it add value to business or
any other organization.
Data Collection: After defining the problem statement, the next

obvious step is to go in search of data that you might require for your
model. You must do good research, find all that you need. Data can be
in any form i.e unstructured or structured. It might be in various forms
like videos, spreadsheets, coded forms, etc. You must collect all these
kinds of sources.
Data Cleaning: As you have formulated your motive and also you
did collect your data, the next step to do is cleaning. Yes, it is! Data
cleaning is the most favorite thing for data scientists to do. Data cleaning
is all about the removal of missing, redundant, unnecessary and
duplicate data from your collection. There are various tools to do so with
the help of programming in either R or Python. It’s totally on you to
choose one of them. Various scientists have their opinion on which to
choose. When it comes to the statistical part, R is preferred over Python,
as it has the privilege of more than 12,000 packages. While python is
used as it is fast, easily accessible and we can perform the same things
as we can in R with the help of various packages.
Data Analysis and Exploration: It’s one of the prime things in data
science to do and time to get inner Holmes out. It’s about analyzing the
structure of data, finding hidden patterns in them, studying behaviors,
visualizing the effects of one variable over others and then concluding.
We can explore the data with the help of various graphs formed with the
Page 5
help of libraries using any programming language. In R, GGplot is one of

the most famous models while Matplotlib in Python.
Data Modeling: Once you are done with your study that you have
formed from data visualization, you must start building a hypothesis
model such that it may yield you a good prediction in future. Here, you
must choose a good algorithm that best fit to your model. There different
kinds of algorithms from regression to classification, SVM( Support
Vector Machines), Clustering, etc. Your model can be of a Machine
Learning algorithm. You train your model with the train data and then
test it with test data. There are various methods to do so. One of them is
the K-fold method where you split your whole data into two parts, One is
Train and the other is test data. On these bases, you train your model.
Optimization and Deployment: You followed each and every step

and hence build a model that you feel is the best fit. But how can you
decide how well your model is performing? This where optimization
comes. You test your data and find how well it is performing by checking
its accuracy. In short, you check the efficiency of the data model and
thus try to optimize it for better accurate prediction. Deployment deals
with the launch of your model and let the people outside there to benefit
from that. You can also obtain feedback from organizations and people
to know their need and then to work more on your model.
1.2 What Is Python

Python is a general purpose, dynamic, high-level, and interpreted
programming language. It supports Object Oriented programming
approach to develop applications. It is simple and easy to learn and
provides lots of high-level data structures.
Page 6
Python is easy to learn yet powerful and versatile scripting

language, which makes it attractive for Application Development.
Python's syntax and dynamic typing with its interpreted nature

make it an ideal language for scripting and rapid application
development.
Python supports multiple programming pattern, including object-

oriented, imperative, and functional or procedural programming styles.
Python is not intended to work in a particular area, such as web

programming. That is why it is known as multipurpose programming
language because it can be used with web, enterprise, 3D CAD, etc.
We don't need to use data types to declare variable because it

is dynamically typed so we can write a=10 to assign an integer value in
an integer variable.
Python makes the development and debugging fast because there

is no compilation step included in Python development, and edit-test-
debug cycle is very fast.
Why Python
Python is open source, interpreted, high level language and
provides great approach for object-oriented programming. It is one of the
best language used by data scientist for various data science
projects/application. Python provide great functionality to deal with
mathematics, statistics and scientific function. It provides great libraries
to deals with data science application.
Page 7
One of the main reasons why Python is widely used in the

scientific and research communities is because of its ease of use and
simple syntax which makes it easy to adapt for people who do not have
an engineering background. It is also more suited for quick prototyping.
According to engineers coming from academia and industry, deep

learning frameworks available with Python APIs, in addition to the
scientific packages have made Python incredibly productive and
versatile. There has been a lot of evolution in deep learning Python
frameworks and it’s rapidly upgrading.
In terms of application areas, ML scientists prefer Python as well.

When it comes to areas like building fraud detection algorithms and
network security, developers leaned towards Java, while for applications
like natural language processing (NLP) and sentiment analysis,
developers opted for Python, because it provides large collection of
libraries that help to solve complex business problem easily, build strong
system and data application.
1.3 Essential Python

Python is a general purpose programming language that was
designed to be compact, easy to use, easy to extend, and which has a
large standard library and a very active development community. As well
as being a general purpose programming language, Python is widely
used as a scripting language, a glue language, for data science and
machine learning, and for software test.
Whether you work in artificial intelligence or finance or are

pursuing a career in web development or data science, Python is one of
the most important skills you can learn. Python's simple syntax is
Page 8
especially suited for desktop, web, and business applications. Python's

design philosophy emphasizes readability and usability. Python was
developed on the premise that there should be only one way (and
preferably, one obvious way) to do things, a philosophy that resulted in a
strict level of code standardization. The core programming language is
quite small and the standard library is also large. In fact, Python's large
library is one of its greatest benefits, providing different tools for
programmers suited for a variety of tasks.
Essential Python is intended for professionals working in

the electronic systems hardware and embedded software development
flows.
1.4 Libraries Python
Most Commonly used libraries for data science :
Numpy: Numpy is Python library that provides mathematical

function to handle large dimension array. It provides various
method/function for Array, Metrics, and linear algebra.
NumPy stands for Numerical Python. It provides lots of useful

features for operations on n-arrays and matrices in Python. The library
provides vectorization of mathematical operations on the NumPy array
type, which enhance performance and speeds up the execution. It’s very
easy to work with large multidimensional arrays and matrices using
NumPy.
Pandas: Pandas is one of the most popular Python library for data
manipulation and analysis. Pandas provide useful functions to
manipulate large amount of structured data. Pandas provide easiest
method to perform analysis. It provide large data structures and
Page 9
manipulating numerical tables and time series data. Pandas is a perfect

tool for data wrangling. Pandas is designed for quick and easy data
manipulation, aggregation, and visualization. There two data structures
in Pandas –
Series – It Handle and store data in one-dimensional data.
DataFrame – It Handle and store Two dimensional data.
Matplotlib: Matplotlib is another useful Python library for Data

Visualization. Descriptive analysis and visualizing data is very important
for any organization. Matplotlib provides various method to Visualize
data in more effective way. Matplotlib allows to quickly make line graphs,
pie charts, histograms, and other professional grade figures. Using
Matplotlib, one can customize every aspect of a figure. Matplotlib has
interactive features like zooming and planning and saving the Graph in
graphics format.
Scipy: Scipy is another popular Python library for data science

and scientific computing. Scipy provides great functionality to scientific
mathematics and computing programming. SciPy contains sub-modules
for optimization, linear algebra, integration, interpolation, special
functions, FFT, signal and image processing, ODE solvers, Statmodel
and other tasks common in science and engineering.
Scikit – learn: Sklearn is Python library for machine learning.

Sklearn provides various algorithms and functions that are used in
machine learning. Sklearn is built on NumPy, SciPy, and matplotlib.
Sklearn provides easy and simple tools for data mining and data
analysis. It provides a set of common machine learning algorithms to
users through a consistent interface. Scikit-Learn helps to quickly
Page 10
implement popular algorithms on datasets and solve real-world

problems.
Introduction
Python is a widely used general-purpose, high level programming
language. It was created by Guido van Rossum in 1991 and further
developed by the Python Software Foundation. It was designed with an
emphasis on code readability, and its syntax allows programmers to
express their concepts in fewer lines of code. Python is a programming
language that lets you work quickly and integrate systems more
efficiently.
There are two major Python versions: Python 2 and Python 3. Both
are quite different.
1.5 Features
Following are some useful features of Python language:
 It uses the elegant syntax, hence the programs are easier to read.
 It is a simple to access language, which makes it easy to achieve
the program working.
 The large standard library and community support.
 The interactive mode of Python makes its simple to test codes.
 In Python, it is also simple to extend the code by appending new
modules that are implemented in other compiled language like
C++ or C.
 Python is an expressive language which is possible to embed into
applications to offer a programmable interface.
 Allows developer to run the code anywhere, including Windows,
Mac OS X, UNIX, and Linux.
Page 11
 It is free software in a couple of categories. It does not cost

anything to use or download Pythons or to add it to the
application.
1.6 Identifiers
Identifier is a name used to identify a variable, function, class,
module, etc. The identifier is a combination of character digits and
underscore. The identifier should start with a character or Underscore
then use a digit. The characters are A-Z or a-z, an Underscore ( _ ) , and
digit (0-9). we should not use special characters ( #, @, $, %, ! ) in
identifiers.
Examples of valid identifiers:
var1
_var1
_1_var
var_1
Examples of invalid identifiers:
!var1
1var
1_var
var#1
Page 12
1.7 Reserved words
Total Python Keywords

Keywords Description
This is a logical operator it returns true if both the operands are
and
true else return false.
This is also a logical operator it returns true if anyone operand is
or
true else return false.
This is again a logical operator it returns True if the operand is
not
false else return false.
if This is used to make a conditional statement.
Elif is a condition statement used with an if statement the elif
elif
statement is executed if the previous conditions were not true
Else is used with if and elif conditional statement the else block is
else
executed if the given condition is not true.
for This is created for a loop.
while This keyword is used to create a while loop.
break This is used to terminate the loop.
as This is used to create an alternative.
def It helps us to define functions.
lambda It is used to define the anonymous function.
pass This is a null statement which means it will do nothing.
return It will return a value and exit the function.
true This is a boolean value.
false This is also a boolean value.
try It makes a try-except statement.
with The with keyword is used to simplify exception handling.

This function is used for debugging purposes. Usually used to
assert
check the correctness of code
Page 13
class It helps us to define a class.
continue It continues to the next iteration of a loop
del It deletes a reference to an object.
except Used with exceptions, what to do when an exception occurs
Finally is use with exceptions, a block of code that will be

finally
executed no matter if there is an exception or not.
from The form is used to import specific parts of any module.
global This declares a global variable.
import This is used to import a module.
in It’s used to check if a value is present in a list, tuple, etc, or not.
is This is used to check if the two variables are equal or not.
This is a special constant used to denote a null value or avoid. It’s

none important to remember, 0, any empty container(e.g empty list) do
not compute to None
nonlocal It’s declared a non-local variable.
raise This raises an exception
yield It’s ends a function and returns a generator.
1.8 Indentation
Indentation is a very important concept of Python because without
properly indenting the Python code, you will end up seeing
IndentationError and the code will not get compiled.
Python Indentation
Python indentation refers to adding white space before a
statement to a particular block of code. In another word, all the
Page 14
statements with the same space to the right, belong to the same code
block.
Example 1
The lines print(‘Welcome…’) and print(‘retype the Good Bye.’) are

two separate code blocks. The two blocks of code in our example if-
statement are both indented four spaces. The final print(‘All set!’) is not
indented, so it does not belong to the else block.
site = 'Hi'
if site == 'Hi':
print(Welcome...')
else:
print('retype the Good Bye.')
print('All set !')
1.9 Comments :
Comments in Python are the lines in the code that are ignored by
the interpreter during the execution of the program. Comments enhance
the readability of the code and help the programmers to understand the
code very carefully.
Types of Comments in Python :
There are three main kinds of comments in Python. They are:
Single-Line Comments: Python single-line comment starts with the

hashtag symbol (#) with no white spaces and lasts till the end of the line.
If the comment exceeds one line then put a hashtag on the next line and
continue the comment. Python’s single-line comments are proved useful
Page 15
for supplying short explanations for variables, function declarations, and

expressions.
# Python program to demonstrate comments
Multi-Line Comments: Python does not provide the option for multiline
comments. However, there are different ways through which we can
write multiline comments.
Using Multiple Hashtags (#)
We can multiple hashtags (#) to write multiline comments in

Python. Each and every line will be considered as a single-line
comment.
Example: Multiline comments using multiple hashtags (#)
Using String Literals: Python ignores the string literals that are not
assigned to a variable so we can use these string literals as a comment.
Example: """ Python program to demonstrate
multiline comments"""
Python Docstring :
Python docstring is the string literals with triple quotes that are
appeared right after the function. It is used to associate documentation
that has been written with Python modules, functions, classes, and
methods. It is added right below the functions, modules, or classes to
describe what they do. In Python, the docstring is then made available
via the __doc__ attribute.
Page 16
Example:
def multiply(a, b):
"""Multiplies the value of a and b"""
return a*b
# Print the docstring of multiply function
print(multiply.__doc__)
Output:
Multiplies the value of a and b
1.10 Python Data Types

Variables can hold values, and every value has a data-type.
Python is a dynamically typed language; hence we do not need to define
the type of the variable while declaring it. The interpreter implicitly binds
the value with its type.
1. a = 5
The variable a holds integer value five and we did not define its
type. Python interpreter will automatically interpret variables a as an
integer type.
Python enables us to check the type of the variable used in the

program. Python provides us the type() function, which returns the type
of the variable passed.
Consider the following example to define the values of different

data types and checking its type.
1. a=10
Page 17
2. b="Hi Python"
3. c = 10.5
4. print(type(a))
5. print(type(b))
6. print(type(c))
Output:
<type 'int'>
<type 'str'>
<type 'float'>
Standard data types

A variable can hold different types of values. For example, a
person's name must be stored as a string whereas its id must be stored
as an integer.
Python provides various standard data types that define the

storage method on each of them. The data types defined in Python are
given below.
1. Numbers
2. Sequence Type
3. Boolean
4. Set
5. Dictionary
Page 18
In this section of the tutorial, we will give a brief introduction of the

above data-types. We will discuss each one of them in detail later in this
tutorial.
Numbers
Number stores numeric values. The integer, float, and complex
values belong to a Python Numbers data-type. Python provides
the type() function to know the data-type of the variable. Similarly,
the isinstance() function is used to check an object belongs to a
particular class.
Python creates Number objects when a number is assigned to a

variable. For example;
a=5
print("The type of a", type(a))
b = 40.5
print("The type of b", type(b))
Page 19
c = 1+3j
print("The type of c", type(c))
print(" c is a complex number", isinstance(1+3j,complex))
Output:
The type of a <class 'int'>

The type of b <class 'float'>
The type of c <class 'complex'>
c is complex number: True
Python supports three types of numeric data.
1. Int - Integer value can be any length such as integers 10, 2, 29, -
20, -150 etc. Python has no restriction on the length of an integer.
Its value belongs to int
2. Float - Float is used to store floating-point numbers like 1.9, 9.902,

15.2, etc. It is accurate upto 15 decimal points.
3. complex - A complex number contains an ordered pair, i.e., x + iy

where x and y denote the real and imaginary parts, respectively.
The complex numbers like 2.14j, 2.0 + 2.3j, etc.
Sequence Type
String
The string can be defined as the sequence of characters
represented in the quotation marks. In Python, we can use single,
double, or triple quotes to define a string.
String handling in Python is a straightforward task since Python

provides built-in functions and operators to perform operations in the
string.
Page 20
In the case of string handling, the operator + is used to

concatenate two strings as the operation "hello"+" python" returns "hello
python".
The operator * is known as a repetition operator as the operation

"Python" *2 returns 'Python Python'.
The following example illustrates the string in Python.
Example - 1
1. str = "string using double quotes"

2. print(str)
3. s = '''''A multiline
4. string'''
5. print(s)
Output:
string using double quotes

A multiline
string
Consider the following example of string handling.
List
Python Lists are similar to arrays in C. However, the list can
contain data of different types. The items stored in the list are separated
with a comma (,) and enclosed within square brackets [].
We can use slice [:] operators to access the data of the list. The
concatenation operator (+) and repetition operator (*) works with the list
in the same way as they were working with the strings.
Consider the following example.
Page 21
1. list1 = [1, "hi", "Python", 2]

2. #Checking type of given list
3. print(type(list1))
4.
5. #Printing the list1
6. print (list1)
7.
8. # List slicing
9. print (list1[3:])
10.
11. # List slicing
12. print (list1[0:2])
13.
14. # List Concatenation using + operator
15. print (list1 + list1)
16.
17. # List repetation using * operator
18. print (list1 * 3)
Output:
[1, 'hi', 'Python', 2]

[2]
[1, 'hi']
[1, 'hi', 'Python', 2, 1, 'hi', 'Python', 2]
[1, 'hi', 'Python', 2, 1, 'hi', 'Python', 2, 1, 'hi', 'Python', 2]
Tuple
A tuple is similar to the list in many ways. Like lists, tuples also
contain the collection of the items of different data types. The items of
the tuple are separated with a comma (,) and enclosed in parentheses ().
A tuple is a read-only data structure as we can't modify the size

and value of the items of a tuple.
-Let's see a simple example of the tuple.

1. tup = ("hi", "Python", 2)
2. # Checking type of tup
3. print (type(tup))
Page 22
4.
5. #Printing the tuple
6. print (tup)
7.
8. # Tuple slicing
9. print (tup[1:])
10. print (tup[0:1])
11.
12. # Tuple concatenation using + operator
13. print (tup + tup)
14.
15. # Tuple repatation using * operator
16. print (tup * 3)
17.
18. # Adding value to tup. It will throw an error.
19. t[2] = "hi"
Output:
<class 'tuple'>
('hi', 'Python', 2)
('Python', 2)
('hi',)
('hi', 'Python', 2, 'hi', 'Python', 2)
('hi', 'Python', 2, 'hi', 'Python', 2, 'hi', 'Python', 2)
Traceback (most recent call last):
File "main.py", line 14, in <module>
t[2] = "hi";
TypeError: 'tuple' object does not support item assignment
Dictionary
Dictionary is an unordered set of a key-value pair of items. It is like
an associative array or a hash table where each key stores a specific
value. Key can hold any primitive data type, whereas value is an
arbitrary Python object.
The items in the dictionary are separated with the comma (,) and
enclosed in the curly braces {}.
Page 23
1. d = {1:'Jimmy', 2:'Alex', 3:'john', 4:'mike'}

2. # Printing dictionary
3. print (d)
4.
5. # Accesing value using keys
6. print("1st name is "+d[1])
7. print("2nd name is "+ d[4])
8.
9. print (d.keys())
10. print (d.values())
Output:
1st name is Jimmy

2nd name is mike
{1: 'Jimmy', 2: 'Alex', 3: 'john', 4: 'mike'}
dict_keys([1, 2, 3, 4])
dict_values(['Jimmy', 'Alex', 'john', 'mike'])
Boolean
Boolean type provides two built-in values, True and False. These
values are used to determine the given statement true or false. It
denotes by the class bool. True can be represented by any non-zero
value or 'T' whereas false can be represented by the 0 or 'F'. Consider
the following example.
1. # Python program to check the boolean type

2. print(type(True))
3. print(type(False))
4. print(false)
Output:
<class 'bool'>
<class 'bool'>
NameError: name 'false' is not defined
Page 24
Set
Python Set is the unordered collection of the data type. It is
iterable, mutable(can modify after creation), and has unique elements. In
set, the order of the elements is undefined; it may return the changed
sequence of the element. The set is created by using a built-in
function set(), or a sequence of elements is passed in the curly braces
and separated by the comma. It can contain various types of values.
# Creating Empty set

set1 = set()
set2 = {'James', 2, 3,'Python'}
#Printing Set value
print(set2)
# Adding element to the set
set2.add(10)
print(set2)
#Removing element from the set
set2.remove(2)
print(set2)
Output:
{3, 'Python', 'James', 2}

{'Python', 'James', 3, 2, 10}
{'Python', 'James', 3, 10}
1..11 Type Conversion

Python defines type conversion functions to directly convert one
data type to another which is useful in day-to-day and competitive
programming. This article is aimed at providing information about certain
conversion functions.
Page 25
There are two types of Type Conversion in Python:
1. Implicit Type Conversion

2. Explicit Type Conversion
Let’s discuss them in detail.
Implicit Type Conversion

In Implicit type conversion of data types in Python, the Python
interpreter automatically converts one data type to another without any
user involvement.
Example:
x = 10
print("x is of type:",type(x))
y = 10.6
print("y is of type:",type(y))
z=x+y
print(z)
print("z is of type:",type(z))
Output:
x is of type: <class 'int'>
y is of type: <class 'float'>
20.6
z is of type: <class 'float'>
As we can see the data type of ‘z’ got automatically changed to the
“float” type while one variable x is of integer type while the other variable
y is of float type. The reason for the float value not being converted into
an integer instead is due to type promotion that allows performing
Page 26
operations by converting data into a wider-sized data type without any

loss of information. This is a simple case of Implicit type conversion in
python.
Explicit Type Conversion

In Explicit Type Conversion in Python, the data type is manually
changed by the user as per their requirement. With explicit type
conversion, there is a risk of data loss since we are forcing an
expression to be changed in some specific data type. Various forms of
explicit type conversion are explained below:
1. int(a, base): This function converts any data type to integer. ‘Base’
specifies the base in which string is if the data type is a string.
2. float(): This function is used to convert any data type to a floating-
point number.
Python3
# Python code to demonstrate Type conversion
# using int(), float()
# initializing string
s = "10010"
# printing string converting to int base 2
c = int(s,2)
print ("After converting to integer base 2 : ", end="")
print (c)
Page 27
# printing string converting to float
e = float(s)
print ("After converting to float : ", end="")
print (e)
Output:
After converting to integer base 2 : 18
After converting to float : 10010.0
3. ord() : This function is used to convert a character to integer.
4. hex() : This function is to convert integer to hexadecimal string.
5. oct() : This function is to convert integer to octal string.
Python3
# using ord(), hex(), oct()
# initializing integer
s = '4'
# printing character converting to integer
c = ord(s)
print ("After converting character to integer : ",end="")
print (c)
# printing integer converting to hexadecimal string
c = hex(56)
print ("After converting 56 to hexadecimal string : ",end="")
print (c)
Page 28
# printing integer converting to octal string
c = oct(56)
print ("After converting 56 to octal string : ",end="")
print (c)
Output:
After converting character to integer : 52
After converting 56 to hexadecimal string : 0x38
After converting 56 to octal string : 0o70
6. tuple() : This function is used to convert to a tuple.
7. set() : This function returns the type after converting to set.
8. list() : This function is used to convert any data type to a list type.
Python3
# using tuple(), set(), list()
# initializing string
s = 'geeks'
# printing string converting to tuple
c = tuple(s)
print ("After converting string to tuple : ",end="")
print (c)
# printing string converting to set
c = set(s)
Page 29
print ("After converting string to set : ",end="")
print (c)
# printing string converting to list
c = list(s)
print ("After converting string to list : ",end="")
print (c)
Output:
After converting string to tuple : ('g', 'e', 'e', 'k', 's')
After converting string to set : {'k', 'e', 's', 'g'}
After converting string to list : ['g', 'e', 'e', 'k', 's']
9. dict() : This function is used to convert a tuple of order (key,value) into a dictionary.
10. str() : Used to convert integer into a string.
11. complex(real,imag) : This function converts real numbers to complex(real,imag) number.
Python3
# using dict(), complex(), str()
# initializing integers
a=1
b=2
# initializing tuple
tup = (('a', 1) ,('f', 2), ('g', 3))
# printing integer converting to complex number
Page 30
c = complex(1,2)
print ("After converting integer to complex number : ",end="")
print (c)
# printing integer converting to string
c = str(a)
print ("After converting integer to string : ",end="")
print (c)
# printing tuple converting to expression dictionary
c = dict(tup)
print ("After converting tuple to dictionary : ",end="")
print (c)
Output:
After converting integer to complex number : (1+2j)
After converting integer to string : 1
After converting tuple to dictionary : {'a': 1, 'f': 2, 'g': 3}
12. chr(number): This function converts number to its corresponding ASCII character.
Python3
# Convert ASCII value to characters
a = chr(76)
b = chr(77)
print(a)
print(b)
Output: L M
Page 31
1..12 Operators:
Python Operators in general are used to perform operations on
values and variables. These are standard symbols used for the purpose
of logical and arithmetic operations. In this article, we will look into
different types of Python operators.
OPERATORS: Are the special symbols. Eg- + , * , /, etc.
OPERAND: It is the value on which the operator is applied.
Arithmetic Operators:
Arithmetic operators are used to performing mathematical
operations like addition, subtraction, multiplication, and division.
In Python 3.x the result of division is a floating-point while in

Python 2.x division of 2 integer was an integer and to obtain an integer
result in Python 3.x floored (// integer) is used.
Operator Description Syntax

+ Addition: adds two operands x+y
– Subtraction: subtracts two operands x–y
* Multiplication: multiplies two operands x*y
/ Division (float): divides the first operand by the second x/y
// Division (floor): divides the first operand by the second x // y

Modulus: returns the remainder when the first operand is
% x%y
divided by the second
** Power: Returns first raised to power second x ** y
PRECEDENCE:
P – Parentheses
E – Exponentiation
Page 32
M – Multiplication (Multiplication and division have the same

precedence)
D – Division
A – Addition (Addition and subtraction have the same precedence)
S – Subtraction
The modulus operator helps us extract the last digit/s of a number.

For example:
x % 10 -> yields the last digit
x % 100 -> yield last two digits
Example: Arithmetic operators in Python

Python3
# Examples of Arithmetic Operator

a=9
b=4
# Addition of numbers
add = a + b
# Subtraction of numbers
sub = a - b
# Multiplication of number
mul = a * b
# Division(float) of number
div1 = a / b
# Division(floor) of number
div2 = a // b
Page 33
# Modulo of both number

mod = a % b
# Power
p = a ** b
# print results
print(add)
print(sub)
print(mul)
print(div1)
print(div2)
print(mod)
print(p)
Output:
13
36
2.25
6561
Comparison of Relational operators compares the values. It either

returns True or False according to the condition.

> Greater than: True if the left operand is greater than the right x>y
< Less than: True if the left operand is less than the right x<y
== Equal to: True if both operands are equal x == y
Page 34
!= Not equal to – True if operands are not equal x != y
Greater than or equal to True if the left operand is greater

>= x >= y
than or equal to the right
Less than or equal to True if the left operand is less than or
<= x <= y
equal to the right
is x is the same as y x is y
is not x is not the same as y x is not y
= is an assignment operator and == comparison operator.
Example: Comparison Operators in Python

Python3
# Examples of Relational Operators
a = 13
b = 33
# a > b is False
print(a > b)
# a < b is True
print(a < b)
# a == b is False
print(a == b)
# a != b is True
print(a != b)
Page 35
# a >= b is False
print(a >= b)
# a <= b is True
print(a <= b)
Output:
False
True
False
True
False
True
Logical Operators:
Logical Operators perform Logical AND, Logical OR, and Logical
NOT operations. It is used to combine conditional statements.

and Logical AND: True if both the operands are true x and y
or Logical OR: True if either of the operands is true x or y
not Logical NOT: True if the operand is false not x
Example: Logical Operators in Python
Python3
# Examples of Logical Operator
a = True
b = False
Page 36
# Print a and b is False
print(a and b)
# Print a or b is True
print(a or b)
# Print not a is False
print(not a)
Output:
False
True
False
Bitwise Operators:
Bitwise operators act on bits and perform the bit-by-bit operations.
These are used to operate on binary numbers.

& Bitwise AND x&y
| Bitwise OR x|y
~ Bitwise NOT ~x
^ Bitwise XOR x^y
>> Bitwise right shift x>>
<< Bitwise left shift x<<
Example: Bitwise Operators in Python
Python3
# Examples of Bitwise operators
Page 37
a = 10
b=4
# Print bitwise AND operation
print(a & b)
# Print bitwise OR operation
print(a | b)
# Print bitwise NOT operation
print(~a)
# print bitwise XOR operation
print(a ^ b)
# print bitwise right shift operation
print(a >> 2)
# print bitwise left shift operation
print(a << 2)
Output:
0
14
-11
14
2
40
Assignment Operators:
Assignment operators are used to assign values to the variables.
Assign value of right side of expression to left side

= x=y+z
operand
Add AND: Add right-side operand with left side operand a+=b
+=
and then assign to left operand a=a+b
Subtract AND: Subtract right operand from left operand a-=b

-=
and then assign to left operand a=a-b
Multiply AND: Multiply right operand with left operand a*=b

*=
and then assign to left operand a=a*b
Divide AND: Divide left operand with right operand and a/=b
/=
then assign to left operand a=a/b
Page 38
Modulus AND: Takes modulus using left and right a%=b

%=
operands and assign the result to left operand a=a%b
Divide(floor) AND: Divide left operand with right operand a//=b

//=
and then assign the value(floor) to left operand a=a//b
Exponent AND: Calculate exponent(raise power) value a**=b

**=
using operands and assign value to left operand a=a**b
Performs Bitwise AND on operands and assign value to a&=b

&=
left operand a=a&b
Performs Bitwise OR on operands and assign value to a|=b

|=
left operand a=a|b
Performs Bitwise xOR on operands and assign value to a^=b

^=
left operand a=a^b
Performs Bitwise right shift on operands and assign a>>=b

>>=
value to left operand a=a>>b
Performs Bitwise left shift on operands and assign value a <<= b

<<=
to left operand a= a << b
Example: Assignment Operators in Python
Python3
# Examples of Assignment Operators
a = 10
# Assign value
b=a
print(b)
# Add and assign value
b += a
print(b)
# Subtract and assign value
Page 39
b -= a
print(b)
# multiply and assign
b *= a
print(b)
# bitwise lishift operator
b <<= a
print(b)
Output:
10
20
10
100
102400
Identity Operators:
is and is not are the identity operators both are used to check if two
values are located on the same part of the memory. Two variables that
are equal do not imply that they are identical.
is True if the operands are identical
is not True if the operands are not identical
Example: Identity Operator

Python3
a = 10
b = 20
Page 40
c=a
print(a is not b)
print(a is c)
Output:
True
True
Membership Operators:
in and not in are the membership operators; used to test whether a
value or variable is in a sequence.
In True if value is found in the sequence
not in True if value is not found in the sequence
Example: Membership Operator
Python3
# Python program to illustrate
# not 'in' operator
x = 24
y = 20
list = [10, 20, 30, 40, 50]
if (x not in list):
print("x is NOT present in given list")
else:
Page 41
print("x is present in given list")
if (y in list):
print("y is present in given list")
else:
print("y is NOT present in given list")
Output:
x is NOT present in given list
y is present in given list
Precedence and Associativity of Operators:

Precedence and Associativity of Operators: Operator precedence
and associativity determine the priorities of the operator.
Operator Precedence
This is used in an expression with more than one operator with

different precedence to determine which operation to perform first.
Example: Operator Precedence
Python3
# Examples of Operator Precedence
# Precedence of '+' & '*'
expr = 10 + 20 * 30
print(expr)
# Precedence of 'or' & 'and'
Page 42
name = "Alex"
age = 0
if name == "Alex" or name == "John" and age >= 2:
print("Hello! Welcome.")
else:
print("Good Bye!!")
Output:
610
Hello! Welcome.
Operator Associativity
If an expression contains two or more operators with the same
precedence then Operator Associativity is used to determine. It can
either be Left to Right or from Right to Left.
Example: Operator Associativity
Python3
# Examples of Operator Associativity
# Left-right associativity
# 100 / 10 * 10 is calculated as
# (100 / 10) * 10 and not
# as 100 / (10 * 10)
print(100 / 10 * 10)
# Left-right associativity
# 5 - 2 + 3 is calculated as
Page 43
# (5 - 2) + 3 and not
# as 5 - (2 + 3)
print(5 - 2 + 3)
# left-right associativity
print(5 - (2 + 3))
# right-left associativity
# 2 ** 3 ** 2 is calculated as
# 2 ** (3 ** 2) and not
# as (2 ** 3) ** 2
print(2 ** 3 ** 2)
Output:
100.0
512
Ternary operators:
Ternary operators are also known as conditional expressions are
operators that evaluate something based on a condition being true or
false. It was added to Python in version 2.5. It simply allows testing a
condition in a single line replacing the multiline if-else making the code
compact.
Page 44
Syntax :
[on_true] if [expression] else [on_false]
Simple Method to use ternary operator:
Python
# Program to demonstrate conditional operator
a, b = 10, 20
# Copy value of a in min if a < b else copy b
min = a if a < b else b
print(min)
Output:
10
1.0
>>>10/2
5.0
>>>-10/2
-5.0
>>>20.0/2
10.0
(ii) Integer division( Floor division):

The quotient returned by this operator is dependent on the
argument being passed. If any of the numbers is float, it returns output in
float. It is also known as Floor division because, if any number is
negative, then the output will be floored. For example:
>>>5//5
Page 45
>>>3//2
>>>10//3
Consider the below statements in Python.
Python3
# A Python program to demonstrate the use of
# "//" for integers
print (5//2)
print (-5//2)
Output:
2
-3
The first output is fine, but the second one may be surprised if we
are coming Java/C++ world. In Python, the “//” operator works as a floor
division for integer and float arguments. However, the division operator
‘/’ returns always a float value.
Note: The “//” operator is used to return the closest integer value
which is less than or equal to a specified expression or value. So from
the above code, 5//2 returns 2. You know that 5/2 is 2.5, and the closest
integer which is less than or equal is 2[5//2].( it is inverse to the normal
maths, in normal maths the value is 3).
Example:
Page 46
Python3
# A Python program to demonstrate use of
# "/" for floating point numbers
print (5.0/2)
print (-5.0/2)
Output:
2.5
-2.5
The real floor division operator is “//”. It returns the floor value for
both integer and floating-point arguments.
Python3
# A Python program to demonstrate use of
# "//" for both integers and floating points
print (5//2)
print (-5//2)
print (5.0//2)
print (-5.0//2)
Output:
2
-3
2.0
-3.0
Page 47
1..13 Decision making

Decision making is anticipation of conditions occurring while
execution of the program and specifying actions taken according to the
conditions.
Decision structures evaluate multiple expressions which produce TRUE
or FALSE as outcome. You need to determine which action to take and
which statements to execute if outcome is TRUE or FALSE otherwise.
Following is the general form of a typical decision making structure found
in most of the programming languages −
Python programming language assumes
any non-zero and non-null values as TRUE,
and if it is either zero or null, then it is
assumed as FALSE value.
Python programming language provides
following types of decision making statements.
Click the following links to check their detail.
Sr.No. Statement & Description
1 if statements
An if statement consists of a boolean expression followed by

one or more statements.
2 if...else statements
An if statement can be followed by an optional else

statement, which executes when the boolean expression is
FALSE.
3 nested if statements
You can use one if or else if statement inside

another if or else if statement(s).
Page 48
Let us go through each decision making briefly −
Single Statement Suites

If the suite of an if clause consists only of a single line, it may go on
the same line as the header statement.
Here is an example of a one-line if clause −
#!/usr/bin/python
var = 100
if ( var == 100 ) : print "Value of expression is 100"
print "Good bye!"
When the above code is executed, it produces the following result −
Value of expression is 100
Good bye!
1..14 loop statement

In general, statements are executed
sequentially: The first statement in a
function is executed first, followed by the
second, and so on. There may be a
situation when you need to execute a
block of code several number of times.
Programming languages provide various
control structures that allow for more
complicated execution paths.
A loop statement allows us to execute a
statement or group of statements multiple
times. The following diagram illustrates a
loop statement −
Python programming language provides following types of loops to

handle looping requirements.
Page 49
Sr.No. Loop Type & Description
1 while loop
Repeats a statement or group of statements while a given condition is

TRUE. It tests the condition before executing the loop body.
2 for loop
Executes a sequence of statements multiple times and abbreviates the

code that manages the loop variable.
3 nested loops
You can use one or more loop inside any another while, for or do..while
loop.
Loop Control Statements

Loop control statements change execution from its normal
sequence. When execution leaves a scope, all automatic objects that
were created in that scope are destroyed.
Python supports the following control statements. Click the
following links to check their detail.
Let us go through the loop control statements briefly
Sr.No. Control Statement & Description
1 break statement
Terminates the loop statement and transfers execution to the statement
immediately following the loop.
2 continue statement
Causes the loop to skip the remainder of its body and immediately retest its
condition prior to reiterating.
3 pass statement
The pass statement in Python is used when a statement is required syntactically

but you do not want any command or code to execute.
Page 50
1..15 Number Functions In Python

Number data types store numeric values. They are immutable data
types, means that changing the value of a number data type results
in a newly allocated object.
Number objects are created when you assign a value to them. For
example −
var1 = 1
var2 = 10
You can also delete the reference to a number object by using
the del statement. The syntax of the del statement is −
del var1[,var2[,var3[....,varN]]]]
You can delete a single object or multiple objects by using
the del statement. For example −
del var
del var_a, var_b
Python supports four different numerical types −
 int (signed integers) − They are often called just integers or
ints, are positive or negative whole numbers with no decimal
point.
 long (long integers ) − Also called longs, they are integers of
unlimited size, written like integers and followed by an
uppercase or lowercase L.
 float (floating point real values) − Also called floats, they
represent real numbers and are written with a decimal point
dividing the integer and fractional parts. Floats may also be in
scientific notation, with E or e indicating the power of 10
(2.5e2 = 2.5 x 102 = 250).
 complex (complex numbers) − are of the form a + bJ, where
a and b are floats and J (or j) represents the square root of -1
(which is an imaginary number). The real part of the number
is a, and the imaginary part is b. Complex numbers are not
used much in Python programming.
Examples
Here are some examples of numbers
Page 51
int Long Float complex
10 51924361L 0.0 3.14j
100 -0x19323L 15.20 45.j
-786 0122L -21.9 9.322e-36j
080 0xDEFABCECBDAECBFB 32.3+e18 .876j

AEL
-0490 535633629843L -90. -.6545+0J
-0x260 -052318172735L -32.54e100 3e+26J
0x69 -4721885298529L 70.2-E12 4.53e-7j
 Python allows you to use a lowercase L with long, but it is

recommended that you use only an uppercase L to avoid
confusion with the number 1. Python displays long integers
with an uppercase L.
 A complex number consists of an ordered pair of real floating
point numbers denoted by a + bj, where a is the real part and
b is the imaginary part of the complex number.
Number Type Conversion

Python converts numbers internally in an expression containing
mixed types to a common type for evaluation. But sometimes, you
need to coerce a number explicitly from one type to another to
satisfy the requirements of an operator or function parameter.
 Type int(x) to convert x to a plain integer.
 Type long(x) to convert x to a long integer.
 Type float(x) to convert x to a floating-point number.
Page 52
 Type complex(x) to convert x to a complex number with real

part x and imaginary part zero.
 Type complex(x, y) to convert x and y to a complex number
with real part x and imaginary part y. x and y are numeric
expressions
Mathematical Functions
Python includes following functions that perform mathematical
calculations.
Sr.No. Function & Returns ( description )
1 abs(x)
The absolute value of x: the (positive) distance between x

and zero.
2 ceil(x)
The ceiling of x: the smallest integer not less than x
3 cmp(x, y)
-1 if x < y, 0 if x == y, or 1 if x > y
4 exp(x)
The exponential of x: ex
5 fabs(x)
The absolute value of x.
6 floor(x)
The floor of x: the largest integer not greater than x
Page 53
7 log(x)
The natural logarithm of x, for x> 0
8 log10(x)
The base-10 logarithm of x for x> 0.
9 max(x1, x2,...)
The largest of its arguments: the value closest to positive

infinity
10 min(x1, x2,...)
The smallest of its arguments: the value closest to

negative infinity
11 modf(x)
The fractional and integer parts of x in a two-item tuple.

Both parts have the same sign as x. The integer part is
returned as a float.
12 pow(x, y)
The value of x**y.
13 round(x [,n])
x rounded to n digits from the decimal point. Python

rounds away from zero as a tie-breaker: round(0.5) is 1.0
and round(-0.5) is -1.0.
14 sqrt(x)
The square root of x for x > 0
Page 54
Random Number Functions

Random numbers are used for games, simulations, testing,
security, and privacy applications. Python includes following
functions that are commonly used.
Sr.No. Function & Description
1 choice(seq)
A random item from a list, tuple, or string.
2 randrange ([start,] stop [,step])
A randomly selected element from range(start, stop, step)
3 random()
A random float r, such that 0 is less than or equal to r and r

is less than 1
4 seed([x])
Sets the integer starting value used in generating random

numbers. Call this function before calling any other random
module function. Returns None.
5 shuffle(lst)
Randomizes the items of a list in place. Returns None.
6 uniform(x, y)
A random float r, such that x is less than or equal to r and r

is less than y
Page 55
Trigonometric Functions
Python includes following functions that perform trigonometric
calculations.
Sr.No. Function & Description
1 acos(x)
Return the arc cosine of x, in radians.
2 asin(x)
Return the arc sine of x, in radians.
3 atan(x)
Return the arc tangent of x, in radians.
4 atan2(y, x)
Return atan(y / x), in radians.
5 cos(x)
Return the cosine of x radians.
6 hypot(x, y)
Return the Euclidean norm, sqrt(x*x + y*y).
7 sin(x)
Return the sine of x radians.
8 tan(x)
Return the tangent of x radians.
9 degrees(x)
Converts angle x from radians to degrees.
Page 56
1.16 Function In Pyhton

A function is a block of organized, reusable code that is used to
perform a single, related action. Functions provide better modularity for
your application and a high degree of code reusing.
As you already know, Python gives you many built-in functions like
print(), etc. but you can also create your own functions. These functions
are called user-defined functions.
Defining a Function
You can define functions to provide the required functionality. Here are
simple rules to define a function in Python.
 Function blocks begin with the keyword def followed by the
function name and parentheses ( ( ) ).
 Any input parameters or arguments should be placed within these
parentheses. You can also define parameters inside these
parentheses.
 The first statement of a function can be an optional statement - the
documentation string of the function or docstring.
 The code block within every function starts with a colon (:) and is
indented.
 The statement return [expression] exits a function, optionally
passing back an expression to the caller. A return statement with
no arguments is the same as return None.
Syntax
def functionname( parameters ):

"function_docstring"
function_suite
return [expression]
By default, parameters have a positional behavior and you need to

inform them in the same order that they were defined.
Page 57
Example
The following function takes a string as input parameter and prints it
on standard screen.
def printme( str ):
"This prints a passed string into this function"
print str
return
Calling a Function
Defining a function only gives it a name, specifies the parameters
that are to be included in the function and structures the blocks of
code.
Once the basic structure of a function is finalized, you can execute
it by calling it from another function or directly from the Python
prompt. Following is the example to call printme() function −
Live Demo
#!/usr/bin/python
# Function definition is here

def printme( str ):
print str
return;
# Now you can call printme function

printme("I'm first call to user defined function!")
printme("Again second call to the same function")
I'm first call to user defined function!
Again second call to the same function
Pass by reference vs value

All parameters (arguments) in the Python language are passed by
reference. It means if you change what a parameter refers to within
Page 58
a function, the change also reflects back in the calling function. For
example −
Live Demo
#!/usr/bin/python

def changeme( mylist ):
"This changes a passed list into this function"
mylist.append([1,2,3,4]);
print "Values inside the function: ", mylist
return
# Now you can call changeme function

mylist = [10,20,30];
changeme( mylist );
print "Values outside the function: ", mylist
Here, we are maintaining reference of the passed object and
appending values in the same object. So, this would produce the
following result −
Values inside the function: [10, 20, 30, [1, 2, 3, 4]]
Values outside the function: [10, 20, 30, [1, 2, 3, 4]]
There is one more example where argument is being passed by
reference and the reference is being overwritten inside the called
function.
Live Demo
#!/usr/bin/python

def changeme( mylist ):
"This changes a passed list into this function"
mylist = [1,2,3,4]; # This would assig new reference in mylist
print "Values inside the function: ", mylist
return
# Now you can call changeme function

mylist = [10,20,30];
changeme( mylist );
print "Values outside the function: ", mylist
Page 59
The parameter mylist is local to the function changeme. Changing

mylist within the function does not affect mylist. The function
accomplishes nothing and finally this would produce the following
result −
Values inside the function: [1, 2, 3, 4]
Values outside the function: [10, 20, 30]
1..17 Function Arguments

You can call a function by using the following types of formal
arguments −
 Required arguments
 Keyword arguments
 Default arguments
 Variable-length arguments
Required arguments
Required arguments are the arguments passed to a function in
correct positional order. Here, the number of arguments in the
function call should match exactly with the function definition.
To call the function printme(), you definitely need to pass one
argument, otherwise it gives a syntax error as follows −
Live Demo
#!/usr/bin/python

def printme( str ):
print str
return;

printme()
Page 60
File "test.py", line 11, in <module>

printme();
TypeError: printme() takes exactly 1 argument (0 given)
Keyword arguments
Keyword arguments are related to the function calls. When you use
keyword arguments in a function call, the caller identifies the
arguments by the parameter name.
This allows you to skip arguments or place them out of order
because the Python interpreter is able to use the keywords
provided to match the values with parameters. You can also make
keyword calls to the printme() function in the following ways −
Live Demo
#!/usr/bin/python

def printme( str ):
print str
return;

printme( str = "My string")
My string
The following example gives more clear picture. Note that the order
of parameters does not matter.
Live Demo
#!/usr/bin/python

def printinfo( name, age ):
"This prints a passed info into this function"
print "Name: ", name
print "Age ", age
return;
Page 61
# Now you can call printinfo function

printinfo( age=50, name="miki" )
Name: miki
Age 50
Default arguments
A default argument is an argument that assumes a default value if a
value is not provided in the function call for that argument. The
following example gives an idea on default arguments, it prints
default age if it is not passed −
Live Demo
#!/usr/bin/python

def printinfo( name, age = 35 ):
"This prints a passed info into this function"
print "Name: ", name
print "Age ", age
return;

printinfo( age=50, name="miki" )
printinfo( name="miki" )
Name: miki
Age 50
Name: miki
Age 35
Variable-length arguments
You may need to process a function for more arguments than you
specified while defining the function. These arguments are
called variable-length arguments and are not named in the function
definition, unlike required and default arguments.
Page 62
Syntax for a function with non-keyword variable arguments is this −

def functionname([formal_args,] *var_args_tuple ):
"function_docstring"
function_suite
return [expression]
An asterisk (*) is placed before the variable name that holds the
values of all nonkeyword variable arguments. This tuple remains
empty if no additional arguments are specified during the function
call. Following is a simple example −
Live Demo
#!/usr/bin/python

def printinfo( arg1, *vartuple ):
"This prints a variable passed arguments"
print "Output is: "
print arg1
for var in vartuple:
print var
return;

printinfo( 10 )
printinfo( 70, 60, 50 )
Output is:
10
Output is:
70
60
50
Page 63
Data Science using Python – Unit II
UNIT - II
Introduction of Python
2.1 Python Modules

This tutorial will explain how to construct and import custom Python
modules. Additionally, we may import or integrate Python's built-in modules
via various methods.
What is Modular Programming?

Modular programming is the practice of segmenting a single,
complicated coding task into multiple, simpler, easier-to-manage sub-tasks.
We call these subtasks modules. Therefore, we can build a bigger program
by assembling different modules that act like building blocks.
Modularizing our code in a big application has a lot of benefits.
Simplification: A module often concentrates on one comparatively small

area of the overall problem instead of the full task. We will have a more
manageable design problem to think about if we are only concentrating on
one module. Program development is now simpler and much less
vulnerable to mistakes.
Flexibility: Modules are frequently used to establish conceptual

separations between various problem areas. It is less likely that changes to
one module would influence other portions of the program if modules are
constructed in a fashion that reduces interconnectedness. (We might even
be capable of editing a module despite being familiar with the program
Page 64
beyond it.) It increases the likelihood that a group of numerous developers

will be able to collaborate on a big project.
Reusability: Functions created in a particular module may be readily

accessed by different sections of the assignment (through a suitably
established api). As a result, duplicate code is no longer necessary.
Scope: Modules often declare a distinct namespace to prevent identifier

clashes in various parts of a program.
In Python, modularization of the code is encouraged through the use

of functions, modules, and packages.
What are Modules in Python?

A document with definitions of functions and various statements
written in Python is called a Python module.
In Python, we can define a module in one of 3 ways:
o Python itself allows for the creation of modules.
o Similar to the re (regular expression) module, a module can be

primarily written in C programming language and then dynamically
inserted at run-time.
o A built-in module, such as the itertools module, is inherently included

in the interpreter.
A module is a file containing Python code, definitions of functions,

statements, or classes. An example_module.py file is a module we will
create and whose name is example_module.
Page 65
We employ modules to divide complicated programs into smaller,

more understandable pieces. Modules also allow for the reuse of code.
Rather than duplicating their definitions into several applications, we

may define our most frequently used functions in a separate module and
then import the complete module.
Let's construct a module. Save the file as example_module.py after

entering the following.
Code:
1. # Python program to show how to create a module.
2. # defining a function in the module to reuse it
3. def square( number ):
4. """This function will square the number passed to it"""
5. result = number ** 2
6. return result
Here, a module called example_module contains the definition of the

function square(). The function returns the square of a given number.
How to Import Modules in Python?

In Python, we may import functions from one module into our
program, or as we say into, another module.
For this, we make use of the import Python keyword. In the Python
window, we add the next to import keyword, the name of the module we
Page 66
need to import. We will import the module we defined earlier

example_module.
Code:
1. import example_module
The functions that we defined in the example_module are not

immediately imported into the present program. Only the name of the
module, i.e., example_ module, is imported here.
We may use the dot operator to use the functions using the module
name. For instance:
Code:
1. result = example_module.square( 4 )
2. print( "By using the module square of number is: ", result )
Output:
By using the module square of number is: 16
There are several standard modules for Python. The complete list of
Python standard modules is available. The list can be seen using the help
command.
Similar to how we imported our module, a user-defined module, we

can use an import statement to import other standard modules.
Importing a module can be done in a variety of ways. Below is a list of

them.
Page 67
Python import Statement

Using the import Python keyword and the dot operator, we may
import a standard module and can access the defined functions within it.
Here's an illustration.
Code:
1. # Python program to show how to import a standard module
2. # We will import the math module which is a standard module
3. import math
4. print( "The value of euler's number is", math.e )
Output:
The value of euler's number is 2.718281828459045
Importing and also Renaming

While importing a module, we can change its name too. Here is an
example to show.
Code:
1. # Python program to show how to import a module and rename it
2. # We will import the math module and give a different name to it
3. import math as mt
4. print( "The value of euler's number is", mt.e )
Output:
Page 68
The math module is now named mt in this program. In some

circumstances, it might help us type faster in case of modules having long
names.
Please take note that now the scope of our program does not include
the term math. Thus, mt.pi is the proper implementation of the module,
whereas math.pi is invalid.
Python from...import Statement

We can import specific names from a module without importing the
module as a whole. Here is an example.
Code:
1. # Python program to show how to import specific objects from a module
2. # We will import euler's number from the math module using the from keyword
3. from math import e
4. print( "The value of euler's number is", e )
Output:
Only the e constant from the math module was imported in this case.
We avoid using the dot (.) operator in these scenarios. As follows, we

may import many attributes at the same time:
Code:
1. # Python program to show how to import multiple objects from a module
2. from math import e, tau
Page 69
3. print( "The value of tau constant is: ", tau )
4. print( "The value of the euler's number is: ", e )
Output:
The value of tau constant is: 6.283185307179586
The value of the euler's number is: 2.718281828459045
Import all Names - From import * Statement

To import all the objects from a module within the present
namespace, use the * symbol and the from and import keyword.
Syntax:
1. from name_of_module import *
There are benefits and drawbacks to using the symbol *. It is not

advised to use * unless we are certain of our particular requirements from
the module; otherwise, do so.
Here is an example of the same.
Code:
1. # importing the complete math module using *
2. from math import *
3. # accessing functions of math module without using the dot operator
4. print( "Calculating square root: ", sqrt(25) )
5. print( "Calculating tangent of an angle: ", tan(pi/6) ) # here pi is also imported from the m
ath module
Output:
Page 70
Calculating square root: 5.0
Calculating tangent of an angle: 0.5773502691896257
Locating Path of Modules

The interpreter searches numerous places when importing a module
in the Python program. Several directories are searched if the built-in
module is not present. The list of directories can be accessed using
sys.path. The Python interpreter looks for the module in the way described
below:
The module is initially looked for in the current working directory.

Python then explores every directory in the shell parameter PYTHONPATH
if the module cannot be located in the current directory. A list of folders
makes up the environment variable known as PYTHONPATH. Python
examines the installation-dependent set of folders set up when Python is
downloaded if that also fails.
Here is an example to print the path.
Code:
1. # We will import the sys module
2. import sys
3. # we will import sys.path
4. print(sys.path)
Output:
['/home/pyodide', '/home/pyodide/lib/Python310.zip', '/lib/Python3.10', '/lib/Python3.10/lib-
dynload', '', '/lib/Python3.10/site-packages']
Page 71
The dir() Built-in Function

We may use the dir() method to identify names declared within a
module.
For instance, we have the following names in the standard module

str. To print the names, we will use the dir() method in the following way:
Code:
1. # Python program to print the directory of a module
2. print( "List of functions:\n ", dir( str ), end=", " )
Output:
List of functions:
['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__',

'__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__',
'__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__',
'__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__',
'__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count',
'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha',
'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle',
'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'removeprefix', 'removesuffix',
'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip',
'swapcase', 'title', 'translate', 'upper', 'zfill']
Page 72
Namespaces and Scoping

Objects are represented by names or identifiers called variables. A
namespace is a dictionary containing the names of variables (keys) and the
objects that go with them (values).
Both local and global namespace variables can be accessed by a

Python statement. When two variables with the same name are local and
global, the local variable takes the role of the global variable. There is a
separate local namespace for every function. The scoping rule for class
methods is the same as for regular functions. Python determines if
parameters are local or global based on reasonable predictions. Any
variable that is allocated a value in a method is regarded as being local.
Therefore, we must use the global statement before we may provide

a value to a global variable inside of a function. Python is informed that
Var_Name is a global variable by the line global Var_Name. Python stops
looking for the variable inside the local namespace.
We declare the variable Number, for instance, within the global

namespace. Since we provide a Number a value inside the function,
Python considers a Number to be a local variable. UnboundLocalError will
be the outcome if we try to access the value of the local variable without or
before declaring it global.
Code:
1. Number = 204
2. def AddNumber():
3. # accessing the global namespace
Page 73
4. global Number
5. Number = Number + 200
6. print( Number )
7. AddNumber()
8. print( Number )
Output:
204
404
Python Packages:
We usually organize our files in different folders and subfolders based
on some criteria, so that they can be managed easily and efficiently. For
example, we keep all our games in a Games folder and we can even
subcategorize according to the genre of the game or something like this.
The same analogy is followed by the Python package.
A Python module may contain several classes, functions, variables,

etc. whereas a Python package can contains several module. In simpler
terms a package is folder that contains various modules as files.
Creating Package
Let’s create a package named mypckg that will contain two modules
mod1 and mod2. To create this module follow the below steps –
Create a folder named mypckg.
Page 74
Inside this folder create an empty Python file i.e. __init__.py
Then create two modules mod1 and mod2 in this folder.
Mod1.py
def gfg():
print("Welcome to GFG")
The hierarchy of the our package looks like this –
mypckg
---__init__.py
---mod1.py
---mod2.py
Understanding __init__.py
__init__.py helps the Python interpreter to recognise the folder as

package. It also specifies the resources to be imported from the modules. If
the __init__.py is empty this means that all the functions of the modules will
Page 75
be imported. We can also specify the functions from each module to be

made available.
For example, we can also create the __init__.py file for the above module
as –
__init__.py
from .mod1 import gfg
from .mod2 import sum
This __init__.py will only allow the gfg and sum functions from the
mod1 and mod2 modules to be imported.
Import Modules from a Package

We can import these modules using the from…import statement and
the dot(.) operator.
Syntax:
import package_name.module_name
Example: Import Module from package
We will import the modules from the above created package and will
use the functions inside those modules.
from mypckg import mod1
from mypckg import mod2
mod1.gfg()
res = mod2.sum(1, 2)
print(res)
Page 76
Output:
Welcome to GFG
We can also import the specific function also using the same syntax.
Example: Import Specific function from the module

from mypckg.mod1 import gfg
from mypckg.mod2 import sum
gfg()
res = sum(1, 2)
print(res)
Output:
Welcome to GFG
Page 77
2.2 Python File Handling

Till now, we were taking the input from the console and writing it back
to the console to interact with the user.
Sometimes, it is not enough to only display the data on the console.

The data to be displayed may be very large, and only a limited amount of
data can be displayed on the console since the memory is volatile, it is
impossible to recover the programmatically generated data again and
again.
The file handling plays an important role when the data needs to be
stored permanently into the file. A file is a named location on disk to store
related information. We can access the stored information (non-volatile)
after the program termination.
The file-handling implementation is slightly lengthy or complicated in

the other programming language, but it is easier and shorter in Python.
In Python, files are treated in two modes as text or binary. The file
may be in the text or binary format, and each line of a file is ended with the
special character.
Hence, a file operation can be done in the following order.
o Open a file
o Read or write - Performing operation
o Close the file
Page 78
Opening a file
Python provides an open() function that accepts two arguments, file
name and access mode in which the file is accessed. The function returns
a file object which can be used to perform various operations like reading,
writing, etc.
Syntax:
1. file object = open(<file-name>, <access-mode>, <buffering>)
The files can be accessed using various modes like read, write, or
append. The following are the details about the access mode to open a file.
Access
SN Description
mode
It opens the file to read-only mode. The file pointer exists at the beginning.
1 r
The file is by default open in this mode if no access mode is passed.
It opens the file to read-only in binary format. The file pointer exists at the
2 rb
beginning of the file.
It opens the file to read and write both. The file pointer exists at the
3 r+
It opens the file to read and write both in binary format. The file pointer
4 rb+
exists at the beginning of the file.
It opens the file to write only. It overwrites the file if previously exists or
5 w creates a new one if no file exists with the same name. The file pointer
It opens the file to write only in binary format. It overwrites the file if it
6 wb exists previously or creates a new one if no file exists. The file pointer
It opens the file to write and read both. It is different from r+ in the sense
that it overwrites the previous file if one exists whereas r+ doesn't
7 w+
overwrite the previously written file. It creates a new file if no file exists.
The file pointer exists at the beginning of the file.
Page 79
It opens the file to write and read both in binary format. The file pointer
8 wb+
It opens the file in the append mode. The file pointer exists at the end of
9 a the previously written file if exists any. It creates a new file if no file exists
with the same name.
It opens the file in the append mode in binary format. The pointer exists at
10 ab the end of the previously written file. It creates a new file in binary format if
no file exists with the same name.
It opens a file to append and read both. The file pointer remains at the end
11 a+ of the file if a file exists. It creates a new file if no file exists with the same
name.
It opens a file to append and read both in binary format. The file pointer
12 ab+
remains at the end of the file.
Let's look at the simple example to open a file named "file.txt" (stored
in the same directory) in read mode and printing its content on the console.
Example:
1. #opens the file file.txt in read mode
2. fileptr = open("file.txt","r")
3.
4. if fileptr:
5. print("file is opened successfully")
Output:
<class '_io.TextIOWrapper'>
file is opened successfully
In the above code, we have passed filename as a first argument and

opened file in read mode as we mentioned r as the second argument.
The fileptr holds the file object and if the file is opened successfully, it will
execute the print statement
Page 80
The close() method

Once all the operations are done on the file, we must close it through
our Python script using the close() method. Any unwritten information gets
destroyed once the close() method is called on a file object.
We can perform any operation on the file externally using the file
system which is the currently opened in Python; hence it is good practice to
close the file once all the operations are done.
The syntax to use the close() method is given below.
Syntax:
1. fileobject.close()
2. Consider the following example.
3. # opens the file file.txt in read mode
4. fileptr = open("file.txt","r")
5.
6. if fileptr:
7. print("file is opened successfully")
8.
9. #closes the opened file
10. fileptr.close()
After closing the file, we cannot perform any operation in the file. The
file needs to be properly closed. If any exception occurs while performing
some operations in the file then the program terminates without closing the
file.
Page 81
We should use the following method to overcome such type of

problem.
1. try:
2. fileptr = open("file.txt")
3. # perform file operations
4. finally:
5. fileptr.close()
The with statement

The with statement was introduced in python 2.5. The with statement
is useful in the case of manipulating the files. It is used in the scenario
where a pair of statements is to be executed with a block of code in
between.
The syntax to open a file using with the statement is given below.
1. with open(<file name>, <access mode>) as <file-pointer>:
2. #statement suite
The advantage of using with statement is that it provides the

guarantee to close the file regardless of how the nested block exits.
It is always suggestible to use the with statement in the case of files

because, if the break, return, or exception occurs in the nested block of
code then it automatically closes the file, we don't need to write
the close() function. It doesn't let the file to corrupt.
Example
Page 82
1. with open("file.txt",'r') as f:
2. content = f.read();
3. print(content)
Writing the file

To write some text to a file, we need to open the file using the open
method with one of the following access modes.
w: It will overwrite the file if any file exists. The file pointer is at the
a: It will append the existing file. The file pointer is at the end of the file. It
creates a new file if no file exists.
Example:
1. # open the file.txt in append mode. Create a new file if no such file exists.
2. fileptr = open("file2.txt", "w")
3. # appending the content to the file
4. fileptr.write('''''Python is the modern day language. It makes things so simple.
5. It is the fastest-growing programing language''')
6. # closing the opened the file
7. fileptr.close()
Output:
File2.txt
Python is the modern-day language. It makes things so simple. It is

the fastest growing programming language.
Page 83
Data SScience using Python – Unit II
Snapshot of the file2.txt
We have opened the file in w mode. The file1.txt file doesn't exist, it
created a new file and we have written the content in the file using
the write() function.
Example 2:
1. #open the file.txt in write mode.
2. fileptr = open("file2.txt","a")
3. #overwriting the content of the file
4. fileptr.write(" Python has an easy syntax and user-friendly interaction.")
5. #closing the opened file
6. fileptr.close()
Output:
Python is the modern day language. It makes things
things so simple.
It is the fastest growing programing language Python has an easy

syntax and user-friendly
friendly interaction.
Snapshot of the file2.txt
We can see that the content of the file is modified. We have opened
Page 84
the file in a mode and it appended the content in the existing file2.txt.
To read a file using the Python script, the Python provides

the read() method. The read() method reads a string from the file. It can
read the data in the text as well as a binary format.
The syntax of the read() method is given below.
Syntax:
1. fileobj.read(<count>)
Here, the count is the number of bytes to be read from the file starting
from the beginning of the file. If the count is not specified, then it may read
the content of the file until the end.
Example
1. #open the file.txt in read mode. causes error if no such file exists.
2. fileptr = open("file2.txt","r")
3. #stores all the data of the file into the variable content
4. content = fileptr.read(10)
5. # prints the type of the data stored in the file
6. print(type(content))
7. #prints the content of the file
8. print(content)
10. fileptr.close()
Page 85
Output:
<class 'str'>
Python is
In the above code, we have read the content of file2.txt by using

the read() function. We have passed count value as ten which means it will
read the first ten characters from the file.
If we use the following line, then it will print all content of the file.
1. content = fileptr.read()
2. print(content)
Output:
Python is the modern-day language. It makes things so simple.
It is the fastest-growing programing language Python has easy an
syntax and user-friendly interaction.
Read file through for loop

We can read the file using for loop. Consider the following example.
1. #open the file.txt in read mode. causes an error if no such file exists.
2. fileptr = open("file2.txt","r");
3. #running a for loop
4. for i in fileptr:
5. print(i) # i contains each line of the file
Output:
Python is the modern day language.
Page 86
It makes things so simple.

Python has easy syntax and user-friendly interaction.
Read Lines of the file

Python facilitates to read the file line by line by using a
function readline() method. The readline() method reads the lines of the
file from the beginning, i.e., if we use the readline() method two times, then
we can get the first two lines of the file.
Consider the following example which contains a

function readline() that reads the first line of our file "file2.txt" containing
three lines. Consider the following example.
Example 1: Reading lines using readline() function

4. content = fileptr.readline()
5. content1 = fileptr.readline()
7. print(content)
8. print(content1)
10. fileptr.close()
Output:
Python is the modern day language.
Page 87
It makes things so simple.
We called the readline() function two times that's why it read two
lines from the file.
Python provides also the readlines() method which is used for the
reading lines. It returns the list of the lines till the end of file(EOF) is
reached.
Example 2: Reading Lines Using readlines() function

3.
5. content = fileptr.readlines()
6.
8. print(content)
9.
11. fileptr.close()
Output:
['Python is the modern day language.\n', 'It makes things so
simple.\n', 'Python has easy syntax and user-friendly
interaction.']
Page 88
Creating a new file

The new file can be created by using one of the following access
modes with the function open().
x: it creates a new file with the specified name. It causes an error a

file exists with the same name.
a: It creates a new file with the specified name if no such file exists. It
appends the content to the file if the file already exists with the specified
name.
w: It creates a new file with the specified name if no such file exists.
It overwrites the existing file.
Example 1
2. fileptr = open("file2.txt","x")
3. print(fileptr)
4. if fileptr:
5. print("File created successfully")
Output:
<_io.TextIOWrapper name='file2.txt' mode='x' encoding='cp1252'>
File created successfully
Page 89
File Pointer positions

Python provides the tell() method which is used to print the byte
number at which the file pointer currently exists. Consider the following
example.
1. # open the file file2.txt in read mode
3. #initially the filepointer is at 0
4. print("The filepointer is at byte :",fileptr.tell())
5. #reading the content of the file
6. content = fileptr.read();
7. #after the read operation file pointer modifies. tell() returns the location of the fileptr.
8. print("After reading, the filepointer is at:",fileptr.tell())
Output:
The filepointer is at byte : 0
After reading, the filepointer is at: 117
Modifying file pointer position

In real-world applications, sometimes we need to change the file
pointer location externally since we may need to read or write the content
at various locations.
For this purpose, the Python provides us the seek() method which
enables us to modify the file pointer position externally.
Page 90
The syntax to use the seek() method is given below.
Syntax:
1. <file-ptr>.seek(offset[, from)
The seek() method accepts two parameters:
offset: It refers to the new position of the file pointer within the file.
from: It indicates the reference position from where the bytes are to be
moved. If it is set to 0, the beginning of the file is used as the reference
position. If it is set to 1, the current position of the file pointer is used as the
reference position. If it is set to 2, the end of the file pointer is used as the
reference position.
Example
1. # open the file file2.txt in read mode
3. #initially the filepointer is at 0
4. print("The filepointer is at byte :",fileptr.tell())
5. #changing the file pointer location to 10.
6. fileptr.seek(10);
7. #tell() returns the location of the fileptr.
8. print("After reading, the filepointer is at:",fileptr.tell())
Output:
The filepointer is at byte : 0
After reading, the filepointer is at: 10
Page 91
2.3 Creating the new directory

The mkdir() method is used to create the directories in the current
working directory. The syntax to create the new directory is given below.
Syntax:
1. mkdir(directory name)
Example 1
1. import os
2. #creating a new directory with the name new
3. os.mkdir("new")
The getcwd() method

This method returns the current working directory.
The syntax to use the getcwd() method is given below.
Syntax:
1. os.getcwd()
Example
1. import os
2. os.getcwd()
Output:
'C:\\Users\\DEVANSH SHARMA'
Page 92
Changing the current working directory

The chdir() method is used to change the current working directory to
a specified directory.
The syntax to use the chdir() method is given below.
Syntax:
1. chdir("new-directory")
Example
1. import os
2. # Changing current directory with the new directiory
3. os.chdir("C:\\Users\\DEVANSH SHARMA\\Documents")
4. #It will display the current working directory
5. os.getcwd()
Output:
Deleting directory
The rmdir() method is used to delete the specified directory.
The syntax to use the rmdir() method is given below.
Syntax:
1. os.rmdir(directory name)
Example 1
1. import os
2. #removing the new directory
Page 93
3. os.rmdir("directory_name")
It will remove the specified directory.
Writing Python output to the files

In Python, there are the requirements to write the output of a Python script
to a file.
The check_call() method of module subprocess is used to execute a

Python script and write the output of that script to a file.
The following example contains two python scripts. The script file1.py
executes the script file.py and writes its output to the text file output.txt.
Example
file.py
1. temperatures=[10,-20,-289,100]
2. def c_to_f(c):
3. if c< -273.15:
4. return "That temperature doesn't make sense!"
5. else:
6. f=c*9/5+32
7. return f
8. for t in temperatures:
9. print(c_to_f(t))
file.py
1. import subprocess
2. with open("output.txt", "wb") as f:
Page 94
3. subprocess.check_call(["python", "file.py"], stdout=f)
The file related methods

The file object provides the following methods to manipulate the files
on various operating systems.
SN Method Description
It closes the opened file. The file once closed, it can't be

1 file.close()
read or write anymore.
2 File.fush() It flushes the internal buffer.
It returns the file descriptor used by the underlying

3 File.fileno()
implementation to request I/O from the OS.
It returns true if the file is connected to a TTY device,
4 File.isatty()
otherwise returns false.
5 File.next() It returns the next line from the file.
6 File.read([size]) It reads the file for the specified size.
It reads one line from the file and places the file pointer to
7 File.readline([size])
the beginning of the new line.
It returns a list containing all the lines of the file. It reads the
8 File.readlines([sizehint])
file until the EOF occurs using readline() function.
It modifies the position of the file pointer to a specified offset
9 File.seek(offset[,from)
with the specified reference.
10 File.tell() It returns the current position of the file pointer within the file.
11 File.truncate([size]) It truncates the file to the optional specified size.
12 File.write(str) It writes the specified string to a file
13 File.writelines(seq) It writes a sequence of the strings to a file.
Page 95
2.4 Python Exceptions

When a Python program meets an error, it stops the execution of the
rest of the program. An error in Python might be either an error in the
syntax of an expression or a Python exception. We will see what an
exception is. Also, we will see the difference between a syntax error and an
exception in this tutorial. Following that, we will learn about trying and
except blocks and how to raise exceptions and make assertions. After that,
we will see the Python exceptions list.
What is an Exception?
An exception in Python is an incident that happens while executing a
program that causes the regular course of the program's commands to be
disrupted. When a Python code comes across a condition it can't handle, it
raises an exception. An object in Python that describes an error is called an
exception.
When a Python code throws an exception, it has two options: handle

the exception immediately or stop and quit.
Exceptions versus Syntax Errors

When the interpreter identifies a statement that has an error, syntax errors
occur. Consider the following scenario:
Code:
1. #Python code after removing the syntax error
2. string = "Python Exceptions"
3.
4. for s in string:
Page 96
5. if (s != o:
6. print( s )
Output:
if (s != o:
^
SyntaxError: invalid syntax
The arrow in the output shows where the interpreter encountered a

syntactic error. There was one unclosed bracket in this case. Close it and
rerun the program:
Code:
1. #Python code after removing the syntax error
2. string = "Python Exceptions"
3.
4. for s in string:
5. if (s != o):
6. print( s )
Output:
2 string = "Python Exceptions"

4 for s in string:
----> 5 if (s != o):
6 print( s )
NameError: name 'o' is not defined
We encountered an exception error after executing this code. When

syntactically valid Python code produces an error, this is the kind of error
that arises. The output's last line specified the name of the exception error
code encountered. Instead of displaying just "exception error", Python
Page 97
displays information about the sort of exception error that occurred. It was a
NameError in this situation. Python includes several built-in exceptions.
However, Python offers the facility to construct custom exceptions.
Try and Except Statement - Catching Exceptions

In Python, we catch exceptions and handle them using try and except
code blocks. The try clause contains the code that can raise an exception,
while the except clause contains the code lines that handle the exception.
Let's see if we can access the index from the array, which is more than the
array's length, and handle the resulting exception.
Code:
1. # Python code to catch an exception and handle it using try and except code blocks
2. a = ["Python", "Exceptions", "try and except"]
3. try:
4. #looping through the elements of the array a, choosing a range that goes beyond the l
ength of the array
5. for i in range( 4 ):
6. print( "The index and element from the array is", i, a[i] )
7. #if an error occurs in the try block, then except block will be executed by the Python inter
preter
8. except:
9. print ("Index out of range")
Output:
The index and element from the array is 0 Python

The index and element from the array is 1 Exceptions
The index and element from the array is 2 try and except
Index out of range
The code blocks that potentially produce an error are inserted inside
the try clause in the preceding example. The value of i greater than 2
Page 98
attempts to access the list's item beyond its length, which is not present,
resulting in an exception. The except clause then catches this exception
and executes code without stopping it.
How to Raise an Exception

If a condition does not meet our criteria but is correct according to the
Python interpreter, we can intentionally raise an exception using the raise
keyword. We can use a customized exception in conjunction with the
statement.
If we wish to use raise to generate an exception when a given

condition happens, we may do so as follows:
Code:
1. #Python code to show how to raise an exception in Python
2. num = [3, 4, 5, 7]
3. if len(num) > 3:
4. raise Exception( f"Length of the given list must be less than or equal to 3 but is {len(nu
m)}" )
Output:
1 num = [3, 4, 5, 7]
2 if len(num) > 3:
----> 3 raise Exception( f"Length of the given list must be
less than or equal to 3 but is {len(num)}" )
Exception: Length of the given list must be less than or equal

to 3 but is 4
The implementation stops and shows our exception in the output,

providing indications as to what went incorrect.
Page 99
2.5 Python OOPs Concepts

Like other general-purpose programming languages, Python is also
an object-oriented language since its beginning. It allows us to develop
applications using an Object-Oriented approach. In Python, we can easily
create and use classes and objects.
An object-oriented paradigm is to design the program using classes

and objects. The object is related to real-word entities such as book, house,
pencil, etc. The oops concept focuses on writing the reusable code. It is a
widespread technique to solve the problem by creating objects.
Major principles of object-oriented programming system are given

below.
o Class
o Object
o Method
o Inheritance
o Polymorphism
o Data Abstraction
o Encapsulation
Class
The class can be defined as a collection of objects. It is a logical
entity that has some specific attributes and methods. For example: if you
have an employee class, then it should contain an attribute and method,
i.e. an email id, name, age, salary, etc.
Page 100
1. class ClassName:
2. <statement-1>
3. <statement-N>
Object
The object is an entity that has state and behavior. It may be any
real-world object like the mouse, keyboard, chair, table, pen, etc.
Everything in Python is an object, and almost everything has

attributes and methods. All functions have a built-in attribute __doc__,
which returns the docstring defined in the function source code.
When we define a class, it needs to create an object to allocate the

memory. Consider the following example.
Example:
1. class car:
2. def __init__(self,modelname, year):
3. self.modelname = modelname
4. self.year = year
5. def display(self):
6. print(self.modelname,self.year)
7. c1 = car("Toyota", 2016)
8. c1.display()
Output:
Toyota 2016
Page 101
In the above example, we have created the class named car, and it
has two attributes modelname and year. We have created a c1 object to
access the class attribute. The c1 object will allocate memory for these
values. We will learn more about class and object in the next tutorial.
Method
The method is a function that is associated with an object. In Python,
a method is not unique to class instances. Any object type can have
methods.
Inheritance
Inheritance is the most important aspect of object-oriented
programming, which simulates the real-world concept of inheritance. It
specifies that the child object acquires all the properties and behaviors of
the parent object.
By using inheritance, we can create a class which uses all the

properties and behavior of another class. The new class is known as a
derived class or child class, and the one whose properties are acquired is
known as a base class or parent class.
It provides the re-usability of the code.
Polymorphism
Polymorphism contains two words "poly" and "morphs". Poly means
many, and morph means shape. By polymorphism, we understand that one
task can be performed in different ways. For example - you have a class
animal, and all animals speak. But they speak differently. Here, the "speak"
Page 102
behavior is polymorphic in a sense and depends on the animal. So, the

abstract "animal" concept does not actually "speak", but specific animals
(like dogs and cats) have a concrete implementation of the action "speak".
Encapsulation
Encapsulation is also an essential aspect of object-oriented
programming. It is used to restrict access to methods and variables. In
encapsulation, code and data are wrapped together within a single unit
from being modified by accident.
Data Abstraction
Data abstraction and encapsulation both are often used as
synonyms. Both are nearly synonyms because data abstraction is achieved
through encapsulation.
Abstraction is used to hide internal details and show only

functionalities. Abstracting something means to give names to things so that
the name captures the core of what a function or a whole program does.
Page 103
2.6 Python Class and Objects

We have already discussed in previous tutorial, a class is a virtual
entity and can be seen as a blueprint of an object. The class came into
existence when it instantiated. Let's understand it by an example.
Suppose a class is a prototype of a building. A building contains all

the details about the floor, rooms, doors, windows, etc. we can make as
many buildings as we want, based on these details. Hence, the building
can be seen as a class, and we can create as many objects of this class.
On the other hand, the object is the instance of a class. The process
of creating an object can be called instantiation.
In this section of the tutorial, we will discuss creating classes and

objects in Python. We will also discuss how a class attribute is accessed by
using the object.
Creating classes in Python
In Python, a class can be created by using the keyword class,

followed by the class name. The syntax to create a class is given below.
Syntax:
1. class ClassName:
2. #statement_suite
In Python, we must notice that each class is associated with a

documentation string which can be accessed by using <class-
Page 104
name>.__doc__. A class contains a statement suite including fields,

constructor, function, etc. definition.
Consider the following example to create a class Employee which

contains two fields as Employee id, and name.
The class also contains a function display(), which is used to display

the information of the Employee.
Example
1. class Employee:
2. id = 10
3. name = "Devansh"
4. def display (self):
5. print(self.id,self.name)
Here, the self is used as a reference variable, which refers to the

current class object. It is always the first argument in the function definition.
However, using self is optional in the function call.
The self-parameter
The self-parameter refers to the current instance of the class and
accesses the class variables. We can use anything instead of self, but it
must be the first parameter of any function which belongs to the class.
Creating an instance of the class
Page 105
A class needs to be instantiated if we want to use the class attributes

in another class or method. A class can be instantiated by calling the class
using the class name.
The syntax to create the instance of the class is given below.

1. <object-name> = <class-name>(<arguments>)
The following example creates the instance of the class Employee

defined in the above example.
Example
1. class Employee:
2. id = 10
3. name = "John"
4. def display (self):
5. print("ID: %d \nName: %s"%(self.id,self.name))
6. # Creating a emp instance of Employee class
7. emp = Employee()
8. emp.display()
Output:
ID: 10
Name: John
In the above code, we have created the Employee class which has
two attributes named id and name and assigned value to them. We can
Page 106
observe we have passed the self as parameter in display function. It is

used to refer to the same class attribute.
We have created a new instance object named emp. By using it, we

can access the attributes of the class.
Delete the Object
We can delete the properties of the object or object itself by using the
del keyword. Consider the following example.
Example
1. class Employee:
2. id = 10
3. name = "John"
5. print("ID: %d \nName: %s" % (self.id, self.name))
6. # Creating a emp instance of Employee class
7. emp = Employee()
8. # Deleting the property of object
9. del emp.id
10. # Deleting the object itself
11. del emp
12. emp.display()
It will through the Attribute error because we have deleted the object emp.
Page 107
2.7 Python Constructor
A constructor is a special type of method (function) which is used to

initialize the instance members of the class.
In C++ or Java, the constructor has the same name as its class, but it
treats constructor differently in Python. It is used to create an object.
Constructors can be of two types.
1. Parameterized Constructor
2. Non-parameterized
parameterized Constructor
Constructor definition is executed when we create the object of this

class. Constructors also verify that there are enough resources for the
object to perform any start-up
start task.
Creating the constructor in python

In Python, the method the __init__() simulates the constructor of the
class. This method is called when the class is instantiated. It accepts
Page 108
the self-keyword as a first argument which allows accessing the attributes

or method of the class.
We can pass any number of arguments at the time of creating the

class object, depending upon the __init__() definition. It is mostly used to
initialize the class attributes. Every class must have a constructor, even if it
simply relies on the default constructor.
Consider the following example to initialize the Employee class attributes.
Example
1. class Employee:
2. def __init__(self, name, id):
3. self.id = id
4. self.name = name
5.
7. print("ID: %d \nName: %s" % (self.id, self.name))
8.
9.
10. emp1 = Employee("John", 101)
11. emp2 = Employee("David", 102)
12.
13. # accessing display() method to print employee 1 information
14.
15. emp1.display()
16.
Page 109
17. # accessing display() method to print employee 2 information
18. emp2.display()
Output:
ID: 101
Name: John
ID: 102
Name: David
Counting the number of objects of a class

The constructor is called automatically when we create the object of
the class. Consider the following example.
Example
1. class Student:
2. count = 0
3. def __init__(self):
4. Student.count = Student.count + 1
5. s1=Student()
6. s2=Student()
7. s3=Student()
8. print("The number of students:",Student.count)
Output:
The number of students: 3
Python Non-Parameterized Constructor

The non-parameterized constructor uses when we do not want to
manipulate the value or the constructor that has only self as an argument.
Page 110
Example
1. class Student:
2. # Constructor - non parameterized
4. print("This is non parametrized constructor")
5. def show(self,name):
6. print("Hello",name)
7. student = Student()
8. student.show("John")
Python Parameterized Constructor

The parameterized constructor has multiple parameters along with
the self. Consider the following example.
Example
1. class Student:
2. # Constructor - parameterized
3. def __init__(self, name):
4. print("This is parametrized constructor")
5. self.name = name
6. def show(self):
7. print("Hello",self.name)
8. student = Student("John")
9. student.show()
Output:
Page 111
This is parametrized constructor

Hello John
Python Default Constructor

When we do not include the constructor in the class or forget to
declare it, then that becomes the default constructor. It does not perform
any task but initializes the objects. Consider the following example.
Example
1. class Student:
2. roll_num = 101
3. name = "Joseph"
4.
6. print(self.roll_num,self.name)
7.
8. st = Student()
9. st.display()
Output:
101 Joseph
More than One Constructor in Single class

Let's have a look at another scenario, what happen if we declare the
two same constructors in the class.
Example
1. class Student:
Page 112
3. print("The First Constructor")
5. print("The second contructor")
6.
7. st = Student()
Output:
The Second Constructor
In the above code, the object st called the second constructor

whereas both have the same configuration. The first method is not
accessible by the st object. Internally, the object of the class will always call
the last constructor if the class has multiple constructors.
Python built-in class functions

The built-in functions defined in the class are described in the following
table.
SN Function Description
1 getattr(obj,name,default) It is used to access the attribute of the object.
It is used to set a particular value to the specific attribute of
2 setattr(obj, name,value)
an object.
3 delattr(obj, name) It is used to delete a specific attribute.
4 hasattr(obj, name) It returns true if the object contains some specific attribute.
Example
1. class Student:
2. def __init__(self, name, id, age):
Page 113
3. self.name = name
4. self.id = id
5. self.age = age
6. # creates the object of the class Student
7. s = Student("John", 101, 22)
8. # prints the attribute name of the object s
9. print(getattr(s, 'name'))
10. # reset the value of attribute age to 23
11. setattr(s, "age", 23)
12. # prints the modified value of age
13. print(getattr(s, 'age'))
14. # prints true if the student contains the attribute with name id
15. print(hasattr(s, 'id'))
16. # deletes the attribute age
17. delattr(s, 'age')
18. # this will give an error since the attribute age has been deleted
19. print(s.age)
Output:
John
23
True
AttributeError: 'Student' object has no attribute 'age'
Page 114
Built-in class attributes

Along with the other attributes, a Python class also contains some
built-in class attributes which provide information about the class.
The built-in class attributes are given in the below table.
SN Attribute Description
It provides the dictionary containing the information about the class
1 __dict__
namespace.
2 __doc__ It contains a string which has the class documentation
3 __name__ It is used to access the class name.
4 __module__ It is used to access the module in which, this class is defined.
5 __bases__ It contains a tuple including all base classes.
Example
1. class Student:
2. def __init__(self,name,id,age):
3. self.name = name;
4. self.id = id;
5. self.age = age
6. def display_details(self):
7. print("Name:%s, ID:%d, age:%d"%(self.name,self.id))
8. s = Student("John",101,22)
9. print(s.__doc__)
10. print(s.__dict__)
11. print(s.__module__)
Output:
None
{'name': 'John', 'id': 101, 'age': 22}
__main__
Page 115
2.8 Data Hiding in Python
What is Data Hiding?

Data hiding is a part of object-oriented programming, which is
generally used to hide the data information from the user. It includes
internal object details such as data members, internal working. It
maintained the data integrity and restricted access to the class member.
The main working of data hiding is that it combines the data and functions
into a single unit to conceal data within a class. We cannot directly access
the data from outside the class.
This process is also known as the data encapsulation. It is done by

hiding the working information to user. In the process, we declare class
members as private so that no other class can access these data
members. It is accessible only within the class.
Data Hiding in Python

Python is the most popular programming language as it applies in
every technical domain and has a straightforward syntax and vast libraries.
In the official Python documentation, Data hiding isolates the client from a
part of program implementation. Some of the essential members must be
hidden from the user. Programs or modules only reflected how we could
use them, but users cannot be familiar with how the application works.
Thus it provides security and avoiding dependency as well.
Page 116
We can perform data hiding in Python using the __ double

underscore before prefix. This makes the class members private and
inaccessible to the other classes.
Let's understand the following example.
Example -
1. class CounterClass:
2. __privateCount = 0
3. def count(self):
4. self.__privateCount += 1
5. print(self.__privateCount)
6. counter = CounterClass()
7. counter.count()
8. counter.count()
9. print(counter.__privateCount)
Output:
1
2
File "<string>", line 17, in <module>
AttributeError: 'CounterClass' object has no attribute
'__privateCount'
However we can access the private member using the class name.
1. print(counter.CounterClass__privatecounter)
Output:
1
2
2
Page 117
Advantages of Data Hiding

Below are the main advantages of the data hiding.
o The class objects are disconnected from the irrelevant data.

o It enhances the security against hackers that are unable to access
important data.
o It isolates object as the basic concept of OOP.
o It helps programmer from incorrect linking to the corrupt data.
o We can isolate the object from the basic concept of OOP.
o It provides the high security which stops damage to violate data by
hiding it from the public.
Disadvantages of Data Hiding

Every coin has two sides if there are advantages then there will be
disadvantage as well. Here are the some disadvantages are given below.
o Sometimes programmers need to write the extra lien of the code.
o The data hiding prevents linkage that act as link between visible and
invisible data makes the object faster.
o It forces the programmers to write extra code to hide the important
data from the common users.
Conclusion
Data hiding is an important aspect when it comes to privacy and
security to particularly within the application. It plays an essential role in
preventing unauthorized access. It has some disadvantages, but these are
avoidable in front of its advantages.
Page 118
2.9 Abstraction in Python

Abstraction is used to hide the internal functionality of the function
from the users. The users only interact with the basic implementation of the
function, but inner working is hidden. User is familiar with that "what
function does" but they don't know "how it does."
In simple words, we all use the smartphone and very much familiar
with its functions such as camera, voice-recorder, call-dialing, etc., but we
don't know how these operations are happening in the background. Let's
take another example - When we use the TV remote to increase the
volume. We don't know how pressing a key increases the volume of the
TV. We only know to press the "+" button to increase the volume.
Why Abstraction is Important?

In Python, an abstraction is used to hide the irrelevant data/class in
order to reduce the complexity. It also enhances the application efficiency.
Next, we will learn how we can achieve abstraction using the Python
program.
Abstraction classes in Python

In Python, abstraction can be achieved by using abstract classes and
interfaces.
Page 119
A class that consists of one or more abstract method is called the

abstract class. Abstract methods do not contain their implementation.
Abstract class can be inherited by the subclass and abstract method gets
its definition in the subclass. Abstraction classes are meant to be the
blueprint of the other class. An abstract class can be useful when we are
designing large functions. An abstract class is also helpful to provide the
standard interface for different implementations of components. Python
provides the abc module to use the abstraction in the Python program.
Let's see the following syntax.
Syntax
1. from abc import ABC
2. class ClassName(ABC):
We import the ABC class from the abc module.
Abstract Base Classes

An abstract base class is the common application program of the
interface for a set of subclasses. It can be used by the third-party, which
will provide the implementations such as with plugins. It is also beneficial
when we work with the large code-base hard to remember all the classes.
Working of the Abstract Classes

Unlike the other high-level language, Python doesn't provide the
abstract class itself. We need to import the abc module, which provides the
base for defining Abstract Base classes (ABC). The ABC works by
decorating methods of the base class as abstract. It registers concrete
classes as the implementation of the abstract base. We use
Page 120
the @abstractmethod decorator to define an abstract method or if we

don't provide the definition to the method, it automatically becomes the
abstract method. Let's understand the following example.
Example -
1. # Python program demonstrate
2. # abstract base class work
3. from abc import ABC, abstractmethod
4. class Car(ABC):
5. def mileage(self):
6. pass
7.
8. class Tesla(Car):
10. print("The mileage is 30kmph")
11. class Suzuki(Car):
13. print("The mileage is 25kmph ")
14. class Duster(Car):
17.
18. class Renault(Car):
21.
22. # Driver code
23. t= Tesla ()
24. t.mileage()
25.
26. r = Renault()
27. r.mileage()
28.
29. s = Suzuki()
Page 121
30. s.mileage()
31. d = Duster()
32. d.mileage()
Output:
The mileage is 30kmph
Explanation -
In the above code, we have imported the abc module to create the
abstract base class. We created the Car class that inherited the ABC class
and defined an abstract method named mileage(). We have then inherited
the base class from the three different subclasses and implemented the
abstract method differently. We created the objects to call the abstract
method.
Let's understand another example.
Let's understand another example.
Example -
1. # Python program to define
2. # abstract class
3.
4. from abc import ABC
5.
6. class Polygon(ABC):
7.
8. # abstract method
9. def sides(self):
10. pass
11.
Page 122
12. class Triangle(Polygon):

13.
14.
15. def sides(self):
16. print("Triangle has 3 sides")
17.
18. class Pentagon(Polygon):
19.
20.
22. print("Pentagon has 5 sides")
23.
24. class Hexagon(Polygon):
25.
27. print("Hexagon has 6 sides")
28.
29. class square(Polygon):
30.
32. print("I have 4 sides")
33.
34. # Driver code
35. t = Triangle()
36. t.sides()
37.
38. s = square()
39. s.sides()
40.
41. p = Pentagon()
42. p.sides()
43.
44. k = Hexagon()
45. K.sides()
Page 123
Output:
Triangle has 3 sides
Square has 4 sides
Pentagon has 5 sides
Hexagon has 6 sides
Explanation -
In the above code, we have defined the abstract base class named
Polygon and we also defined the abstract method. This base class
inherited by the various subclasses. We implemented the abstract method
in each subclass. We created the object of the subclasses and invoke
the sides() method. The hidden implementations for the sides() method
inside the each subclass comes into play. The abstract
method sides() method, defined in the abstract class, is never invoked.
Points to Remember
Below are the points which we should remember about the abstract base
class in Python.
o An Abstract class can contain the both method normal and abstract
method.
o An Abstract cannot be instantiated; we cannot create objects for the

abstract class.
Abstraction is essential to hide the core functionality from the users. We

have covered the all the basic concepts of Abstraction in Python.
Page 124
2.10 Python Inheritance

Inheritance is an important aspect of the object-oriented
object oriented paradigm.
Inheritance provides code reusability to the program because we can use
an existing class to create a new class instead of creating it from scratch.
In inheritance, the child class acquires the properties and can access
all the data members and functions defined in the parent class. A child
class can also provide its specific imp
implementation
lementation to the functions of the
parent class. In this section of the tutorial, we will discuss inheritance in
detail.
In python, a derived class can inherit base class by just mentioning

the base in the bracket after the derived class name. Consider th
the following
syntax to inherit a base class into the derived class.
Syntax
1. class derived-class(base
class(base class):
2. <class-suite>
Page 125
A class can inherit multiple classes by mentioning all of them inside

the bracket. Consider the following syntax.
Syntax
1. class derive-class(<base class 1>, <base class 2>, ..... <base class n>):
2. <class - suite>
Example 1
1. class Animal:
2. def speak(self):
3. print("Animal Speaking")
4. #child class Dog inherits the base class Animal
5. class Dog(Animal):
6. def bark(self):
7. print("dog barking")
8. d = Dog()
9. d.bark()
10. d.speak()
Output:
dog barking
Animal Speaking
Python Multi-Level inheritance

Multi-Level inheritance is possible in python like other object-oriented
languages. Multi-level inheritance is archived when a derived class inherits
another derived class. There is no limit on the number of levels up to which,
the multi-level inheritance is archived in python.
Page 126
The syntax of multi-level

level inheritance is given below.
Syntax
11. class class1:
12. <class-suite>
13. class class2(class1):
14. <class suite>
15. class class3(class2):
16. <class suite>
17. .
18. .
Example
1. class Animal:
2. def speak(self):
3. print("Animal Speaking")
4. #The child class Dog inherits the base class Animal
5. class Dog(Animal):
Page 127
6. def bark(self):
7. print("dog barking")
8. #The child class Dogchild inherits another child class Dog
9. class DogChild(Dog):
10. def eat(self):
11. print("Eating bread...")
12. d = DogChild()
13. d.bark()
14. d.speak()
15. d.eat()
Output:
dog barking
Animal Speaking
Eating bread...
Python Multiple inheritance

Python provides us the flexibility to inherit multiple base classes in the
child class.
The syntax to perform multiple inheritance is given below.
Page 128
Syntax
1. class Base1:
2. <class-suite>
3.
4. class Base2:
5. <class-suite>
6. .
7. .
8. .
9. class BaseN:
10. <class-suite>
11.
12. class Derived(Base1, Base2, ...... BaseN):
13. <class-suite>
Example
1. class Calculation1:
2. def Summation(self,a,b):
3. return a+b;
5. def Multiplication(self,a,b):
6. return a*b;
7. class Derived(Calculation1,Calculation2):
8. def Divide(self,a,b):
9. return a/b;
10. d = Derived()
11. print(d.Summation(10,20))
12. print(d.Multiplication(10,20))
13. print(d.Divide(10,20))
Output:
30
200
0.5
Page 129
The issubclass (sub,sup) method

The issubclass(sub, sup) method is used to check the relationships
between the specified classes. It returns true if the first class is the
subclass of the second class, and false otherwise.
Example
3. return a+b;
6. return a*b;
9. return a/b;
10. d = Derived()
11. print(issubclass(Derived,Calculation2))
12. print(issubclass(Calculation1,Calculation2))
Output:
True
False
The isinstance (obj, class) method

The isinstance() method is used to check the relationship between
the objects and classes. It returns true if the first parameter, i.e., obj is the
instance of the second parameter, i.e., class.
Page 130
Example
3. return a+b;
6. return a*b;
9. return a/b;
10. d = Derived()
11. print(isinstance(d,Derived))
Page 131
Data Science using Python – Unit III
UNIT – III
3.1 What is NumPy?

NumPy is a Python library used for working with arrays. It also has
functions for working in domain of linear algebra, fourier transform, and
matrices. NumPy was created in 2005 by Travis Oliphant. It is an open
source project and you can use it freely.
NumPy stands for Numerical Python.
NumPy is a general-purpose array-processing package. It provides a
high-performance multidimensional array object, and tools for working
with these arrays. It is the fundamental package for scientific computing
with Python. It is open-source software. It contains various features
including these important ones:
 A powerful N-dimensional array object
 Sophisticated (broadcasting) functions
 Tools for integrating C/C++ and Fortran code
 Useful linear algebra, Fourier transform, and random number
capabilities
Besides its obvious scientific uses, NumPy can also be used as an
efficient multi-dimensional container of generic data. Arbitrary data-
types can be defined using Numpy which allows NumPy to seamlessly
and speedily integrate with a wide variety of databases. Installation:
 Mac and Linux users can install NumPy via pip command:
pip install numpy
 Windows does not have any package manager analogous to that in

linux or mac. Please download the pre-built windows installer for
NumPy from here (according to your system configuration and
Python version). And then install the packages manually.
Page 132
NumPy: NumPy’s main object is the homogeneous multidimensional

array.
 It is a table of elements (usually numbers), all of the same type,
indexed by a tuple of positive integers.
 In NumPy dimensions are called axes. The number of axes is rank.
 NumPy’s array class is called ndarray. It is also known by the
alias array.
Example :
# Python program to demonstrate
# basic array characteristics
import numpy as np
# Creating array object
arr = np.array( [[ 1, 2, 3],[ 4, 2, 5]] )
# Printing type of arr object
print("Array is of type: ", type(arr))
# Printing array dimensions (axes)
print("No. of dimensions: ", arr.ndim)
# Printing shape of array
print("Shape of array: ", arr.shape)
# Printing size (total number of elements) of array
print("Size of array: ", arr.size)
# Printing type of elements in array
print("Array stores elements of type: ", arr.dtype)
Output :
Array is of type:
No. of dimensions: 2
Shape of array: (2, 3)
Size of array: 6
Array stores elements of type: int64
Page 133
3.2 Array creation:

There are various ways to create arrays in NumPy.For example,
you can create an array from a regular Python list or tuple using
the array function. The type of the resulting array is deduced from
the type of the elements in the sequences. Often, the elements of an
array are originally unknown, but its size is known. Hence, NumPy
offers several functions to create arrays with initial placeholder
content. These minimize the necessity of growing arrays, an
expensive operation. For example: np.zeros, np.ones, np.full,
np.empty, etc.
To create sequences of numbers, NumPy provides a function

analogous to range that returns arrays instead of lists.
 arange: returns evenly spaced values within a given
interval. step size is specified.
 linspace: returns evenly spaced values within a given
interval. num no. of elements are returned.
 Reshaping array: We can use reshape method to reshape an array.
Consider an array with shape (a1, a2, a3, …, aN). We can reshape
and convert it into another array with shape (b1, b2, b3, …, bM). The
only required condition is: a1 x a2 x a3 … x aN = b1 x b2 x b3 … x
bM . (i.e original size of array remains unchanged.)
 Flatten array: We can use flatten method to get a copy of array
collapsed into one dimension. It accepts order argument. Default
value is ‘C’ (for row-major order). Use ‘F’ for column major order.
Note: Type of array can be explicitly defined while creating array.
# array creation techniques
Page 134
import numpy as np
# Creating array from list with type float

a = np.array([[1, 2, 4], [5, 8, 7]], dtype = 'float')
print ("Array created using passed list:\n", a)
# Creating array from tuple

b = np.array((1 , 3, 2))
print ("\nArray created using passed tuple:\n", b)
# Creating a 3X4 array with all zeros

c = np.zeros((3, 4))
print ("\nAn array initialized with all zeros:\n", c)
# Create a constant value array of complex type

d = np.full((3, 3), 6, dtype = 'complex')
print ("\nAn array initialized with all 6s." "Array type is complex:\n", d)
# Create an array with random values

e = np.random.random((2, 2))
print ("\nA random array:\n", e)
# Create a sequence of integers

# from 0 to 30 with steps of 5
f = np.arange(0, 30, 5)
print ("\nA sequential array with steps of 5:\n", f)
# Create a sequence of 10 values in range 0 to 5

g = np.linspace(0, 5, 10)
Page 135
print ("\nA sequential array with 10 values between" "0 and 5:\n", g)
# Reshaping 3X4 array to 2X2X3 array

arr = np.array([[1, 2, 3, 4],[5, 2, 4, 2],[1, 2, 0, 1]])
newarr = arr.reshape(2, 2, 3)
print ("\nOriginal array:\n", arr)
print ("Reshaped array:\n", newarr)
# Flatten array
arr = np.array([[1, 2, 3], [4, 5, 6]])
flarr = arr.flatten()
print ("\nOriginal array:\n", arr)
print ("Fattened array:\n", flarr)
Output :
Array created using passed list:
[[ 1. 2. 4.]
[ 5. 8. 7.]]
Array created using passed tuple:

[1 3 2]
An array initialized with all zeros:

[[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]]
An array initialized with all 6s. Array type is complex:

[[ 6.+0.j 6.+0.j 6.+0.j]
Page 136
[ 6.+0.j 6.+0.j 6.+0.j]

[ 6.+0.j 6.+0.j 6.+0.j]]
A random array:
[[ 0.46829566 0.67079389]
[ 0.09079849 0.95410464]]
A sequential array with steps of 5:

[ 0 5 10 15 20 25]
A sequential array with 10 values between 0 and 5:

[ 0.0.55555556 1.11111111 1.66666667 2.22222222 2.77777778
3.33333333 3.88888889 4.44444444 5.]
Original array:
[[1 2 3 4]
[5 2 4 2]
[1 2 0 1]]
Reshaped array:
[[[1 2 3]
[4 5 2]]
[[4 2 1]
[2 0 1]]]
Original array:
[[1 2 3]
[4 5 6]]
Fattened array:
[1 2 3 4 5 6]
Page 137
3.3 Data Types in NumPhy Arrays

By default Python have these data types:
strings - used to represent text data, the text is given under quote
marks. e.g. "ABCD"
integer - used to represent integer numbers. e.g. -1, -2, -3
float - used to represent real numbers. e.g. 1.2, 42.42
boolean - used to represent True or False.
complex - used to represent complex numbers. e.g. 1.0 + 2.0j,
Data Types in NumPy NumPy has some extra data types, and refer
to data types with one character, like i for integers, u for unsigned
integers etc.
Below is a list of all data types in NumPy and the characters used to
represent them.
i - integer M - datetime
b - boolean O - object
u - unsigned integer S - string
f - float U - unicode string
c - complex float V - fixed chunk of memory for
m - timedelta other type ( void )
Checking the Data Type of an Array
The NumPy array object has a property called dtype that returns the
data type of the array:
Get the data type of an array object:
import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr.dtype)
Get the data type of an array containing strings:
import numpy as np
arr = np.array(['apple', 'banana', 'cherry'])
print(arr.dtype)
Page 138
3.4 Arithmetic with NumPy:

Plethora of built-in arithmetic functions are provided in NumPy.
 Operations on single array: We can use overloaded arithmetic
operators to do element-wise operation on array to create a new
array. In case of +=, -=, *= operators, the existing array is modified.
# basic operations on single array
import numpy as np
a = np.array([1, 2, 5, 3])
# add 1 to every element
print ("Adding 1 to every element:", a+1)
# subtract 3 from each element
print ("Subtracting 3 from each element:", a-3)
# multiply each element by 10
print ("Multiplying each element by 10:", a*10)
# square each element
print ("Squaring each element:", a**2)
# modify existing array
a *= 2
print ("Doubled each element of original array:", a)
# transpose of array
Page 139
a = np.array([[1, 2, 3], [3, 4, 5], [9, 6, 0]])
print ("\nOriginal array:\n", a)
print ("Transpose of array:\n", a.T)
Output :
Adding 1 to every element: [2 3 6 4]
Subtracting 3 from each element: [-2 -1 2 0]
Multiplying each element by 10: [10 20 50 30]
Squaring each element: [ 1 4 25 9]
Doubled each element of original array: [ 2 4 10 6]
Original array:
[[1 2 3]
[3 4 5]
[9 6 0]]
Transpose of array:
[[1 3 9]
[2 4 6]
[3 5 0]]
 Unary operators: Many unary operations are provided as a method

of ndarray class. This includes sum, min, max, etc. These functions
can also be applied row-wise or column-wise by setting an axis
parameter.
# unary operators in numpy
import numpy as np
Page 140
arr = np.array([[1, 5, 6],[4, 7, 2],[3, 1, 9]])

# maximum element of array
print ("Largest element is:", arr.max())
print ("Row-wise maximum elements:",arr.max(axis = 1))
# minimum element of array
print ("Column-wise minimum elements:",arr.min(axis = 0))
# sum of array elements
print ("Sum of all array elements:",arr.sum())
# cumulative sum along each row
print ("Cumulative sum along each row:\n",arr.cumsum(axis =
1))
Output :
Largest element is: 9
Row-wise maximum elements: [6 7 9]
Column-wise minimum elements: [1 1 2]
Sum of all array elements: 38
Cumulative sum along each row:
[[ 1 6 12]
[ 4 11 13]
[ 3 4 13]]
 Binary operators: These operations apply on array elementwise and

a new array is created. You can use all basic arithmetic operators like
+, -, /, , etc. In case of +=, -=, = operators, the existing array is
modified.
Page 141
# binary operators in Numpy
import numpy as np
a = np.array([[1, 2], [3, 4]])
b = np.array([[4, 3],[2, 1]])
# add arrays
print ("Array sum:\n", a + b)
# multiply arrays (elementwise multiplication)
print ("Array multiplication:\n", a*b)
# matrix multiplication
print ("Matrix multiplication:\n", a.dot(b))
Output:
Array sum:
[[5 5]
[5 5]]
Array multiplication:
[[4 6]
[6 4]]
Matrix multiplication:
[[ 8 5]
[20 13]]
Page 142
3.5 Array Indexing:

Knowing the basics of array indexing is important for analyzing and
manipulating the array object. NumPy offers many ways to do array
indexing.
 Slicing: Just like lists in python, NumPy arrays can be sliced. As

arrays can be multidimensional, you need to specify a slice for each
dimension of the array.
Slicing in python means taking elements from one given index to

another given index.
We pass slice instead of index like this: [start:end].
We can also define the step, like this: [start:end:step].
If we don't pass start its considered 0
If we don't pass end its considered length of array in that dimension
If we don't pass step its considered 1
Example:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5])
Page 143
Negative Slicing
Use the minus operator to refer to an index from the end:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[-3:-1])
Use the step value to determine the step of the slicing:
Example
Return every other element from index 1 to index 5:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5:2])
Slicing 2-D Arrays
Example
From the second element, slice elements from index 1 to index 4 (not
included):
import numpy as np
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
print(arr[1, 1:4])
Page 144
 Integer array indexing:

In this method, lists are passed for indexing for each dimension. One
to one mapping of corresponding elements is done to construct a new
arbitrary array.
# indexing in numpy
import numpy as np
# An exemplar array
arr = np.array([[-1, 2, 0, 4], [4, -0.5, 6, 0], [2.6, 0, 7, 8], [3,

-7, 4, 2.0]])
# Integer array indexing example
temp = arr[[0, 1, 2, 3], [3, 2, 1, 0]]
print ("\nElements at indices (0, 3), (1, 2), (2, 1)," "(3, 0):\n", temp)
 Boolean array indexing:

This method is used when we want to pick elements from array which
satisfy some condition.
# boolean array indexing example
cond = arr > 0 # cond is a boolean array
temp = arr[cond]
print ("\nElements greater than 0:\n", temp)
Output :
Array with first 2 rows and alternatecolumns(0 and 2):
[[-1. 0.]
Page 145
[ 4. 6.]]
Elements at indices (0, 3), (1, 2), (2, 1),(3, 0):
[ 4. 6. 0. 3.]
Elements greater than 0:
[ 2. 4. 4. 6. 2.6 7. 8. 3. 4. 2. ]
 Swapping in arrays
Numpy allows you to swap axes without costing anything in
memory, and very little in time. The obvious axis swap is a 2D array
transpose:
>>> import numpy as np
>>> arr = np.arange(10).reshape((5, 2))
>>> arr
array([[0, 1],
[2, 3],
[4, 5],
[6, 7],
[8, 9]])
>>> arr.T
array([[0, 2, 4, 6, 8],
[1, 3, 5, 7, 9]])
The transpose method - and the np.tranpose function does the same
thing as the .T attribute above:
>>> arr.transpose()
array([[0, 2, 4, 6, 8],
[1, 3, 5, 7, 9]])
The advantage of transpose over the .T attribute is that is allows you to
move axes into any arbitrary order.
For example, let’s say you had a 3D array:
>>> arr = np.arange(24).reshape((2, 3, 4))
>>> arr
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
Page 146
[ 8, 9, 10, 11]],
<BLANKLINE>
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
>>> arr.shape
(2, 3, 4)
>>> arr[:, :, 0]
array([[ 0, 4, 8],
[12, 16, 20]])
 Universal functions (ufunc):

NumPy provides familiar mathematical functions such as sin, cos, exp,
etc. These functions also operate elementwise on an array, producing
an array as output.
Note: All the operations we did above using overloaded operators can
be done using ufuncs like np.add, np.subtract, np.multiply, np.divide,
np.sum, etc.
# universal functions in numpy
import numpy as np
# create an array of sine values
a = np.array([0, np.pi/2, np.pi])
print ("Sine values of array elements:", np.sin(a))
# exponential values
a = np.array([0, 1, 2, 3])
print ("Exponent of array elements:", np.exp(a))
# square root of array values
print ("Square root of array elements:", np.sqrt(a))
Output:
Sine values of array elements: [ 0.00000000e+00 1.00000000e+00
1.22464680e-16]
Exponent of array elements: [ 1. 2.71828183 7.3890561
20.08553692]
Square root of array elements: [ 0. 1. 1.41421356
1.73205081]
Page 147
3.6 Sorting array:

There is a simple np.sort method for sorting NumPy arrays. Let’s
explore it a bit. Sorting means putting elements in an ordered
sequence.Ordered sequence is any sequence that has an order
corresponding to elements, like numeric or alphabetical, ascending or
descending.
The NumPy ndarray object has a function called sort(), that will sort a
specified array.
Sort the array:
import numpy as np
arr = np.array([3, 2, 0, 1])
print(np.sort(arr))
Note: This method returns a copy of the array, leaving the original array
unchanged.
You can also sort arrays of strings, or any other data type:
Example
Sort the array alphabetically:
import numpy as np
arr = np.array(['banana', 'cherry', 'apple'])
print(np.sort(arr))
Example
Sort a boolean array:
import numpy as np
arr = np.array([True, False, True])
Page 148
print(np.sort(arr))
Sorting a 2-D Array
If you use the sort() method on a 2-D array, both arrays will be sorted:
Sort a 2-D array:
import numpy as np
arr = np.array([[3, 2, 4], [5, 0, 1]])
print(np.sort(arr))
Use the correct NumPy method to return a sorted array.
arr = np.array([3, 2, 0, 1])
x = np.
(arr)
Page 149
Data Science using Python – Unit IV
UNIT – IV
4.1 Introduction to pandas Data Structures:
Pandas deals with the following three data structures −
 Series
 DataFrame
 Panel
These data structures are built on top of Numpy array, which means they
are fast.
Dimension & Description

The best way to think of these data structures is that the higher
dimensional data structure is a container of its lower dimensional data
structure. For example, DataFrame is a container of Series, Panel is a
container of DataFrame.
Data Structure Dimensions Description
Series 1 1D labeled homogeneous array,
sizeimmutable.
Data Frames 2 General 2D labeled, size-mutable
tabular structure with potentially
heterogeneously typed columns.
Panel 3 General 3D labeled, size-mutable
array.
Building and handling two or more dimensional arrays is a tedious task,

burden is placed on the user to consider the orientation of the data set
when writing functions. But using Pandas data structures, the mental
effort of the user is reduced.
For example, with tabular data (DataFrame) it is more semantically
helpful to think of the index (the rows) and the columns rather than axis
0 and axis 1.
Mutability
All Pandas data structures are value mutable (can be changed) and
except Series all are size mutable. Series is size immutable.
Page 150
Note − DataFrame is widely used and one of the most important data
structures. Panel is used much less.
Series
Series is a one-dimensional array like structure with homogeneous
data. For example, the following series is a collection of integers 10, 23,
56, …
10 23 56 17 52 61 73 90 26 72
Key Points
 Homogeneous data
 Size Immutable
 Values of Data Mutable
DataFrame
DataFrame is a two-dimensional array with heterogeneous data. For
example,
Name Age Gender Rating
Steve 32 Male 3.45
Lia 28 Female 4.6
Vin 45 Male 3.9
Katie 38 Female 2.78
The table represents the data of a sales team of an organization with

their overall performance rating. The data is represented in rows and
columns. Each column represents an attribute and each row represents
a person.
Data Type of Columns
The data types of the four columns are as follows −
Column Type
Name String
Age Integer
Gender String
Rating Float
Page 151
Key Points
 Heterogeneous data
 Size Mutable
 Data Mutable
Panel
Panel is a three-dimensional data structure with heterogeneous
data. It is hard to represent the panel in graphical representation. But a
panel can be illustrated as a container of DataFrame.
Key Points
 Heterogeneous data
 Size Mutable
 Data Mutable
Page 152
4.2 Series
Series is a one-dimensional labeled array capable of holding data
of any type (integer, string, float, python objects, etc.). The axis labels
are collectively called index.
pandas.Series
A pandas Series can be created using the following constructor −
pandas.Series( data, index, dtype, copy)
The parameters of the constructor are as follows −
Sr.No Parameter & Description
1 data
data takes various forms like ndarray, list, constants
2 index
Index values must be unique and hashable, same length as
data. Default np.arrange(n) if no index is passed.
3 dtype
dtype is for data type. If None, data type will be inferred
4 copy
Copy data. Default False
A series can be created using various inputs like −

 Array
 Dict
 Scalar value or constant
Create an Empty Series

A basic series, which can be created is an Empty Series.
Page 153
Example
Live Demo
#import the pandas library and aliasing as pd
import pandas as pd
s = pd.Series()
print s
Its output is as follows −
Series([], dtype: float64)
Create a Series from ndarray

If data is an ndarray, then index passed must be of the same length. If
no index is passed, then by default index will be range(n) where n is
array length, i.e., [0,1,2,3…. range(len(array))-1].
Example 1
Live Demo
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data)
print s
0 a
1 b
2 c
3 d
dtype: object
We did not pass any index, so by default, it assigned the indexes
ranging from 0 to len(data)-1, i.e., 0 to 3.
Page 154
4.3 DataFrame
A Data frame is a two-dimensional data structure, i.e., data is
aligned in a tabular fashion in rows and columns.
Features of DataFrame
 Potentially columns are of different types
 Size – Mutable
 Labeled axes (rows and columns)
 Can Perform Arithmetic operations on rows and columns
Structure
Let us assume that we are creating a data frame with student’s data.
You can think of it as an SQL table or a spreadsheet data

representation.
pandas.DataFrame
A pandas DataFrame can be created using the following constructor −
pandas.DataFrame( data, index, columns, dtype, copy)
Page 155

Sr.No Parameter & Description
1 data
data takes various forms like ndarray, series, map, lists, dict,
constants and also another DataFrame.
2 index
For the row labels, the Index to be used for the resulting frame is
Optional Default np.arange(n) if no index is passed.
3 columns
For column labels, the optional default syntax is - np.arange(n).
This is only true if no index is passed.
4 dtype
Data type of each column.
5 copy
This command (or whatever it is) is used for copying of data, if
the default is False.
Create DataFrame
A pandas DataFrame can be created using various inputs like −
 Lists
 dict
 Series
 Numpy ndarrays
 Another DataFrame
In the subsequent sections of this chapter, we will see how to create a
DataFrame using these inputs.
Create an Empty DataFrame

A basic DataFrame, which can be created is an Empty Dataframe.
Page 156
Example
Live Demo
import pandas as pd
df = pd.DataFrame()
print df
Empty DataFrame
Columns: []
Index: []
Create a DataFrame from Lists

The DataFrame can be created using a single list or a list of lists.
Example 1
Live Demo
import pandas as pd
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print df
0
0 1
1 2
2 3
3 4
4 5
Page 157
4.4 PANEL
A panel is a 3D container of data. The term Panel data is derived
from econometrics and is partially responsible for the name pandas
− pan(el)-da(ta)-s.
The names for the 3 axes are intended to give some semantic meaning
to describing operations involving panel data. They are −
 items − axis 0, each item corresponds to a DataFrame contained

inside.
 major_axis − axis 1, it is the index (rows) of each of the
DataFrames.
 minor_axis − axis 2, it is the columns of each of the DataFrames.
pandas.Panel()
A Panel can be created using the following constructor −
pandas.Panel(data, items, major_axis, minor_axis, dtype, copy)
Parameter Description
Data Data takes various forms like ndarray, series, map, lists,
dict, constants and also another DataFrame
Items axis=0
major_axis axis=1
minor_axis axis=2
Dtype Data type of each column
Copy Copy data. Default, false
Page 158
Create Panel
A Panel can be created using multiple ways like −
 From ndarrays
 From dict of DataFrames
From 3D ndarray
Live Demo
# creating an empty panel
import pandas as pd
import numpy as np
data = np.random.rand(2,4,5)
p = pd.Panel(data)
print p
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 4 (major_axis) x 5 (minor_axis)
Items axis: 0 to 1
Major_axis axis: 0 to 3
Minor_axis axis: 0 to 4
Page 159
4.5 Indexing Selection

The Python and NumPy indexing operators "[ ]" and attribute
operator "." provide quick and easy access to Pandas data structures
across a wide range of use cases. However, since the type of the data to
be accessed isn’t known in advance, directly using standard operators
has some optimization limits. For production code, we recommend that
you take advantage of the optimized pandas data access methods
explained in this chapter.
Pandas now supports three types of Multi-axes indexing; the three types
are mentioned in the following table −
Sr.No Indexing & Description
1 .loc()
Label based
2 .iloc()
Integer based
3 .ix()
Both Label and Integer based
.loc()
Pandas provide various methods to have purely label based indexing.
When slicing, the start bound is also included. Integers are valid labels,
but they refer to the label and not the position.
.loc() has multiple access methods like −
 A single scalar label
 A list of labels
 A slice object
 A Boolean array
loc takes two single/list/range operator separated by ','. The first one
indicates the row and the second one indicates columns.
Page 160
Example 1
Live Demo
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(8, 4),
index = ['a','b','c','d','e','f','g','h'], columns = ['A', 'B', 'C', 'D'])
#select all rows for a specific column

print df.loc[:,'A']
a 0.391548
b -0.070649
c -0.317212
d -2.162406
e 2.202797
f 0.613709
g 1.050559
h 1.122680
Name: A, dtype: float64
Example 2
Live Demo
# import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
# Select all rows for multiple columns, say list[]

print df.loc[:,['A','C']]
A C
a 0.391548 0.745623
b -0.070649 1.620406
c -0.317212 1.448365
d -2.162406 -0.873557
e 2.202797 0.528067
f 0.613709 0.286414
g 1.050559 0.216526
h 1.122680 -1.621420
Page 161
Example 3
Live Demo
import pandas as pd
import numpy as np
# Select few rows for multiple columns, say list[]

print df.loc[['a','b','f','h'],['A','C']]
A C
a 0.391548 0.745623
b -0.070649 1.620406
f 0.613709 0.286414
h 1.122680 -1.621420
Example 4
Live Demo
import pandas as pd
import numpy as np
# Select range of rows for all columns

print df.loc['a':'h']
A B C D
a 0.391548 -0.224297 0.745623 0.054301
b -0.070649 -0.880130 1.620406 1.419743
c -0.317212 -1.929698 1.448365 0.616899
d -2.162406 0.614256 -0.873557 1.093958
e 2.202797 -2.315915 0.528067 0.612482
f 0.613709 -0.157674 0.286414 -0.500517
g 1.050559 -2.272099 0.216526 0.928449
h 1.122680 0.324368 -1.621420 -0.741470
Example 5
Live Demo
import pandas as pd
Page 162
import numpy as np
# for getting values with a boolean array

print df.loc['a']>0
A False
B True
C False
D False
Name: a, dtype: bool
.iloc()
Pandas provide various methods in order to get purely integer based
indexing. Like python and numpy, these are 0-based indexing.
The various access methods are as follows −
 An Integer
 A list of integers
 A range of values
Example 1
Live Demo
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(8, 4), columns = ['A', 'B', 'C', 'D'])
# select all rows for a specific column

print df.iloc[:4]
A B C D
0 0.699435 0.256239 -1.270702 -0.645195
1 -0.685354 0.890791 -0.813012 0.631615
2 -0.783192 -0.531378 0.025070 0.230806
3 0.539042 -1.284314 0.826977 -0.026251
Page 163
Example 2
Live Demo
import pandas as pd
import numpy as np
# Integer slicing
print df.iloc[:4]
print df.iloc[1:5, 2:4]
A B C D
0 0.699435 0.256239 -1.270702 -0.645195
1 -0.685354 0.890791 -0.813012 0.631615
2 -0.783192 -0.531378 0.025070 0.230806
3 0.539042 -1.284314 0.826977 -0.026251
C D
1 -0.813012 0.631615
2 0.025070 0.230806
3 0.826977 -0.026251
4 1.423332 1.130568
Example 3
Live Demo
import pandas as pd
import numpy as np
# Slicing through list of values

print df.iloc[[1, 3, 5], [1, 3]]
print df.iloc[1:3, :]
print df.iloc[:,1:3]
B D
1 0.890791 0.631615
3 -1.284314 -0.026251
5 -0.512888 -0.518930
A B C D
1 -0.685354 0.890791 -0.813012 0.631615
2 -0.783192 -0.531378 0.025070 0.230806
Page 164
B C
0 0.256239 -1.270702
1 0.890791 -0.813012
2 -0.531378 0.025070
3 -1.284314 0.826977
4 -0.460729 1.423332
5 -0.512888 0.581409
6 -1.204853 0.098060
7 -0.947857 0.641358
.ix()
Besides pure label based and integer based, Pandas provides a hybrid
method for selections and subsetting the object using the .ix() operator.
Example 1
Live Demo
import pandas as pd
import numpy as np
# Integer slicing
print df.ix[:4]
A B C D
0 0.699435 0.256239 -1.270702 -0.645195
1 -0.685354 0.890791 -0.813012 0.631615
2 -0.783192 -0.531378 0.025070 0.230806
3 0.539042 -1.284314 0.826977 -0.026251
Example 2
Live Demo
import pandas as pd
import numpy as np

# Index slicing
print df.ix[:,'A']
0 0.699435
1 -0.685354
Page 165
2 -0.783192
3 0.539042
4 -1.044209
5 -1.415411
6 1.062095
7 0.994204
Name: A, dtype: float64
Use of Notations
Getting values from the Pandas object with Multi-axes indexing uses the
following notation −
Object Indexers Return Type
Series s.loc[indexer] Scalar value
DataFrame df.loc[row_index,col_index] Series object
Panel p.loc[item_index,major_index, p.loc[item_index,major_index,
minor_index] minor_index]
Note − .iloc() & .ix() applies the same indexing options and Return
value.
Page 166
4.6 Filtration
Filtration filters the data on a defined criteria and returns the
subset of data. The filter() function is used to filter the data.
Live Demo
import pandas as pd
import numpy as np
ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',

'kings', 'Kings', 'Kings', 'Riders', 'Royals', 'Royals', 'Riders'],
'Rank': [1, 2, 2, 3, 3,4 ,1 ,1,2 , 4,1,2],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Points':[876,789,863,673,741,812,756,788,694,701,804,690]}
df = pd.DataFrame(ipl_data)
print df.groupby('Team').filter(lambda x: len(x) >= 3)
Points Rank Team Year

0 876 1 Riders 2014
1 789 2 Riders 2015
4 741 3 Kings 2014
6 756 1 Kings 2016
7 788 1 Kings 2017
8 694 2 Riders 2016
11 690 2 Riders 2017
In the above filter condition, we are asking to return the teams which
have participated three or more times in IPL.
Page 167
4.7 Mapping,
map() function returns a map object(which is an iterator) of the results after
applying the given function to each item of a given iterable (list, tuple etc.)
Syntax :
map(fun, iter)
Parameters :
fun : It is a function to which map passes each element of given iterable.

iter : It is a iterable which is to be mapped.
NOTE : You can pass one or more iterable to the map() function.
Returns :
Returns a list of the results after applying the given function

to each item of a given iterable (list, tuple etc.)
NOTE : The returned value from map() (map object) then can be passed to
functions like list() (to create a list), set() (to create a set) .
CODE 1
# Python program to demonstrate working

# of map.
# Return double of n
def addition(n):
return n + n
# We double all numbers using map()

numbers = (1, 2, 3, 4)
result = map(addition, numbers)
print(list(result))
Output :
[2, 4, 6, 8]
Page 168
4.8 Sorting
The easiest way to sort is with the sorted(list) function, which takes a list
and returns a new list with those elements in sorted order. The original
list is not changed.
a = [5, 1, 4, 3]
print(sorted(a)) ## [1, 3, 4, 5]
print(a) ## [5, 1, 4, 3]
It's most common to pass a list into the sorted() function, but in fact it
can take as input any sort of iterable collection. The older list.sort()
method is an alternative detailed below. The sorted() function seems
easier to use compared to sort(), so I recommend using sorted().
The sorted() function can be customized through optional arguments.

The sorted() optional argument reverse=True, e.g. sorted(list,
reverse=True), makes it sort backwards.
strs = ['aa', 'BB', 'zz', 'CC']

print(sorted(strs)) ## ['BB', 'CC', 'aa', 'zz'] (case sensitive)
print(sorted(strs, reverse=True)) ## ['zz', 'aa', 'CC', 'BB']
Custom Sorting With key=
For more complex custom sorting, sorted() takes an optional "key="

specifying a "key" function that transforms each element before
comparison. The key function takes in 1 value and returns 1 value, and
the returned "proxy" value is used for the comparisons within the sort.
For example with a list of strings, specifying key=len (the built in len()
function) sorts the strings by length, from shortest to longest. The sort
calls len() for each string to get the list of proxy length values, and then
sorts with those proxy values.
strs = ['ccc', 'aaaa', 'd', 'bb']

print(sorted(strs, key=len)) ## ['d', 'bb', 'ccc', 'aaaa']
Page 169
As another example, specifying "str.lower" as the key function is a way

to force the sorting to treat uppercase and lowercase the same:
## "key" argument specifying str.lower function to use for sorting

print(sorted(strs, key=str.lower)) ## ['aa', 'BB', 'CC', 'zz']
You can also pass in your own MyFn as the key function, like this:
## Say we have a list of strings we want to sort by the last letter of the string.
strs = ['xc', 'zb', 'yd' ,'wa']
## Write a little function that takes a string, and returns its last letter.
## This will be the key function (takes in 1 value, returns 1 value).
def MyFn(s):
return s[-1]
## Now pass key=MyFn to sorted() to sort by the last letter:

print(sorted(strs, key=MyFn)) ## ['wa', 'zb', 'xc', 'yd']
For more complex sorting like sorting by last name then by first name,
you can use the itemgetter or attrgetter functions like:
from operator import itemgetter
# (first name, last name, score) tuples

grade = [('Freddy', 'Frank', 3), ('Anil', 'Frank', 100), ('Anil', 'Wang', 24)]
Page 170
sorted(grade, key=itemgetter(1,0))
# [('Anil', 'Frank', 100), ('Freddy', 'Frank', 3), ('Anil', 'Wang', 24)]
sorted(grade, key=itemgetter(0,-1)) # Aha! -1 sorts by last name in reverse

order.
#[('Anil', 'Wang', 24), ('Anil', 'Frank', 100), ('Freddy', 'Frank', 3)]
sort() method
As an alternative to sorted(), the sort() method on a list sorts that list into
ascending order, e.g. list.sort(). The sort() method changes the
underlying list and returns None, so use it like this:
alist.sort() ## correct
alist = blist.sort() ## Incorrect. sort() returns None
The above is a very common misunderstanding with sort() -- it

*does not return* the sorted list. The sort() method must be called on a
list; it does not work on any enumerable collection (but the sorted()
function above works on anything). The sort() method predates the
sorted() function, so you will likely see it in older code. The sort() method
does not need to create a new list, so it can be a little faster in the case
that the elements to sort
Page 171
4.9. Data Ranking

Data Ranking produces ranking for each element in the array of
elements. In case of ties, assigns the mean rank.
Live Demo
import pandas as pd
import numpy as np
s = pd.Series(np.random.np.random.randn(5), index=list('abcde'))
s['d'] = s['b'] # so there's a tie
print s.rank()
a 1.0
b 3.5
c 2.0
d 3.5
e 5.0
dtype: float64
Rank optionally takes a parameter ascending which by default is true;

when false, data is reverse-ranked, with larger values assigned a smaller
rank.
Rank supports different tie-breaking methods, specified with the method

parameter −
 average − average rank of tied group

 min − lowest rank in the group
 max − highest rank in the group
 first − ranks assigned in the order they appear in the array
Page 172
4.10 Reading and Writing Data in Text Format

Like other languages, Python provides some inbuilt functions for
reading, writing, or accessing files. Python can handle mainly two types
of files. The normal text file and the binary files.
For the text files, each lines are terminated with a special
character '\n' (It is known as EOL or End Of Line). For the Binary file,
there is no line ending character. It saves the data after converting the
content into bit stream.
In this section we will discuss about the text files.
File Accessing Modes

Sr.No Modes & Description
1 r
It is Read Only mode. It opens the text file for reading. When the file is not
present, it raises I/O Error.
2 r+
This mode for Reading and Writing. When the file is not present, it will raise
I/O Error.
3 w
It is for write only jobs. When file is not present, it will create a file first, then
start writing, when the file is present, it will remove the contents of that file, and
start writing from beginning.
4 w+
It is Write and Read mode. When file is not present, it can create the file, or
when the file is present, the data will be overwritten.
5 a
This is append mode. So it writes data at the end of a file.
6 a+
Append and Read mode. It can append data as well as read the data.
Page 173
Now see how a file can be written using writelines() and write() method.
Live Demo
#Create an empty file and write some lines

line1 = 'This is first line. \n'
lines = ['This is another line to store into file.\n',
'The Third Line for the file.\n',
'Another line... !@#$%^&*()_+.\n',
'End Line']
#open the file as write mode
my_file = open('file_read_write.txt', 'w')
my_file.write(line1)
my_file.writelines(lines) #Write multiple lines
my_file.close()
print('Writing Complete')
Output
Writing Complete
After writing the lines, we are appending some lines into the file.
Live Demo
#program to append some lines

line1 = '\n\nThis is a new line. This line will be appended. \n'
#open the file as append mode
my_file = open('file_read_write.txt', 'a')
my_file.write(line1)
my_file.close()
print('Appending Done')
Output
Appending Done
At last, we will see how to read the file content from the read() and
readline() method. We can provide some integer number 'n' to get first 'n'
characters.
Page 174
#program to read from file

#open the file as read mode
my_file = open('file_read_write.txt', 'r')
print('Show the full content:')
print(my_file.read())
#Show first two lines
my_file.seek(0)
print('First two lines:')
print(my_file.readline(), end = '')
print(my_file.readline(), end = '')
#Show upto 25 characters
my_file.seek(0)
print('\n\nFirst 25 characters:')
print(my_file.read(25), end = '')
my_file.close()
Output
Show the full content:
This is first line.
This is another line to store into file.
The Third Line for the file.
Another line... !@#$%^&*()_+.
End Line
This is a new line. This line will be appended.
First two lines:

This is first line.
This is another line to store into file.
First 25 characters:
This is first line.
This
Page 175
Data Science using Python – Unit V
UNIT - V
5.1 Data Cleaning and Preparation
Data cleaning is one of the important parts of machine learning. It plays
a significant part in building a model. It surely isn’t the fanciest part of machine
learning and at the same time, there aren’t any hidden tricks or secrets to
uncover. However, the success or failure of a project relies on proper data
cleaning. Professional data scientists usually invest a very large portion of their
time in this step because of the belief that “Better data beats fancier
algorithms”.
If we have a well-cleaned dataset, there are chances that we can get

achieve good results with simple algorithms also, which can prove very
beneficial at times especially in terms of computation when the dataset size is
large.
Obviously, different types of data will require different types of cleaning.

However, this systematic approach can always serve as a good starting point.
Steps involved in Data Cleaning:
Page 176
 Removal of unwanted observations

This includes deleting duplicate/ redundant or irrelevant values from your
dataset. Duplicate observations most frequently arise during data collection
and Irrelevant observations are those that don’t actually fit the specific
problem that you’re trying to solve.
Redundant observations alter the efficiency by a great extent as the data

repeats and may add towards the correct side or towards the incorrect side,
thereby producing unfaithful results.
Irrelevant observations are any type of data that is of no use to us and can be
removed directly.
 Fixing Structural errors

The errors that arise during measurement, transfer of data, or other similar
situations are called structural errors. Structural errors include typos in the
name of features, the same attribute with a different name, mislabeled
classes, i.e. separate classes that should really be the same, or inconsistent
capitalization.
For example, the model will treat America and America as different classes
or values, though they represent the same value or red, yellow, and red-yellow
as different classes or attributes, though one class can be included in the other
two classes. So, these are some structural errors that make our model
inefficient and give poor quality results.
 Managing Unwanted outliers

Outliers can cause problems with certain types of models. For example, linear
regression models are less robust to outliers than decision tree models.
Generally, we should not remove outliers until we have a legitimate reason to
remove them. Sometimes, removing them improves performance, sometimes
not. So, one must have a good reason to remove the outlier, such as suspicious
measurements that are unlikely to be part of real data.
Page 177
 Handling missing data

Missing data is a deceptively tricky issue in machine learning. We cannot
just ignore or remove the missing observation. They must be handled carefully
as they can be an indication of something important. The two most common
ways to deal with missing data are:
 Dropping observations with missing values.

The fact that the value was missing may be informative in itself.
Plus, in the real world, you often need to make predictions on new data even if
some of the features are missing!
 Imputing the missing values from past observations.

Again, “missingness” is almost always informative in itself, and you should tell
your algorithm if a value was missing.
Even if you build a model to impute your values, you’re not adding any real
information. You’re just reinforcing the patterns already provided by other
features.
Missing data is like missing a puzzle piece. If you drop it, that’s like pretending
the puzzle slot isn’t there. If you impute it, that’s like trying to squeeze in a
piece from somewhere else in the puzzle.
So, missing data is always an informative and an indication of something

important. And we must be aware of our algorithm of missing data by flagging
it. By using this technique of flagging and filling, you are essentially allowing
the algorithm to estimate the optimal constant for missingness, instead of just
filling it in with the mean.
Page 178
5.2 What is Data Transformation?

Data transformation is the process of converting, cleansing,
and structuring data into a usable format that can be analyzed
to support decision making processes, and to propel the growth
of an organization.
Data transformation is used when data needs to be

converted to match that of the destination system. This can
occur at two places of the data pipeline. First, organizations
with on-site data storage use an extract, transform, load, with
the data transformation taking place during the middle
‘transform’ step.
Organizations today mostly use cloud-based data

warehouses because they can scale their computing and
storage resources in seconds. Cloud based organizations, with
this huge scalability available, can skip the ETL process.
Instead, they use a transformation process that converts the
data as the raw data is uploaded, a process called extract,
Page 179
load, and transform. The process of data transformation can be

handled manually, automated or a combination of both.
Transformation is an essential step in many processes, such

as data integration, migration, warehousing and wrangling. The
process of data transformation can be:
 Constructive, where data is added, copied or replicated

 Destructive, where records and fields are deleted
 Aesthetic, where certain values are standardized, or
 Structural, which includes columns being renamed, moved
and combined
On a basic level, the data transformation process converts

raw data into a usable format by removing duplicates,
converting data types and enriching the dataset. This data
transformation process involves defining the structure,
mapping the data, extracting the data from the source system,
performing the transformations, and then storing the
transformed data in the appropriate dataset. Data then
becomes accessible, secure and more usable, allowing for use
in a multitude of ways. Organizations perform data
transformation to ensure the compatibility of data with other
types while combining it with other information or migrating it
into a dataset. Through data transformations, organizations
can gain valuable insights into the operational and
informational functions.
Page 180
How is Data Transformation Used?

Data transformation works on the simple objective of
extracting data from a source, converting it into a usable
format and then delivering the converted data to the
destination system. The extraction phase involves data being
pulled into a central repository from different sources or
locations, therefore it is usually in its raw original form which is
not usable. To ensure the usability of the extracted data it
must be transformed into the desired format by taking it
through a number of steps. In certain cases, the data also
needs to be cleaned before the transformation takes place. This
step resolves the issues of missing values and inconsistencies
that exist in the dataset. The data transformation process is
carried out in five stages.
1. Discovery
The first step is to identify and understand data in its
original source format with the help of data profiling tools.
Finding all the sources and data types that need to be
transformed. This step helps in understanding how the data
needs to be transformed to fit into the desired format.
2. Mapping
The transformation is planned during the data mapping
phase. This includes determining the current structure, and the
consequent transformation that is required, then mapping the
data to understand at a basic level, the way individual fields
would be modified, joined or aggregated.
Page 181
3. Code Generation
The code, which is required to run the transformation
process, is created in this step using a data transformation
platform or tool.
4. Execution
The data is finally converted into the selected format with
the help of the code. The data is extracted from the source(s),
which can vary from structured to streaming, telemetry to log
files. Next, transformations are carried out on data, such as
aggregation, format conversion or merging, as planned in the
mapping stage. The transformed data is then sent to the
destination system which could be a dataset or a data
warehouse. Some of the transformation types, depending on
the data involved, include:
 Filtering which helps in selecting certain columns that

require transformation
 Enriching which fills out the basic gaps in the data set
 Splitting where a single column is split into multiple or
vice versa
 Removal of duplicate data, and
 Joining data from different sources
5. Review
The transformed data is evaluated to ensure the

conversion has had the desired results in terms of the format
of the data.
Page 182
5.3 String manipulation.

Like many other popular programming languages, strings in
Python are arrays of bytes representing unicode characters.However,
Python does not have a character data type, a single character is simply
a string with a length of 1.
Square brackets can be used to access elements of the string.
Packages that must be imported for supported are as follows
import pandas as pd
import altair as alt
import numpy as np
String basics
You can create strings with either single quotes or double

quotes. You can read the pandas user guide on working with text
data for more details.
string1 = "This is a string"

string2 = 'If I want to include a "quote" inside a string, I use single
quotes'
To include a literal single or double quote in a string you can use \ to
“escape” it:
double_quote = "\"" # or '"'
single_quote = '\'' # or "'"
That means if you want to include a literal backslash, you’ll need to
double it up: "\\".
Beware that the printed representation of a string is not the same as
string itself, because the printed representation shows the escapes:
x = "\" \\"
Page 183
x
#> '" \\'
print(x)
#> " \
There are a handful of other special characters. The most common

are "\n", newline, and "\t", tab, but you can see the complete list in
the Python reference manual. You’ll also sometimes see strings
like "\u00b5", this is a way of writing non-English characters that works
on all platforms:
x = "\u00b5"
x
#> 'µ'
Multiple strings are often stored in a object series, which you can
create with []:
pd.Series(["one", "two", "three"])
#> 0 one
#> 1 two
#> 2 three
#> dtype: object
String length
Python contains many functions to work with strings. We’ll use the
functions from pandas for use on series. These all start with str. For
example, str.length() tells you the number of characters in a string:
pd.Series(["a", "R for data science", np.nan]).str.len()
#> 0 1.0
#> 1 18.0
#> 2 NaN
Page 184
#> dtype: float64
Combining strings
To combine two or more strings, use str_c():
pd.Series(["x", "y"]).str.cat()
#> 'xy'
pd.Series(["x", "y", "z"]).str.cat()
#> 'xyz'
Use the sep argument to control how they’re separated:
pd.Series(["x", "y"]).str.cat(sep = '_')
#> 'x_y'
Like most other functions in Python, missing values are contagious. If
you want them to print as "NA", use fillna() or na_rep = 'NA':
x = pd.Series(["abc", np.nan])
x.str.cat()
#> 'abc'
x.str.cat(na_rep = "NA")
#> 'abcNA'
x.fillna('NA').str.cat()
#> 'abcNA'
Subsetting strings
You can extract parts of a string using str[]. As well as the
string, str[] takes start:end arguments which give the (inclusive)
position of the substring:
x = pd.Series(["Apple", "Banana", "Pear"])
x.str[0:3]
# negative numbers count backwards from end
#> 0 App
Page 185
#> 1 Ban
#> 2 Pea
#> dtype: object
x.str[-3:]
#> 0 ple
#> 1 ana
#> 2 ear
#> dtype: object
Note that str[] won’t fail if the string is too short: it will just return as
much as possible:
pd.Series(["a"]).str[0:5]
#> 0 a
#> dtype: object
You can also use the assign strings using str.slice_replace() to modify
strings:
x.str.slice_replace(0,0, repl = "5")
#> 0 5Apple
#> 1 5Banana
#> 2 5Pear
#> dtype: object
Page 186
5.4 Vectorized String Operations

One strength of Python is its relative ease in handling and
manipulating string data. Pandas builds on this and provides a
comprehensive set of vectorized string operations that are an important
part of the type of munging required when working with (read: cleaning
up) real-world data. In this chapter, weâ ll walk through some of the
Pandas string operations, and then take a look at using them to partially
clean up a very messy dataset of recipes collected from the internet.
Introducing Pandas String Operations
We saw in previous chapters how tools like NumPy and Pandas
generalize arithmetic operations so that we can easily and quickly
perform the same operation on many array elements. For example:
In [1]: import numpy as np

x = np.array([2, 3, 5, 7, 11, 13])
x*2
Out[1]: array([ 4, 6, 10, 14, 22, 26])
This vectorization of operations simplifies the syntax of operating on

arrays of data: we no longer have to worry about the size or shape of the
array, but just about what operation we want done. For arrays of strings,
NumPy does not provide such simple access, and thus youâ re stuck
using a more verbose loop syntax:
In [2]: data = ['peter', 'Paul', 'MARY', 'gUIDO']

[s.capitalize() for s in data]
Out[2]: ['Peter', 'Paul', 'Mary', 'Guido']
This is perhaps sufficient to work with some data, but it will break if there
are any missing values, so this approach requires putting in extra
checks:
In [3]: data = ['peter', 'Paul', None, 'MARY' ...
Page 187
5.5 Plot with Pandas

Python’s popular data analysis library, pandas, provides several
different options for visualizing your data with .plot(). Even if you’re at the
beginning of your pandas journey, you’ll soon be creating basic plots that
will yield valuable insights into your data.
Basic Plotting: plot

This functionality on Series and DataFrame is just a simple wrapper
around the matplotlib libraries plot() method.
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(10,4),index=pd.date_range('1/1/2000',
periods=10), columns=list('ABCD'))
df.plot()
Its output is as follows −If the index consists of dates, it

calls gct().autofmt_xdate() to format the x-axis as shown in the above illustration.
Page 188
We can plot
one column
versus another
using
the x and y ke
ywords.
Plotting
methods allow
a handful of plot styles other than the default line plot. These methods
can be provided as the kind keyword argument to plot(). These include
−
 bar or barh for bar plots

 hist for histogram
 box for boxplot
 'area' for area plots
 'scatter' for scatter plots
Bar Plot
Let us now see what a Bar Plot is by creating one. A bar plot can be
created in the following way −
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(10,4),columns=['a','b','c','d')
df.plot.bar()
Page 189
To produce a stacked bar plot, pass stacked=True −
import pandas as pd
df.plot.bar(stacked=True)
To get horizontal bar plots, use the barh method −
import pandas as pd
import numpy as np
df.plot.barh(stacked=True)
Page 190
Histograms
Histograms can be plotted using the plot.hist() method. We can specify
number of bins.
import pandas as pd
import numpy as np
df =
pd.DataFrame({'a':np.random.randn(1000)+1,'b':np.random.randn(1000),'c':
np.random.randn(1000) - 1}, columns=['a', 'b', 'c'])
df.plot.hist(bins=20)
To plot different histograms for each column, use the following code −
import pandas as pd
import numpy as np
df=pd.DataFrame({'a':np.random.randn(1000)+1,'b':np.random.randn(1000),'c':
np.random.randn(1000) - 1}, columns=['a', 'b', 'c'])
df.diff.hist(bins=20)
Page 191
Box Plots
Boxplot can be drawn calling Series.box.plot() and
DataFrame.box.plot(), or DataFrame.boxplot() to visualize the
distribution of values within each column. For instance, here is a boxplot
representing five trials of 10 observations of a uniform random variable
on [0,1).
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(10, 5), columns=['A', 'B', 'C', 'D', 'E'])
df.plot.box()
Page 192
Area Plot
Area plot can be created using the Series.plot.area() or
the DataFrame.plot.area() methods.
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(10, 4), columns=['a', 'b', 'c', 'd'])
df.plot.area()
Scatter Plot
Scatter plot can be created using
the DataFrame.plot.scatter() methods.
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(50, 4), columns=['a', 'b', 'c', 'd'])
df.plot.scatter(x='a', y='b')
Page 193
Pie Chart
Pie chart can be created using the DataFrame.plot.pie() method.
import pandas as pd
import numpy as np
df = pd.DataFrame(3 * np.random.rand(4), index=['a', 'b', 'c', 'd'], columns=['x'])
df.plot.pie(subplots=True)
Page 194

Python For Data Science - ANR PL - Final

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Python For Data Science - ANR PL - Final

Uploaded by

Copyright:

Available Formats

Page 1

1.1 Introduction of DATA SCIENCE

What is DATA SCIENCE?

Data Science is kind a blended with various tools, algorithms, and

How Data Science Works?

Problem Statement: No work start without motivation, Data science

Pavitra Degree College – B.Sc.(Computers) III Year VI sem Page 1

Data Collection: After defining the problem statement, the next

Pavitra Degree College – B.Sc.(Computers) III Year VI sem Page 2

help of libraries using any programming language. In R, GGplot is one of

Optimization and Deployment: You followed each and every step

1.2 What Is Python

Pavitra Degree College – B.Sc.(Computers) III Year VI sem Page 3

Python is easy to learn yet powerful and versatile scripting

Python's syntax and dynamic typing with its interpreted nature

Python supports multiple programming pattern, including object-

Python is not intended to work in a particular area, such as web

We don't need to use data types to declare variable because it

Python makes the development and debugging fast because there

Pavitra Degree College – B.Sc.(Computers) III Year VI sem Page 4

One of the main reasons why Python is widely used in the

According to engineers coming from academia and industry, deep

In terms of application areas, ML scientists prefer Python as well.

1.3 Essential Python

Whether you work in artificial intelligence or finance or are

Pavitra Degree College – B.Sc.(Computers) III Year VI sem Page 5

especially suited for desktop, web, and business applications. Python's

Essential Python is intended for professionals working in

1.4 Libraries Python

Most Commonly used libraries for data science :

Numpy: Numpy is Python library that provides mathematical

NumPy stands for Numerical Python. It provides lots of useful

Pavitra Degree College – B.Sc.(Computers) III Year VI sem Page 6

manipulating numerical tables and time series data. Pandas is a perfect

Series – It Handle and store data in one-dimensional data.

DataFrame – It Handle and store Two dimensional data.

Matplotlib: Matplotlib is another useful Python library for Data

Scipy: Scipy is another popular Python library for data science

Scikit – learn: Sklearn is Python library for machine learning.

Pavitra Degree College – B.Sc.(Computers) III Year VI sem Page 7

implement popular algorithms on datasets and solve real-world

Pavitra Degree College – B.Sc.(Computers) III Year VI sem Page 8

 It is free software in a couple of categories. It does not cost

Examples of valid identifiers:

Examples of invalid identifiers:

Pavitra Degree College – B.Sc.(Computers) III Year VI sem Page 9

1.7 Reserved words

Total Python Keywords

for This is created for a loop.

while This keyword is used to create a while loop.

break This is used to terminate the loop.

as This is used to create an alternative.

def It helps us to define functions.

lambda It is used to define the anonymous function.

pass This is a null statement which means it will do nothing.

return It will return a value and exit the function.

true This is a boolean value.

false This is also a boolean value.

try It makes a try-except statement.

with The with keyword is used to simplify exception handling.

Pavitra Degree College – B.Sc.(Computers) III Year VI sem Page 10

class It helps us to define a class.

continue It continues to the next iteration of a loop