Professional Documents
Culture Documents
INFORMATICS PRACTICES
(065)
Unit-1
Data Visualization &
Data Handling using Pandas
Project File
(Session: 2023-24)
For
Class 12th
DATA VISUALIZING
Python Library
Matplotlib
It is a comprehensive library for creating static, animated and interactive
visualization or data visualization in python
We can customize and take few controls of line style, font properties,
excess properties as well as export a number of file formats and
interactive environments.
DATA VISUALIZATION
The main goals of data visualization are to distills large data a set into visual
graphics to allow for an easy understanding for complex relationship written the
data.
DRAWING
Plots can be drawer based on the passed data through specific function.
CUSTOMIZATION
Plots can be customized as per the requirement after specifying functions
like;
Color, Style, Width, Label, Title and Legend
SAVING
After drawing & customization Plots can be saved for future use.
LINE PLOT
A line plot chart is a graph that shows the frequency of data occurring along a number
line.
The line plot is represented by a series of data points connected with the straight line.
We can plot a line graph to define the grid, the x and y axis scales and labels, littles and
display options etc.
import numpy as np
import matplotlib.pyplot as plt
year=(2015,2016,2017,2018,2019)
bpspercentage=(90, 92, 94, 95, 97)
srirampercentage=(89, 90, 88, 85, 93)
plt.plot(year, bpspercentage, colour='green')
plt.plot(year, srirampercentage, colour='red')
plt.xlabel('year')
plt.ylabel('passpercentage')
plt.title('Students Board Exams Pass Percentage')
plt.show()
Output:
Page 5 of 19
BAR GRAPH
A bar graph drawn using rectangular bars to show how large each value is. The bar can be
horizontal or vertical.
A bar graph makes easy to compare data between different groups at a glance. It represents
categories on one access and a discrete value in another. The goal of bar graph is to show
the relationship between two access. Bar Graph can also show big changes in data over the
time.
import numpy as np
import matplotlib.pyplot as plt
labels=('sandeep', 'chitra', 'mansi', 'mohit', 'ayush', 'shubh')
bpsper=(94,92,96,83,90,76)
index=np.arange(len(labels))
plt.bar(index, bpsper)
plt.xlabel('student name',
fontsize=10)
plt.ylabel('student percentage',
fontsize=11)
plt.show()
Output:
Page 6 of 19
HISTOGRAM
Histogram is a graphic presentation in which a organic group of data points into user’s
specified range.
Histogram produced virtual presentation of numerical data which shows the number of data
points that fall with in specified range of value.
Histogram is similar to the bar graph (vertical) but without gaps in between the bars.
DATA STRUCTURE
There are two important data structure in the Python Pandas are follows:
1. Series
2. Data Frame
SERIES(S)
Series is like a one-dimension array like structure with homogenous data. The axis
label is collectively known as index. Series structure can store any type of data such
as integer, float, string, python objects, and so on. It can be created using an array,
a dictionary or a constant value.
For Example:
The following series in the collection of Integers.
SERIES
7 25 90 -1 136 76 5
1D
FEATURES OF SERIES
• Homogenous Data
• Series Data/Volume is Mutable
• Series Size is Immutable
Page 8 of 19
For Example:
- - - - -
- - - - -
- - - - -
2D
X OR DF
PANDAS SERIES
Pandas Series is just like a one-dimensional array, capable of holding any type of data
like Integer, Float, String and Python Objects.
Example 2
import pandas as pd1
import numpy as np1
data=np1.array(['a', 'b', 'c', 'd'])
s=pd1.Series(data, index=[100, 110, 120, 130])
print(s)
Output:
100 a
110 b
120 c
130 d, dtype: object
Page 10 of 19
Example 3
Create a Pandas Series without Index
import pandas as pd1
import numpy as np1
data={'a':0.0, 'b':1.0, 'c':2.0, 'd':3.0}
s=pd1.Series(data)
print(s)
Output:
a 0.0
b 1.0
c 2.0
d 3.0, dtype: float64
Example 4
Create a Series from Scalar
import pandas as pd1
import numpy as np1
s=pd1.Series([5, index= 0, 1, 2, 3])
print()
Output:
0 5
1 5
2 5
3 5, dtype: int64
Page 11 of 19
Example 5
Maths Operations with Pandas Series
1.
import pandas as pd1
s=pd1.Series([1, 2, 3])
m=pd1.series([1, 2, 3])
n=s*m#perform addition operation
print(n)
Output:
0 2
1 4
2 7 dtype: int64
2.
import pandas as pd1
s=pd1.Series([1, 2, 3])
m=pd1.series([1, 2, 4])
n=s*m#perform multiplication operation
print(n)
Output:
0 1
1 4
2 12 dtype: int64
Page 12 of 19
Head Function ( ):
Head ( ) returns the first n rows. The default number of elements to display is 5.
For Example
import pandas as pd1
s=pd1.Series([1, 2, 3, 4, 5],index=['a', 'b', 'c', 'd', 'e'])
print(s.head(3))
Output:
a 1
b 2
c 3 dtype:int64
Tale Function ( ):
Tale Function returns the last n rows. The default number of elements to display
is 5 but we may paas a custom number.
For Example
import pandas as pd1
s=pd1.Series([1, 2, 3, 4, 5, 6] index=[‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’])
print(s.tale(3))
Output:
d 4
e 5
f 6 dtype:int64
Page 13 of 19
For Example
import pandas as pd1
s=pd1.Series([1, 2, 3, 4, 5, 6], index=[‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’])
print(s[3])
Output:
a 1
b 2
c 3 dtype:int64
For Example
import pandas as pd1
s=pd1.Series([1, 2, 3, 4, 5, 6], index=[‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’])
print(s[:3])
Output:
a 1
b 2
c 3 dtype:int64
Page 14 of 19
Example 1
Create an Empty Data Frame
import pandas as pd1
df1=pd1.DataFrame()
print(df1)
Output:
“Empty DataFrame”
Columns: []
Index: []
Example 2
Create a Data frame from List
import pandas as pd1
data1=[1, 2, 3, 4, 5]
df1=pd1.DataFrame(data1)
print(df1)
Output:
0
0 1
1 2
2 3
3 4
4 5
Page 15 of 19
Example 3
Create a Data frame from (dict)nd Array or List
import pandas as pd1
data1={'Name':['Mansi', 'Krishna', 'Sandeep', 'Aashi'], 'Age':[17, 17, 17, 18]}
df1=pd1.DataFrame(data1)
print(df1)
Output:
Name Age
0 Mansi 17
1 Krishna 17
2 Sandeep 17
3 Aashi 18
Example 4
Create a Data frame from List (dict)
import pandas as pd1
data1=[{'x':1, 'y':2},{'x':10, 'y':12, 'z':15}]
df1=pd1.DataFrame(data1)
print(df1)
Output:
x y z
0 1 2 NaN
1 10 12 15.0
Page 16 of 19
Example 5
Create a Data frame from dict (Series)
import pandas as pd1
d1={'One':pd1.Series([1, 2, 3], index=('a','b', 'c')),
'Two':pd1.Series([1, 2, 3, 4],index=('a','b', 'c', 'd'))}
print(d1)
Output:
{'One': a 1
b 2
c 3
dtype: int64, 'Two': a 1
b 2
c 3
d 4
dtype: int64}
Page 17 of 19
Suppose we have a CSV file named as [BPS Product CSV] that contain the following data.
import pandas as pd
df=pd.DataFrame({"A":[1, 2, 3], "B":[4, 5, 6]})
c=[7, 8, 9]
df["C"]=c
print(df)
Output:
A B C
0 1 4 7
1 2 5 8
2 3 6 9
• Column Deletion
del df1[‘One’]
#deleting the first column using the del function
df.pop[‘Two’]
#deleting another column using the pop function
• Column Rename
import pandas as pd
df=pd.DataFrame({"A":[1, 2, 3], "B":[4, 5, 6]})
df.rename(columns={'A': 'a', 'B': 'b'})
Output:
a b
0 1 4
1 2 5
2 3 6
Page 19 of 19