You are on page 1of 20

Question - 1

Question 1 (a)

Write the code to create a DataFrame „df‟ and perform the following operations.
Maths Science SST
Amit 100 100.0 60.0
Mohan 95 50.0 57.48
Sudha 85 90.0 53.58

i) Add one column Total=Maths+Science+SST.


ii) Add the marks of Kishor with values 75.6, 88.5, 90.3
iii) Display the marks of Maths and Science
iv) Update marks of Science of Sudha to 85.0
v) Delete the row – Mohan

Solution:

import pandas as pd

s1={'Maths':100,'Science':100.0,'SST':60.0}

s2={'Maths':95,'Science':50.0,'SST':57.48}

s3={'Maths':85,'Science':90.0,'SST':53.58}

lst=[s1,s2,s3]

df=pd.DataFrame(lst,index=['Amit','Mohan','Sudha'])

print("The created dataframe is:")

print(df)

df['Total']=df['Maths']+df['Science']+df['SST']

print("The dataframe after adding a new column Total:")

print(df)

df.loc['Kishor',:]=[75.6,88.5,90.3,0]

print("The dataframe after adding a new row:")

print(df)

print("Marks of Maths & Science")

1
print(df.iloc[::,0:2])

df.iat[2,1]=85.0

print("Dataframe after changing the marks of Science of Sudha")

print(df)

df=df.drop(['Mohan'])

print("Dataframe after deleting the record of Mohan")

print(df)

Sample output

The created dataframe is:

Maths Science SST

Amit 100 100.0 60.00

Mohan 95 50.0 57.48

Sudha 85 90.0 53.58

The dataframe after adding a new column Total:

Maths Science SST Total

Amit 100 100.0 60.00 260.00

Mohan 95 50.0 57.48 202.48

Sudha 85 90.0 53.58 228.58

The dataframe after adding a new row:

Maths Science SST Total

Amit 100.0 100.0 60.00 260.00

Mohan 95.0 50.0 57.48 202.48

Sudha 85.0 90.0 53.58 228.58

Kishor 75.6 88.5 90.30 0.00

Marks of Maths & Science

2
Maths Science

Amit 100.0 100.0

Mohan 95.0 50.0

Sudha 85.0 90.0

Kishor 75.6 88.5

Dataframe after changing the marks of Science of Sudha

Maths Science SST Total

Amit 100.0 100.0 60.00 260.00

Mohan 95.0 50.0 57.48 202.48

Sudha 85.0 85.0 53.58 228.58

Kishor 75.6 88.5 90.30 0.00

Dataframe after deleting the record of Mohan

Maths Science SST Total

Amit 100.0 100.0 60.00 260.00

Sudha 85.0 85.0 53.58 228.58

Kishor 75.6 88.5 90.30 0.00

--------------------------------------------------------------------------------------------------------------------------------------

Question – 2
Question 1(a)
Consider an array with values 10, 20, 30, 40, 50. Create a series from this array with default indexes
and write Python statements for the following.
i) Set the values of all elements to 100
ii) Add 10 to all elements of the series and display it.
iii) Display the 1st and the 4th elements of the series.
iv) Set the value of 3rd element to 500.
Solution
import numpy as np
import pandas as pd
lst=[10,20,30,40]
ar=np.array(lst)

3
sr=pd.Series(ar)
print("Created series is:")
print(sr)
print("Series after adding 10 with all elements:")
print(sr+10)
print("The first and the fourth elements of the series are:")
print(sr[0],sr[3])
print("Series after setting the third value as 500")
sr[2]=500
print(sr)

sample output
Created series is:
0 10
1 20
2 30
3 40
dtype: int32
Series after adding 10 with all elements:
0 20
1 30
2 40
3 50
dtype: int32
The first and the fourth elements of the series are:
10 40
Series after setting the third value as 500
0 10
1 20
2 500
3 40
dtype: int32

Question – 3
Question 1(a)

a) Create a series that stores the name (as index) and area (as value) of six states in KM (using
dictionaries).
i) Write the code to find out the biggest and the smallest three areas from the given series.
ii) To display the series in the alphabetical order of state names.
iii) To display the details of the states having the area greater than 25000 KM.
iv) To change the indices as „State1‟, „State2‟, „State3‟, „State4‟, „State5‟ and „State6‟
Solution
import pandas as pd
s1=pd.Series({"Kerala":50000,"TamilNadu":23000,"Karnataka":35000,"UP":75000,"AP":40000,"M
P":20000})
print("Created Series is:")

4
print(s1)
ans=‟y‟
while ans==‟y‟ or ans==‟Y‟:
print(“1. The biggest and the smallest three areas”)
print(“2. Display the series in alphabetical order of state names”)
print(“3. States having area above 25000 KM”)
print(“4. Change the indices”)
ch=int(input(“Enter your choice”))
if ch==1:
s1.sort_values(inplace=True)
print("Biggest three atates are:")
print(s1.tail(3))
print("Smallest three states are")
print(s1.head(3))
elif ch==2:
print("Given series in the alphabetical order of state names:")
s1.sort_index(inplace=True)
print(s1)
elif ch==3:
print("Details of the states having the area greater than 25000")
print(s1[s1>25000])
elif ch==4:
print("Series after chaning the index values:")
s1.index=['State1','State2','State3','State4','State5','State6']
print(s1)
else:
print(“Invalid choice”)
ans=input(“Do you wish to continue”))

sample output
Created Series is:
Kerala 50000
TamilNadu 23000
Karnataka 35000
UP 75000
AP 40000
5
MP 20000
dtype: int64
Biggest three atates are:
AP 40000
Kerala 50000
UP 75000
dtype: int64
Smallest three states are
MP 20000
TamilNadu 23000
Karnataka 35000
dtype: int64
Given series in the alphabetical order of state names:
AP 40000
Karnataka 35000
Kerala 50000
MP 20000
TamilNadu 23000
UP 75000
dtype: int64
Details of the states having the area greater than 25000
AP 40000
Karnataka 35000
Kerala 50000
UP 75000
dtype: int64
Series after chaning the index values:
State1 40000
State2 35000
State3 50000
State4 20000
State5 23000
State6 75000
dtype: int64

6
Question – 4
Question 1(a)
Create a DataFrame as given below and also write the code to add or remove rows from the dataframe
according to the user‟s choice. Also store the dataframe to a CSV file.
Yr1 Yr2 Yr3
Qtr1 34500 44900 54500
Qtr2 36000 46100 51000
Qtr3 47000 57000 58500
Solution
import pandas as pd
d1={'Yr1':34500,'Yr2':44900,'Yr3':54500}
d2={'Yr1':36000,'Yr2':46100,'Yr3':51000}
d3={'Yr1':47000,'Yr2':57000,'Yr3':58500}
lst=[d1,d2,d3]
df=pd.DataFrame(lst,index=['Qtr1','Qtr2','Qtr3'])
print("Created dataframe is:")
print(df)
ans='Y'
while ans=='Y' or ans=='y':
print("1.Add a new row\n2.Remove a row")
ch=int(input("Enter your choice(1 or 2)"))
if ch==1:
yr1=eval(input("Enter the sales amount of year1: "))
yr2=eval(input("Enter the sales amount of year2: "))
yr3=eval(input("Enter the sales amount of year3: "))
df.loc["Qtr"+str(len(df)+1)]=[yr1,yr2,yr3]
print("Dataframe after insertion of new row:")
print(df)
elif ch==2:
print(df.index)
lab=input("Enter the row index")
df=df.drop([lab])
print("Dataframe after deletion of the specified row")
print(df)
else:
print("Invalid choice")
ans=input("Do you wish to continue")
df.to_csv("D:\\SAPS\\sample.csv",sep=',')
print("A csv file created at the specified location using the dataframe")

Sample output
Created dataframe is:
Yr1 Yr2 Yr3
Qtr1 34500 44900 54500
Qtr2 36000 46100 51000
Qtr3 47000 57000 58500
1.Add a new row
2.Remove a row
Enter your choice(1 or 2)1
Enter the sales amount of year1: 25000

7
Enter the sales amount of year2: 32000
Enter the sales amount of year3: 40000
Dataframe after insertion of new row:
Yr1 Yr2 Yr3
Qtr1 34500 44900 54500
Qtr2 36000 46100 51000
Qtr3 47000 57000 58500
Qtr4 25000 32000 40000
Do you wish to continuey
1.Add a new row
2.Remove a row
Enter your choice(1 or 2)2
Index(['Qtr1', 'Qtr2', 'Qtr3', 'Qtr4'], dtype='object')
Enter the row indexQtr1
Dataframe after deletion of the specified row
Yr1 Yr2 Yr3
Qtr2 36000 46100 51000
Qtr3 47000 57000 58500
Qtr4 25000 32000 40000
Do you wish to continuen
A csv file created at the specified location using the dataframe

Question – 5
Question 1(a)
DataFrame - Employee

Name Position Age Projects

Rabia Manager 30 NaN

Evan Programmer NaN NaN

Jia Manager 34 16

Lalit Programmer 40 20

Write a program to create the above given DataFrame employee and perform the following operations on
it.

1. Display the details of employees having age greater than 30

2. Display the details of all programmers

3. Display the details of a particular employee

Solution
import numpy as np
import pandas as pd
e1={'Name':'Rabia','Position':'Manager','Age':30,'Projects':np.NaN}
e2={'Name':'Evan','Position':'Programmer','Age':np.NaN,'Projects':np.NaN}
e3={'Name':'Jia','Position':'Manager','Age':34,'Projects':16}
e4={'Name':'Lalit','Position':'Programmer','Age':40,'Projects':20}

8
lst=[e1,e2,e3,e4]
employee=pd.DataFrame(lst)
print("Created dataframe is:")
print(employee)
ans='y'
while ans=='y' or ans=='Y':
print("1.Details of employees having age above 30\n2.Details of all Programmers\n3.Details of a
particular employee")
ch=int(input("Enter your choice (1-3)"))
if ch==1:
print("Employees having age freater than 30")
print(employee[employee['Age']>30])
elif ch==2:
print("Details of programmers")
print(employee[employee['Position']=='Programmer'])
elif ch==3:
print("Names of employees")
print(employee.Name)
nm=input("Enter the name of the employee whose details you want to display")
print(employee[employee.Name==nm])
else:
print("Invalid choice")
ans=input("Do you wish to continue")
Sample output

Created dataframe is:

Name Position Age Projects

0 Rabia Manager 30.0 NaN

1 Evan Programmer NaN NaN

2 Jia Manager 34.0 16.0

3 Lalit Programmer 40.0 20.0

1.Details of employees having age above 30

2.Details of all Programmers

3.Details of a particular employee

Enter your choice (1-3)1

Employees having age freater than 30

Name Position Age Projects

2 Jia Manager 34.0 16.0

3 Lalit Programmer 40.0 20.0

Do you wish to continuey

9
1.Details of employees having age above 30

2.Details of all Programmers

3.Details of a particular employee

Enter your choice (1-3)2

Details of programmers

Name Position Age Projects

1 Evan Programmer NaN NaN

3 Lalit Programmer 40.0 20.0

Do you wish to continuey

1.Details of employees having age above 30

2.Details of all Programmers

3.Details of a particular employee

Enter your choice (1-3)3

Names of employees

0 Rabia

1 Evan

2 Jia

3 Lalit

Name: Name, dtype: object

Enter the name of the employee whose details you want to displayRabia

Name Position Age Projects

0 Rabia Manager 30.0 NaN

Do you wish to continuen

10
Question – 6

Question 1(a)
Write a program to create two series that stores the salary obtained by 3 employees for 2 months (using lists).
Calculate the sum, average and difference in their salaries using Series.

Solution
import pandas as pd
mth1=pd.Series([30000,35000,28000],index=["Ram","Shyam","Mohan"])
mth2=pd.Series([32000,40000,28000],index=["Ram","Shyam","Mohan"])
print("salary of the first month")
print(mth1)
print("Salary of the second month")
print(mth2)
ans='y'
while ans=='y' or ans=='Y':
print("1.Sum of salaries")
print("2. Average of salaries")
print("3. Difference of salaries")
ch=int(input("enter your choice"))
if ch==1:
print("Sum of the salaries")
print(mth1+mth2)
elif ch==2:
print("Average of the salaries")
print((mth1+mth2)/2)
elif ch==3:
print("Difference between the salaries")
print(mth1-mth2)
else:
print("Invalid choice")
ans=input("Do you wish to continue")

Sample output
salary of the first month
Ram 30000
Shyam 35000
Mohan 28000
dtype: int64
Salary of the second month
Ram 32000
Shyam 40000
Mohan 28000
dtype: int64
1.Sum of salaries
2. Average of salaries
3. Difference of salaries
11
enter your choice1
Sum of the salaries
Ram 62000
Shyam 75000
Mohan 56000
dtype: int64
Do you wish to continuey
1.Sum of salaries
2. Average of salaries
3. Difference of salaries
enter your choice2
Average of the salaries
Ram 31000.0
Shyam 37500.0
Mohan 28000.0
dtype: float64
Do you wish to continuey
1.Sum of salaries
2. Average of salaries
3. Difference of salaries
enter your choice3
Difference between the salaries
Ram -2000
Shyam -5000
Mohan 0
dtype: int64
Do you wish to continuen

12
Question – 7
Question 1(a)

Create a dataframe with RollNo, Name, Age and Marks of 5 subjects with default index. Write the
commands to do the following operations on the dataframes.
i) Calculate the total marks and display in the field „Total‟
ii) Change the index from default to RollNo.
iii) Display the details of 1st and 3rd students.
iv) Add a new row to the dataframe.
Solution
import pandas as pd
s1={'RollNo':1,'Name':'Arun','Age':15,'Mark1':78,'Mark2':85,'Mark3':90,'Mark4':92,'Mark5':88}
s2={'RollNo':2,'Name':'Bibin','Age':16,'Mark1':98,'Mark2':80,'Mark3':92,'Mark4':96,'Mark5':98}
s3={'RollNo':3,'Name':'Dijo','Age':15,'Mark1':70,'Mark2':80,'Mark3':90,'Mark4':95,'Mark5':97}
students=pd.DataFrame([s1,s2,s3])
print(students)
students['Total']=students.Mark1+students.Mark2+students.Mark3+students.Mark4+students.Mark5
print(students)
print("Details of the first and the third students")
print(students.iloc[0::2,::])
print("Dataframe after setting RollNo as index:")
students.set_index('RollNo',inplace=True)
print(students)
print("Details of the first and the third students")
print(students.iloc[0::2])
print("Adding new record")
rno=int(input("Enter roll number"))
nm=input("Enter name")
ag=int(input("Enter age"))
m1=int(input("Enter mark1: "))
m2=int(input("Enter mark2: "))
m3=int(input("Enter mark3: "))
m4=int(input("Enter mark4: "))
m5=int(input("Enter mark5: "))
tot=eval(input("Enter total: "))
students.loc[rno]=[nm,ag,m1,m2,m3,m4,m5,tot]
print("Dataframe after insertion of a new row")
print(students)
Sample output
RollNo Name Age Mark1 Mark2 Mark3 Mark4 Mark5
0 1 Arun 15 78 85 90 92 88
1 2 Bibin 16 98 80 92 96 98
2 3 Dijo 15 70 80 90 95 97
RollNo Name Age Mark1 Mark2 Mark3 Mark4 Mark5 Total
0 1 Arun 15 78 85 90 92 88 433

13
1 2 Bibin 16 98 80 92 96 98 464
2 3 Dijo 15 70 80 90 95 97 432
Details of the first and the third students
RollNo Name Age Mark1 Mark2 Mark3 Mark4 Mark5 Total
0 1 Arun 15 78 85 90 92 88 433
2 3 Dijo 15 70 80 90 95 97 432
Dataframe after setting RollNo as index:
Name Age Mark1 Mark2 Mark3 Mark4 Mark5 Total
RollNo
1 Arun 15 78 85 90 92 88 433
2 Bibin 16 98 80 92 96 98 464
3 Dijo 15 70 80 90 95 97 432
Details of the first and the third students
Name Age Mark1 Mark2 Mark3 Mark4 Mark5 Total
RollNo
1 Arun 15 78 85 90 92 88 433
3 Dijo 15 70 80 90 95 97 432
Adding new record
Enter roll number4
Enter nameKevin
Enter age15
Enter mark1: 100
Enter mark2: 100
Enter mark3: 100
Enter mark4: 100
Enter mark5: 100
Enter total: 500
Dataframe after insertion of a new row
Name Age Mark1 Mark2 Mark3 Mark4 Mark5 Total
RollNo
1 Arun 15 78 85 90 92 88 433
2 Bibin 16 98 80 92 96 98 464
3 Dijo 15 70 80 90 95 97 432
4 Kevin 15 100 100 100 100 100 500

14
Question – 8
Question 1(a)

Write a Python program to create 2 DataFrames that stores the marks secured by 5 students in 2
examinations and perform the following operations:

i) To create a new data frame containing total marks (adding marks secured
in both exams)

ii) To display the top 3 scorers details

Solution
import pandas as pd
exam1=pd.DataFrame({"Name":["Anu","Biju","Cino","Deljith"],"Marks":[50,45,24,48]},index=["Rno1"
,"Rno2","Rno3","Rno4"])
exam2=pd.DataFrame({"Name":["Anu","Biju","Cino","Deljith"],"Marks":[45,40,43,40]},index=["Rno1"
,"Rno2","Rno3","Rno4"])
print("Details of Exam1")
print(exam1)
print("Details of Exam2")
print(exam2)
print("Total marks of two exams")
df3=pd.DataFrame({"Name":exam1.Name,"Total":exam1.Marks+exam2.Marks})
print(df3)
print("Details of the top 3 scorers")
df3.sort_values(["Total"],ascending=False,inplace=True)
print(df3.head(3))

Sample output

Details of Exam1

Name Marks

Rno1 Anu 50

Rno2 Biju 45

Rno3 Cino 24

Rno4 Deljith 48

Details of Exam2

Name Marks

Rno1 Anu 45
15
Rno2 Biju 40

Rno3 Cino 43

Rno4 Deljith 40

Total marks of two exams

Name Total

Rno1 Anu 95

Rno2 Biju 85

Rno3 Cino 67

Rno4 Deljith 88

Details of the top 3 scorers

Name Total

Rno1 Anu 95

Rno4 Deljith 88

Rno2 Biju 85

Question – 9
Question 1(a)
a) Write a Python program to create the given DataFrame and also write the code to perform the
following operations.
Population Hospitals Schools

Delhi 10927986 189 7916

Mumbai 12691836 208 8508

Kolkata 4631392 149 7226

Chennai 4328063 157 7617


i) Display the details of Mumbai

ii) Add one more column Colleges with appropriate data

iii) Change the no. of hospitals of Chennai to 160


iv) Add the details of the city „Hyderabad‟

Sample output

16
Created dataframe is

Population Hospitals Schools

Delhi 10927986 189 7916

Mumbai 12691836 208 8508

Kolkata 4631392 149 7226

Chennai 4328063 157 7617

Details of the city Mumbai

Population 12691836

Hospitals 208

Schools 8508

Name: Mumbai, dtype: int64

Dataframe after adding new field Colleges

Population Hospitals Schools Colleges

Delhi 10927986 189 7916 100

Mumbai 12691836 208 8508 200

Kolkata 4631392 149 7226 125

Chennai 4328063 157 7617 170

Dataframe after changing no. of hospitals of Chennai

Population Hospitals Schools Colleges

Delhi 10927986 189 7916 100

Mumbai 12691836 208 8508 200

Kolkata 4631392 149 7226 125

Chennai 4328063 160 7617 170

Dataframe after adding details of the city Hyderabad

Population Hospitals Schools Colleges

Delhi 10927986 189 7916 100

Mumbai 12691836 208 8508 200

Kolkata 4631392 149 7226 125

Chennai 4328063 160 7617 170


17
Hyderabad 10238596 200 5250 150

Question – 10
Question 1(a)

Write a Python program to create a series to store the amount of sales made by a salesman in the last year
(whole months) and perform the following operations.

i) Display the sales amount which is greater than 10000.


ii) Display the sales amount in the first four months.
iii) Display the series in the descending order of sales amount.

Solution

import pandas as pd

import numpy as np

sales=np.array([25000,10000,22500,21750,24000,25000,9500,22500,21750,24000,23000,22800])

ser=pd.Series(sales,index=['jan','feb','mar','apr','may','jun','jul','aug','sep','oct','nov','dec'])

print("Created series is:")

print(ser)

print("Sales amount more than 10000")

print(ser[ser>10000])

print("Sales amount in the first four months")

print(ser.head(4))

print("Series in the descending order of sales amount")

ser.sort_values(ascending=False,inplace=True)

print(ser)

Sample output

Created series is:

jan 25000

feb 10000

mar 22500

apr 21750

18
may 24000

jun 25000

jul 9500

aug 22500

sep 21750

oct 24000

nov 23000

dec 22800

dtype: int32

Sales amount more than 10000

jan 25000

mar 22500

apr 21750

may 24000

jun 25000

aug 22500

sep 21750

oct 24000

nov 23000

dec 22800

dtype: int32

Sales amount in the first four months

jan 25000

feb 10000

mar 22500

apr 21750

dtype: int32

Series in the descending order of sales amount

jan 25000
19
jun 25000

may 24000

oct 24000

nov 23000

dec 22800

mar 22500

aug 22500

apr 21750

sep 21750

feb 10000

jul 9500

dtype: int32

20

You might also like