Professional Documents
Culture Documents
Question 1 (a)
Write the code to create a DataFrame „df‟ and perform the following operations.
Maths Science SST
Amit 100 100.0 60.0
Mohan 95 50.0 57.48
Sudha 85 90.0 53.58
Solution:
import pandas as pd
s1={'Maths':100,'Science':100.0,'SST':60.0}
s2={'Maths':95,'Science':50.0,'SST':57.48}
s3={'Maths':85,'Science':90.0,'SST':53.58}
lst=[s1,s2,s3]
df=pd.DataFrame(lst,index=['Amit','Mohan','Sudha'])
print(df)
df['Total']=df['Maths']+df['Science']+df['SST']
print(df)
df.loc['Kishor',:]=[75.6,88.5,90.3,0]
print(df)
1
print(df.iloc[::,0:2])
df.iat[2,1]=85.0
print(df)
df=df.drop(['Mohan'])
print(df)
Sample output
2
Maths Science
--------------------------------------------------------------------------------------------------------------------------------------
Question – 2
Question 1(a)
Consider an array with values 10, 20, 30, 40, 50. Create a series from this array with default indexes
and write Python statements for the following.
i) Set the values of all elements to 100
ii) Add 10 to all elements of the series and display it.
iii) Display the 1st and the 4th elements of the series.
iv) Set the value of 3rd element to 500.
Solution
import numpy as np
import pandas as pd
lst=[10,20,30,40]
ar=np.array(lst)
3
sr=pd.Series(ar)
print("Created series is:")
print(sr)
print("Series after adding 10 with all elements:")
print(sr+10)
print("The first and the fourth elements of the series are:")
print(sr[0],sr[3])
print("Series after setting the third value as 500")
sr[2]=500
print(sr)
sample output
Created series is:
0 10
1 20
2 30
3 40
dtype: int32
Series after adding 10 with all elements:
0 20
1 30
2 40
3 50
dtype: int32
The first and the fourth elements of the series are:
10 40
Series after setting the third value as 500
0 10
1 20
2 500
3 40
dtype: int32
Question – 3
Question 1(a)
a) Create a series that stores the name (as index) and area (as value) of six states in KM (using
dictionaries).
i) Write the code to find out the biggest and the smallest three areas from the given series.
ii) To display the series in the alphabetical order of state names.
iii) To display the details of the states having the area greater than 25000 KM.
iv) To change the indices as „State1‟, „State2‟, „State3‟, „State4‟, „State5‟ and „State6‟
Solution
import pandas as pd
s1=pd.Series({"Kerala":50000,"TamilNadu":23000,"Karnataka":35000,"UP":75000,"AP":40000,"M
P":20000})
print("Created Series is:")
4
print(s1)
ans=‟y‟
while ans==‟y‟ or ans==‟Y‟:
print(“1. The biggest and the smallest three areas”)
print(“2. Display the series in alphabetical order of state names”)
print(“3. States having area above 25000 KM”)
print(“4. Change the indices”)
ch=int(input(“Enter your choice”))
if ch==1:
s1.sort_values(inplace=True)
print("Biggest three atates are:")
print(s1.tail(3))
print("Smallest three states are")
print(s1.head(3))
elif ch==2:
print("Given series in the alphabetical order of state names:")
s1.sort_index(inplace=True)
print(s1)
elif ch==3:
print("Details of the states having the area greater than 25000")
print(s1[s1>25000])
elif ch==4:
print("Series after chaning the index values:")
s1.index=['State1','State2','State3','State4','State5','State6']
print(s1)
else:
print(“Invalid choice”)
ans=input(“Do you wish to continue”))
sample output
Created Series is:
Kerala 50000
TamilNadu 23000
Karnataka 35000
UP 75000
AP 40000
5
MP 20000
dtype: int64
Biggest three atates are:
AP 40000
Kerala 50000
UP 75000
dtype: int64
Smallest three states are
MP 20000
TamilNadu 23000
Karnataka 35000
dtype: int64
Given series in the alphabetical order of state names:
AP 40000
Karnataka 35000
Kerala 50000
MP 20000
TamilNadu 23000
UP 75000
dtype: int64
Details of the states having the area greater than 25000
AP 40000
Karnataka 35000
Kerala 50000
UP 75000
dtype: int64
Series after chaning the index values:
State1 40000
State2 35000
State3 50000
State4 20000
State5 23000
State6 75000
dtype: int64
6
Question – 4
Question 1(a)
Create a DataFrame as given below and also write the code to add or remove rows from the dataframe
according to the user‟s choice. Also store the dataframe to a CSV file.
Yr1 Yr2 Yr3
Qtr1 34500 44900 54500
Qtr2 36000 46100 51000
Qtr3 47000 57000 58500
Solution
import pandas as pd
d1={'Yr1':34500,'Yr2':44900,'Yr3':54500}
d2={'Yr1':36000,'Yr2':46100,'Yr3':51000}
d3={'Yr1':47000,'Yr2':57000,'Yr3':58500}
lst=[d1,d2,d3]
df=pd.DataFrame(lst,index=['Qtr1','Qtr2','Qtr3'])
print("Created dataframe is:")
print(df)
ans='Y'
while ans=='Y' or ans=='y':
print("1.Add a new row\n2.Remove a row")
ch=int(input("Enter your choice(1 or 2)"))
if ch==1:
yr1=eval(input("Enter the sales amount of year1: "))
yr2=eval(input("Enter the sales amount of year2: "))
yr3=eval(input("Enter the sales amount of year3: "))
df.loc["Qtr"+str(len(df)+1)]=[yr1,yr2,yr3]
print("Dataframe after insertion of new row:")
print(df)
elif ch==2:
print(df.index)
lab=input("Enter the row index")
df=df.drop([lab])
print("Dataframe after deletion of the specified row")
print(df)
else:
print("Invalid choice")
ans=input("Do you wish to continue")
df.to_csv("D:\\SAPS\\sample.csv",sep=',')
print("A csv file created at the specified location using the dataframe")
Sample output
Created dataframe is:
Yr1 Yr2 Yr3
Qtr1 34500 44900 54500
Qtr2 36000 46100 51000
Qtr3 47000 57000 58500
1.Add a new row
2.Remove a row
Enter your choice(1 or 2)1
Enter the sales amount of year1: 25000
7
Enter the sales amount of year2: 32000
Enter the sales amount of year3: 40000
Dataframe after insertion of new row:
Yr1 Yr2 Yr3
Qtr1 34500 44900 54500
Qtr2 36000 46100 51000
Qtr3 47000 57000 58500
Qtr4 25000 32000 40000
Do you wish to continuey
1.Add a new row
2.Remove a row
Enter your choice(1 or 2)2
Index(['Qtr1', 'Qtr2', 'Qtr3', 'Qtr4'], dtype='object')
Enter the row indexQtr1
Dataframe after deletion of the specified row
Yr1 Yr2 Yr3
Qtr2 36000 46100 51000
Qtr3 47000 57000 58500
Qtr4 25000 32000 40000
Do you wish to continuen
A csv file created at the specified location using the dataframe
Question – 5
Question 1(a)
DataFrame - Employee
Jia Manager 34 16
Lalit Programmer 40 20
Write a program to create the above given DataFrame employee and perform the following operations on
it.
Solution
import numpy as np
import pandas as pd
e1={'Name':'Rabia','Position':'Manager','Age':30,'Projects':np.NaN}
e2={'Name':'Evan','Position':'Programmer','Age':np.NaN,'Projects':np.NaN}
e3={'Name':'Jia','Position':'Manager','Age':34,'Projects':16}
e4={'Name':'Lalit','Position':'Programmer','Age':40,'Projects':20}
8
lst=[e1,e2,e3,e4]
employee=pd.DataFrame(lst)
print("Created dataframe is:")
print(employee)
ans='y'
while ans=='y' or ans=='Y':
print("1.Details of employees having age above 30\n2.Details of all Programmers\n3.Details of a
particular employee")
ch=int(input("Enter your choice (1-3)"))
if ch==1:
print("Employees having age freater than 30")
print(employee[employee['Age']>30])
elif ch==2:
print("Details of programmers")
print(employee[employee['Position']=='Programmer'])
elif ch==3:
print("Names of employees")
print(employee.Name)
nm=input("Enter the name of the employee whose details you want to display")
print(employee[employee.Name==nm])
else:
print("Invalid choice")
ans=input("Do you wish to continue")
Sample output
9
1.Details of employees having age above 30
Details of programmers
Names of employees
0 Rabia
1 Evan
2 Jia
3 Lalit
Enter the name of the employee whose details you want to displayRabia
10
Question – 6
Question 1(a)
Write a program to create two series that stores the salary obtained by 3 employees for 2 months (using lists).
Calculate the sum, average and difference in their salaries using Series.
Solution
import pandas as pd
mth1=pd.Series([30000,35000,28000],index=["Ram","Shyam","Mohan"])
mth2=pd.Series([32000,40000,28000],index=["Ram","Shyam","Mohan"])
print("salary of the first month")
print(mth1)
print("Salary of the second month")
print(mth2)
ans='y'
while ans=='y' or ans=='Y':
print("1.Sum of salaries")
print("2. Average of salaries")
print("3. Difference of salaries")
ch=int(input("enter your choice"))
if ch==1:
print("Sum of the salaries")
print(mth1+mth2)
elif ch==2:
print("Average of the salaries")
print((mth1+mth2)/2)
elif ch==3:
print("Difference between the salaries")
print(mth1-mth2)
else:
print("Invalid choice")
ans=input("Do you wish to continue")
Sample output
salary of the first month
Ram 30000
Shyam 35000
Mohan 28000
dtype: int64
Salary of the second month
Ram 32000
Shyam 40000
Mohan 28000
dtype: int64
1.Sum of salaries
2. Average of salaries
3. Difference of salaries
11
enter your choice1
Sum of the salaries
Ram 62000
Shyam 75000
Mohan 56000
dtype: int64
Do you wish to continuey
1.Sum of salaries
2. Average of salaries
3. Difference of salaries
enter your choice2
Average of the salaries
Ram 31000.0
Shyam 37500.0
Mohan 28000.0
dtype: float64
Do you wish to continuey
1.Sum of salaries
2. Average of salaries
3. Difference of salaries
enter your choice3
Difference between the salaries
Ram -2000
Shyam -5000
Mohan 0
dtype: int64
Do you wish to continuen
12
Question – 7
Question 1(a)
Create a dataframe with RollNo, Name, Age and Marks of 5 subjects with default index. Write the
commands to do the following operations on the dataframes.
i) Calculate the total marks and display in the field „Total‟
ii) Change the index from default to RollNo.
iii) Display the details of 1st and 3rd students.
iv) Add a new row to the dataframe.
Solution
import pandas as pd
s1={'RollNo':1,'Name':'Arun','Age':15,'Mark1':78,'Mark2':85,'Mark3':90,'Mark4':92,'Mark5':88}
s2={'RollNo':2,'Name':'Bibin','Age':16,'Mark1':98,'Mark2':80,'Mark3':92,'Mark4':96,'Mark5':98}
s3={'RollNo':3,'Name':'Dijo','Age':15,'Mark1':70,'Mark2':80,'Mark3':90,'Mark4':95,'Mark5':97}
students=pd.DataFrame([s1,s2,s3])
print(students)
students['Total']=students.Mark1+students.Mark2+students.Mark3+students.Mark4+students.Mark5
print(students)
print("Details of the first and the third students")
print(students.iloc[0::2,::])
print("Dataframe after setting RollNo as index:")
students.set_index('RollNo',inplace=True)
print(students)
print("Details of the first and the third students")
print(students.iloc[0::2])
print("Adding new record")
rno=int(input("Enter roll number"))
nm=input("Enter name")
ag=int(input("Enter age"))
m1=int(input("Enter mark1: "))
m2=int(input("Enter mark2: "))
m3=int(input("Enter mark3: "))
m4=int(input("Enter mark4: "))
m5=int(input("Enter mark5: "))
tot=eval(input("Enter total: "))
students.loc[rno]=[nm,ag,m1,m2,m3,m4,m5,tot]
print("Dataframe after insertion of a new row")
print(students)
Sample output
RollNo Name Age Mark1 Mark2 Mark3 Mark4 Mark5
0 1 Arun 15 78 85 90 92 88
1 2 Bibin 16 98 80 92 96 98
2 3 Dijo 15 70 80 90 95 97
RollNo Name Age Mark1 Mark2 Mark3 Mark4 Mark5 Total
0 1 Arun 15 78 85 90 92 88 433
13
1 2 Bibin 16 98 80 92 96 98 464
2 3 Dijo 15 70 80 90 95 97 432
Details of the first and the third students
RollNo Name Age Mark1 Mark2 Mark3 Mark4 Mark5 Total
0 1 Arun 15 78 85 90 92 88 433
2 3 Dijo 15 70 80 90 95 97 432
Dataframe after setting RollNo as index:
Name Age Mark1 Mark2 Mark3 Mark4 Mark5 Total
RollNo
1 Arun 15 78 85 90 92 88 433
2 Bibin 16 98 80 92 96 98 464
3 Dijo 15 70 80 90 95 97 432
Details of the first and the third students
Name Age Mark1 Mark2 Mark3 Mark4 Mark5 Total
RollNo
1 Arun 15 78 85 90 92 88 433
3 Dijo 15 70 80 90 95 97 432
Adding new record
Enter roll number4
Enter nameKevin
Enter age15
Enter mark1: 100
Enter mark2: 100
Enter mark3: 100
Enter mark4: 100
Enter mark5: 100
Enter total: 500
Dataframe after insertion of a new row
Name Age Mark1 Mark2 Mark3 Mark4 Mark5 Total
RollNo
1 Arun 15 78 85 90 92 88 433
2 Bibin 16 98 80 92 96 98 464
3 Dijo 15 70 80 90 95 97 432
4 Kevin 15 100 100 100 100 100 500
14
Question – 8
Question 1(a)
Write a Python program to create 2 DataFrames that stores the marks secured by 5 students in 2
examinations and perform the following operations:
i) To create a new data frame containing total marks (adding marks secured
in both exams)
Solution
import pandas as pd
exam1=pd.DataFrame({"Name":["Anu","Biju","Cino","Deljith"],"Marks":[50,45,24,48]},index=["Rno1"
,"Rno2","Rno3","Rno4"])
exam2=pd.DataFrame({"Name":["Anu","Biju","Cino","Deljith"],"Marks":[45,40,43,40]},index=["Rno1"
,"Rno2","Rno3","Rno4"])
print("Details of Exam1")
print(exam1)
print("Details of Exam2")
print(exam2)
print("Total marks of two exams")
df3=pd.DataFrame({"Name":exam1.Name,"Total":exam1.Marks+exam2.Marks})
print(df3)
print("Details of the top 3 scorers")
df3.sort_values(["Total"],ascending=False,inplace=True)
print(df3.head(3))
Sample output
Details of Exam1
Name Marks
Rno1 Anu 50
Rno2 Biju 45
Rno3 Cino 24
Rno4 Deljith 48
Details of Exam2
Name Marks
Rno1 Anu 45
15
Rno2 Biju 40
Rno3 Cino 43
Rno4 Deljith 40
Name Total
Rno1 Anu 95
Rno2 Biju 85
Rno3 Cino 67
Rno4 Deljith 88
Name Total
Rno1 Anu 95
Rno4 Deljith 88
Rno2 Biju 85
Question – 9
Question 1(a)
a) Write a Python program to create the given DataFrame and also write the code to perform the
following operations.
Population Hospitals Schools
Sample output
16
Created dataframe is
Population 12691836
Hospitals 208
Schools 8508
Question – 10
Question 1(a)
Write a Python program to create a series to store the amount of sales made by a salesman in the last year
(whole months) and perform the following operations.
Solution
import pandas as pd
import numpy as np
sales=np.array([25000,10000,22500,21750,24000,25000,9500,22500,21750,24000,23000,22800])
ser=pd.Series(sales,index=['jan','feb','mar','apr','may','jun','jul','aug','sep','oct','nov','dec'])
print(ser)
print(ser[ser>10000])
print(ser.head(4))
ser.sort_values(ascending=False,inplace=True)
print(ser)
Sample output
jan 25000
feb 10000
mar 22500
apr 21750
18
may 24000
jun 25000
jul 9500
aug 22500
sep 21750
oct 24000
nov 23000
dec 22800
dtype: int32
jan 25000
mar 22500
apr 21750
may 24000
jun 25000
aug 22500
sep 21750
oct 24000
nov 23000
dec 22800
dtype: int32
jan 25000
feb 10000
mar 22500
apr 21750
dtype: int32
jan 25000
19
jun 25000
may 24000
oct 24000
nov 23000
dec 22800
mar 22500
aug 22500
apr 21750
sep 21750
feb 10000
jul 9500
dtype: int32
20