You are on page 1of 4

ICT0513 (Assignment #2-PHYTON) Semester 2 2020/2021 31

Name: Mohamad Luqman Bin Mohamad Shahril Matric No.: 203905 Group No.: 557

Instructions:
● Read the questions carefully and answer ALL questions in this booklet.
● Create ONE folder on the desktop.
● Download the csv file from Google Classroom (File name is Bestseller books.csv)
● Write and paste your answers in the space provided in this question booklet.
● Submit your files by attaching it to Google Classroom and click Submit.

Dataset name : Bestseller book.csv Column name : Reviews


Question 1: Write the library needed to read your csv file. 1 mark

import pandas as pd

df=pd.read_csv("bestseller books.csv")

display(df)

Question 2: Write the code(s) to load the content from the dataset . 2 marks
import pandas as pd

df=pd.read_csv("bestseller books.csv")

display(df)

Question 3: Find the Mean, Median and Mode for your column. Write the code(s) used for
the following tasks: 4 marks
Import statistics as stats
Import needed library

Write the codes and display in required decimal places


mean=stats.mean(df.Reviews)
print(mean)

Mean codes
print("{:.1f}".format(mean))
(1 decimal
place)
Median codes median=stats.median(df.Reviews)
print(median)

print ("{:.4f}".format(median))

Page 1
(4 decimal
places)
mode=stats.mode(df.Reviews)
print(mode)

Mode codes
print("{:.0f}".format(mode)
(0 decimal
place)

Write down the values or result ONLY for : 3 marks


Mean (1 decimal place) Median (4 decimal places) Mode (0 decimal place)

11953.3
8580.0000 8580

Question 4: Find the Standard Deviation and Variance for your column. Write the code(s)
used for the following tasks: 4 marks
Write the codes and display in 2 decimal places
stdev=stats.stdev(df.Reviews)
Standard Deviation
print(stdev)
codes
print("{:.2f}".format(stdev))
variance=stats.variance(df.Reviews)
Variance codes print(variance)
print("{:.2f}".format(variance))

Write down the values ONLY for : 2 marks


Standard Deviation (2 decimal places) Variance (2 decimal places)
11731.13 137619458.41

Question 5: Find the First Quartile, Second Quartile and Third Quartile for your column.
Write using the code(s) used for the following tasks: 2 marks
import numpy as np
Import needed library

Calculate and display in 1 decimal places


Quartiles codes min = np.quantile(df.Reviews, 0)
Page 2
quartile1 = np.quantile(df.Reviews, 0.25)
quartile2 = np.quantile(df.Reviews, 0.5)
quartile3 = np.quantile(df.Reviews, 0.75)
max = np.quantile(df.Reviews, 1)
interquartile = quartile3 - quartile1

print(min)
print(quartile1)
print(quartile2)
print(quartile3)
print(max)

Write down the values ONLY for : 3 marks


First Quartile Second Quartile Third Quartile
(0 decimal place) (0 decimal place) (0 decimal place)
4058 8580 17253

Question 6: Plot a histogram for your column. Insert the most suitable title and label the x
and y axes, based on your understanding for the given set of data.
Write the code(s) used. 3 marks

Question 7: Label some data into the histogram. The label should be showing mean,
import seaborn as sns

plt.hist(Reviews,bins=10,edgecolor='black',linewidth='3', color='grey')

plt.title('Histogram Reviews')

plt.xlabel('Total Reviews')

plt.ylabel('Total Ratings')

plt.show()
median and mode that have been calculated earlier.
Write the code(s) used for plotting mean, median and mode for the histogram. 3 marks
median_total_bill=statistics.median(df.Reviews)

plt.axvline(median_total_bill, color='yellow', label='Total_Median')

plt.legend()

Question 8: Plot a boxplot for your column. Insert the title for the boxplot
Page 3
Write the code(s) used 3 marks
sns.boxplot(df.Reviews, color='red')

plt.title('Boxplot of Reviews')

plt.ylabel('Ratings')

plt.show()

Important: Save your files name as Phyton Assignment, together with your matric
number and ICT group number as example shown below:
1 mark for following the correct format

Page 4

You might also like