Sarthak Python

MID TERM TEST – SECTION –B (Data Analytics with Python)
Time Limit: 40 Minutes

HANDS ON ASSESMENT (30)
Note: You are expected to attempt all questions in Python
Platform & paste the code against each question with appropriate
comment in a single Word file. Word file should be submitted as
an assignment through Google Classroom before 12:15PM (as it is
scheduled as mentioned, you will not be able to submit Later.)
Q1.Write a Program to find out Maximum, Minimum element in each row & Column &
Cumulative sum along each row.
arr = np.array([[1, 5, 6],
[4, 7, 2],
[3, 1, 9]])
(4)
Q2. There is a code given in Python to demonstrate sorting in Numpy.
Complete the following Code with the help of Numpy Functions for
getting an output against each Comment Statements as mentioned below:
# Python program to demonstrate sorting in numpy
import numpy as np

a = np.array([[1, 4, 2],
[3, 4, 6],
[0, -1, 5]])

# to get Array elements in Sorted Order

# to get Row Wise Sorted array
# Column wise sort by applying merge sort

dtypes = [('name', 'S10'), ('grad_year', int), ('cgpa', float)]

values = [('Hrithik', 2009, 8.5), ('Ajay', 2008, 8.7),
('Pankaj', 2008, 7.9), ('Aakash', 2009, 9.0)]

arr = np.array(values, dtype = dtypes)
# to sort an array by name
# To Sort an array by graduation year and then cgpa

(5)
Q3. Case Study (20)
In this Case Study, you are expected to apply the NumPy python library
to explore a dataset. The dataset we'll be using is a medical dataset
with information about some patients on metrics like glucose, insulin
levels, and other metrics related to diabetes. The assignment will
serve two primary objectives - (a) practice NumPy on a realistic task,
and (b) learn how to get a feel for a large dataset (also known as
data cleaning and data exploration).
Dataset description
The following are the column names: Pregnancies, Glucose,

BloodPressure, SkinThickness, Insulin, BMI, DiabetesPedigreeFunction,
Age, Outcome
Perform the following based on a given Data Set using Python Library
NUMPY:
1. Import the Data Set

2. How many patients does the dataset have information about?
3. What is the blood pressure of the patient number 5 (0-indexed)?
4. What is the age of the patient number 112 (0-indexed)?
In this dataset, Outcome = 0 denotes that the patient does not have
diabetes. And Outcome = 1 denotes that the patient has diabetes.
5. Does patient number 227 (0-indexed) have diabetes?
6. Out of the total patients, how many have diabetes?
7. For features Glucose, BloodPressure, SkinThickness, Insulin and

BMI (columns 1, 2, 3, 4 and 5 0-indexed) the values are missing for
some of the patients. Instead of the actual value, the dataset simply
has a 0. Find Total number of missing values
8. For how many patients is at-least one of the features missing? (Be
careful that it is okay for someone to be Pregnant 0 times).
9. what is the total number of patients who have diabetes in the

dataset?
10. What is the average glucose level in the dataset?
11. What is the average glucose level among the diabetes patients?
12. What is the average glucose level among the non-diabetic people?
ANSWERS
CODE
#################################################### Question 1
import numpy as np
arr = np.array([[1, 5, 6],[4, 7, 2],[3, 1, 9]])
# maximum element of array

print ("Largest element is", arr.max())
print ("Row-wise max elements:", arr.max(axis = 1))
print ("Row-wise min elements:", arr.min(axis = 1))
# minimum element of array

print ("Column-wise max elements:", arr.max(axis = 0))
print ("Column-wise min elements:", arr.min(axis = 0))
# sum of array elements

print ("Sum of all array elements:", arr.sum())
# cumulative sum along each row

print ("Cumulative sum along each row:\n", arr.cumsum(axis = 1))
####################################################### Question 2
a = np.array([[1, 4, 2],
[3, 4,6],
[0, -1, 5]])
# Sorted array
print ("Array elements in sorted order:\n", np.sort(a, axis = None))
# Row-wise sorted array

print ("Row-wise sorted array:\n", np.sort(a, axis = 1))
# Column-wise Merge Sort

print ("Column wise sort by applying merge-sort:\n", np.sort(a, axis = 0,
kind = 'mergesort'))
dtypes = [('name', 'S10'), ('grad_year', int), ('cgpa', float)]

values = [('Hrithik', 2009, 8.5), ('Ajay', 2008, 8.7), ('Pankaj', 2008, 7.9),
('Aakash', 2009, 9.0)]
## creating array
arr = np.array(values, dtype = dtypes)
## Sortr by names
print ("\nArray sorted by names:\n", np.sort(arr, order = 'name'))
## Sort by grad year and then cgpa
print ("Array sorted by grauation year and then cgpa:\n", np.sort(arr, order
= ['grad_year', 'cgpa']))
####################################################### CASE STUDY
###### Question 1
# Importing dataset
dbdata = np.loadtxt('C:/Users/Sarthak Kaushik/Desktop/diabetes.csv',
skiprows=1, delimiter=',')
###### 2 Data set information

print(dbdata.shape)
###### 3 Blood pressure of Patient 5

print(dbdata[5, 2])
###### 4 Age of patient 112

print(dbdata[112, 7])
###### 5 Diabetes of patient 227

print(dbdata[227, 8])
##### 6 No. of patients having diabetes

pat = sum(dbdata[:, 8] == 1)
print(pat)
##### 7 Missing value

miss = sum(dbdata[:, 4] == 0)
print(miss)
#### 8 One feature missing

print(sum((dbdata[:, 1] == 0) |
(dbdata[:, 2] == 0) |
(dbdata[:, 3] == 0) |
(dbdata[:, 4] == 0) |
(dbdata[:, 5] == 0) |
(dbdata[:, 6] == 0) |
(dbdata[:, 7] == 0) ))
##### 9 total no. of patients having diabetes

p = sum(dbdata[:, 8] == 1)
print(p)
##### 10 Average glucose level in data
avg = np.mean(dbdata, axis=0)
print(avg)
#### 11 average g level in diabetes patient

d = dbdata[ (dbdata[:, 8] == 1) , :]
a = np.mean(d, axis=0)
print(a)
##### 12 avg glucose level in non diabetic patient

nd = dbdata[ (dbdata[:, 8] == 0) , :]
avg = np.mean(nd, axis=0)
print(avg)
OUTPUT
Largest element is 9
Row-wise max elements: [6 7 9]
Row-wise min elements: [1 2 1]
Column-wise max elements: [4 7 9]
Column-wise min elements: [1 1 2]
Sum of all array elements: 38
Cumulative sum along each row:
[[ 1 6 12]
[ 4 11 13]
[ 3 4 13]]
Array elements in sorted order:
[-1 0 1 2 3 4 4 5 6]
Row-wise sorted array:
[[ 1 2 4]
[ 3 4 6]
[-1 0 5]]
Column wise sort by applying merge-sort:
[[ 0 -1 2]
[ 1 4 5]
[ 3 4 6]]
Array sorted by names:

[(b'Aakash', 2009, 9. ) (b'Ajay', 2008, 8.7) (b'Hrithik', 2009, 8.5)
(b'Pankaj', 2008, 7.9)]
Array sorted by grauation year and then cgpa:
[(b'Pankaj', 2008, 7.9) (b'Ajay', 2008, 8.7) (b'Hrithik', 2009, 8.5)
(b'Aakash', 2009, 9. )]
(768, 9)
74.0
23.0
1.0
268
374
376
268
[ 3.84505208 120.89453125 69.10546875 20.53645833 79.79947917
31.99257812 0.4718763 33.24088542 0.34895833]
[ 4.86567164 141.25746269 70.82462687 22.1641791 100.3358209
35.14253731 0.5505 37.06716418 1. ]
[ 3.298 109.98 68.184 19.664 68.792 30.3042
0.429734 31.19 0. ]

Sarthak Python

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sarthak Python

Uploaded by

Copyright:

Available Formats

MID TERM TEST – SECTION –B (Data Analytics with Python)

Time Limit: 40 Minutes

# to get Array elements in Sorted Order

Q3. Case Study (20)

The following are the column names: Pregnancies, Glucose,

1. Import the Data Set

5. Does patient number 227 (0-indexed) have diabetes?

6. Out of the total patients, how many have diabetes?

7. For features Glucose, BloodPressure, SkinThickness, Insulin and

9. what is the total number of patients who have diabetes in the

10. What is the average glucose level in the dataset?

arr = np.array([[1, 5, 6],[4, 7, 2],[3, 1, 9]])

# maximum element of array

# minimum element of array

# sum of array elements

# cumulative sum along each row

# Row-wise sorted array

# Column-wise Merge Sort

dtypes = [('name', 'S10'), ('grad_year', int), ('cgpa', float)]

####################################################### CASE STUDY

###### 2 Data set information

###### 3 Blood pressure of Patient 5

###### 4 Age of patient 112

###### 5 Diabetes of patient 227

##### 6 No. of patients having diabetes

##### 7 Missing value

#### 8 One feature missing

##### 9 total no. of patients having diabetes

#### 11 average g level in diabetes patient

##### 12 avg glucose level in non diabetic patient

Array sorted by names:

You might also like