CS3361 Set2

B.E / B.Tech.
PRACTICAL END SEMESTER EXAMINATIONS, NOVEMBER/DECEMBER 2022

Third Semester
CS3361 – DATA SCIENCE LABORATORY
(Regulations 2021)
Time : 3 Hours Answer any one Question Max. Marks 100
Aim/Principle/Apparatus Tabulation/Circuit/ Calculation Viva-Voce Record Total

required/Procedure Program/Drawing & Results
20 30 30 10 10 100
1. a. Write a NumPy program to convert an array to a float type
b. Write a NumPy program to add a border (filled with 0's) around an existing array
c. Write a NumPy program to convert a list and tuple into arrays
d. Write a NumPy program to append values to the end of an array
2. a. Write a NumPy program to convert an array to a float type
b. Write a NumPy program to create an empty and a full array
c. Write a NumPy program to convert a list and tuple into arrays
d. Write a NumPy program to find the real and imaginary parts of an array of complex numbers
3. Write a Pandas program to create and display a DataFrame from a specified dictionary data which
has the index labels.
Sample Python dictionary data and list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura',
'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Expected Output:
attempts name qualify score
a 1 Anastasia yes 12.5
b 3 Dima no 9.0
.... i 2 Kevin no 8.0
j 1 Jonas yes 19.0
Page 1 of 6
4. Write a Pandas program to select the rows where the number of attempts in the examination is
greater than 2.
'Kevin', 'Jonas'],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
Expected Output:
Number of attempts in the examination is greater than 2:
name score attempts qualify
b Dima 9.0 3 no
d James NaN 3 no
f Michael 20.0 3 yes
5. Write a Pandas program to get the first 3 rows of a given DataFrame.

'Kevin', 'Jonas'],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
Expected Output:
First three rows of the data frame:
a 1 Anastasia yes 12.5
b 3 Dima no 9.0
c 2 Katherine yes 16.5
6. Write a Pandas program to select the rows where the score is missing, i.e. is NaN.

'Kevin', 'Jonas'],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Expected Output:
Rows where score is missing:
d 3 James no NaN
h 1 Laura no NaN
Page 2 of 6
7. Reading data from text files, Excel and the web and exploring various commands for doing
descriptive analytics on the Iris data set
8. Use the diabetes data set from UCI data set for performing the following:
Apply Univariate analysis:
• Frequency
• Mean,
• Median,
• Mode,
• Variance
• Standard Deviation
• Skewness and Kurtosis
Apply Bivariate analysis:
• Linear and logistic regression modeling
Apply Bivariate analysis:
• Multiple Regression analysis
11. Apply and explore various plotting functions on Pima Indians Diabetes data set for performing the
following:
a) Normal values
b) Density and contour plots
c) Three-dimensional plotting
12. Apply and explore various plotting functions on Pima Indians Diabetes data set for performing the
following:
a) Correlation and scatter plots

b) Histograms
Page 3 of 6
13. Apply and explore various plotting functions on UCI data set for performing the following:
a) Normal values
b) Density and contour plots
14. Apply and explore various plotting functions on UCI data set for performing the following:
a) Correlation and scatter plots

b) Histograms
15. Write a Pandas program to get the numeric representation of an array by identifying distinct values
of a given column of a dataframe.
Sample Output:
Original DataFrame:
Name Date_Of_Birth Age
0 Alberto Franco 17/05/2002 18.5
1 Gino Mcneill 16/02/1999 21.2
2 Ryan Parkes 25/09/1998 22.5
3 Eesha Hinton 11/05/2002 22.0
4 Gino Mcneill 15/09/1997 23.0
Numeric representation of an array by identifying distinct values:
[0 1 2 3 1]
Index(['Alberto Franco', 'Gino Mcneill', 'Ryan Parkes', 'Eesha Hinton'], dtype='object')
16. Write a Pandas program to check for inequality of two given DataFrames.
Sample Output:
Original DataFrames:
WXYZ
0 68.0 78.0 84 86
1 75.0 85.0 94 97
2 86.0 NaN 89 96
3 80.0 80.0 83 72
4 NaN 86.0 86 83
WXYZ
0 78.0 78 84 86
1 75.0 85 84 97
2 86.0 96 89 96
3 80.0 80 83 72
4 NaN 76 86 83
Check for inequality of the said dataframes:
WXYZ
0 True False False False
1 False False True False
Page 4 of 6
2 False True False False
3 False False False False
4 True True False False
17. Write a Pandas program to get first n records of a DataFrame.

Sample Output:
Original DataFrame
col1 col2 col3
0147
1255
2368
3 4 9 12
4751
5 11 0 11
First 3 rows of the said DataFrame':
col1 col2 col3
0147
1255
2368
18. Write a Pandas program to select all columns, except one given column in a DataFrame.
Sample Output:
Original DataFrame
col1 col2 col3
0147
1258
2 3 6 12
3491
4 7 5 11
All columns except 'col3':
col1 col2
014
125
236
349
475
19. Write a NumPy program to convert a Python dictionary to a NumPy ndarray.

Sample Output:
Original dictionary:
{'column0': {'a': 1, 'b': 0.0, 'c': 0.0, 'd': 2.0},
'column1': {'a': 3.0, 'b': 1, 'c': 0.0, 'd': -1.0},
'column2': {'a': 4, 'b': 1, 'c': 5.0, 'd': -1.0},
'column3': {'a': 3.0, 'b': -1.0, 'c': -1.0, 'd': -1.0}}
Type: <class 'dict'>
ndarray:
[[ 1. 0. 0. 2.]
Page 5 of 6
[ 3. 1. 0. -1.]
[ 4. 1. 5. -1.]
[ 3. -1. -1. -1.]]
Type: <class 'numpy.ndarray'>
20. Write a NumPy program to search the index of a given array in another given array.
Sample Output:
Original NumPy array:
[[ 1 2 3]
[ 4 5 6]
[ 7 8 9]
[10 11 12]]
Searched array:
[4 5 6]
Index of the searched array in the original array:
[1]
Page 6 of 6

CS3361 Set2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CS3361 Set2

Uploaded by

Copyright:

Available Formats

B.E / B.Tech.

PRACTICAL END SEMESTER EXAMINATIONS, NOVEMBER/DECEMBER 2022

CS3361 – DATA SCIENCE LABORATORY

Time : 3 Hours Answer any one Question Max. Marks 100

Aim/Principle/Apparatus Tabulation/Circuit/ Calculation Viva-Voce Record Total

1. a. Write a NumPy program to convert an array to a float type

c. Write a NumPy program to convert a list and tuple into arrays

d. Write a NumPy program to append values to the end of an array

2. a. Write a NumPy program to convert an array to a float type

b. Write a NumPy program to create an empty and a full array

c. Write a NumPy program to convert a list and tuple into arrays

5. Write a Pandas program to get the first 3 rows of a given DataFrame.

Sample Python dictionary data and list labels:

Apply Univariate analysis:

Apply Bivariate analysis:

• Linear and logistic regression modeling

Apply Bivariate analysis:

• Multiple Regression analysis

b) Density and contour plots

a) Correlation and scatter plots

a) Correlation and scatter plots

17. Write a Pandas program to get first n records of a DataFrame.

19. Write a NumPy program to convert a Python dictionary to a NumPy ndarray.

You might also like