Professional Documents
Culture Documents
All Merged IDS Quiz - Merged
All Merged IDS Quiz - Merged
Quiz 1
Due Dec 19 at 23:59 Points 10 Questions 40 Available Dec 18 at 19:00 - Dec 19 at 23:59
Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go back to the previous question if you skip a
question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
LATEST Attempt 1 48 minutes 9.38 out of 10
Springboot
Python
R
SAS
A city conducted a new bi-annual census of its residents. Which of the following most strongly suggests a
cognitive bias in their collected dataset?
The census data includes new data on how many cars are owned by the residents.
Some rows have date of birth in DD-MM-YY and some in DD-MM-YYYY formats
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
The average income in the dataset is 40% higher than the average income of the city’s population from the
previous census 2 years ago.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Hadoop
MongoDB
Amazon S3
Flask
Business Analyst
Visualization Engineer
Data Architect
Data Scientist
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Which of the following best describes the difference between the data analyst and data scientist?
Data analyst and data scientists plays the same role in the project
Data analyst are more proficient in R whereas Data scientists are more proficient in Python
Data analyst does estimation whereas data scientist predicts & explains it as well
Data analyst just deal with numbers whereas data scientists deals with algorithms
Compute the depth of each bin for the data given below, if the number of bins is 5.
[47, 38, 40, 42, 47, 42, 41, 39, 42, 45, 36, 37, 36, 40, 43, 40, 45, 36, 43, 46]
2.02
2.17
2.2
2.3
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
y= (2,1,1,2,1,2,1,1,0,0,0)
0.034
50.2
0.64
0.99
Suppose that the minimum and maximum values for an attribute are 4.3 and 7.6, respectively. Compute the
scaled value of 5.4 if min-max normalization is applied to scale [0.0,1.0]. (Answer should have a precision of
X.XX]
0.36
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
0.33
0.43
0.45
Suppose that the minimum and maximum values for an attribute are 4.3 and 7.6, respectively. Compute the
scaled value of 5.4 if min-max normalization is applied to scale [-1.0,+1.0]. (Answer should have a precision
of X.XX)
0.33
0.76
0.66
-0.33
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Identify the data analytics task for the following scenario. A car service showroom manager wants to analyze
his marketing and sales data to understand the reason for drop in sales.
Predictive Analytics
Descriptive Analytics
Diagnostic Analytics
Prescriptive Analytics
Amongst which of the following is / are the branch of statistics which deals with the development of statistical
methods is classified as ___.
Industry statistics
Economic statistics
Applied statistics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Identify the data analytics task for the following scenario. A physics teacher is analyzing the answer scripts of
the students to identify the areas that he/she should concentrate on so that the students understand the
concepts better.
Predictive Analytics
Descriptive Analytics
Prescriptive Analytics
Diagnostic Analytics
Aanalyzing the data to determine why some phenomena related to learning happened a type of
Descriptive
Prescriptive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Predictive
Diagnostic
What is the name of the Google-developed programming framework that enables the creation of applications
for processing big data sets in a distributed computing environment?
ZooKeeper
Hive
MapReduce
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
prescriptive analytics
Predictive analytics
machine learning
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
The process through which businesses analyse customer data or other types of information in order to find
patterns and links between various data items is known as:
Data mining
Data digging
Consumer engagement
Identify the data analytics task for the following scenario. A project manager is analysing past projects to
identify the software, hardware and human resources that will be required to complete a new project
successfully on time.
Descriptive Analytics
Prescriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Diagnostic Analytics
Predictive Analytics
Fortis-Apollo hospital is planning to design a model which maps patients to the best possible treatments
based on the diagnosis. Identify the data analytics task for this scenario
Diagnostic Analytics
Predictive Analytics
Cognitive analytics
Descriptive Analytics
We are predicting the weather condition as foggy, warm, cloudy, and misty at Bangalore using the data
collected in the last one month. This task is an example of
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Regression
Association Rule
Classification
Clustering
Sub setting can be used to select and exclude variables and observations
True
False
It places less emphasis on the initial planning phases covered in CRISP-DM (Business Understanding and Data
Understanding phases) and omits entirely the Deployment phase.
The SEMMA model also emphasizes data mining as a non-linear, adaptive process.
SEMMA is a logical organisation of the functional tool set of SAS Enterprise Miner for carrying out the core tasks of
data mining.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Ordinal attribute
Numeric attribute
Continuous attribute
Nominal attribute
"Order Fulfilment Date" should come after "Order Creation Date". This is an example of which data quality
aspect:
Timeliness
Consistency
Integrity
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Conformity
NULL
Null
Empty
NaN
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Partial
Question 29 0.13 / 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 31 0 / 0.25 pts
In Python, the function used to find whether there are missing values is:
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 19/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
dropna()
isna()
imputena()
fillna()
mean−mode ≈ 3×(mean−median).
mean−median ≈ 3×(mean−mode)
mean−mode ≈ 3×(median-mean).
median-mean ≈ 3×(mean−mode)
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 20/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
A and B
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 21/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
The statistical description (x1,x2, . . . ,xN)/N, for the data values x1,x2, . . . ,xN is called as their
________________
mean
IQR
mode
median
In which phase, the duplicates (of the data) are removed? Choose the best possible answer.
Data Preparation
Data Exploration
Data Collection
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 22/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data Understanding
In a box and whisker plot of data, point out the FALSE statement, about Outliers
Outliers are beyond 1.5 times the Inter Quartile Range (IQR) from the lower quartile
Outliers are beyond 1.5 times the Inter Quartile Range (IQR) from the median
Outliers are beyond the lowest or the highest value in the dataset
Outliers are beyond 1.5 times the Inter Quartile Range (IQR) from the upper quartile
Consider the data set below.How will you(most appropriately) handle the missing values for record 1,4,6
respectively.
Data set description.
price: continuous from 5118 to 45400.(class label)
normalized-losses: continuous from 65 to 256.
num-of-doors: four, two.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 23/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 39 0 / 0.25 pts
Equal-frequency binning
Histogram analysis
Correlation analysis
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 24/25
18/12/2022, 23:58 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
For exploring continuous data using descriptive statistics which of the following method is used
Range
Frequency
Percentage
Histogram
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 25/25
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 23:59 Points 10 Questions 40
Available Dec 18 at 19:00 - Dec 19 at 23:59 Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss this
Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
LATEST Attempt 1 57 minutes 8.25 out of 10
data mining
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
data manipulation
data analytics
Data Wrangling
Recommendation Systems
Privacy Checker
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Centralized
Consulting
Decentralized
Federated
i, ii, iii
iv, v
i, iv, v
ii, iii
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Hadoop
Flask
Amazon S3
MongoDB
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 3.2. [Answer should have a precision of X.XXXX]
0.8679
0.8769
-0.8679
-0.8769
Compute the depth of each bin for the data given below, if the number of
bins is 5.
[47, 38, 40, 42, 47, 42, 41, 39, 42, 45, 36, 37, 36, 40, 43, 40, 45, 36, 43,
46]
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
2.3
2.17
2.02
2.2
Compute the depth of each bin for data given below, if the number of bins
is 5.
[47, 38, 40, 42, 47, 42, 41, 39, 42, 45, 36, 37, 36, 40, 43, 40, 45, 36, 43,
46]
3
6
5
4
20/5
Suppose that the minimum and maximum values for the attribute income
are $12,000 and $98,000, respectively. The new range is [0.0,1.0]. Apply
min-max normalization to a value of $73,600.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
0.716
0.561
0.856
0.758
Diagnostic
Prescriptive
Descriptive
Predictive
Incorrect
Question 12 0 / 0.25 pts
Diagnostic
Descriptive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Prescriptive
Predictive
A scenario where you visited a doctor for your fever, and the doctor
asked you questions about your condition and tried to understand how
it might have happened. Now the doctor comes to a conclusion based
on your responses as normal fever. But to be sure about it and ensure
there are no underlying health problems or other risk factors that make
it more likely that you won't recover as quickly, the doctor performs
some tests and asks to see previous medical reports. Now, based on
this doctor is sure that it is a regular fever, and he’s sure it will go away
in 5 days. The last part of the narration, which is derived from above-
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Prescriptive
Predictive
Diagnostic
Descriptive
Access Situation - "Cost and Benefits" - Falls into which phase of Crisp -
DM
Data modeling
Evaluation
Business Understanding
Data Preparation
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Calculating a room's temperature (in Celsius) based on other
environmental factors (such as atmospheric pressure, humidity etc).
Predicting if a cricket player is a batsman or bowler, given his playing
record.
Finding the shorter path between two already-existing routes between two
locations.
Predict traffic congestion along a specific route between two locations
using vehicle journey times.
Incorrect
Question 17 0 / 0.25 pts
In an exam, marks less than 40 (out of 100) is considered fail. The task of
finding the names of the failed students with a mark range between 10
and 25 is
Failure Analytics
Diagnostic Analytics
Descriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Regression is
(1) Prediction of a value of a given continuous valued variable based on
the values of other variables.
(2) Regression works with a linear or nonlinear model of dependency
among the variables.
Option 2
Option 1
None
Both 1& 2
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Identify the data analytics task for the following scenario. A project
manager is analysing past projects to identify the software, hardware and
human resources that will be required to complete a new project
successfully on time.
Prescriptive Analytics
Predictive Analytics
Descriptive Analytics
Diagnostic Analytics
The best method for predicting the number of deaths due to covid is
Regression
Correlation
Classification
Clustering
Incorrect
Question 22 0 / 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
We are predicting the weather condition as foggy, warm, cloudy, and misty
at Bangalore using the data collected in the last one month. This task is
an example of
Classification
Clustering
Regression
Association Rule
Incorrect
Question 23 0 / 0.25 pts
Nominal attribute
Ordinal attribute
Interval attribute
Ratio attribute
Exam Grades
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Academic ranks
Zip codes
Military ranks
Structured format
none
both
Raw format
Numeric attribute
Nominal attribute
Ordinal attribute
Continuous attribute
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Interval attribute
Nominal attribute
Ratio attribute
Ordinal attribute
Null
Empty
NaN
NULL
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
In a boxplot, where Q1, Q2 and Q3 are the first, second and third quartiles
respectively, the interquartile range IQR is calculated as:
IQR = Q3 – Q1
IQR = Q2-Q1
IQR = Q3-Q2
True
False
In which phase, the duplicates (of the data) are removed? Choose the
best possible answer.
Data Exploration
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data Preparation
Data Collection
Data Understanding
Suppose that the mean and standard deviation of the values for an
attribute are 6500 and 2000, respectively. Apply z-score normalization to
value of 8000.
.5
.75
.7
.6
For a data analytics task to analyse feedback on her subject for a class of
60 students, a school teacher decided to use the survey submitted by the
ten students who come for tuitions for that subject, at her home. Identify
the type of sampling she is doing.
Stratified sampling
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Systematic Sampling
“Similarity” means:
The number of tuples in a database whose attributes have similar values.
True
False
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Systamatic
Probabilistic
Representative
Qualitative
5
5.5
6
2
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
2
3
1
4
There are two sets X={10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24} and Y = {-30, -31, -32, -33, -34, -35, -36, -37, -38, -39, -40, -41,
-42, -43, -44}. What is TRUE about the standard deviations of X and Y i.e.
σX and σY respectively?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 19/20
12/18/22, 11:21 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 20/20
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 11:59pm Points 10 Questions 40
Available Dec 18 at 7pm - Dec 19 at 11:59pm Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
processing data
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
analysing data
organizing data
Incorrect
Question 3 0 / 0.25 pts
i, ii, iii
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
iv, v
ii, iii
i, iv, v
True
False
data mining
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
data analytics
data manipulation
5
8
6
7
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 3.2. [Answer should have a precision of X.XXXX]
0.8769
-0.8679
0.8679
-0.8769
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Compute the width of each bin for data given below, if the number of
bins is 4.
[39, 45, 49, 45, 31, 37, 38, 41, 37, 41, 39, 34, 35, 30, 47, 43, 44, 46,
48, 36]
4
3
5
(49-30)/4
6
5
4
7
6
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Classification
Clustering
Regression
Optimization
A scenario where you feel unwell and go to a doctor. The doctor asks
you questions like were you exposed to rain or cold climate or did you
have contact with a sick person or did you have food from outside etc.
Based on your answers,doctor came to the conclusion. This can be
considered analogous to which stage of data analytics?
Descriptive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Predictive
Prescriptive
Diagnostic
Classification
Association Rule
Clustering
Regression
A company wants to find the target segment of people for one of its
products. Which modelling technique would be generally appropriate
Clustering
Classification
Ranking
Regression
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
True
False
Data pre-processing improves the data quality and make data mining
algorithms efficient and effective. What are the data pre-processing
tasks along with Data cleaning?
Data Transformation
Data Integration
Data Reduction
Access Situation - "Cost and Benefits" - Falls into which phase of Crisp
- DM
Data Preparation
Data modeling
Evaluation
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Business Understanding
Incorrect
Question 19 0 / 0.25 pts
Identify the data analytics task for the following scenario. A software
programmer develops a tool to convert programs written in Verilog to
Python.
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
Descriptive Analytics
Incorrect
Question 20 0 / 0.25 pts
Identify the data analytics task for the following scenario. A student is
analyzing various blogs and vlogs to find out the skill set that has to be
acquired to become a data scientist in the future.
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
Descriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
In 2001, Big Data created the three Vs: volume, velocity, and variety.
The V's have grown to encompass veracity and value in the years
since. Big data is sometimes subjected to a fifth V, which is:
Volatile
Vector
Variability
Vulnerability
Applied statistics
Industry statistics
Economic statistics
What are the best practices for implementing big data analytics
programmes?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Focusing on business goals and how to use big data analytics
technologies to meet them
Adopting data analysis tools based on a laundry list of their capabilities
Zip codes
Exam Grades
Military ranks
Academic ranks
BeautifulSoup
WebCrawler
Scraper
WebSpider
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
The distance between categories is equal across the range of
interval/ratio data
Ordinal variables have a fixed zero point, whereas interval/ratio
variables do not
Variables
Features
Dimensions
Data point
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Symmetric attribute
Discrete attribute
Continuous attribute
Asymmetric attribute
Data retrieval
Data cleansing
Data transformation
Data combining
MEDIAN
MEAN
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
MODE
RANGE
In which phase, the duplicates (of the data) are removed? Choose the
best possible answer.
Data Preparation
Data Collection
Data Understanding
Data Exploration
Matrix
TreeMap
Lexeme
ConeTree
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
There are two sets X={10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24} and Y = {-30, -31, -32, -33, -34, -35, -36, -37, -38, -39, -40, -41,
-42, -43, -44}. What is TRUE about the standard deviations of X and Y
i.e. σX and σY respectively?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Systematic Sampling
Stratified sampling
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Which among the following are valid methods of handling missing data
Q and R
R only
P, Q and R
For the same real world entity, resolving attribute values from different
sources
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
I. The smaller data sets resulting from data reduction require less
memory and processing time.
Statement I and II
Statement I and IV
mean of a value
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 19/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 11:59pm Points 10 Questions 40
Available Dec 18 at 7pm - Dec 19 at 11:59pm Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
processing data
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
analysing data
organizing data
Incorrect
Question 3 0 / 0.25 pts
i, ii, iii
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
iv, v
ii, iii
i, iv, v
True
False
data mining
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
data analytics
data manipulation
5
8
6
7
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 3.2. [Answer should have a precision of X.XXXX]
0.8769
-0.8679
0.8679
-0.8769
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Compute the width of each bin for data given below, if the number of
bins is 4.
[39, 45, 49, 45, 31, 37, 38, 41, 37, 41, 39, 34, 35, 30, 47, 43, 44, 46,
48, 36]
4
3
5
(49-30)/4
6
5
4
7
6
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Classification
Clustering
Regression
Optimization
A scenario where you feel unwell and go to a doctor. The doctor asks
you questions like were you exposed to rain or cold climate or did you
have contact with a sick person or did you have food from outside etc.
Based on your answers,doctor came to the conclusion. This can be
considered analogous to which stage of data analytics?
Descriptive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Predictive
Prescriptive
Diagnostic
Classification
Association Rule
Clustering
Regression
A company wants to find the target segment of people for one of its
products. Which modelling technique would be generally appropriate
Clustering
Classification
Ranking
Regression
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
True
False
Data pre-processing improves the data quality and make data mining
algorithms efficient and effective. What are the data pre-processing
tasks along with Data cleaning?
Data Transformation
Data Integration
Data Reduction
Access Situation - "Cost and Benefits" - Falls into which phase of Crisp
- DM
Data Preparation
Data modeling
Evaluation
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Business Understanding
Incorrect
Question 19 0 / 0.25 pts
Identify the data analytics task for the following scenario. A software
programmer develops a tool to convert programs written in Verilog to
Python.
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
Descriptive Analytics
Incorrect
Question 20 0 / 0.25 pts
Identify the data analytics task for the following scenario. A student is
analyzing various blogs and vlogs to find out the skill set that has to be
acquired to become a data scientist in the future.
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
Descriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
In 2001, Big Data created the three Vs: volume, velocity, and variety.
The V's have grown to encompass veracity and value in the years
since. Big data is sometimes subjected to a fifth V, which is:
Volatile
Vector
Variability
Vulnerability
Applied statistics
Industry statistics
Economic statistics
What are the best practices for implementing big data analytics
programmes?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Focusing on business goals and how to use big data analytics
technologies to meet them
Adopting data analysis tools based on a laundry list of their capabilities
Zip codes
Exam Grades
Military ranks
Academic ranks
BeautifulSoup
WebCrawler
Scraper
WebSpider
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
The distance between categories is equal across the range of
interval/ratio data
Ordinal variables have a fixed zero point, whereas interval/ratio
variables do not
Variables
Features
Dimensions
Data point
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Symmetric attribute
Discrete attribute
Continuous attribute
Asymmetric attribute
Data retrieval
Data cleansing
Data transformation
Data combining
MEDIAN
MEAN
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
MODE
RANGE
In which phase, the duplicates (of the data) are removed? Choose the
best possible answer.
Data Preparation
Data Collection
Data Understanding
Data Exploration
Matrix
TreeMap
Lexeme
ConeTree
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
There are two sets X={10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24} and Y = {-30, -31, -32, -33, -34, -35, -36, -37, -38, -39, -40, -41,
-42, -43, -44}. What is TRUE about the standard deviations of X and Y
i.e. σX and σY respectively?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Systematic Sampling
Stratified sampling
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Which among the following are valid methods of handling missing data
Q and R
R only
P, Q and R
For the same real world entity, resolving attribute values from different
sources
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
I. The smaller data sets resulting from data reduction require less
memory and processing time.
Statement I and II
Statement I and IV
mean of a value
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 19/19
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 23:59 Points 10 Questions 40
Available Dec 18 at 19:00 - Dec 19 at 23:59 Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
LATEST Attempt 1 60 minutes 6.63 out of 10
Feature selection
Deployment
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data warehousing
True
False
Incorrect
Question 4 0 / 0.25 pts
"Take last weeks data and predict the sales for next six months within
next few days with 2 people team" is good example of which of the
following Data Science challenge?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Unrealistic expectations
Data sizing
Data Analytics
Data manipulation
Data mining
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Suppose that the minimum and maximum values for an attribute are 4.3
and 7.6, respectively. Compute the scaled value of 5.4 if min-max
normalization is applied to scale [0.0,1.0]. (Answer should have a
precision of X.XX]
0.45
0.36
0.43
0.33
1.1
1.2
1.0
1.3
Compute the depth of each bin for data given below, if the number of
bins is 5.
[47, 38, 40, 42, 47, 42, 41, 39, 42, 45, 36, 37, 36, 40, 43, 40, 45, 36, 43,
46]
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
4
20/5
5
6
3
11
8
10
7
Identify the data analytics task for the following scenario. A physics
teacher is analyzing the answer scripts of the students to identify the
areas that he/she should concentrate on so that the students
understand the concepts better.
Predictive Analytics
Descriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Prescriptive Analytics
Diagnostic Analytics
Data digging
Data mining
Consumer engagement
Prescriptive
Diagnostic
Predictive
Descriptive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
In 2001, Big Data created the three Vs: volume, velocity, and variety.
The V's have grown to encompass veracity and value in the years
since. Big data is sometimes subjected to a fifth V, which is:
Variability
Volatile
Vulnerability
Vector
Descriptive analysis
Diagnositic Analysis
Predictive analysis
Prescriptive analysis
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Sub setting can be used to select and exclude variables and
observations
Merging concerns combining datasets on the same observations to
produce a result
Regression
Clustering
Association Rule
Classification
Applied statistics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Industry statistics
Economic statistics
Incorrect
Question 19 0 / 0.25 pts
Descriptive analytics
Predictive analytics
Prescriptive analytics
True
False
A company wants to find the target segment of people for one of its
products. Which modelling technique would be generally appropriate
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Regression
Ranking
Classification
Clustering
Various modeling techniques are selected and applied, and their
parameters are calibrated to optimal values3
Thorough evaluation indeed is needed, yet the CRISP-DM methodology
does not prescribe how to do this.
It very much underestimates the amount real experimentation that is
needed to get at viable results
The end-users of the analytical model are required to post-rationalize
the model, which leads to a lot of dissatisfaction
Incorrect
Question 23 0 / 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
True
False
Structured
Semi-structured
Quasi-structured
Unstructured
Incorrect
Question 25 0 / 0.25 pts
Continuous attribute
Symmetric attribute
Asymmetric attribute
Discrete attribute
Incorrect
Question 26 0 / 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
It places less emphasis on the initial planning phases covered in
CRISP-DM (Business Understanding and Data Understanding phases)
and omits entirely the Deployment phase.
SEMMA is focused on the model development aspects of data mining.
SEMMA is a logical organisation of the functional tool set of SAS
Enterprise Miner for carrying out the core tasks of data mining.
The SEMMA model also emphasizes data mining as a non-linear,
adaptive process.
Response to 'Do you own a car?' can be considered symmetric binary
Response to 'Do you have a rare disease?' can be considered
symmetric binary.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
mean−median ≈ 3×(mean−mode)
mean−mode ≈ 3×(median-mean).
median-mean ≈ 3×(mean−mode)
mean−mode ≈ 3×(mean−median).
Data combining
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data cleansing
Data transformation
Data retrieval
Incorrect
Question 31 0 / 0.25 pts
Identify the sampling technique used in the following use case. For ML
classification task, the algorithm requires that the test set has equal
examples from the three categories.
Simple Random
Cluster Sampling
Stratified Random
Systematic Sampling
Incorrect
Question 32 0 / 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Partial
Question 33 0.13 / 0.25 pts
Find the mean salary of the available 90% of the data and use that to fill
in all the missing data.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Find the field wise mean salary for the available data and fill in the
missing salary data with the applicable mean salary.
Data Integration is a
Generalization technique
Pre-processing technique
Incorrect
Question 36 0 / 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
“Similarity” means:
The number of tuples in a database whose attributes have similar
values.
Incorrect
Question 38 0 / 0.25 pts
5
5.5
6
2
In which phase, the duplicates of the data are removed? Choose the
best possible answer.
Data Collection
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/18
18/12/2022, 23:02 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data Requirements
Data Understanding
Data Preparation
Unanswered
Question 40 0 / 0.25 pts
Mostly True
True
False
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 23:59 Points 10 Questions 40
Available Dec 18 at 19:00 - Dec 19 at 23:59 Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
False
True
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
True
False
Data Wrangling
Which of the following are the reasons for the sudden growth of
analytics?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Large number of user friendly analytics tools available for data
processing
Data manipulation
Data mining
Data Analytics
Feature selection
Data warehousing
Deployment
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Compute the width of each bin for data given below, if the number of
bins is 5.
[39, 45, 49, 45, 31, 37, 38, 41, 37, 41, 39, 34, 35, 30, 47, 43, 44, 46,
48, 36]
5
4
(49−30)÷5
6
3
56.739
5673.9
5.6739
567.39
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
150W
100W
90W
98W
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 3.2. [Answer should have a precision of X.XXXX]
-0.8679
-0.8769
0.8679
0.8769
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
None
Operationalize
Model building
Model planning
In 2001, Big Data created the three Vs: volume, velocity, and variety.
The V's have grown to encompass veracity and value in the years
since. Big data is sometimes subjected to a fifth V, which is:
Vector
Variability
Vulnerability
Volatile
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Predictive analysis
Prescriptive analysis
Diagnositic Analysis
Descriptive analysis
Descriptive Analytics
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Identify the data analytics task for the following scenario. The sales of
various products of your organization per month per geographical area
is reported using an interactive visual tool.
Descriptive Analytics
Prescriptive Analytics
Predictive Analytics
Diagnostic Analytics
Finding the shorter path between two already-existing routes between
two locations.
Predict traffic congestion along a specific route between two locations
using vehicle journey times.
Calculating a room's temperature (in Celsius) based on other
environmental factors (such as atmospheric pressure, humidity etc).
Predicting if a cricket player is a batsman or bowler, given his playing
record.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
A scenario where you feel unwell and go to a doctor. The doctor asks
you questions like were you exposed to rain or cold climate or did you
have contact with a sick person or did you have food from outside etc.
Based on your answers,doctor came to the conclusion. This can be
considered analogous to which stage of data analytics?
Diagnostic
Descriptive
Predictive
Prescriptive
Incorrect
Question 19 0 / 0.25 pts
Data Preparation,Modeling
Data Understanding,Evaluation
Modeling,Data Preparation
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Evaluation,Business understanding
Cassandra Big-data
Tableau Visualization
SAS Statistics
Incorrect
Question 21 0 / 0.25 pts
Data analytics is the pursuit of extracting meaning from raw data using
specialized computer systems.
Data Analytics refers to the techniques used to analyze data to enhance
productivity and business gain.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Analytics is a process in which a computer examines information using
mathematical methods to find useful patterns.
Incorrect
Question 22 0 / 0.25 pts
It focuses on transforming the data’s format by converting raw data into
another format
WebSpider
BeautifulSoup
Scraper
WebCrawler
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
True
False
Ordinal attribute
Continuous attribute
Numeric attribute
Nominal attribute
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
True
False
In which phase, the duplicates of the data are removed? Choose the
best possible answer.
Data Requirements
Data Preparation
Data Understanding
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data Collection
Find the mean salary of the available 90% of the data and use that to fill
in all the missing data.
Find the field wise mean salary for the available data and fill in the
missing salary data with the applicable mean salary.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
A. it can’t be done
One Hot Encoding scales well as the number of class labels increases
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data cleansing
Data retrieval
Data combining
Data transformation
Which among the following are valid methods of handling missing data
Q and R
P, Q and R
R only
Identify the sampling technique used in the following use case. For ML
classification task, the algorithm requires that the test set has equal
examples from the three categories.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Stratified Random
Systematic Sampling
Simple Random
Cluster Sampling
Data Integration is a
Generalization technique
Pre-processing technique
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/18
12/18/22, 8:59 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Probabilistic
Systamatic
Qualitative
Representative
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 23:59 Points 10 Questions 40
Available Dec 18 at 19:00 - Dec 19 at 23:59 Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
Consulting
Coordinational
Functional
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Center of Excellence
Data Wrangling
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Hadoop
MongoDB
Flask
Amazon S3
True
False
Recommendation Systems
Privacy Checker
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Compute the depth of each bin for the data given below, if the number
of bins is 5.
[47, 38, 40, 42, 47, 42, 41, 39, 42, 45, 36, 37, 36, 40, 43, 40, 45, 36,
43, 46]
2.2
2.3
2.17
2.02
56.739
567.39
5673.9
5.6739
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 10.7.
0.2679
-0.2679
0.2769
-0.2769
Suppose that the minimum and maximum values for an attribute are
4.3 and 7.6, respectively. Compute the scaled value of 5.4 if min-max
normalization is applied to scale [0.0,1.0]. (Answer should have a
precision of X.XX]
0.38
0.30
0.33
0.29
Match with the most appropriate answer, related to the tools available
to a Data Science life cycle.
SMAM 8 stages
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
SEMMA 5 stages
CRISP-DM 6 stages
Analytics is a process in which a computer examines information using
mathematical methods to find useful patterns.
Data Analytics refers to the techniques used to analyze data to enhance
productivity and business gain.
Data analytics is the pursuit of extracting meaning from raw data using
specialized computer systems.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
home with medicines. He also has instructions to rest and drink plenty
of fluids and return to the hospital if the fever doesn't subside in a
week. This can be considered analogous to which stage of data
analytics
Prescriptive
Predictive
Descriptive
Diagnostic
a gateway that offers access to a variety of vital information from many
different sources on one screen
methods for predicting future behavior using statistical analysis and
data mining, particularly to maximize the strategic value of corporate
intelligence
a method that uses feature analysis to predict people in photos and
tags them to other photos on its own
software applications, also known as "bots," that are dispatched to carry
out a mission and gather data from web pages on behalf of a user
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Prescriptive
Predictive
Descriptive
Diagnostic
Data Wrangling
Artificial Intelligence
Deep Learning
Machine Learning
Incorrect
Question 17 0 / 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
It focuses on transforming the data’s format by converting raw data into
another format
Model building
None
Model planning
Operationalize
Incorrect
Question 19 0 / 0.25 pts
Identify the data analytics task for the following scenario. A project
manager is analysing past projects to identify the software, hardware
and human resources that will be required to complete a new project
successfully on time.
Diagnostic Analytics
Prescriptive Analytics
Descriptive Analytics
Predictive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Identify the data analytics task for the following scenario. A program
manager wants to analyze user clickstream data from the mobile app to
understand how many users were using a particular feature that was
rolled out.
Diagnostic Analytics
Descriptive Analytics
Predictive Analytics
Prescriptive Analytics
True
False
Identify the data analytics task for the following scenario. A mother
analysed the answer scripts of her daughter and found that they should
concentrate on reading comprehension to score better in the upcoming
exams.
Prescriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Predictive Analytics
Diagnostic Analytics
Descriptive Analytics
Duplicate Data
Inconsistent data
Noisy data
Incomplete data
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Nominal attribute
Ratio attribute
Ordinal attribute
Interval attribute
Nominal attribute
Ordinal attribute
Interval attribute
Ratio attribute
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Variables
Data point
Features
Dimensions
[0, 6, 24]
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
True
False
Find the mean salary of the available 90% of the data and use that to fill
in all the missing data.
Find the field wise mean salary for the available data and fill in the
missing salary data with the applicable mean salary.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
[0,6,24]
[30,60,63]
[0,24,30]
[87,87,90]
Identify the sampling technique used in the following use case. For ML
classification task, the algorithm requires that the test set has equal
examples from the three categories.
Cluster Sampling
Simple Random
Stratified Random
Systematic Sampling
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
[0,1,1,0]
[1,0,0,1]
Histogram
Range
Percentage
Frequency
The area surrounding an object where the object is able to exert its
influence.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
In a box and whisker plot of data, point out the FALSE statement, about
Outliers
Outliers are beyond 1.5 times the Inter Quartile Range (IQR) from the
upper quartile
Outliers are beyond 1.5 times the Inter Quartile Range (IQR) from the
lower quartile
Outliers are beyond 1.5 times the Inter Quartile Range (IQR) from the
median
Outliers are beyond the lowest or the highest value in the dataset
Find the Jaccard coefficient for the 2 data objects with the below
feature vectors.
0
NONE
1
.7
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/18
12/18/22, 9:41 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
A. it can’t be done
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 23:59 Points 10 Questions 40
Available Dec 18 at 19:00 - Dec 19 at 23:59 Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
SAS
Python
R
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Springboot
Challenge results
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data Engineer
Data Architect
Data Journalist
Flask
Hadoop
MongoDB
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Amazon S3
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 3.2. [Answer should have a precision of X.XXXX]
-0.8679
0.8769
0.8679
-0.8769
100W
98W
150W
110W
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Suppose that the minimum and maximum values for an attribute are
4.3 and 7.6, respectively. Compute the scaled value of 5.4 if min-max
normalization is applied to scale [0.0,1.0]. (Answer should have a
precision of X.XX]
0.29
0.33
0.38
0.30
Suppose that the minimum and maximum values for the attribute
income are $12,000 and $98,000, respectively. The new range is
[0.0,1.0]. Apply min-max normalization to a value of $73,600.
0.758
0.716
0.561
0.856
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Association Rule
Clustering
Classification
Regression
Finding the shorter path between two already-existing routes between
two locations.
Predict traffic congestion along a specific route between two locations
using vehicle journey times.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Calculating a room's temperature (in Celsius) based on other
environmental factors (such as atmospheric pressure, humidity etc).
Predicting if a cricket player is a batsman or bowler, given his playing
record.
Reinforcement
Association
Classification
Regression
True
False
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
ZooKeeper
MapReduce
Hive
Prescriptive analysis
Diagnositic Analysis
Descriptive analysis
Predictive analysis
Incorrect
Question 18 0 / 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Prescriptive Analytics
Diagnostic Analytics
Predictive Analytics
Descriptive Analytics
Predictive
Diagnostic
Descriptive
Prescriptive
Incorrect
Question 20 0 / 0.25 pts
A retail store realizes its sales were lower than expected in the last
quarter. Data scientist helps in identifying whether the sales were
affected uniformly across all segments or restricted to one segment?
What kind of analytics is this?
Predictive
Prescriptive
Descriptive
Diagnostic
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data Analytics refers to the techniques used to analyze data to enhance
productivity and business gain.
Data analytics is the pursuit of extracting meaning from raw data using
specialized computer systems.
Analytics is a process in which a computer examines information using
mathematical methods to find useful patterns.
Incorrect
Question 22 0 / 0.25 pts
Prescriptive
Predictive
Descriptive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Diagnostic
Features
Dimensions
Data point
Variables
Incorrect
Question 24 0 / 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Interval attribute
Ordinal attribute
Nominal attribute
Ratio attribute
True
False
Response to 'Do you own a car?' can be considered symmetric binary
Response to 'Do you have a rare disease?' can be considered
symmetric binary.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Continuous
Normal
Asymmetric
Discrete
From mathematical models for epidemics one observes that the initial
phase is in the exponential growth. This can be verified by plotting the
number of infections on log-scale. This is a use case for which
technique.
Sampling
Discretization
Aggregation
Transformation
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Lexeme
TreeMap
Matrix
ConeTree
Suppose that the mean and standard deviation of the values for an
attribute are 6500 and 2000, respectively. Apply z-score normalization
to value of 8000.
.7
.75
.6
.5
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
True
False
In the given table below there is a requirement that to get the name,
gender, marks of the top-scoring students only. Which of the
following functionalities of data wrangling is used?
Replace
Data exploration
Filter
Reshape
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
RANGE
MODE
MEDIAN
MEAN
[0,24,30]
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
[87,87,90]
[30,60,63]
[0,6,24]
[1,0,0,1]
[0,1,1,0]
Do Nothing
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Histogram analysis
Equal-frequency binning
Correlation analysis
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 23:59 Points 10 Questions 40
Available Dec 18 at 19:00 - Dec 19 at 23:59 Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
SAS
Python
R
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Springboot
Challenge results
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data Engineer
Data Architect
Data Journalist
Flask
Hadoop
MongoDB
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Amazon S3
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 3.2. [Answer should have a precision of X.XXXX]
-0.8679
0.8769
0.8679
-0.8769
100W
98W
150W
110W
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Suppose that the minimum and maximum values for an attribute are
4.3 and 7.6, respectively. Compute the scaled value of 5.4 if min-max
normalization is applied to scale [0.0,1.0]. (Answer should have a
precision of X.XX]
0.29
0.33
0.38
0.30
Suppose that the minimum and maximum values for the attribute
income are $12,000 and $98,000, respectively. The new range is
[0.0,1.0]. Apply min-max normalization to a value of $73,600.
0.758
0.716
0.561
0.856
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Association Rule
Clustering
Classification
Regression
Finding the shorter path between two already-existing routes between
two locations.
Predict traffic congestion along a specific route between two locations
using vehicle journey times.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Calculating a room's temperature (in Celsius) based on other
environmental factors (such as atmospheric pressure, humidity etc).
Predicting if a cricket player is a batsman or bowler, given his playing
record.
Reinforcement
Association
Classification
Regression
True
False
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
ZooKeeper
MapReduce
Hive
Prescriptive analysis
Diagnositic Analysis
Descriptive analysis
Predictive analysis
Incorrect
Question 18 0 / 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Prescriptive Analytics
Diagnostic Analytics
Predictive Analytics
Descriptive Analytics
Predictive
Diagnostic
Descriptive
Prescriptive
Incorrect
Question 20 0 / 0.25 pts
A retail store realizes its sales were lower than expected in the last
quarter. Data scientist helps in identifying whether the sales were
affected uniformly across all segments or restricted to one segment?
What kind of analytics is this?
Predictive
Prescriptive
Descriptive
Diagnostic
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data Analytics refers to the techniques used to analyze data to enhance
productivity and business gain.
Data analytics is the pursuit of extracting meaning from raw data using
specialized computer systems.
Analytics is a process in which a computer examines information using
mathematical methods to find useful patterns.
Incorrect
Question 22 0 / 0.25 pts
Prescriptive
Predictive
Descriptive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Diagnostic
Features
Dimensions
Data point
Variables
Incorrect
Question 24 0 / 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Interval attribute
Ordinal attribute
Nominal attribute
Ratio attribute
True
False
Response to 'Do you own a car?' can be considered symmetric binary
Response to 'Do you have a rare disease?' can be considered
symmetric binary.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Continuous
Normal
Asymmetric
Discrete
From mathematical models for epidemics one observes that the initial
phase is in the exponential growth. This can be verified by plotting the
number of infections on log-scale. This is a use case for which
technique.
Sampling
Discretization
Aggregation
Transformation
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Lexeme
TreeMap
Matrix
ConeTree
Suppose that the mean and standard deviation of the values for an
attribute are 6500 and 2000, respectively. Apply z-score normalization
to value of 8000.
.7
.75
.6
.5
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
True
False
In the given table below there is a requirement that to get the name,
gender, marks of the top-scoring students only. Which of the
following functionalities of data wrangling is used?
Replace
Data exploration
Filter
Reshape
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
RANGE
MODE
MEDIAN
MEAN
[0,24,30]
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
[87,87,90]
[30,60,63]
[0,6,24]
[1,0,0,1]
[0,1,1,0]
Do Nothing
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/18
12/18/22, 10:29 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Histogram analysis
Equal-frequency binning
Correlation analysis
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/18
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 11:59pm Points 10 Questions 40
Available Dec 18 at 7pm - Dec 19 at 11:59pm Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
processing data
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
analysing data
organizing data
Incorrect
Question 3 0 / 0.25 pts
i, ii, iii
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
iv, v
ii, iii
i, iv, v
True
False
data mining
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
data analytics
data manipulation
5
8
6
7
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 3.2. [Answer should have a precision of X.XXXX]
0.8769
-0.8679
0.8679
-0.8769
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Compute the width of each bin for data given below, if the number of
bins is 4.
[39, 45, 49, 45, 31, 37, 38, 41, 37, 41, 39, 34, 35, 30, 47, 43, 44, 46,
48, 36]
4
3
5
(49-30)/4
6
5
4
7
6
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Classification
Clustering
Regression
Optimization
A scenario where you feel unwell and go to a doctor. The doctor asks
you questions like were you exposed to rain or cold climate or did you
have contact with a sick person or did you have food from outside etc.
Based on your answers,doctor came to the conclusion. This can be
considered analogous to which stage of data analytics?
Descriptive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Predictive
Prescriptive
Diagnostic
Classification
Association Rule
Clustering
Regression
A company wants to find the target segment of people for one of its
products. Which modelling technique would be generally appropriate
Clustering
Classification
Ranking
Regression
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
True
False
Data pre-processing improves the data quality and make data mining
algorithms efficient and effective. What are the data pre-processing
tasks along with Data cleaning?
Data Transformation
Data Integration
Data Reduction
Access Situation - "Cost and Benefits" - Falls into which phase of Crisp
- DM
Data Preparation
Data modeling
Evaluation
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Business Understanding
Incorrect
Question 19 0 / 0.25 pts
Identify the data analytics task for the following scenario. A software
programmer develops a tool to convert programs written in Verilog to
Python.
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
Descriptive Analytics
Incorrect
Question 20 0 / 0.25 pts
Identify the data analytics task for the following scenario. A student is
analyzing various blogs and vlogs to find out the skill set that has to be
acquired to become a data scientist in the future.
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
Descriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
In 2001, Big Data created the three Vs: volume, velocity, and variety.
The V's have grown to encompass veracity and value in the years
since. Big data is sometimes subjected to a fifth V, which is:
Volatile
Vector
Variability
Vulnerability
Applied statistics
Industry statistics
Economic statistics
What are the best practices for implementing big data analytics
programmes?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Focusing on business goals and how to use big data analytics
technologies to meet them
Adopting data analysis tools based on a laundry list of their capabilities
Zip codes
Exam Grades
Military ranks
Academic ranks
BeautifulSoup
WebCrawler
Scraper
WebSpider
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
The distance between categories is equal across the range of
interval/ratio data
Ordinal variables have a fixed zero point, whereas interval/ratio
variables do not
Variables
Features
Dimensions
Data point
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Symmetric attribute
Discrete attribute
Continuous attribute
Asymmetric attribute
Data retrieval
Data cleansing
Data transformation
Data combining
MEDIAN
MEAN
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
MODE
RANGE
In which phase, the duplicates (of the data) are removed? Choose the
best possible answer.
Data Preparation
Data Collection
Data Understanding
Data Exploration
Matrix
TreeMap
Lexeme
ConeTree
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
There are two sets X={10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24} and Y = {-30, -31, -32, -33, -34, -35, -36, -37, -38, -39, -40, -41,
-42, -43, -44}. What is TRUE about the standard deviations of X and Y
i.e. σX and σY respectively?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Systematic Sampling
Stratified sampling
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Which among the following are valid methods of handling missing data
Q and R
R only
P, Q and R
For the same real world entity, resolving attribute values from different
sources
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
I. The smaller data sets resulting from data reduction require less
memory and processing time.
Statement I and II
Statement I and IV
mean of a value
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 19/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 11:59pm Points 10 Questions 40
Available Dec 18 at 7pm - Dec 19 at 11:59pm Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
processing data
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
analysing data
organizing data
Incorrect
Question 3 0 / 0.25 pts
i, ii, iii
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
iv, v
ii, iii
i, iv, v
True
False
data mining
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
data analytics
data manipulation
5
8
6
7
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 3.2. [Answer should have a precision of X.XXXX]
0.8769
-0.8679
0.8679
-0.8769
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Compute the width of each bin for data given below, if the number of
bins is 4.
[39, 45, 49, 45, 31, 37, 38, 41, 37, 41, 39, 34, 35, 30, 47, 43, 44, 46,
48, 36]
4
3
5
(49-30)/4
6
5
4
7
6
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Classification
Clustering
Regression
Optimization
A scenario where you feel unwell and go to a doctor. The doctor asks
you questions like were you exposed to rain or cold climate or did you
have contact with a sick person or did you have food from outside etc.
Based on your answers,doctor came to the conclusion. This can be
considered analogous to which stage of data analytics?
Descriptive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Predictive
Prescriptive
Diagnostic
Classification
Association Rule
Clustering
Regression
A company wants to find the target segment of people for one of its
products. Which modelling technique would be generally appropriate
Clustering
Classification
Ranking
Regression
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
True
False
Data pre-processing improves the data quality and make data mining
algorithms efficient and effective. What are the data pre-processing
tasks along with Data cleaning?
Data Transformation
Data Integration
Data Reduction
Access Situation - "Cost and Benefits" - Falls into which phase of Crisp
- DM
Data Preparation
Data modeling
Evaluation
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Business Understanding
Incorrect
Question 19 0 / 0.25 pts
Identify the data analytics task for the following scenario. A software
programmer develops a tool to convert programs written in Verilog to
Python.
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
Descriptive Analytics
Incorrect
Question 20 0 / 0.25 pts
Identify the data analytics task for the following scenario. A student is
analyzing various blogs and vlogs to find out the skill set that has to be
acquired to become a data scientist in the future.
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
Descriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
In 2001, Big Data created the three Vs: volume, velocity, and variety.
The V's have grown to encompass veracity and value in the years
since. Big data is sometimes subjected to a fifth V, which is:
Volatile
Vector
Variability
Vulnerability
Applied statistics
Industry statistics
Economic statistics
What are the best practices for implementing big data analytics
programmes?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Focusing on business goals and how to use big data analytics
technologies to meet them
Adopting data analysis tools based on a laundry list of their capabilities
Zip codes
Exam Grades
Military ranks
Academic ranks
BeautifulSoup
WebCrawler
Scraper
WebSpider
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
The distance between categories is equal across the range of
interval/ratio data
Ordinal variables have a fixed zero point, whereas interval/ratio
variables do not
Variables
Features
Dimensions
Data point
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Symmetric attribute
Discrete attribute
Continuous attribute
Asymmetric attribute
Data retrieval
Data cleansing
Data transformation
Data combining
MEDIAN
MEAN
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
MODE
RANGE
In which phase, the duplicates (of the data) are removed? Choose the
best possible answer.
Data Preparation
Data Collection
Data Understanding
Data Exploration
Matrix
TreeMap
Lexeme
ConeTree
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
There are two sets X={10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24} and Y = {-30, -31, -32, -33, -34, -35, -36, -37, -38, -39, -40, -41,
-42, -43, -44}. What is TRUE about the standard deviations of X and Y
i.e. σX and σY respectively?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Systematic Sampling
Stratified sampling
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Which among the following are valid methods of handling missing data
Q and R
R only
P, Q and R
For the same real world entity, resolving attribute values from different
sources
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
I. The smaller data sets resulting from data reduction require less
memory and processing time.
Statement I and II
Statement I and IV
mean of a value
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 19/19
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 23:59 Points 10 Questions 40
Available Dec 18 at 19:00 - Dec 19 at 23:59 Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
analysing data
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
processing data
organizing data
Data Architect
Data Scientist
Visualization Engineer
Business Analyst
Python
Springboot
SAS
R
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Which of the following best describes the difference between the data
analyst and data scientist?
Data analyst does estimation whereas data scientist predicts & explains
it as well
Data analyst are more proficient in R whereas Data scientists are more
proficient in Python
Data analyst just deal with numbers whereas data scientists deals with
algorithms
Data analyst and data scientists plays the same role in the project
True
False
Storyteller
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data Analyst
150W
98W
100W
110W
10
11
7
8
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
y= (2,1,1,2,1,2,1,1,0,0,0)
0.034
50.2
0.64
0.99
Suppose that the minimum and maximum values for the attribute
income are $12,000 and $98,000, respectively. The new range is
[0.0,1.0]. Apply min-max normalization to a value of $73,600.
0.716
0.758
0.856
0.561
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Clustering
Regression
Association Rule
Classification
Identify the data analytics task for the following scenario. A mother
analysed the answer scripts of her daughter and found that they should
concentrate on reading comprehension to score better in the upcoming
exams.
Prescriptive Analytics
Diagnostic Analytics
Predictive Analytics
Descriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Predictive
Descriptive
Prescriptive
Diagnostic
Identify the data analytics task for the following scenario. A car service
showroom manager wants to analyze his marketing and sales data to
understand the reason for drop in sales.
Descriptive Analytics
Diagnostic Analytics
Prescriptive Analytics
Predictive Analytics
For a given data set, the following data preprocessing techniques used
to improve the quality of data:
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Match with the most appropriate answer, related to the tools available
to a Data Science life cycle.
SMAM 8 stages
SEMMA 5 stages
CRISP-DM 6 stages
A scenario where you feel unwell and go to a doctor. The doctor asks
you questions like were you exposed to rain or cold climate or did you
have contact with a sick person or did you have food from outside etc.
Based on your answers,doctor came to the conclusion. This can be
considered analogous to which stage of data analytics?
Predictive
Descriptive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Prescriptive
Diagnostic
Diagnostic Analytics
Descriptive Analytics
Predictive Analytics
Prescriptive Analytics
Association Rule
Clustering
Regression
Classification
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Identify the data analytics task for the following scenario. Google is
using tools to suggest texts or phrases while composing emails.
Descriptive Analytics
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
Incorrect
Question 21 0 / 0.25 pts
Data collection
Deployment
Data preparation
Data Modelling
Data pre-processing improves the data quality and make data mining
algorithms efficient and effective. What are the data pre-processing
tasks along with Data cleaning?
Data Transformation
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data Reduction
Data Integration
Incorrect
Question 24 0 / 0.25 pts
Nominal attribute
Ordinal attribute
Ratio attribute
Interval attribute
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Nominal attribute
Ordinal attribute
Ratio attribute
Interval attribute
Continuous attribute
Discrete attribute
Asymmetric attribute
Symmetric attribute
Ordinal attribute
Numeric attribute
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Nominal attribute
Continuous attribute
Percentage
Histogram
Range
Frequency
“Similarity” means:
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
The number of tuples in a database whose attributes have similar
values.
It may not be a good idea to drop a data field with missing values.
As a data scientist, even if you know that certain outliers are valid data,
you might still omit them from model construction.
As a data scientist, while cleaning the data, you are not concerned with
why the data values are missing. You just fix them.
As a data scientist, even if you know that certain outliers are valid data,
you might still omit them from model construction.
mean−median ≈ 3×(mean−mode)
mean−mode ≈ 3×(median-mean).
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
median-mean ≈ 3×(mean−mode)
mean−mode ≈ 3×(mean−median).
2
4
1
3
Boxplot
Tabulation
Histogram
Pareto diagram
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
In which phase, the duplicates (of the data) are removed? Choose the
best possible answer.
Data Understanding
Data Collection
Data Preparation
Data Exploration
Choose the possible combinations for drawing a scatter plot for given
data.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/18
12/18/22, 10:04 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/18
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 23:59 Points 10 Questions 40
Available Dec 18 at 19:00 - Dec 19 at 23:59 Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
LATEST Attempt 1 54 minutes 8.63 out of 10
Storyteller
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data Analyst
Privacy Checker
Recommendation Systems
Communicative
Punctual
Creative
Technical
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
data wrangling
Functional
Center of Excellence
Coordinational
Consulting
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
98W
100W
150W
110W
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 3.2. [Answer should have a precision of X.XXXX]
0.8679
0.8769
-0.8769
-0.8679
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 10.7.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
-0.2679
0.2679
0.2769
-0.2769
Suppose that the minimum and maximum values for an attribute are 4.3
and 7.6, respectively. Compute the scaled value of 5.4 if min-max
normalization is applied to scale [-1.0,+1.0]. (Answer should have a
precision of X.XX)
0.66
0.76
0.33
-0.33
SMAM
CRISP-DM
SEMMA
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Artificial Intelligence
Data Wrangling
Deep Learning
Machine Learning
Prescriptive
Predictive
Diagnostic
Descriptive
Predictive
Prescriptive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Diagnostic
Descriptive
Prescriptive Analytics
Diagnostic Analytics
Predictive Analytics
Descriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Regression
Classification
Association Rule
Clustering
Classification
Association Rule
Regression
Clustering
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
None
Model planning
Operationalize
Model building
Incorrect
Question 21 0 / 0.25 pts
Descriptive Analytics
Predictive Analytics
Diagnostic Analytics
Cognitive analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Alerts
Adhoc reports
Predictive model
Standard report
Variables
Data point
Features
Dimensions
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Nominal attribute
Numeric attribute
Ordinal attribute
Continuous attribute
The distance between categories is equal across the range of
interval/ratio data
Ordinal variables have a fixed zero point, whereas interval/ratio
variables do not
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 27 0 / 0.25 pts
Unsupervised Learning
Feature Selection
Quasi-structured
Semi-structured
Unstructured
Structured
Partial
Question 29 0.13 / 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 30 0 / 0.25 pts
1
4
3
2
Incorrect
Question 31 0 / 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Identify the sampling technique used in the following use case. For ML
classification task, the algorithm requires that the test set has equal
examples from the three categories.
Systematic Sampling
Stratified Random
Cluster Sampling
Simple Random
Mostly True
False
True
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data transformation
Data retrieval
Data cleansing
Data combining
“Similarity” means:
The number of tuples in a database whose attributes have similar
values.
Noise
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Inconsistencies
Integration
Redundancy
A. it can’t be done
In a boxplot, where Q1, Q2 and Q3 are the first, second and third
quartiles respectively, the interquartile range IQR is calculated as:
IQR = Q3 – Q1
IQR = Q3-Q2
IQR = Q2-Q1
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
2
3
4
1
From mathematical models for epidemics one observes that the initial
phase is in the exponential growth. This can be verified by plotting the
number of infections on log-scale. This is a use case for which
technique.
Transformation
Aggregation
Discretization
Sampling
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/17
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due
Dec 19 at 23:59
Points
10
Questions
40
Available
Dec 18 at 19:00 - Dec 19 at 23:59
Time Limit
60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
LATEST Attempt 1
60 minutes 8.63 out of 10
Correct answers will be available on Dec 22 at 0:00.
Question 1 0.25
/ 0.25 pts
Privacy Checker
Online Price Comparison
Recommendation Systems
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Image & Speech Recognition
Question 2 0.25
/ 0.25 pts
data wrangling
machine learning/deep learning
all of the options
probability and statistics
Question 3 0.25
/ 0.25 pts
Data Architect
Data Analyst
Data Journalist
Data Scientist
Question 4 0.25
/ 0.25 pts
Visualization Engineer
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Business Analyst
Data Scientist
Data Architect
Question 5 0.25
/ 0.25 pts
Which of the following best describes the difference between the data
analyst and data scientist?
Data analyst does estimation whereas data scientist predicts & explains
it as well
Data analyst and data scientists plays the same role in the project
Data analyst just deal with numbers whereas data scientists deals with
algorithms
Data analyst are more proficient in R whereas Data scientists are more
proficient in Python
Question 6 0.25
/ 0.25 pts
"Take last weeks data and predict the sales for next six months within
next few days with 2 people team" is good example of which of the
following Data Science challenge?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Lack of professional having sound knowledge of Data science skills
Unrealistic expectations
Data sizing
Insufficient timing for project completion
Question 7 0.25
/ 0.25 pts
0.166
0.455
0.765
0.234
Question 8 0.25
/ 0.25 pts
90W
98W
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
100W
150W
Question 9 0.25
/ 0.25 pts
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 3.2. [Answer should have a precision of X.XXXX]
-0.8679
0.8769
0.8679
-0.8769
Incorrect
Question 10 0
/ 0.25 pts
1.0
1.3
1.1
1.2
Question 11 0.25
/ 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Which of the following data science project step is the most critical step
for the success of the project?
Model Selection
Data preprocessing
Model Evaluation
Model Building
Question 12 0.25
/ 0.25 pts
Access Situation - "Cost and Benefits" - Falls into which phase of Crisp
- DM
Evaluation
Data modeling
Business Understanding
Data Preparation
Question 13 0.25
/ 0.25 pts
True
False
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 14 0
/ 0.25 pts
Prescriptive
Descriptive
Predictive
Diagnostic
Question 15 0.25
/ 0.25 pts
A scenario where you feel unwell and go to a doctor. The doctor asks
you questions like were you exposed to rain or cold climate or did you
have contact with a sick person or did you have food from outside etc.
Based on your answers,doctor came to the conclusion. This can be
considered analogous to which stage of data analytics?
Diagnostic
Prescriptive
Descriptive
Predictive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Question 16 0.25
/ 0.25 pts
Identify the data analytics task for the following scenario. A physics
teacher is analyzing the answer scripts of the students to identify the
areas that he/she should concentrate on so that the students
understand the concepts better.
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
Descriptive Analytics
Question 17 0.25
/ 0.25 pts
Data integration
Data cleaning
Data replication
All of them
Question 18 0.25
/ 0.25 pts
In 2001, Big Data created the three Vs: volume, velocity, and variety.
The V's have grown to encompass veracity and value in the years
since. Big data is sometimes subjected to a fifth V, which is:
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Vector
Variability
Vulnerability
Volatile
Question 19 0.25
/ 0.25 pts
A company wants to find the target segment of people for one of its
products. Which modelling technique would be generally appropriate
Regression
Classification
Clustering
Ranking
Question 20 0.25
/ 0.25 pts
Diagnostic, Predictive, Prescriptive, Descriptive
Descriptive, Diagnostic, Predictive, Prescriptive
Prescriptive, Descriptive, Diagnostic, Predictive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Predictive, Diagnostic, Prescriptive, Descriptive
Incorrect
Question 21 0
/ 0.25 pts
Identify the data analytics task for the following scenario. A project
manager is analysing past projects to identify the risk involved and how
they were mitigated.
Diagnostic Analytics
Prescriptive Analytics
Descriptive Analytics
Predictive Analytics
Question 22 0.25
/ 0.25 pts
Predict traffic congestion along a specific route between two locations
using vehicle journey times.
Finding the shorter path between two already-existing routes between
two locations.
Predicting if a cricket player is a batsman or bowler, given his playing
record.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
filtering spam from mails
Calculating a room's temperature (in Celsius) based on other
environmental factors (such as atmospheric pressure, humidity etc).
Question 23 0.25
/ 0.25 pts
Unstructured
Semi-structured
Quasi-structured
Structured
Question 24 0.25
/ 0.25 pts
Structured format
Raw format
both
none
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Question 25 0.25
/ 0.25 pts
None of the mentioned
Preprocessed data is original source of data
Raw data is the data obtained after processing steps
Raw data is original source of data
Question 26 0.25
/ 0.25 pts
WebSpider
Scraper
WebCrawler
BeautifulSoup
Question 27 0.25
/ 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
It places less emphasis on the initial planning phases covered in
CRISP-DM (Business Understanding and Data Understanding phases)
and omits entirely the Deployment phase.
The SEMMA model also emphasizes data mining as a non-linear,
adaptive process.
SEMMA is a logical organisation of the functional tool set of SAS
Enterprise Miner for carrying out the core tasks of data mining.
SEMMA is focused on the model development aspects of data mining.
Question 28 0.25
/ 0.25 pts
Interval attribute
Ratio attribute
Ordinal attribute
Nominal attribute
Question 29 0.25
/ 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Mostly True
False
None of the answers
True
Partial
Question 30 0.13
/ 0.25 pts
Question 31 0.25
/ 0.25 pts
Representative
Probabilistic
Qualitative
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Systamatic
Question 32 0.25
/ 0.25 pts
input=['Havells','Philips','Syska','Eveready','Lloyd']
le = sklearn.preprocessing.LabelEncoder()
le.fit(input)
print(le.transform(‘Syska’))
1
2
3
4
Question 33 0.25
/ 0.25 pts
There are two sets X={10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24} and Y = {-30, -31, -32, -33, -34, -35, -36, -37, -38, -39, -40, -41,
-42, -43, -44}. What is TRUE about the standard deviations of X and Y
i.e. σX and σY respectively?
Will be the same.
Magnitude will be the same but the sign will be different
σX will be smaller than σY.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
σY will be smaller than σX.
Question 34 0.25
/ 0.25 pts
Continous_Attribute
130.2
125
126.75
Equal Width binning
Equal frequency binning.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect Question 35 0
/ 0.25 pts
Entropy based discretization
Correlation analysis
Equal-frequency binning
Histogram analysis
Question 36 0.25
/ 0.25 pts
Inconsistencies
Noise
Redundancy
Accuracy & Efficiency
Integration
Incorrect
Question 37 0
/ 0.25 pts
Identify the sampling technique used in the following use case. For ML
classification task, an engineer used every 8th example to generate the
test set .
Simple Random
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Stratified Random
Systematic Sampling
Cluster Sampling
Question 38 0.25
/ 0.25 pts
Drop missing rows or columns
All of the given options
Assign a unique category to missing values
Replace missing values with mean/median/mode
Question 39 0.25
/ 0.25 pts
Finding out the data type of a variable
Finding statistical estimates of a variable
In univariate and bivariate variable analysis
Derivation of new attributes
Question 40 0.25
/ 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/19
18/12/2022, 22:38 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
True
False
Quiz Score:
8.63 out of 10
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 19/19
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 23:59 Points 10 Questions 40
Available Dec 18 at 19:00 - Dec 19 at 23:59 Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
LATEST Attempt 1 60 minutes 6.5 out of 10
data wrangling
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data Analyst
Data Scientist
Data Architect
Data Journalist
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
data manipulation
data mining
data analytics
Privacy Checker
Recommendation Systems
For an organization, having not much data analytics need and just
embarking on the analytics path will most likely structure its data team
in a
Centralized model
Federated model
Consulting model
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Functional model
Suppose that the minimum and maximum values for an attribute are 4.3
and 7.6, respectively. Compute the scaled value of 5.4 if min-max
normalization is applied to scale [0.0,1.0]. (Answer should have a
precision of X.XX]
0.33
0.43
0.36
0.45
0.765
0.166
0.455
0.234
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Compute the width of each bin for data given below, if the number of
bins is 5.
[39, 45, 49, 45, 31, 37, 38, 41, 37, 41, 39, 34, 35, 30, 47, 43, 44, 46, 48,
36]
6
4
(49−30)÷5
3
5
Suppose that the minimum and maximum values for an attribute are 4.3
and 7.6, respectively. Compute the scaled value of 5.4 if min-max
normalization is applied to scale [0.0,1.0]. (Answer should have a
precision of X.XX]
0.33
0.29
0.30
0.38
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Deployment
Data Modelling
Data preparation
Data collection
what is the average score of all students in the CBSE 10th Math Exam
Incorrect
Question 13 0 / 0.25 pts
CRISP-DM
SEMMA
SMAM
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Descriptive analysis
Prescriptive analysis
Diagnositic Analysis
Predictive analysis
Identify the data analytics task for the following scenario. A physics
teacher is analyzing the answer scripts of the students to identify the
areas that he/she should concentrate on so that the students
understand the concepts better.
Predictive Analytics
Diagnostic Analytics
Descriptive Analytics
Prescriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Sub setting can be used to select and exclude variables and
observations
Merging concerns combining datasets on the same observations to
produce a result
Incorrect
Question 17 0 / 0.25 pts
Optimization
Classification
Regression
Clustering
Clustering
Association Rule
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Classification
Regression
Incorrect
Question 19 0 / 0.25 pts
Which of the following data science project step is the most critical step
for the success of the project?
Model Selection
Model Evaluation
Model Building
Data preprocessing
Diagnostic Analytics
Descriptive Analytics
Prescriptive Analytics
Predictive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
A company wants to find the target segment of people for one of its
products. Which modelling technique would be generally appropriate
Classification
Clustering
Regression
Ranking
Identify the data analytics task for the following scenario. A student is
analyzing various blogs and vlogs to find out the skill set that has to be
acquired to become a data scientist in the future.
Diagnostic Analytics
Descriptive Analytics
Predictive Analytics
Prescriptive Analytics
WebSpider
WebCrawler
BeautifulSoup
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Scraper
Quantitative values
Qualitative Values
P) Distinctness
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Q) Order
P and R
P and S
P,Q and R
Interval attribute
Ordinal attribute
Nominal attribute
Ratio attribute
True
False
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 29 0 / 0.25 pts
Incorrect
Question 30 0 / 0.25 pts
2
4
1
3
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
For a data analytics task to analyse feedback on her subject for a class
of 60 students, a school teacher decided to use the survey submitted
by the ten students who come for tuitions for that subject, at her home.
Identify the type of sampling she is doing.
Systematic Sampling
Stratified sampling
Incorrect
Question 32 0 / 0.25 pts
Incorrect
Question 33 0 / 0.25 pts
The statistical description (x1,x2, . . . ,xN)/N, for the data values x1,x2, .
. . ,xN is called as their ________________
mean
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
IQR
median
mode
Which of the following statements are true with respect to data quality
issues?
The given data set should not miss any values or attributes
Pre-processing of data is required to address the problems of
inconsistency, incompleteness.
If data are not updated time to time there will be a negative impact on
data quality.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 36 0 / 0.25 pts
[0,1,1,0]
[1,0,0,1]
As a data scientist, while cleaning the data, you are not concerned with
why the data values are missing. You just fix them.
As a data scientist, even if you know that certain outliers are valid data,
you might still omit them from model construction.
It may not be a good idea to drop a data field with missing values.
As a data scientist, even if you know that certain outliers are valid data,
you might still omit them from model construction.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 38 0 / 0.25 pts
A. it can’t be done
In a boxplot, where Q1, Q2 and Q3 are the first, second and third
quartiles respectively, the interquartile range IQR is calculated as:
IQR = Q3 – Q1
IQR = Q2-Q1
IQR = Q3-Q2
Incorrect
Question 40 0 / 0.25 pts
MEAN
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
RANGE
MEDIAN
MODE
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/18
012032114ÿ06789ÿ
ÿ07ÿ
ÿÿÿÿ011 !"981#
/012ÿ4
506ÿÿ0.ÿÿ1879. ÿ7819:;ÿ06 ÿ/06;:189;ÿ,6
<=>1?>@?6ÿÿ03ÿÿ0.766ÿÿÿ0.ÿÿ1879. ÿA1B6ÿC1B1:ÿD6ÿ
&
E9;:F0G:189;
V$ÿ
ÿÿÿ'ÿ*%ÿ(Jÿ)ÿX
ÿP((ÿÿ'ÿ%QÿP$ÿÿÿ*W
%ÿ
4ÿNÿJ
ÿ*&&ÿ$&
)
V$ÿ-
ÿ&ÿ*ÿNÿÿ$
ÿÿ$&ÿÿ'ÿ*%(ÿÿJ
ÿ&)ÿX
ÿP((ÿÿ'ÿ((PÿÿY
'Wÿÿ$ÿ%Q
&ÿ-
&ÿNÿJ
ÿ&W%ÿÿ-
&)ÿ
Z$&ÿ$ÿ*&ÿ%%%ÿ&Pÿÿ$ÿ-
&)[
V$ÿ&P&ÿP((ÿ'ÿQ&'(ÿ(JÿNÿ$ÿJ&ÿÿ$ÿ-
ÿ$&ÿ)
H((ÿ$ÿ'&)
H*%ÿI&J
<::6BK: A1B6 LG8F6
C<AMLA <::6BK:ÿ4 9+ÿ*
& 3).1ÿ
ÿNÿ06
Oÿÿ&P&ÿP((ÿ'ÿQ('(ÿÿÿ11ÿÿ6766)
ÿNÿ$&ÿ-
7ÿRSTUÿ
ÿNÿ06
'*ÿÿ03ÿÿ11719
V$&ÿ*%ÿWÿ9+ÿ*
&)
7>F:1>? /06;:189ÿ4 \S4]ÿ^ÿ\SU_ÿK:;
HN(ÿ((Yÿ*%&&ÿNÿ$ÿN((PYÿ&*&
9:;87<=>ÿ2 0123ÿ5ÿ0123ÿ678
?$ÿÿ@((A/ÿBCÿ(&ÿÿÿ&ÿ%D
ÿÿÿ&4ÿE$4ÿ4ÿ%&4ÿ
/**ÿ
9:;87<=>ÿF 0123ÿ5ÿ0123ÿ678
ÿG/ÿ&ÿÿ&
'@(ÿ@ÿÿ)
ÿÿ?
ÿ
9:;87<=>ÿH 0123ÿ5ÿ0123ÿ678
I$$ÿÿ@ÿ$ÿ@((A/ÿ&*&ÿ&ÿÿ
J
ÿÿÿÿ&ÿÿ/ÿ%&ÿÿ$ÿ)ÿ
ÿ&&ÿ&ÿÿ&%&'(ÿ/
ÿÿÿ*% (ÿ
9:;87<=>ÿB 0345ÿ2ÿ0345ÿ678
C$$ÿ/ÿ$ÿ/((DEÿ%%$ÿ&$
(ÿ'ÿ
&ÿÿ&FÿÿG(H&&
-
&I
89:76;<=ÿ> /012ÿ4ÿ/012ÿ567
%%&ÿ$ÿ$ÿ**
*ÿÿ*?*
*ÿ@(
&ÿAÿÿ'
ÿÿ,)8
ÿ+)B4ÿ&%@(C)ÿ*%
ÿ$ÿ&(ÿ@(
ÿAÿ9),ÿAÿ**?
*(ÿ&ÿ%%(ÿÿ&(ÿD0)64E0)6F)ÿG&Hÿ&$
(ÿ$@ÿ
%&ÿAÿI)II#
ÿÿ6)88ÿ
89:76;<=ÿJ /012ÿ4ÿ/012ÿ567
%%&ÿ$ÿ$ÿ**
*ÿÿ*?*
*ÿ@(
&ÿAÿÿ'
ÿÿ,)8
ÿ+)B4ÿ&%@(C)ÿ*%
ÿ$ÿ&(ÿ@(
ÿAÿ9),ÿAÿ**?
*(ÿ&ÿ%%(ÿÿ&(ÿD6)640)6F)ÿG&Hÿ&$
(ÿ$@ÿ
%&ÿAÿI)IIF
ÿÿ6)88ÿ
89:76;<=ÿ> /012ÿ4ÿ/012ÿ567
%%&ÿ$ÿ$ÿ**
*ÿÿ*?*
*ÿ@(
&ÿAÿÿ'
ÿÿ,)8
ÿ+)B4ÿ&%@(C)ÿ*%
ÿ$ÿ&(ÿ@(
ÿAÿ9),ÿAÿ**?
*(ÿ&ÿ%%(ÿÿ&(ÿD6)640)6E)ÿF&Gÿ&$
(ÿ$@ÿ
%&ÿAÿH)HHE
ÿÿ6)88ÿ
89:76;<=ÿI/ /012ÿ4ÿ/012ÿ567
*%
ÿ$ÿJK&ÿAAÿAÿ?ÿLÿ0464646404040#ÿÿCÿL
6404046464046#
ÿÿ6)0BBÿ
89:76;<=ÿII /012ÿ4ÿ/012ÿ567
01ÿ$ÿÿ(1&ÿ&2ÿ0ÿ$ÿ0((34ÿ&)ÿ5ÿ%6
*4ÿ&ÿ(1&4ÿ%&ÿ%6&ÿÿ01ÿ$ÿ&2ÿ7(7ÿÿ$3
$1ÿ3ÿ*4)
ÿÿ&%7ÿ5(1&ÿ
ABC@?DEFÿG: 89:;ÿ=ÿ89:;ÿ>?@
5&&ÿ
ÿÿH&ÿÿI0&HÿÿJ((&ÿÿ3$$ÿ%$&ÿ0ÿ&%ÿ
ÿÿI &&&ÿK&4ÿ
ABC@?DEFÿGL 89:;ÿ=ÿ89:;ÿ>?@
ÿ&ÿ
&&ÿÿÿ&7ÿ$ÿ%ÿ0*ÿ$ÿ3
)
ÿÿH&ÿ
:;<98=>?ÿ@6 1456ÿ3ÿ1456ÿ789
ÿÿ%$&4ÿ0(ÿ%2$(ÿ
*ÿ0ÿ%&&ÿ&
%%
ÿÿ/%(ÿ
9:;87<=>ÿ?@ 0123ÿ5ÿ0123ÿ678
ÿÿ/(ÿ' ÿ
9:;87<=>ÿ?G 0123ÿ5ÿ0123ÿ678
H$ÿ%&&ÿ$
I$ÿJ$$ÿ'
&&&&ÿ(K&ÿ
&*ÿÿÿ$
K%&ÿBÿB*ÿÿÿÿBÿ%&ÿÿ(L&ÿ'JÿM
&
ÿ*&ÿ&ÿLJÿ&7
ÿÿÿ*Iÿ
89:76;<=ÿ>? /012ÿ4ÿ/012ÿ567
@$$ÿAÿ$ÿA((BCÿ&ÿÿ&&(ÿ%&&ÿÿB$$ÿ$ÿ((C
*$&ÿÿ%%(ÿÿDÿÿ%&E
ÿÿÿCÿ
89:76;<=ÿ>F /012ÿ4ÿ/012ÿ567
AGÿ$ÿÿ(G&ÿ&HÿAÿ$ÿA((BCÿ&)ÿIÿ&
ÿ&
(GCÿJ
&ÿ'(C&ÿÿJ(C&ÿÿAÿ
ÿ$ÿ&H((ÿ&ÿ$ÿ$&ÿÿ'
-
ÿÿ'*ÿÿÿ&&ÿÿ$ÿA
)ÿ
ÿÿ&%JÿI(G&ÿ
ÿÿ:((ÿ2ÿ$ÿ01ÿ%&ÿÿ ÿ
DEFCBGHIÿ=J ;<=>ÿ@ÿ;<=>ÿABC
7
ÿ*$(04ÿ&ÿ&%2((4ÿ'
(ÿ2ÿ62*ÿ6$(04#
%K&)
ÿÿ/(&ÿ
DEFCBGHIÿ== ;<=>ÿ@ÿ;<=>ÿABC
:;<98=>?ÿ3@ 1234ÿ6ÿ1234ÿ789
ÿÿC(ÿ' ÿ
:;<98=>?ÿ3D 1234ÿ6ÿ1234ÿ789
/Eÿÿ&$
(ÿ'ÿ%&&ÿ(Aÿÿ*)
ÿÿF(&ÿ
:;<98=>?ÿ34 1234ÿ6ÿ1234ÿ789
$%&722'&%()&
)*2
&&20+6,2-
&28,98 0020.
012032114ÿ06789ÿ
ÿ07ÿ
ÿÿÿÿ011 !"981#
ÿÿ/&$ÿÿ&ÿ$ÿ0
ÿ1%%(
ÿ&$23ÿ$ÿ&ÿ0
$ÿ%%(ÿ&ÿÿ4*%(ÿ0
>?@=<ABCÿ7D 5678ÿ:ÿ5678ÿ;<=
ÿ
ÿ$ÿÿ&*)
ÿÿ2ÿÿ&ÿ3(ÿ&
ÿ0ÿÿ
>?@=<ABCÿ7E 5678ÿ:ÿ5678ÿ;<=
ÿÿJÿ' ÿ
89:76;<=ÿ1> /012ÿ4ÿ/012ÿ567
89:76;<=ÿ>/ /012ÿ4ÿ/012ÿ567
?$$ÿ@ÿ$ÿ@((ABÿ&ÿCDEÿÿF&
(ÿ$-
G
ÿÿH*ÿ
89:76;<=ÿ>I /012ÿ4ÿ/012ÿ567
&ÿ$ÿ@((ABÿ
J$ÿ)ÿÿ
*%ÿ
*%Jÿ&ÿ%
@*ÿ&K()%%&&Bÿ*%ÿL
H*0ÿMÿ%)JN,04,84,94,+4,.O#
'0ÿMÿL$&$(MÿH*0)*##
H*0'ÿMÿ'0)@&@*H*0)&$%04ÿ0##
%H*0'#
E$ÿ(&ÿ(ÿ@ÿ$ÿÿ&%%ÿA((ÿ%
ÿÿN646464040Oÿ
89:76;<=ÿ>1 /012ÿ4ÿ/012ÿ567
$&ÿ$ÿÿ*%(ÿ(
ÿÿ*?*ÿ@ÿ8A*?*#)ÿ
89:76;<=ÿ>> /012ÿ4ÿ/012ÿ567
89:76;<=ÿ>G /012ÿ4ÿ/012ÿ567
EHÿ$ÿE(&ÿ&*)
ÿ
0&ÿÿÿ&&4ÿ1$(ÿ(2ÿ$ÿ4ÿ3
ÿÿÿÿ1$
1$3ÿ$ÿÿ4(
&ÿÿ*&&2)ÿ5
ÿ6
&ÿ78ÿ$*)
BCDA@EFGÿH< 9:;<ÿ>ÿ9:;<ÿ?@A
$ÿ$ÿ7((12ÿ7
ÿ
&2ÿÿ
3$ÿ
&ÿÿÿ(2)
IJF?GKLM ÿ
ÿ8ÿ1$
ÿN02
BCDA@EFGÿHQ 9:;<ÿ>ÿ9:;<ÿ?@A
R$$ÿ*2ÿ$ÿ7((12ÿÿ4(ÿ*$&ÿ7ÿ$(2ÿ*&&2ÿ
$%&722'&%()&
)*2
&&20+6,2-
&28,98 0.20/
012032114ÿ06789ÿ
ÿ07ÿ
ÿÿÿÿ011 !"981#
ÿÿÿÿÿÿ
)ÿ(*/ÿÿ0'1&
ÿÿÿÿÿ)ÿ&*/ÿ&&/ÿ2(
&
ÿÿÿÿÿ3)ÿ//ÿ$ÿ&&/ÿ2(
&ÿ
/ÿ4(5&&
ÿÿÿÿÿ)ÿ3%(/ÿ6$ÿ((ÿ%&&'(ÿ7(
&
ÿÿ4((ÿ$ÿ%&ÿÿÿ
ABC@?DEFÿGH 89:;ÿ=ÿ89:;ÿ>?@
I*ÿ*$*(ÿ*(&ÿJÿ%*&ÿÿ'&7&ÿ$ÿ$ÿ(
%$&ÿ&ÿÿ$ÿK%(ÿ/6$)ÿL$&ÿÿ'ÿ7Jÿ'5ÿ%(/ÿ$
*'ÿJÿJ&ÿÿ(/&()ÿL$&ÿ&ÿÿ
&ÿ&ÿJÿ6$$ÿ$-
)
ÿÿL&J*ÿ
ABC@?DEFÿGM 89:;ÿ=ÿ89:;ÿ>?@
N$$ÿÿ7&
(ÿ&ÿ%%%ÿÿK%(ÿ$ÿ(&$%
'6ÿ6ÿ'
&ÿ
ÿJÿ*5ÿ'
&ÿÿÿÿJ*)
$%&722'&%()&
)*2
&&20+6,2-
&28,98 0+20.
012032114ÿ06789ÿ
ÿ07ÿ
ÿÿÿÿ011 !"981#
ÿÿÿ%(ÿ
89:76;<=ÿ>? /012ÿ4ÿ/012ÿ567
@$ÿ&&(ÿ&%ÿA04A14ÿ)ÿ)ÿ)ÿ4AB#2B4ÿCÿ$ÿÿD(
&ÿA04A14ÿ)ÿ)ÿ)
4ABÿ&ÿ((ÿ&ÿ$ÿ
ÿÿ*ÿ
89:76;<=ÿE/ /012ÿ4ÿ/012ÿ567
ÿ7ÿ3456ÿ ÿ0ÿ06
Quiz 1
Due Dec 19 at 23:59 Points 10 Questions 40
Available Dec 18 at 19:00 - Dec 19 at 23:59 Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
Attempt 1 (https://bits- 31 8.48 out of
LATEST
pilani.instructure.com/courses/1704/quizzes/3453/history?version=1) minutes 10
True
False
Storyteller
Data Analyst
processing data
analysing data
organizing data
Data mining
Data manipulation
Data analytics
Compute the width of each bin for data given below, if the number of
bins is 5.
[39, 45, 49, 45, 31, 37, 38, 41, 37, 41, 39, 34, 35, 30, 47, 43, 44, 46, 48,
36]
4
(49−30)÷5
3
6
5
7
8
5
6
Compute the depth of each bin for data given below, if the number of
bins is 4.
[47, 38, 40, 42, 47, 42, 41, 39, 42, 45, 36, 37, 36, 40, 43, 40, 45, 36, 43,
46]
6
5
3
4
0.234
0.166
0.455
0.765
Descriptive
Prescriptive
Predictive
Diagnostic
Descriptive Analytics
Diagnostic Analytics
Cognitive analytics
Predictive Analytics
Access Situation - "Cost and Benefits" - Falls into which phase of Crisp
- DM
Data modeling
Data Preparation
Evaluation
Business Understanding
Operationalize
None
Model building
Model planning
A scenario where you feel unwell and go to a doctor. The doctor asks
you questions like were you exposed to rain or cold climate or did you
have contact with a sick person or did you have food from outside etc.
Based on your answers,doctor came to the conclusion. This can be
considered analogous to which stage of data analytics?
Diagnostic
Prescriptive
Predictive
Descriptive
Identify the data analytics task for the following scenario. A mother
analysed the answer scripts of her daughter and found that they should
concentrate on reading comprehension to score better in the upcoming
exams.
Predictive Analytics
Diagnostic Analytics
Prescriptive Analytics
Descriptive Analytics
Data Preparation,Modeling
Evaluation,Business understanding
Data Understanding,Evaluation
Modeling,Data Preparation
Incorrect
Question 18 0 / 0.25 pts
Regression
Association Rule
Classification
Clustering
A company wants to find the target segment of people for one of its
products. Which modelling technique would be generally appropriate
Clustering
Ranking
Classification
Regression
Incorrect
Question 21 0 / 0.25 pts
Which data analytic approach can show the relationships in the various
elements in your data?
Descriptive
Prescriptive
Predictive
Diagnostic
Symmetric attribute
Discrete attribute
Asymmetric attribute
Continuous attribute
Incorrect
Question 24 0 / 0.25 pts
True
False
Bank loan approval data consists of a field called loan type. It is stored
as an integer in the database. Its values mean the following:- 1 -
personal loan, 2 - home loan, 3 - business loan. What type of data type
is the loan type?
Ratio
Interval
Nominal
Ordinal
True
False
Integration
Redundancy
Inconsistencies
Noise
Incorrect
Question 30 0 / 0.25 pts
I. The smaller data sets resulting from data reduction require less
memory and processing time.
Statement I and II
Statement I and IV
[0,0,0,1,1]
[0,0,1,0,0]
[1,1,0,1,1]
[1,1,1,0,0]
In which phase, the duplicates of the data are removed? Choose the
best possible answer.
Data Collection
Data Requirements
Data Preparation
Data Understanding
Identify the sampling technique used in the following use case. For ML
classification task, the algorithm requires that the test set has equal
examples from the three categories.
Simple Random
Stratified Random
Cluster Sampling
Systematic Sampling
True
Mostly True
False
6
2
5
5.5
Quiz 1
Due Dec 19 at 23:59 Points 10 Questions 40
Available Dec 18 at 19:00 - Dec 19 at 23:59 Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
LATEST Attempt 1 60 minutes 6.5 out of 10
data wrangling
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data Analyst
Data Scientist
Data Architect
Data Journalist
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
data manipulation
data mining
data analytics
Privacy Checker
Recommendation Systems
For an organization, having not much data analytics need and just
embarking on the analytics path will most likely structure its data team
in a
Centralized model
Federated model
Consulting model
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Functional model
Suppose that the minimum and maximum values for an attribute are 4.3
and 7.6, respectively. Compute the scaled value of 5.4 if min-max
normalization is applied to scale [0.0,1.0]. (Answer should have a
precision of X.XX]
0.33
0.43
0.36
0.45
0.765
0.166
0.455
0.234
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Compute the width of each bin for data given below, if the number of
bins is 5.
[39, 45, 49, 45, 31, 37, 38, 41, 37, 41, 39, 34, 35, 30, 47, 43, 44, 46, 48,
36]
6
4
(49−30)÷5
3
5
Suppose that the minimum and maximum values for an attribute are 4.3
and 7.6, respectively. Compute the scaled value of 5.4 if min-max
normalization is applied to scale [0.0,1.0]. (Answer should have a
precision of X.XX]
0.33
0.29
0.30
0.38
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Deployment
Data Modelling
Data preparation
Data collection
what is the average score of all students in the CBSE 10th Math Exam
Incorrect
Question 13 0 / 0.25 pts
CRISP-DM
SEMMA
SMAM
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Descriptive analysis
Prescriptive analysis
Diagnositic Analysis
Predictive analysis
Identify the data analytics task for the following scenario. A physics
teacher is analyzing the answer scripts of the students to identify the
areas that he/she should concentrate on so that the students
understand the concepts better.
Predictive Analytics
Diagnostic Analytics
Descriptive Analytics
Prescriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Sub setting can be used to select and exclude variables and
observations
Merging concerns combining datasets on the same observations to
produce a result
Incorrect
Question 17 0 / 0.25 pts
Optimization
Classification
Regression
Clustering
Clustering
Association Rule
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Classification
Regression
Incorrect
Question 19 0 / 0.25 pts
Which of the following data science project step is the most critical step
for the success of the project?
Model Selection
Model Evaluation
Model Building
Data preprocessing
Diagnostic Analytics
Descriptive Analytics
Prescriptive Analytics
Predictive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
A company wants to find the target segment of people for one of its
products. Which modelling technique would be generally appropriate
Classification
Clustering
Regression
Ranking
Identify the data analytics task for the following scenario. A student is
analyzing various blogs and vlogs to find out the skill set that has to be
acquired to become a data scientist in the future.
Diagnostic Analytics
Descriptive Analytics
Predictive Analytics
Prescriptive Analytics
WebSpider
WebCrawler
BeautifulSoup
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Scraper
Quantitative values
Qualitative Values
P) Distinctness
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Q) Order
P and R
P and S
P,Q and R
Interval attribute
Ordinal attribute
Nominal attribute
Ratio attribute
True
False
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 29 0 / 0.25 pts
Incorrect
Question 30 0 / 0.25 pts
2
4
1
3
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
For a data analytics task to analyse feedback on her subject for a class
of 60 students, a school teacher decided to use the survey submitted
by the ten students who come for tuitions for that subject, at her home.
Identify the type of sampling she is doing.
Systematic Sampling
Stratified sampling
Incorrect
Question 32 0 / 0.25 pts
Incorrect
Question 33 0 / 0.25 pts
The statistical description (x1,x2, . . . ,xN)/N, for the data values x1,x2, .
. . ,xN is called as their ________________
mean
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
IQR
median
mode
Which of the following statements are true with respect to data quality
issues?
The given data set should not miss any values or attributes
Pre-processing of data is required to address the problems of
inconsistency, incompleteness.
If data are not updated time to time there will be a negative impact on
data quality.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 36 0 / 0.25 pts
[0,1,1,0]
[1,0,0,1]
As a data scientist, while cleaning the data, you are not concerned with
why the data values are missing. You just fix them.
As a data scientist, even if you know that certain outliers are valid data,
you might still omit them from model construction.
It may not be a good idea to drop a data field with missing values.
As a data scientist, even if you know that certain outliers are valid data,
you might still omit them from model construction.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 38 0 / 0.25 pts
A. it can’t be done
In a boxplot, where Q1, Q2 and Q3 are the first, second and third
quartiles respectively, the interquartile range IQR is calculated as:
IQR = Q3 – Q1
IQR = Q2-Q1
IQR = Q3-Q2
Incorrect
Question 40 0 / 0.25 pts
MEAN
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
RANGE
MEDIAN
MODE
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 23:59 Points 10 Questions 40
Available Dec 18 at 19:00 - Dec 19 at 23:59 Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
LATEST Attempt 1 60 minutes 6.5 out of 10
data wrangling
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data Analyst
Data Scientist
Data Architect
Data Journalist
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
data manipulation
data mining
data analytics
Privacy Checker
Recommendation Systems
For an organization, having not much data analytics need and just
embarking on the analytics path will most likely structure its data team
in a
Centralized model
Federated model
Consulting model
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Functional model
Suppose that the minimum and maximum values for an attribute are 4.3
and 7.6, respectively. Compute the scaled value of 5.4 if min-max
normalization is applied to scale [0.0,1.0]. (Answer should have a
precision of X.XX]
0.33
0.43
0.36
0.45
0.765
0.166
0.455
0.234
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Compute the width of each bin for data given below, if the number of
bins is 5.
[39, 45, 49, 45, 31, 37, 38, 41, 37, 41, 39, 34, 35, 30, 47, 43, 44, 46, 48,
36]
6
4
(49−30)÷5
3
5
Suppose that the minimum and maximum values for an attribute are 4.3
and 7.6, respectively. Compute the scaled value of 5.4 if min-max
normalization is applied to scale [0.0,1.0]. (Answer should have a
precision of X.XX]
0.33
0.29
0.30
0.38
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Deployment
Data Modelling
Data preparation
Data collection
what is the average score of all students in the CBSE 10th Math Exam
Incorrect
Question 13 0 / 0.25 pts
CRISP-DM
SEMMA
SMAM
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Descriptive analysis
Prescriptive analysis
Diagnositic Analysis
Predictive analysis
Identify the data analytics task for the following scenario. A physics
teacher is analyzing the answer scripts of the students to identify the
areas that he/she should concentrate on so that the students
understand the concepts better.
Predictive Analytics
Diagnostic Analytics
Descriptive Analytics
Prescriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Sub setting can be used to select and exclude variables and
observations
Merging concerns combining datasets on the same observations to
produce a result
Incorrect
Question 17 0 / 0.25 pts
Optimization
Classification
Regression
Clustering
Clustering
Association Rule
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Classification
Regression
Incorrect
Question 19 0 / 0.25 pts
Which of the following data science project step is the most critical step
for the success of the project?
Model Selection
Model Evaluation
Model Building
Data preprocessing
Diagnostic Analytics
Descriptive Analytics
Prescriptive Analytics
Predictive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
A company wants to find the target segment of people for one of its
products. Which modelling technique would be generally appropriate
Classification
Clustering
Regression
Ranking
Identify the data analytics task for the following scenario. A student is
analyzing various blogs and vlogs to find out the skill set that has to be
acquired to become a data scientist in the future.
Diagnostic Analytics
Descriptive Analytics
Predictive Analytics
Prescriptive Analytics
WebSpider
WebCrawler
BeautifulSoup
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Scraper
Quantitative values
Qualitative Values
P) Distinctness
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Q) Order
P and R
P and S
P,Q and R
Interval attribute
Ordinal attribute
Nominal attribute
Ratio attribute
True
False
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 29 0 / 0.25 pts
Incorrect
Question 30 0 / 0.25 pts
2
4
1
3
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
For a data analytics task to analyse feedback on her subject for a class
of 60 students, a school teacher decided to use the survey submitted
by the ten students who come for tuitions for that subject, at her home.
Identify the type of sampling she is doing.
Systematic Sampling
Stratified sampling
Incorrect
Question 32 0 / 0.25 pts
Incorrect
Question 33 0 / 0.25 pts
The statistical description (x1,x2, . . . ,xN)/N, for the data values x1,x2, .
. . ,xN is called as their ________________
mean
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
IQR
median
mode
Which of the following statements are true with respect to data quality
issues?
The given data set should not miss any values or attributes
Pre-processing of data is required to address the problems of
inconsistency, incompleteness.
If data are not updated time to time there will be a negative impact on
data quality.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 36 0 / 0.25 pts
[0,1,1,0]
[1,0,0,1]
As a data scientist, while cleaning the data, you are not concerned with
why the data values are missing. You just fix them.
As a data scientist, even if you know that certain outliers are valid data,
you might still omit them from model construction.
It may not be a good idea to drop a data field with missing values.
As a data scientist, even if you know that certain outliers are valid data,
you might still omit them from model construction.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 38 0 / 0.25 pts
A. it can’t be done
In a boxplot, where Q1, Q2 and Q3 are the first, second and third
quartiles respectively, the interquartile range IQR is calculated as:
IQR = Q3 – Q1
IQR = Q2-Q1
IQR = Q3-Q2
Incorrect
Question 40 0 / 0.25 pts
MEAN
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/18
18/12/2022, 20:34 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
RANGE
MEDIAN
MODE
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/18
Quiz 1
Due
Dec 19 at 23:59
Points
10
Questions
40
Available
Dec 18 at 19:00 - Dec 19 at 23:59
Time Limit
60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
LATEST Attempt 1
60 minutes 5.38 out of 10
Correct answers will be available on Dec 22 at 0:00.
Incorrect Question 1 0
/ 0.25 pts
Statement 1 is wrong and Statement 2 is correct
Both statements are correct
Both statements are wrong
Incorrect
Question 2 0
/ 0.25 pts
Which among the following is a more flexible model with right balance
of
Functional
Consulting
Federated
Centralized
Question 3 0.25
/ 0.25 pts
True
False
Question 4 0.25
/ 0.25 pts
Storyteller
Data Analyst
Big Data Engineer
Machine Learning Engineer
Question 5 0.25
/ 0.25 pts
Machine Learning
Data Science
Deep Learning
IoT
Question 6 0.25
/ 0.25 pts
Data Scientist, Analyst
Data Scientist, Architect, SME, Sponsor, Programmer
Data Scientist, Analyst, SME, Programmer
Data Scientist, Analyst, Sponsor
Question 7 0.25
/ 0.25 pts
Suppose that the minimum and maximum values for an attribute are
4.3 and 7.6, respectively. Compute the scaled value of 5.4 if min-max
normalization is applied to scale [0.0,1.0]. (Answer should have a
precision of X.XX]
0.43
0.33
0.45
0.36
Question 8 0.25
/ 0.25 pts
Compute the depth of each bin for data given below, if the number of
bins is 5.
[47, 38, 40, 42, 47, 42, 41, 39, 42, 45, 36, 37, 36, 40, 43, 40, 45, 36,
43, 46]
4
20/5
3
5
6
Question 9 0.25
/ 0.25 pts
Compute the depth of each bin for data given below, if the number of
bins is 4.
[47, 38, 40, 42, 47, 42, 41, 39, 42, 45, 36, 37, 36, 40, 43, 40, 45, 36,
43, 46]
4
3
5
6
Question 10 0.25
/ 0.25 pts
Compute the width of each bin for data given below, if the number of
bins is 4.
[39, 45, 49, 45, 31, 37, 38, 41, 37, 41, 39, 34, 35, 30, 47, 43, 44, 46,
48, 36]
6
3
4
5
(49-30)/4
Question 11 0.25
/ 0.25 pts
Association
Regression
Classification
Reinforcement
Incorrect Question 12 0
/ 0.25 pts
Diagnositic Analysis
Descriptive analysis
Predictive analysis
Prescriptive analysis
Question 13 0.25
/ 0.25 pts
Examining the data they're keeping and reviewing how it's being used
has little or no value for firms who aren't currently aiming to undertake
big data analytics.
False
True
Question 14 0.25
/ 0.25 pts
Data collection
Deployment
Data preparation
Data Modelling
Incorrect
Question 15 0
/ 0.25 pts
Diagnostic Analytics
Prescriptive Analytics
Descriptive Analytics
Predictive Analytics
Incorrect
Question 16 0
/ 0.25 pts
Which of the sentences below best describes predictive analytics?
methods for predicting future behavior using statistical analysis and
data mining, particularly to maximize the strategic value of corporate
intelligence
software applications, also known as "bots," that are dispatched to carry
out a mission and gather data from web pages on behalf of a user
a gateway that offers access to a variety of vital information from many
different sources on one screen
a method that uses feature analysis to predict people in photos and
tags them to other photos on its own
Incorrect
Question 17 0
/ 0.25 pts
It focuses on removing inaccurate data from your data set
It enhances the data’s accuracy and integrity
All of the given options
It focuses on transforming the data’s format by converting raw data into
another format
Partial
Question 18 0.13
/ 0.25 pts
Question 19 0.25
/ 0.25 pts
Identify the data analytics task for the following scenario. The team
leader aggregates the sales data from various geographical areas and
reports the penetration of each product.
Prescriptive Analytics
Descriptive Analytics
Diagnostic Analytics
Predictive Analytics
Incorrect
Question 20 0
/ 0.25 pts
Identify the data analytics task for the following scenario. A project
manager is analysing past projects to identify the risk involved and how
they were mitigated.
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
Descriptive Analytics
Question 21 0.25
/ 0.25 pts
Identify the data analytics task for the following scenario. Google is
using tools to suggest texts or phrases while composing emails.
Descriptive Analytics
Diagnostic Analytics
Prescriptive Analytics
Predictive Analytics
Question 22 0.25
/ 0.25 pts
Identify the data analytics task for the following scenario. A student is
analyzing various blogs and vlogs to find out the skill set that has to be
acquired to become a data scientist in the future.
Prescriptive Analytics
Descriptive Analytics
Diagnostic Analytics
Predictive Analytics
Question 23 0.25
/ 0.25 pts
Raw data is the data obtained after processing steps
None of the mentioned
Preprocessed data is original source of data
Raw data is original source of data
Incorrect
Question 24 0
/ 0.25 pts
"Order Fulfilment Date" should come after "Order Creation Date". This
is an example of which data quality aspect:
Integrity
Conformity
Timeliness
Consistency
Question 25 0.25
/ 0.25 pts
Data point
Variables
Dimensions
Features
Question 26 0.25
/ 0.25 pts
True
False
Question 27 0.25
/ 0.25 pts
preprocessed data is original source of data
none of the options
raw data is the data obtained after processing steps
raw data is original source of data
Question 28 0.25
/ 0.25 pts
Continuous attribute
Ordinal attribute
Nominal attribute
Numeric attribute
Question 29 0.25
/ 0.25 pts
Binarisation
Normalisation
Standardisation
Sampling
Question 30 0.25
/ 0.25 pts
Data Integration is a
Data Normalization Technique
Pre-processing technique
Generalization technique
None of the answers
Incorrect Question 31 0
/ 0.25 pts
Clustering of similar data from different sources
For the same real world entity, resolving attribute values from different
sources
None of the above
Identification of missing rows for identified key values
Unanswered Question 32 0
/ 0.25 pts
mean of a value
most frequent attribute value
least frequent attribute value
none of the above
Unanswered Question 33 0
/ 0.25 pts
Distance between the first and third quartile
Distance between the first and second quartile
Distance between the first and fourth quartile
Distance between the second and third quartile
Unanswered Question 34 0
/ 0.25 pts
For a data analytics task to analyse feedback on her subject for a class
students, a school teacher decided to use the survey submitted by the t
who come for tuitions for that subject, at her home. Identify the type of s
is doing.
Systematic Sampling
Sampling without replacement
Stratified sampling
Non Probabilistic sampling
Unanswered Question 35 0
/ 0.25 pts
lb = sklearn.preprocessing.LabelBinarizer()
[0,1,1,0]
[true, false, false, true]
[True, False, False, True]
[1,0,0,1]
Unanswered Question 36 0
/ 0.25 pts
Redundancy
Accuracy & Efficiency
Integration
Noise
Inconsistencies
Unanswered Question 37 0
/ 0.25 pts
False
Unanswered Question 38 0
/ 0.25 pts
There are two sets X={10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24} and Y = {-30, -31, -32, -33, -34, -35, -36, -37, -38, -39, -40, -41,
-42, -43, -44}. What is TRUE about the standard deviations of X and Y
i.e. σX and σY respectively?
σX will be smaller than σY.
Magnitude will be the same but the sign will be different
σY will be smaller than σX.
Will be the same.
Unanswered Question 39 0
/ 0.25 pts
Treatment of outliers depends on the problem statement
Outliers should always be addressed in the dataset
Outliers should be addressed only in the training dataset
Outliers should be addressed in the test dataset
Unanswered Question 40 0
/ 0.25 pts
Matrix
Lexeme
ConeTree
TreeMap
Quiz Score:
5.38 out of 10
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 11:59pm Points 10 Questions 40
Available Dec 18 at 7pm - Dec 19 at 11:59pm Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
processing data
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
analysing data
organizing data
Incorrect
Question 3 0 / 0.25 pts
i, ii, iii
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
iv, v
ii, iii
i, iv, v
True
False
data mining
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
data analytics
data manipulation
5
8
6
7
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 3.2. [Answer should have a precision of X.XXXX]
0.8769
-0.8679
0.8679
-0.8769
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Compute the width of each bin for data given below, if the number of
bins is 4.
[39, 45, 49, 45, 31, 37, 38, 41, 37, 41, 39, 34, 35, 30, 47, 43, 44, 46,
48, 36]
4
3
5
(49-30)/4
6
5
4
7
6
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Classification
Clustering
Regression
Optimization
A scenario where you feel unwell and go to a doctor. The doctor asks
you questions like were you exposed to rain or cold climate or did you
have contact with a sick person or did you have food from outside etc.
Based on your answers,doctor came to the conclusion. This can be
considered analogous to which stage of data analytics?
Descriptive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Predictive
Prescriptive
Diagnostic
Classification
Association Rule
Clustering
Regression
A company wants to find the target segment of people for one of its
products. Which modelling technique would be generally appropriate
Clustering
Classification
Ranking
Regression
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
True
False
Data pre-processing improves the data quality and make data mining
algorithms efficient and effective. What are the data pre-processing
tasks along with Data cleaning?
Data Transformation
Data Integration
Data Reduction
Access Situation - "Cost and Benefits" - Falls into which phase of Crisp
- DM
Data Preparation
Data modeling
Evaluation
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Business Understanding
Incorrect
Question 19 0 / 0.25 pts
Identify the data analytics task for the following scenario. A software
programmer develops a tool to convert programs written in Verilog to
Python.
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
Descriptive Analytics
Incorrect
Question 20 0 / 0.25 pts
Identify the data analytics task for the following scenario. A student is
analyzing various blogs and vlogs to find out the skill set that has to be
acquired to become a data scientist in the future.
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
Descriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
In 2001, Big Data created the three Vs: volume, velocity, and variety.
The V's have grown to encompass veracity and value in the years
since. Big data is sometimes subjected to a fifth V, which is:
Volatile
Vector
Variability
Vulnerability
Applied statistics
Industry statistics
Economic statistics
What are the best practices for implementing big data analytics
programmes?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Focusing on business goals and how to use big data analytics
technologies to meet them
Adopting data analysis tools based on a laundry list of their capabilities
Zip codes
Exam Grades
Military ranks
Academic ranks
BeautifulSoup
WebCrawler
Scraper
WebSpider
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
The distance between categories is equal across the range of
interval/ratio data
Ordinal variables have a fixed zero point, whereas interval/ratio
variables do not
Variables
Features
Dimensions
Data point
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Symmetric attribute
Discrete attribute
Continuous attribute
Asymmetric attribute
Data retrieval
Data cleansing
Data transformation
Data combining
MEDIAN
MEAN
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
MODE
RANGE
In which phase, the duplicates (of the data) are removed? Choose the
best possible answer.
Data Preparation
Data Collection
Data Understanding
Data Exploration
Matrix
TreeMap
Lexeme
ConeTree
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
There are two sets X={10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24} and Y = {-30, -31, -32, -33, -34, -35, -36, -37, -38, -39, -40, -41,
-42, -43, -44}. What is TRUE about the standard deviations of X and Y
i.e. σX and σY respectively?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Systematic Sampling
Stratified sampling
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Which among the following are valid methods of handling missing data
Q and R
R only
P, Q and R
For the same real world entity, resolving attribute values from different
sources
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
I. The smaller data sets resulting from data reduction require less
memory and processing time.
Statement I and II
Statement I and IV
mean of a value
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 19/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 11:59pm Points 10 Questions 40
Available Dec 18 at 7pm - Dec 19 at 11:59pm Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
processing data
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
analysing data
organizing data
Incorrect
Question 3 0 / 0.25 pts
i, ii, iii
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
iv, v
ii, iii
i, iv, v
True
False
data mining
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
data analytics
data manipulation
5
8
6
7
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 3.2. [Answer should have a precision of X.XXXX]
0.8769
-0.8679
0.8679
-0.8769
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Compute the width of each bin for data given below, if the number of
bins is 4.
[39, 45, 49, 45, 31, 37, 38, 41, 37, 41, 39, 34, 35, 30, 47, 43, 44, 46,
48, 36]
4
3
5
(49-30)/4
6
5
4
7
6
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Classification
Clustering
Regression
Optimization
A scenario where you feel unwell and go to a doctor. The doctor asks
you questions like were you exposed to rain or cold climate or did you
have contact with a sick person or did you have food from outside etc.
Based on your answers,doctor came to the conclusion. This can be
considered analogous to which stage of data analytics?
Descriptive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Predictive
Prescriptive
Diagnostic
Classification
Association Rule
Clustering
Regression
A company wants to find the target segment of people for one of its
products. Which modelling technique would be generally appropriate
Clustering
Classification
Ranking
Regression
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
True
False
Data pre-processing improves the data quality and make data mining
algorithms efficient and effective. What are the data pre-processing
tasks along with Data cleaning?
Data Transformation
Data Integration
Data Reduction
Access Situation - "Cost and Benefits" - Falls into which phase of Crisp
- DM
Data Preparation
Data modeling
Evaluation
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Business Understanding
Incorrect
Question 19 0 / 0.25 pts
Identify the data analytics task for the following scenario. A software
programmer develops a tool to convert programs written in Verilog to
Python.
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
Descriptive Analytics
Incorrect
Question 20 0 / 0.25 pts
Identify the data analytics task for the following scenario. A student is
analyzing various blogs and vlogs to find out the skill set that has to be
acquired to become a data scientist in the future.
Predictive Analytics
Prescriptive Analytics
Diagnostic Analytics
Descriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
In 2001, Big Data created the three Vs: volume, velocity, and variety.
The V's have grown to encompass veracity and value in the years
since. Big data is sometimes subjected to a fifth V, which is:
Volatile
Vector
Variability
Vulnerability
Applied statistics
Industry statistics
Economic statistics
What are the best practices for implementing big data analytics
programmes?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Focusing on business goals and how to use big data analytics
technologies to meet them
Adopting data analysis tools based on a laundry list of their capabilities
Zip codes
Exam Grades
Military ranks
Academic ranks
BeautifulSoup
WebCrawler
Scraper
WebSpider
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
The distance between categories is equal across the range of
interval/ratio data
Ordinal variables have a fixed zero point, whereas interval/ratio
variables do not
Variables
Features
Dimensions
Data point
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Symmetric attribute
Discrete attribute
Continuous attribute
Asymmetric attribute
Data retrieval
Data cleansing
Data transformation
Data combining
MEDIAN
MEAN
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
MODE
RANGE
In which phase, the duplicates (of the data) are removed? Choose the
best possible answer.
Data Preparation
Data Collection
Data Understanding
Data Exploration
Matrix
TreeMap
Lexeme
ConeTree
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
There are two sets X={10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24} and Y = {-30, -31, -32, -33, -34, -35, -36, -37, -38, -39, -40, -41,
-42, -43, -44}. What is TRUE about the standard deviations of X and Y
i.e. σX and σY respectively?
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Systematic Sampling
Stratified sampling
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Which among the following are valid methods of handling missing data
Q and R
R only
P, Q and R
For the same real world entity, resolving attribute values from different
sources
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
I. The smaller data sets resulting from data reduction require less
memory and processing time.
Statement I and II
Statement I and IV
mean of a value
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 18/19
12/18/22, 8:45 PM Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 19/19
Question 1
0.25 / 0.25 pts
Pattern Recognition is a sub-field of Data Science.
True
False
PartialQuestion 2
0.17 / 0.25 pts
Which of the following are the reasons for the sudden growth of analytics?
Large number of user friendly analytics tools available for data processing
Question 3
0.25 / 0.25 pts
A city conducted a new bi-annual census of its residents. Which of the following most
strongly suggests a cognitive bias in their collected dataset?
The census data includes new data on how many cars are owned by the residents.
Some rows have date of birth in DD-MM-YY and some in DD-MM-YYYY formats
Question 4
0.25 / 0.25 pts
Due to market expectations, businesses are having difficulty retaining highly trained data
scientists and engineers.
False
True
Question 5
0.25 / 0.25 pts
Data science is an interdisciplinary field that has minimal overlap with which of the below?
Software Engineering
Machine Learning
Statistical Analysis
Artificial Intelligence
Question 6
0.25 / 0.25 pts
Statement 1: Role of a Business analyst usually requires expertise on building ML models.
Statement 2: Role of a Data Scientist usually requires expertise on performing descriptive
analysis.
Which of the following is right?
Question 7
0.25 / 0.25 pts
Suppose that the minimum and maximum values for an attribute are 4.3 and 7.6,
respectively. Compute the scaled value of 5.4 if min-max normalization is applied to scale
[-1.0,+1.0]. (Answer should have a precision of X.XX)
0.76
0.66
0.33
-0.33
Question 8
0.25 / 0.25 pts
Suppose that the mean and standard deviation of the values for an attribute are 8.9 and
6.5, respectively. Apply z-score normalization to a value of 10.7.
0.2679
-0.2769
-0.2679
0.2769
Question 9
0.25 / 0.25 pts
Compute the Euclidean distance between A(2,3) and B(5,7).
7
Question 11
0.25 / 0.25 pts
Access Situation - "Cost and Benefits" - Falls into which phase of Crisp - DM
Evaluation
Data modeling
Business Understanding
Data Preparation
Question 12
0.25 / 0.25 pts
Suppose a web user visits Flipkart during big billion-day sales. Predicting whether he / she
makes a purchase of a smartphone is a ___________________ task.
Association
Classification
Reinforcement
Regression
Question 15
0.25 / 0.25 pts
A course instructor has data about students attendance in her course in the past semester
. What kind of analytics is she performing when she creates a line graph based on this
data?
Predictive
Descriptive
Diagnostic
Prescriptive
Question 18
0.25 / 0.25 pts
Aanalyzing the data to determine why some phenomena related to learning happened a
type of
Diagnostic
Descriptive
Prescriptive
Predictive
Question 19
0.25 / 0.25 pts
Regression is
(1) Prediction of a value of a given continuous valued variable based on the values of other
variables.
(2) Regression works with a linear or nonlinear model of dependency among the variables.
Option 1
None
Option 2
Both 1& 2
Question 20
0.25 / 0.25 pts
Amongst which of the following is / are the branch of statistics which deals with the
development of statistical methods is classified as ___.
Applied statistics
Industry statistics
Economic statistics
Question 21
0.25 / 0.25 pts
Google tries to differentiate emails as spam and non-spam, this is an example of
Clustering
Classification
Regression
Association Rule
Question 22
0.25 / 0.25 pts
Identify the data analytics task for the following scenario. An e-commerce platform is
recommending products to their customers to improve the shopping experience.
Diagnostic Analytics
Predictive Analytics
Descriptive Analytics
Prescriptive Analytics
Question 23
0.25 / 0.25 pts
In a dataset, CarColor is one of the attributes and it can take the following values {Red,
Green, Yellow, Black}, what type of attribute is CarColor?
Interval attribute
Ratio attribute
Ordinal attribute
Nominal attribute
Question 24
0.25 / 0.25 pts
What are the best practices for implementing big data analytics programmes?
Focusing on business goals and how to use big data analytics technologies to meet them
Question 25
0.25 / 0.25 pts
Is it possible to rescale a continuous data for better data understanding?
True
False
Question 26
0.25 / 0.25 pts
As part of a survey in a large organization, one of the features that you capture is
designation. This type of data has the characteristic
Question 27
0.25 / 0.25 pts
"Order Fulfilment Date" should come after "Order Creation Date". This is an example of
which data quality aspect:
Consistency
Integrity
Conformity
Timeliness
Question 28
0.25 / 0.25 pts
Data lake mainly stores data in
none
Structured format
Raw format
both
Question 29
0.25 / 0.25 pts
What does discretization do ?
Question 32
0.25 / 0.25 pts
Consider the following Python code.
import numpy as np
from sklearn.preprocessing import Binarizer
exam1 = np.array([41,43,45,47,49])
b1 = Binarizer(threshold= exam1.mean())
exam1_b = b1.fit_transform(exam1.reshape(-1, 1))
print(exam1_b)
The last line of the code snippet will print
[0,0,1,0,0]
[1,1,0,1,1]
[0,0,0,1,1]
[1,1,1,0,0]
Question 33
0.25 / 0.25 pts
The scatterplot implies that
Question 34
0.25 / 0.25 pts
Which data visualization is appropriate to explore the relationship between two
attributes out of many attributes in a data frame.
Histogram
Scatter plot
Box-plot
Heat maps
Question 35
0.25 / 0.25 pts
A sample is ------------------ if it has approximately the same property of interest
Systamatic
Qualitative
Representative
Probabilistic
Question 37
0.25 / 0.25 pts
Converting the raw values of a numeric attribute is ?
Sampling
Normalization
Discretization
Smoothing
Question 38
0.25 / 0.25 pts
In the given table below there is a requirement that to get the name, gender, marks of the
top-scoring students only. Which of the following functionalities of data wrangling is
used?
Data exploration
Reshape
Replace
Filter
Question 39
0.25 / 0.25 pts
“Proximity” in Data Science terms means:
The area surrounding an object where the object is able to exert its influence.
PartialQuestion 40
0.13 / 0.25 pts
Which of the following methods are considered to be the best practice for data cleaning?
co
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Quiz 1
Due Dec 19 at 23:59 Points 10 Questions 40
Available Dec 18 at 19:00 - Dec 19 at 23:59 Time Limit 60 Minutes
Instructions
The Quiz I can be attempted only once. You will not be provided with a make-up Quiz, if you miss
this Quiz.
The quiz is timed for one hour and has to be completed once you start. You will not be allowed to go
back to the previous question if you skip a question.
The answers will be visible only after three days once the quiz has ended.
Attempt History
Attempt Time Score
LATEST Attempt 1 54 minutes 8.63 out of 10
Storyteller
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 1/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data Analyst
Privacy Checker
Recommendation Systems
Communicative
Punctual
Creative
Technical
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 2/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
data wrangling
Functional
Center of Excellence
Coordinational
Consulting
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 3/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
98W
100W
150W
110W
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 3.2. [Answer should have a precision of X.XXXX]
0.8679
0.8769
-0.8769
-0.8679
Suppose that the mean and standard deviation of the values for an
attribute are 8.9 and 6.5, respectively. Apply z-score normalization to a
value of 10.7.
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 4/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
-0.2679
0.2679
0.2769
-0.2769
Suppose that the minimum and maximum values for an attribute are 4.3
and 7.6, respectively. Compute the scaled value of 5.4 if min-max
normalization is applied to scale [-1.0,+1.0]. (Answer should have a
precision of X.XX)
0.66
0.76
0.33
-0.33
SMAM
CRISP-DM
SEMMA
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 5/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Artificial Intelligence
Data Wrangling
Deep Learning
Machine Learning
Prescriptive
Predictive
Diagnostic
Descriptive
Predictive
Prescriptive
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 6/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Diagnostic
Descriptive
Prescriptive Analytics
Diagnostic Analytics
Predictive Analytics
Descriptive Analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 7/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Regression
Classification
Association Rule
Clustering
Classification
Association Rule
Regression
Clustering
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 8/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
None
Model planning
Operationalize
Model building
Incorrect
Question 21 0 / 0.25 pts
Descriptive Analytics
Predictive Analytics
Diagnostic Analytics
Cognitive analytics
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 9/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Alerts
Adhoc reports
Predictive model
Standard report
Variables
Data point
Features
Dimensions
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 10/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Nominal attribute
Numeric attribute
Ordinal attribute
Continuous attribute
The distance between categories is equal across the range of
interval/ratio data
Ordinal variables have a fixed zero point, whereas interval/ratio
variables do not
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 11/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 27 0 / 0.25 pts
Unsupervised Learning
Feature Selection
Quasi-structured
Semi-structured
Unstructured
Structured
Partial
Question 29 0.13 / 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 12/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Incorrect
Question 30 0 / 0.25 pts
1
4
3
2
Incorrect
Question 31 0 / 0.25 pts
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 13/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Identify the sampling technique used in the following use case. For ML
classification task, the algorithm requires that the test set has equal
examples from the three categories.
Systematic Sampling
Stratified Random
Cluster Sampling
Simple Random
Mostly True
False
True
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 14/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Data transformation
Data retrieval
Data cleansing
Data combining
“Similarity” means:
The number of tuples in a database whose attributes have similar
values.
Noise
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 15/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
Inconsistencies
Integration
Redundancy
A. it can’t be done
In a boxplot, where Q1, Q2 and Q3 are the first, second and third
quartiles respectively, the interquartile range IQR is calculated as:
IQR = Q3 – Q1
IQR = Q3-Q2
IQR = Q2-Q1
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 16/17
18/12/2022, 19:57 Quiz 1: Introduction to Data Science (S1-22_DSECLZG532)
2
3
4
1
From mathematical models for epidemics one observes that the initial
phase is in the exponential growth. This can be verified by plotting the
number of infections on log-scale. This is a use case for which
technique.
Transformation
Aggregation
Discretization
Sampling
https://bits-pilani.instructure.com/courses/1704/quizzes/3453 17/17