You are on page 1of 1080

03/12/2022, 23:22 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
2 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 23:22
This attempt took 2 minutes.

Question 1 0.25
/ 0.25 pts

_________________ is the science of collecting, analyzing, presenting,


and interpreting data. Objective to draw conclusions on the population.

 
Data Analytics

 
Business Intelligence

 
Statistics

 
Computational Intelligence

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

___________ is considered to be a sub-field of or one of the tools of


AI, which provides machines with the capability of learning from
experience.

 
Data analysis

 
Machine Learning

 
Artificial Intelligence

 
Data Science

Question 3 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Creator

 
Data Administrator

 
None

 
Mathematical Analyst

Question 4 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

How is data science understood in terms of interpretability and


Accuracy?

 
High Accuracy, Low Interpretability

 
low Accuracy, high Interpretability

Question 5 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Data related

 
Organization related

 
All three

 
Technology related

Question 6 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
False

 
True

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 7 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Preparation

 
Data Understanding

 
Data Requirements

 
Data Collection

Question 8 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Label

 
Data Wrangling

 
Data Annotation

 
Data Writing

Question 9 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Creation of new fiscal policy

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Product recommender systems

 
Fraud detection and prevention system in a bank

 
Targeted advertising as per customer's need

Question 10 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Collection

 
Data Preparation

 
Data Understanding

 
Data Cleaning

Question 11 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Data Analyst

 
Data Scientist

 
Data Engineer

 
Business Intelligence Analyst

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 12 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Machine Learning Activity

 
None of the above

 
Probability Calculation activity

 
Data Science Activity

Question 13 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are negatively correlated

 
The features are independent

 
The features are positively correlated

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 14 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic


process?

 
Data Warehouse

 
Data Mining

 
Business Intelligence

 
Data Science

Question 15 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Diagnostic

 
Predictive

 
Descriptive

 
Prescriptive

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 16 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Recommendation Systems

 
Image & Speech Recognition

 
Privacy Checker

 
Online Price Comparison

Question 17 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Prescriptive Analytics

 
Predictive Analytics

 
Diagnostic Analytics

 
Descriptive Analytics

Question 18 0.25
/ 0.25 pts

Big data is defined as ______________?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The process of extracting and creating information from raw data by
using different techniques

 
An engineering discipline that is concerned with all aspects of software
production.

 
Collections of datasets whose volume, velocity or variety is so large that
it is difficult to store, manage, process and analyze the data using
traditional databases and data processing tools

 
The process of making machines capable of mimicking human behavior

Question 19 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Volume

 
Velocity

 
Variety

 
Vision

Question 20 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
What will happen?

 
What happened?

 
What should I do?

 
Why did it happen?

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/10
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
12 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 4 at 8:21
This attempt took 12 minutes.

Question 1 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

What is the process of training machine learning models, evaluating their performance, and using them to
make predictions?

 
Visualization

 
Feature Engineering

 
Predictive Modelling

 
Data Exploration

Question 2 0.25
/ 0.25 pts

The Data Warehouse is a “Schema-on -Load” Approach.

 
True

 
False

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 3 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
All of the listed

 
Client Requirement understanding

 
High level understanding of Technical know-how specific to the domain.

Question 4 0.25
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Standard & dynamic

 
Static & comparative

 
Descriptive & Static

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Predictive & Prescriptive

Question 5 0.25
/ 0.25 pts

_____________ is the practice of designing and building systems for collecting, storing, and analyzing data
at scale.

 
Data engineering

 
Machine Learning

 
Data Mining

 
Data Analysis

Question 6 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
True

 
False

Question 7 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a meal. Which phase of the data
science process does the analogy fit correctly? (Choose the best answer)

 
Data Understanding

 
Data Preparation

 
Data Cleaning

 
Data Collection

Question 8 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Data preprocessing improves the data quality and make data mining algorithms efficient and effective. What
are the data preprocessing tasks along with Data cleaning?

 
All of the above

 
Data Reduction

 
Data Transformation

 
Data Integration

Question 9 0.25
/ 0.25 pts

Exploratory data analysis does not help in

 
Finding out the data type of a variable

 
Finding statistical estimates of a variable

 
Derivation of new attributes

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
In univariate and bivariate variable analysis

Question 10 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Annotation

 
Data Writing

 
Data Wrangling

 
Data Label

Question 11 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the voice is male or female a Data
Science Activity

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
True

 
False

Question 12 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic process?

 
Business Intelligence

 
Data Warehouse

 
Data Mining

 
Data Science

Question 13 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that the Tamil version of the movie was
viewed the most, out of the 3 dubbed languages. The data science team plans to develop a recommendation
https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

system for future releases based on this movie trend, which of the following ‘analytical’ approaches would
you suggest.

 
Predictive analytics

 
None of the above

 
Diagnostic analytics

 
Descriptive analytics

Question 14 0.25
/ 0.25 pts

The scatterplot implies that

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The features are independent

 
The features are positively correlated

 
The features are negatively correlated

Question 15 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor asked you questions like: Were you
exposed to rain or cold climate; or did you have contact with a sick person; or did you have food from outside
etc., and then doctor comes to a conclusion based on your response to the queries can be considered
analogous to which stage of data analytics.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Descriptive

 
Prescriptive

 
Diagnostic

 
Predictive

Question 16 0.25
/ 0.25 pts

Type of problem in statistics will be

 
None of these

 
Well Structured

 
Semi-Structured

 
Unstructured

Question 17 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 11/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which of the following are the main challenges in Data Science?

 
Data Security

 
Data Availability

 
All of these

 
Data quantity

Question 18 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Online Price Comparison

 
Recommendation Systems

 
Image & Speech Recognition

 
Privacy Checker

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 12/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 19 0.25
/ 0.25 pts

Big data is defined as ______________?

 
The process of extracting and creating information from raw data by using different techniques

 
The process of making machines capable of mimicking human behavior

 
An engineering discipline that is concerned with all aspects of software production.

 
Collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process
and analyze the data using traditional databases and data processing tools

Question 20 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Variety

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 13/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Volume

 
Velocity

 
Vision

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 14/14
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due Dec 4 at 19:00 Points 5 Questions 20
Available Dec 3 at 19:00 - Dec 4 at 19:00 24 hours Time Limit 30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score

LATEST Attempt 1 18 minutes 5 out of 5

 Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz: 5 out of 5


Submitted Dec 3 at 23:26
This attempt took 18 minutes.

Question 1 0.25 / 0.25 pts

The Data Warehouse is a “Schema-on -Load” Approach.

  False

  True

Question 2 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which one of this belongs to the Data Science career?

  Data Administrator

  None

  Mathematical Analyst

  Data Creator

Question 3 0.25 / 0.25 pts

How is data science understood in terms of interpretability and


Accuracy?

  low Accuracy, high Interpretability

  High Accuracy, Low Interpretability

Question 4 0.25 / 0.25 pts

___________ comprises the strategies and technologies used by


enterprises for data analysis and management.

  Machine Learning

  Artificial Intelligence

  None of the Above.

  Business Intelligence

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 5 0.25 / 0.25 pts

_____________ is the practice of designing and building systems for


collecting, storing, and analyzing data at scale.

  Data Analysis

  Machine Learning

  Data engineering

  Data Mining

Question 6 0.25 / 0.25 pts

_________________ is the science of collecting, analyzing, presenting,


and interpreting data. Objective to draw conclusions on the population.

  Statistics

  Computational Intelligence

  Data Analytics

  Business Intelligence

Question 7 0.25 / 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

correctly? (Choose the best answer)

  Data Cleaning

  Data Collection

  Data Understanding

  Data Preparation

Question 8 0.25 / 0.25 pts

Exploratory data analysis does not help in

  Derivation of new attributes

  In univariate and bivariate variable analysis

  Finding out the data type of a variable

  Finding statistical estimates of a variable

Question 9 0.25 / 0.25 pts

Data preprocessing improves the data quality and make data mining
algorithms efficient and effective. What are the data preprocessing
tasks along with Data cleaning?

  Data Reduction

  Data Transformation

  All of the above

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  Data Integration

Question 10 0.25 / 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

  Data Collection

  Data Understanding

  Data Requirements

  Data Preparation

Question 11 0.25 / 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

  Data Engineer

  Data Scientist

  Data Analyst

  Business Intelligence Analyst

Question 12 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

While playing cards predicting the next card to be a joker is 

  Machine Learning Activity

  Data Science Activity

  Probability Calculation activity

  None of the above

Question 13 0.25 / 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

  None of the above

  Descriptive analytics

  Diagnostic analytics

  Predictive analytics

Question 14 0.25 / 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

response to the queries can be considered analogous to which stage of


data analytics.

  Descriptive

  Prescriptive

  Predictive

  Diagnostic

Question 15 0.25 / 0.25 pts

Which among the following advocates an exploratory and dynamic


process?

  Data Science

  Business Intelligence

  Data Mining

  Data Warehouse

Question 16 0.25 / 0.25 pts

Which of the following are the main challenges in Data Science?

  Data Security

  Data quantity

  Data Availability

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  All of these

Question 17 0.25 / 0.25 pts

Which of the following is not an application of Data Science?

  Image & Speech Recognition

  Privacy Checker

  Online Price Comparison

  Recommendation Systems

Question 18 0.25 / 0.25 pts

Diagnostic Analysis is often referred as,

  Root Cause Analysis

  Cognitive Analytics

  Pattern Recognition

  Statistical Analysis

Question 19 0.25 / 0.25 pts

Big data is defined as ______________?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The process of making machines capable of mimicking human behavior

 
Collections of datasets whose volume, velocity or variety is so large that
it is difficult to store, manage, process and analyze the data using
traditional databases and data processing tools

 
The process of extracting and creating information from raw data by
using different techniques

 
An engineering discipline that is concerned with all aspects of software
production.

Question 20 0.25 / 0.25 pts

Which of the following is not a characteristic of Big Data?

  Vision

  Variety

  Velocity

  Volume

Quiz Score: 5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/9
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due Dec 4 at 19:00 Points 5 Questions 20 Available Dec 3 at 19:00 - Dec 4 at 19:00 24 hours
Time Limit 30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score

LATEST Attempt 1 15 minutes 5 out of 5

 Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz: 5 out of 5


Submitted Dec 4 at 8:47
This attempt took 15 minutes.

Question 1 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Data science is a ___________ Analysis.

  Static & Comparative.

  Descriptive & Static

  Standard & dynamic

  Predictive & Prescriptive

Question 2 0.25 / 0.25 pts

Data science is an Interdisciplinary field.

  True

  False

Question 3 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Data Engineers are also known as Data Miners.

  True

  False

Question 4 0.25 / 0.25 pts

_________________ is the science of collecting, analyzing, presenting, and interpreting data. Objective to
draw conclusions on the population.

  Statistics

  Business Intelligence

  Data Analytics

  Computational Intelligence

Question 5 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

___________ comprises the strategies and technologies used by enterprises for data analysis and
management.

  None of the Above.

  Artificial Intelligence

  Business Intelligence

  Machine Learning

Question 6 0.25 / 0.25 pts

Which one of this belongs to the Data Science career?

  None

  Mathematical Analyst

  Data Administrator

  Data Creator

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 7 0.25 / 0.25 pts

Which one refers to the labelling of the data?   

  Data Writing

  Data Annotation

  Data Label

  Data Wrangling

Question 8 0.25 / 0.25 pts

Data preprocessing improves the data quality and make data mining algorithms efficient and effective. What
are the data preprocessing tasks along with Data cleaning?

  Data Integration

  All of the above

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  Data Reduction

  Data Transformation

Question 9 0.25 / 0.25 pts

Data scientist is not responsible for 

  Data analytics

  Data manipulation

  Data mining

  Building continuous data stream

Question 10 0.25 / 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the best possible answer.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  Data Understanding

  Data Preparation

  Data Requirements

  Data Collection

Question 11 0.25 / 0.25 pts

While playing cards predicting the next card to be a joker is 

  Probability Calculation activity

  Machine Learning Activity

  Data Science Activity

  None of the above

Question 12 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

A scenario where you visited a doctor for your fever and the doctor asked you questions like: Were you
exposed to rain or cold climate; or did you have contact with a sick person; or did you have food from outside
etc., and then doctor comes to a conclusion based on your response to the queries can be considered
analogous to which stage of data analytics.

  Diagnostic

  Prescriptive

  Predictive

  Descriptive

Question 13 0.25 / 0.25 pts

Can extracting frequencies from audio signals to determine whether the voice is male or female a Data
Science Activity

  True

  False

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 14 0.25 / 0.25 pts

Among the following, which user role is more concerned with non-functional requirements such as security,
fault tolerance and efficiency in data processing applications?

  Data Analyst

  Business Intelligence Analyst

  Data Scientist

  Data Engineer

Question 15 0.25 / 0.25 pts

The scatterplot implies that

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  The features are independent

  The features are positively correlated

  The features are negatively correlated

Question 16 0.25 / 0.25 pts

Type of problem in statistics will be

  Unstructured

  Semi-Structured

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  None of these

  Well Structured

Question 17 0.25 / 0.25 pts

Which of the following is not a characteristic of Big Data?

  Velocity

  Variety

  Volume

  Vision

Question 18 0.25 / 0.25 pts

Which of the following questions is answered by Predictive Analytics?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 11/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  What happened?

  Why did it happen?

  What will happen?

  What should I do?

Question 19 0.25 / 0.25 pts

Big data is defined as ______________?

  The process of making machines capable of mimicking human behavior

  An engineering discipline that is concerned with all aspects of software production.

 
Collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process
and analyze the data using traditional databases and data processing tools

  The process of extracting and creating information from raw data by using different techniques

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 12/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 20 0.25 / 0.25 pts

Diagnostic Analysis is often referred as,

  Root Cause Analysis

  Statistical Analysis

  Cognitive Analytics

  Pattern Recognition

Quiz Score: 5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 13/13
• Available Dec 3 at 19:00 - Dec 4 at
19:00 24 hours

• Time Limit 30 Minutes

Instructions
Instructions:
1. Each question carries 0.25 marks.
2. Time limit for answering the Quiz is 30 minutes.
3. Only one submission allowed per student.

Attempt History
Attempt Time Score

LATEST Attempt 1 12 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.
Score for this quiz: 5 out of 5
Submitted Dec 4 at 10:11
This attempt took 12 minutes.

Question 1
0.25 / 0.25 pts
Data Science domain knowledge needs

Client Requirement understanding

High level understanding of Technical know-how specific to the domain.

Understanding of domain concepts


All of the listed

Question 2
0.25 / 0.25 pts
_____________ is the practice of designing and building systems for collecting, storing,
and analyzing data at scale.

Data engineering

Machine Learning

Data Analysis

Data Mining

Question 3
0.25 / 0.25 pts
Which one of this belongs to the Data Science career?

Data Administrator

Data Creator

Mathematical Analyst

None

Question 4
0.25 / 0.25 pts
___________ is considered to be a sub-field of or one of the tools of AI, which provides
machines with the capability of learning from experience.

Artificial Intelligence

Machine Learning

Data Science

Data analysis

Question 5
0.25 / 0.25 pts
How is data science understood in terms of interpretability and Accuracy?

low Accuracy, high Interpretability

High Accuracy, Low Interpretability

Question 6
0.25 / 0.25 pts
Data science is an Interdisciplinary field.

True

False

Question 7
0.25 / 0.25 pts
Which of these is not an example of the application of data science?

Fraud detection and prevention system in a bank

Product recommender systems

Targeted advertising as per customer's need

Creation of new fiscal policy

Question 8
0.25 / 0.25 pts
In which phase the duplicates (of the data) are removed? Choose the best possible
answer.

Data Collection

Data Preparation

Data Understanding

Data Requirements

Question 9
0.25 / 0.25 pts
Data preprocessing improves the data quality and make data mining algorithms
efficient and effective. What are the data preprocessing tasks along with Data
cleaning?

Data Reduction
All of the above

Data Integration

Data Transformation

Question 10
0.25 / 0.25 pts
A person goes to a supermarket to purchase ingredients for making a meal. Which
phase of the data science process does the analogy fit correctly? (Choose the best
answer)

Data Cleaning

Data Understanding

Data Preparation

Data Collection

Question 11
0.25 / 0.25 pts
The scatterplot implies that
The features are positively correlated

The features are independent

The features are negatively correlated

Question 12
0.25 / 0.25 pts
While playing cards predicting the next card to be a joker is

Machine Learning Activity

None of the above

Probability Calculation activity

Data Science Activity

Question 13
0.25 / 0.25 pts
Among the following, which user role is more concerned with non-functional
requirements such as security, fault tolerance and efficiency in data processing
applications?

Data Scientist

Business Intelligence Analyst

Data Analyst

Data Engineer

Question 14
0.25 / 0.25 pts
Which among the following advocates an exploratory and dynamic process?

Data Mining

Data Science

Data Warehouse

Business Intelligence

Question 15
0.25 / 0.25 pts
Recently, the movie RRR released on OTT platform. It is observed that the Tamil
version of the movie was viewed the most, out of the 3 dubbed languages. The data
science team plans to develop a recommendation system for future releases based on
this movie trend, which of the following ‘analytical’ approaches would you suggest.
Descriptive analytics

Predictive analytics

Diagnostic analytics

None of the above

Question 16
0.25 / 0.25 pts
Type of problem in statistics will be

Well Structured

None of these

Semi-Structured

Unstructured

Question 17
0.25 / 0.25 pts
Diagnostic Analysis is often referred as,

Root Cause Analysis

Statistical Analysis
Cognitive Analytics

Pattern Recognition

Question 18
0.25 / 0.25 pts
Which of the following is not an application of Data Science?

Online Price Comparison

Recommendation Systems

Privacy Checker

Image & Speech Recognition

Question 19
0.25 / 0.25 pts
Which of the following questions is answered by Predictive Analytics?

What will happen?

What should I do?

Why did it happen?

What happened?
Question 20
0.25 / 0.25 pts
Which of the following is not a characteristic of Big Data?

Variety

Volume

Velocity

Vision

Quiz Score: 5 out of 5


Submission Details:
Time: 12 minutes

Current Score: 5 out of 5

Kept Score: 5 out of 5


04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
19 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 23:56
This attempt took 19 minutes.

Question 1 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

How is data science understood in terms of interpretability and Accuracy?

 
High Accuracy, Low Interpretability

 
low Accuracy, high Interpretability

Question 2 0.25
/ 0.25 pts

___________ comprises the strategies and technologies used by enterprises for data analysis and
management.

 
Artificial Intelligence

 
Business Intelligence

 
None of the Above.

 
Machine Learning

Question 3 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

The Data Warehouse is a “Schema-on -Load” Approach.

 
True

 
False

Question 4 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 5 0.25
/ 0.25 pts

Business Intelligence follows _________________ process.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Static & comparative

 
Standard & dynamic

 
Predictive & Prescriptive

 
Descriptive & Static

Question 6 0.25
/ 0.25 pts

What is the process of training machine learning models, evaluating their performance, and using them to
make predictions?

 
Visualization

 
Predictive Modelling

 
Data Exploration

 
Feature Engineering

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 7 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Fraud detection and prevention system in a bank

 
Creation of new fiscal policy

 
Targeted advertising as per customer's need

 
Product recommender systems

Question 8 0.25
/ 0.25 pts

Data preprocessing improves the data quality and make data mining algorithms efficient and effective. What
are the data preprocessing tasks along with Data cleaning?

 
Data Transformation

 
Data Reduction

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
All of the above

 
Data Integration

Question 9 0.25
/ 0.25 pts

Data scientist is not responsible for 

 
Data mining

 
Data analytics

 
Building continuous data stream

 
Data manipulation

Question 10 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the best possible answer.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Collection

 
Data Understanding

 
Data Preparation

 
Data Requirements

Question 11 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-functional requirements such as security,
fault tolerance and efficiency in data processing applications?

 
Data Scientist

 
Business Intelligence Analyst

 
Data Engineer

 
Data Analyst

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 12 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are positively correlated

 
The features are independent

 
The features are negatively correlated

Question 13 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which among the following advocates an exploratory and dynamic process?

 
Data Mining

 
Data Warehouse

 
Business Intelligence

 
Data Science

Question 14 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that the Tamil version of the movie was
viewed the most, out of the 3 dubbed languages. The data science team plans to develop a recommendation
system for future releases based on this movie trend, which of the following ‘analytical’ approaches would
you suggest.

 
None of the above

 
Predictive analytics

 
Diagnostic analytics

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Descriptive analytics

Question 15 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor asked you questions like: Were you
exposed to rain or cold climate; or did you have contact with a sick person; or did you have food from outside
etc., and then doctor comes to a conclusion based on your response to the queries can be considered
analogous to which stage of data analytics.

 
Descriptive

 
Diagnostic

 
Prescriptive

 
Predictive

Question 16 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
All of these

 
Data Availability

 
Data quantity

 
Data Security

Question 17 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Cognitive Analytics

 
Statistical Analysis

 
Root Cause Analysis

 
Pattern Recognition

Question 18 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 11/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Big data is defined as ______________?

 
Collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process and
analyze the data using traditional databases and data processing tools

 
An engineering discipline that is concerned with all aspects of software production.

 
The process of making machines capable of mimicking human behavior

 
The process of extracting and creating information from raw data by using different techniques

Question 19 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Diagnostic Analytics

 
Prescriptive Analytics

 
Descriptive Analytics

 
Predictive Analytics

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 12/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 20 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Recommendation Systems

 
Privacy Checker

 
Image & Speech Recognition

 
Online Price Comparison

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 13/13
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
10 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 22:49
This attempt took 10 minutes.

Question 1 0.25
/ 0.25 pts

_____________ is the practice of designing and building systems for


collecting, storing, and analyzing data at scale.

 
Data Analysis

 
Data Mining

 
Machine Learning

 
Data engineering

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Creator

 
Data Administrator

 
None

 
Mathematical Analyst

Question 3 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
False

 
True

Question 4 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
All of the listed

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
High level understanding of Technical know-how specific to the domain.

 
Client Requirement understanding

Question 5 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Organization related

 
All three

 
Technology related

 
Data related

Question 6 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 7 0.25
/ 0.25 pts

Data scientist is not responsible for 

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data analytics

 
Data manipulation

 
Data mining

 
Building continuous data stream

Question 8 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Collection

 
Data Requirements

 
Data Preparation

 
Data Understanding

Question 9 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Collection

 
Data Cleaning

 
Data Understanding

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Preparation

Question 10 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Writing

 
Data Wrangling

 
Data Label

 
Data Annotation

Question 11 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are negatively correlated

 
The features are positively correlated

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The features are independent

Question 12 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Descriptive

 
Predictive

 
Prescriptive

 
Diagnostic

Question 13 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Probability Calculation activity

 
Machine Learning Activity

 
Data Science Activity

 
None of the above

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 14 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Data Engineer

 
Data Analyst

 
Business Intelligence Analyst

 
Data Scientist

Question 15 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Predictive analytics

 
Descriptive analytics

 
None of the above

 
Diagnostic analytics

Question 16 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which of the following are the main challenges in Data Science?

 
Data Availability

 
Data Security

 
Data quantity

 
All of these

Question 17 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Pattern Recognition

 
Statistical Analysis

 
Cognitive Analytics

 
Root Cause Analysis

Question 18 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Vision

 
Velocity

 
Variety

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Volume

Question 19 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Image & Speech Recognition

 
Online Price Comparison

 
Privacy Checker

 
Recommendation Systems

Question 20 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
What happened?

 
What will happen?

 
What should I do?

 
Why did it happen?

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/9
QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
3 minutes 4.75 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


4.75 out of 5
Submitted Dec 3 at 23:10
This attempt took 3 minutes.

Incorrect Question 1 0
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Descriptive & Static

 
Predictive & Prescriptive

 
Standard & dynamic

 
Static & comparative

Question 2 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
High level understanding of Technical know-how specific to the domain.

 
All of the listed

 
Client Requirement understanding

Question 3 0.25
/ 0.25 pts

___________ is considered to be a sub-field of or one of the tools of AI, which provides machines with the
capability of learning from experience.

 
Artificial Intelligence

 
Machine Learning

 
Data Science

 
Data analysis

Question 4 0.25
/ 0.25 pts

_________________ is the science of collecting, analyzing, presenting, and interpreting data. Objective to
draw conclusions on the population.

 
Business Intelligence

 
Data Analytics

 
Computational Intelligence

 
Statistics

Question 5 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 6 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Creator

 
Data Administrator

 
None

 
Mathematical Analyst

Question 7 0.25
/ 0.25 pts

Exploratory data analysis does not help in

 
In univariate and bivariate variable analysis

 
Finding out the data type of a variable

 
Finding statistical estimates of a variable

 
Derivation of new attributes

Question 8 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the best possible answer.

 
Data Requirements

 
Data Preparation

 
Data Understanding

 
Data Collection

Question 9 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a meal. Which phase of the data
science process does the analogy fit correctly? (Choose the best answer)

 
Data Understanding

 
Data Cleaning

 
Data Collection

 
Data Preparation

Question 10 0.25
/ 0.25 pts

Data scientist is not responsible for 


 
Data analytics

 
Data manipulation

 
Data mining

 
Building continuous data stream

Question 11 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic process?

 
Business Intelligence

 
Data Science

 
Data Mining

 
Data Warehouse

Question 12 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that the Tamil version of the movie was
viewed the most, out of the 3 dubbed languages. The data science team plans to develop a recommendation
system for future releases based on this movie trend, which of the following ‘analytical’ approaches would
you suggest.

 
Predictive analytics

 
Descriptive analytics

 
None of the above

 
Diagnostic analytics

Question 13 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-functional requirements such as security,
fault tolerance and efficiency in data processing applications?

 
Data Analyst

 
Data Scientist

 
Data Engineer

 
Business Intelligence Analyst

Question 14 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
None of the above

 
Machine Learning Activity

 
Data Science Activity

 
Probability Calculation activity

Question 15 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are positively correlated

 
The features are independent

 
The features are negatively correlated

Question 16 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Cognitive Analytics

 
Pattern Recognition

 
Statistical Analysis

 
Root Cause Analysis

Question 17 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data Availability

 
Data quantity

 
All of these

 
Data Security

Question 18 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Vision

 
Velocity

 
Volume

 
Variety

Question 19 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
What should I do?

 
Why did it happen?

 
What will happen?

 
What happened?

Question 20 0.25
/ 0.25 pts

Big data is defined as ______________?

 
The process of making machines capable of mimicking human behavior

 
The process of extracting and creating information from raw data by using different techniques

 
An engineering discipline that is concerned with all aspects of software production.

 
Collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process
and analyze the data using traditional databases and data processing tools

Quiz Score:
4.75 out of 5
03/12/2022, 23:22 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
2 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 23:22
This attempt took 2 minutes.

Question 1 0.25
/ 0.25 pts

_________________ is the science of collecting, analyzing, presenting,


and interpreting data. Objective to draw conclusions on the population.

 
Data Analytics

 
Business Intelligence

 
Statistics

 
Computational Intelligence

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

___________ is considered to be a sub-field of or one of the tools of


AI, which provides machines with the capability of learning from
experience.

 
Data analysis

 
Machine Learning

 
Artificial Intelligence

 
Data Science

Question 3 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Creator

 
Data Administrator

 
None

 
Mathematical Analyst

Question 4 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

How is data science understood in terms of interpretability and


Accuracy?

 
High Accuracy, Low Interpretability

 
low Accuracy, high Interpretability

Question 5 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Data related

 
Organization related

 
All three

 
Technology related

Question 6 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
False

 
True

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 7 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Preparation

 
Data Understanding

 
Data Requirements

 
Data Collection

Question 8 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Label

 
Data Wrangling

 
Data Annotation

 
Data Writing

Question 9 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Creation of new fiscal policy

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Product recommender systems

 
Fraud detection and prevention system in a bank

 
Targeted advertising as per customer's need

Question 10 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Collection

 
Data Preparation

 
Data Understanding

 
Data Cleaning

Question 11 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Data Analyst

 
Data Scientist

 
Data Engineer

 
Business Intelligence Analyst

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 12 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Machine Learning Activity

 
None of the above

 
Probability Calculation activity

 
Data Science Activity

Question 13 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are negatively correlated

 
The features are independent

 
The features are positively correlated

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 14 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic


process?

 
Data Warehouse

 
Data Mining

 
Business Intelligence

 
Data Science

Question 15 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Diagnostic

 
Predictive

 
Descriptive

 
Prescriptive

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 16 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Recommendation Systems

 
Image & Speech Recognition

 
Privacy Checker

 
Online Price Comparison

Question 17 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Prescriptive Analytics

 
Predictive Analytics

 
Diagnostic Analytics

 
Descriptive Analytics

Question 18 0.25
/ 0.25 pts

Big data is defined as ______________?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The process of extracting and creating information from raw data by
using different techniques

 
An engineering discipline that is concerned with all aspects of software
production.

 
Collections of datasets whose volume, velocity or variety is so large that
it is difficult to store, manage, process and analyze the data using
traditional databases and data processing tools

 
The process of making machines capable of mimicking human behavior

Question 19 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Volume

 
Velocity

 
Variety

 
Vision

Question 20 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
What will happen?

 
What happened?

 
What should I do?

 
Why did it happen?

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
30 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 23:18
This attempt took 30 minutes.

Question 1 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
All of the listed

 
High level understanding of Technical know-how specific to the domain.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Client Requirement understanding

Question 2 0.25
/ 0.25 pts

___________ is considered to be a sub-field of or one of the tools of AI,


which provides machines with the capability of learning from experience.

 
Machine Learning

 
Data analysis

 
Data Science

 
Artificial Intelligence

Question 3 0.25
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Standard & dynamic

 
Predictive & Prescriptive

 
Static & comparative

 
Descriptive & Static

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 4 0.25
/ 0.25 pts

___________ comprises the strategies and technologies used by


enterprises for data analysis and management.

 
Artificial Intelligence

 
Business Intelligence

 
None of the Above.

 
Machine Learning

Question 5 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Technology related

 
All three

 
Organization related

 
Data related

Question 6 0.25
/ 0.25 pts

Data science is a ___________ Analysis.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Static & Comparative.

 
Standard & dynamic

 
Predictive & Prescriptive

 
Descriptive & Static

Question 7 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Creation of new fiscal policy

 
Product recommender systems

 
Fraud detection and prevention system in a bank

 
Targeted advertising as per customer's need

Question 8 0.25
/ 0.25 pts

Data preprocessing improves the data quality and make data mining
algorithms efficient and effective. What are the data preprocessing tasks
along with Data cleaning?

 
Data Reduction

 
Data Transformation

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Integration

 
All of the above

Question 9 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Cleaning

 
Data Collection

 
Data Preparation

 
Data Understanding

Question 10 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the best
possible answer.

 
Data Preparation

 
Data Understanding

 
Data Collection

 
Data Requirements

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 11 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic


process?

 
Business Intelligence

 
Data Science

 
Data Warehouse

 
Data Mining

Question 12 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Probability Calculation activity

 
Data Science Activity

 
Machine Learning Activity

 
None of the above

Question 13 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency in
data processing applications?

 
Data Engineer

 
Data Analyst

 
Business Intelligence Analyst

 
Data Scientist

Question 14 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are independent

 
The features are positively correlated

 
The features are negatively correlated

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 15 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the


voice is male or female a Data Science Activity

 
True

 
False

Question 16 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data Security

 
Data quantity

 
All of these

 
Data Availability

Question 17 0.25
/ 0.25 pts

Type of problem in statistics will be

 
Unstructured

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Well Structured

 
Semi-Structured

 
None of these

Question 18 0.25
/ 0.25 pts

Big data is defined as ______________?

 
An engineering discipline that is concerned with all aspects of software
production.

 
Collections of datasets whose volume, velocity or variety is so large that it
is difficult to store, manage, process and analyze the data using traditional
databases and data processing tools

 
The process of extracting and creating information from raw data by using
different techniques

 
The process of making machines capable of mimicking human behavior

Question 19 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Cognitive Analytics

 
Statistical Analysis

 
Pattern Recognition

 
Root Cause Analysis

Question 20 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Prescriptive Analytics

 
Diagnostic Analytics

 
Predictive Analytics

 
Descriptive Analytics

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
4 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 23:16
This attempt took 4 minutes.

Question 1 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Organization related

 
Data related

 
All three

 
Technology related

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

Data science is a ___________ Analysis.

 
Predictive & Prescriptive

 
Descriptive & Static

 
Standard & dynamic

 
Static & Comparative.

Question 3 0.25
/ 0.25 pts

_____________ is the practice of designing and building systems for


collecting, storing, and analyzing data at scale.

 
Data Analysis

 
Data engineering

 
Machine Learning

 
Data Mining

Question 4 0.25
/ 0.25 pts

What is the process of training machine learning models, evaluating


their performance, and using them to make predictions?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Exploration

 
Visualization

 
Predictive Modelling

 
Feature Engineering

Question 5 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
True

 
False

Question 6 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
High level understanding of Technical know-how specific to the domain.

 
Understanding of domain concepts

 
All of the listed

 
Client Requirement understanding

Question 7 0.25
/ 0.25 pts
https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Collection

 
Data Cleaning

 
Data Preparation

 
Data Understanding

Question 8 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Product recommender systems

 
Creation of new fiscal policy

 
Fraud detection and prevention system in a bank

 
Targeted advertising as per customer's need

Question 9 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Preparation

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Understanding

 
Data Requirements

 
Data Collection

Question 10 0.25
/ 0.25 pts

Data preprocessing improves the data quality and make data mining
algorithms efficient and effective. What are the data preprocessing
tasks along with Data cleaning?

 
All of the above

 
Data Integration

 
Data Transformation

 
Data Reduction

Question 11 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Prescriptive

 
Diagnostic

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Predictive

 
Descriptive

Question 12 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Business Intelligence Analyst

 
Data Analyst

 
Data Scientist

 
Data Engineer

Question 13 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Predictive analytics

 
Descriptive analytics

 
Diagnostic analytics

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
None of the above

Question 14 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are independent

 
The features are negatively correlated

 
The features are positively correlated

Question 15 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Machine Learning Activity

 
Probability Calculation activity

 
Data Science Activity

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
None of the above

Question 16 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Descriptive Analytics

 
Predictive Analytics

 
Diagnostic Analytics

 
Prescriptive Analytics

Question 17 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Recommendation Systems

 
Privacy Checker

 
Online Price Comparison

 
Image & Speech Recognition

Question 18 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?


https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Vision

 
Velocity

 
Variety

 
Volume

Question 19 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
All of these

 
Data quantity

 
Data Security

 
Data Availability

Question 20 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
What happened?

 
Why did it happen?

 
What should I do?

 
What will happen?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/10
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
12 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 4 at 8:21
This attempt took 12 minutes.

Question 1 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

What is the process of training machine learning models, evaluating their performance, and using them to
make predictions?

 
Visualization

 
Feature Engineering

 
Predictive Modelling

 
Data Exploration

Question 2 0.25
/ 0.25 pts

The Data Warehouse is a “Schema-on -Load” Approach.

 
True

 
False

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 3 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
All of the listed

 
Client Requirement understanding

 
High level understanding of Technical know-how specific to the domain.

Question 4 0.25
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Standard & dynamic

 
Static & comparative

 
Descriptive & Static

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Predictive & Prescriptive

Question 5 0.25
/ 0.25 pts

_____________ is the practice of designing and building systems for collecting, storing, and analyzing data
at scale.

 
Data engineering

 
Machine Learning

 
Data Mining

 
Data Analysis

Question 6 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
True

 
False

Question 7 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a meal. Which phase of the data
science process does the analogy fit correctly? (Choose the best answer)

 
Data Understanding

 
Data Preparation

 
Data Cleaning

 
Data Collection

Question 8 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Data preprocessing improves the data quality and make data mining algorithms efficient and effective. What
are the data preprocessing tasks along with Data cleaning?

 
All of the above

 
Data Reduction

 
Data Transformation

 
Data Integration

Question 9 0.25
/ 0.25 pts

Exploratory data analysis does not help in

 
Finding out the data type of a variable

 
Finding statistical estimates of a variable

 
Derivation of new attributes

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
In univariate and bivariate variable analysis

Question 10 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Annotation

 
Data Writing

 
Data Wrangling

 
Data Label

Question 11 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the voice is male or female a Data
Science Activity

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
True

 
False

Question 12 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic process?

 
Business Intelligence

 
Data Warehouse

 
Data Mining

 
Data Science

Question 13 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that the Tamil version of the movie was
viewed the most, out of the 3 dubbed languages. The data science team plans to develop a recommendation
https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

system for future releases based on this movie trend, which of the following ‘analytical’ approaches would
you suggest.

 
Predictive analytics

 
None of the above

 
Diagnostic analytics

 
Descriptive analytics

Question 14 0.25
/ 0.25 pts

The scatterplot implies that

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The features are independent

 
The features are positively correlated

 
The features are negatively correlated

Question 15 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor asked you questions like: Were you
exposed to rain or cold climate; or did you have contact with a sick person; or did you have food from outside
etc., and then doctor comes to a conclusion based on your response to the queries can be considered
analogous to which stage of data analytics.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Descriptive

 
Prescriptive

 
Diagnostic

 
Predictive

Question 16 0.25
/ 0.25 pts

Type of problem in statistics will be

 
None of these

 
Well Structured

 
Semi-Structured

 
Unstructured

Question 17 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 11/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which of the following are the main challenges in Data Science?

 
Data Security

 
Data Availability

 
All of these

 
Data quantity

Question 18 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Online Price Comparison

 
Recommendation Systems

 
Image & Speech Recognition

 
Privacy Checker

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 12/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 19 0.25
/ 0.25 pts

Big data is defined as ______________?

 
The process of extracting and creating information from raw data by using different techniques

 
The process of making machines capable of mimicking human behavior

 
An engineering discipline that is concerned with all aspects of software production.

 
Collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process
and analyze the data using traditional databases and data processing tools

Question 20 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Variety

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 13/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Volume

 
Velocity

 
Vision

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 14/14
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
10 minutes 4.25 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


4.25 out of 5
Submitted Dec 3 at 21:39
This attempt took 10 minutes.

Question 1 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
All of the listed

 
Client Requirement understanding

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
High level understanding of Technical know-how specific to the domain.

 
Understanding of domain concepts

Question 2 0.25
/ 0.25 pts

___________ comprises the strategies and technologies used by enterprises for data analysis and management.

 
Machine Learning

 
Business Intelligence

 
Artificial Intelligence

 
None of the Above.

Question 3 0.25
/ 0.25 pts

The Data Warehouse is a “Schema-on -Load” Approach.

 
True

 
False

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 4 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 5 0.25
/ 0.25 pts

_________________ is the science of collecting, analyzing, presenting, and interpreting data. Objective to draw conclusions on the
population.

 
Business Intelligence

 
Computational Intelligence

 
Data Analytics

 
Statistics

Question 6 0.25
/ 0.25 pts

___________ is considered to be a sub-field of or one of the tools of AI, which provides machines with the capability of learning from
experience.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Science

 
Data analysis

 
Machine Learning

 
Artificial Intelligence

Incorrect
Question 7 0
/ 0.25 pts

Exploratory data analysis does not help in

 
In univariate and bivariate variable analysis

 
Finding statistical estimates of a variable

 
Derivation of new attributes

 
Finding out the data type of a variable

Question 8 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the best possible answer.

 
Data Preparation

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Requirements

 
Data Collection

 
Data Understanding

Question 9 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Label

 
Data Writing

 
Data Annotation

 
Data Wrangling

Question 10 0.25
/ 0.25 pts

Data preprocessing improves the data quality and make data mining algorithms efficient and effective. What are the data preprocessing
tasks along with Data cleaning?

 
Data Integration

 
Data Reduction

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Transformation

 
All of the above

Question 11 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
None of the above

 
Probability Calculation activity

 
Machine Learning Activity

 
Data Science Activity

Incorrect Question 12 0
/ 0.25 pts

Which among the following advocates an exploratory and dynamic process?

 
Data Warehouse

 
Data Science

 
Business Intelligence

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Mining

Question 13 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the voice is male or female a Data Science Activity

 
True

 
False

Incorrect Question 14 0
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Diagnostic analytics

 
Predictive analytics

 
None of the above

 
Descriptive analytics

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 15 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of data analytics.

 
Predictive

 
Prescriptive

 
Diagnostic

 
Descriptive

Question 16 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data Availability

 
Data Security

 
All of these

 
Data quantity

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 17 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Cognitive Analytics

 
Statistical Analysis

 
Pattern Recognition

 
Root Cause Analysis

Question 18 0.25
/ 0.25 pts

Type of problem in statistics will be

 
None of these

 
Well Structured

 
Semi-Structured

 
Unstructured

Question 19 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which of the following is not a characteristic of Big Data?

 
Velocity

 
Vision

 
Volume

 
Variety

Question 20 0.25
/ 0.25 pts

Big data is defined as ______________?

 
Collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process and analyze the data using traditional
databases and data processing tools

 
The process of extracting and creating information from raw data by using different techniques

 
The process of making machines capable of mimicking human behavior

 
An engineering discipline that is concerned with all aspects of software production.

Quiz Score:
4.25 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/10
Question 1
0.17 / 0.17 pts
Data preprocessing improves the data quality and make data mining algorithms efficient
and effective. What are the data preprocessing tasks along with Data cleaning?

Data Integration

Data Reduction

Data Transformation

All of the above.

Question 2
0.17 / 0.17 pts
Which of the following artifacts are not considered in the Descriptive analytics?

Standard report

Alerts

Adhoc reports

Predictive model

Question 3
0.17 / 0.17 pts
If you were to arrange the following data analytics techniques in the increasing order of
complexity, which of the following is considered the correct order?
Predictive, Diagnostic, Prescriptive, Descriptive

Descriptive, Diagnostic, Predictive, Prescriptive

Diagnostic, Predictive, Prescriptive, Descriptive

Prescriptive, Descriptive, Diagnostic, Predictive

Question 4
0.17 / 0.17 pts
Data scientist is not responsible for

Data mining

Data manipulation

Building continuous data stream

Data analytics

Question 5
0.17 / 0.17 pts
Which of these is not an example of the application of data science?

Fraud detection and prevention system in a bank

Creation of new fiscal policy


Targeted advertising as per customer's need

Product recommender systems

Question 6
0.17 / 0.17 pts
If a person is coming from software development background, which of the following
Data science project roles will best suite him?

Big Data Engineer

Machine Learning Engineer

Data Analyst

Storyteller

Question 7
0.17 / 0.17 pts

A person goes to a supermarket to purchase ingredients for making a meal. Which


phase of the data science process does the analogy fit correctly? (Choose the best
answer)

Data Understanding

Data Collection

Data Cleaning
Data Preparation

Question 8
0.17 / 0.17 pts
Which of the following are correct component for data science?

Data Engineering

Advanced Computing

Domain expertise

All of the options

IncorrectQuestion 9
0 / 0.17 pts
Which of the following is not an issue with CRISP-DM model?

The end-users of the analytical model are required to post-rationalize the model, which
leads to a lot of dissatisfaction

It very much underestimates the amount real experimentation that is needed to get at
viable results

Various modeling techniques are selected and applied, and their parameters are
calibrated to optimal values3
Thorough evaluation indeed is needed, yet the CRISP-DM methodology does not
prescribe how to do this.

Question 10
0.17 / 0.17 pts
Which of the following curve analysis is conducted on each predictor for classification?

NOC

ROC

COC

All of the mentioned

Question 11
0.17 / 0.17 pts
Which of the following best describes the difference between the data analyst and data
scientist?

Data analyst just deal with numbers whereas data scientists deals with algorithms

Data analyst does estimation whereas data scientist predicts & explains it as well

Data analyst and data scientists plays the same role in the project

Data analyst are more proficient in R whereas Data scientists are more proficient in
Python
IncorrectQuestion 12
0 / 0.17 pts
Which of the following is not a application for data science?

Recommendation Systems

Image & Speech Recognition

Online Price Comparison

Privacy Checker

Question 13
0.17 / 0.17 pts
Which one refers to the labelling of the data?

Data Writing

Data Annotation

Data Label

Data Wrangling

Question 14
0.17 / 0.17 pts
Data Science project steps are highly linear.

True
False

Question 15
0.17 / 0.17 pts
Which of the following can be a significant challenge in data science?

Irregular communication with stakeholders

Lack of project funding in data science projects

Data Quality

Inadequately defined organizational structure

Question 16
0.17 / 0.17 pts
CRISP-DM methodology is specifically built for IT(Information Technology) projects.

True

False

Question 17
0.17 / 0.17 pts
In "Business Understanding" phase of the data science process, the goals are identified
and the objectives defined.

True
False

Question 18
0.17 / 0.17 pts
Which of the following step is performed by data scientist after acquiring the data?

Data Cleaning

Data Integration

Data Replication

All of the options

Question 19
0.17 / 0.17 pts
Which one of the following is not a necessary characteristic of a Data Scientist?

Creative

Technical

Punctual

Communicative

Question 20
0.17 / 0.17 pts
Which of the following is not the characteristics for "Localized Analytics"?

Usable in functional silos

Only at functional or process level

Occurs in disconnected manner

Key data, technology and analysts are centralized

Question 21
0.17 / 0.17 pts
Raw data should be processed only one time.

True

False

IncorrectQuestion 22
0 / 0.17 pts
Which of the following data science project step is the most critical step for the success
of the project?

Model Evaluation

Model Selection
Model Building

Data preprocessing

Question 23
0.17 / 0.17 pts
Data science is the process of diverse set of data through ____________

organizing data

processing data

analysing data

All of the options

Question 24
0.17 / 0.17 pts
Correlation always implies causation.

True

False

Question 25
0.17 / 0.17 pts
Which one of the following statement(s) is correct (Choose the most appropriate
answer)?
Analytics is a process in which a computer examines information using mathematical
methods to find useful patterns.

Data Analytics refers to the techniques used to analyze data to enhance productivity
and business gain.

Data analytics is the pursuit of extracting meaning from raw data using specialized
computer systems.

All the statements

None of the statements

IncorrectQuestion 26
0 / 0.17 pts
The answer to following question can be obtained by which type of analytics?
"Whats the best that can happen?"

Diagnostic Analytics

Descriptive Analytics

Prescriptive Analytics

Predictive Analytics

Question 27
0.17 / 0.17 pts
Pattern Recognition is a sub-field of Data Science?
True

False

PartialQuestion 28
0.11 / 0.17 pts
Which of the following are the reasons for the sudden growth of analytics?

Data is growing at 40% compound annual rate

Large number of analysts available in the market

Large number of user friendly analytics tools available for data processing

Cost of storage has hugely dropped

Question 29
0.17 / 0.17 pts
Training and testing datasets are developed during _________ phase.

Model planning

Model building

Operationalize

None
Question 30
0.17 / 0.17 pts
In _________ phase, final report/technical document of process is prepared

Model planning

Model building

Operationalize

None
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
13 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 4 at 9:04
This attempt took 13 minutes.

Question 1 0.25
/ 0.25 pts

_____________ is the practice of designing and building systems for


collecting, storing, and analyzing data at scale.

 
Data Analysis

 
Data engineering

 
Data Mining

 
Machine Learning

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 3 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
High level understanding of Technical know-how specific to the domain.

 
Client Requirement understanding

 
Understanding of domain concepts

 
All of the listed

Question 4 0.25
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Static & comparative

 
Standard & dynamic

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Descriptive & Static

 
Predictive & Prescriptive

Question 5 0.25
/ 0.25 pts

___________ comprises the strategies and technologies used by


enterprises for data analysis and management.

 
None of the Above.

 
Business Intelligence

 
Artificial Intelligence

 
Machine Learning

Question 6 0.25
/ 0.25 pts

The Data Warehouse is a “Schema-on -Load” Approach.

 
True

 
False

Question 7 0.25
/ 0.25 pts

Data scientist is not responsible for 

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data analytics

 
Data manipulation

 
Building continuous data stream

 
Data mining

Question 8 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Preparation

 
Data Collection

 
Data Requirements

 
Data Understanding

Question 9 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Writing

 
Data Annotation

 
Data Label

 
Data Wrangling

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 10 0.25
/ 0.25 pts

Exploratory data analysis does not help in

 
Finding out the data type of a variable

 
Finding statistical estimates of a variable

 
In univariate and bivariate variable analysis

 
Derivation of new attributes

Question 11 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Diagnostic

 
Descriptive

 
Prescriptive

 
Predictive

Question 12 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Can extracting frequencies from audio signals to determine whether the


voice is male or female a Data Science Activity

 
True

 
False

Question 13 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Predictive analytics

 
None of the above

 
Descriptive analytics

 
Diagnostic analytics

Question 14 0.25
/ 0.25 pts

The scatterplot implies that

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The features are positively correlated

 
The features are independent

 
The features are negatively correlated

Question 15 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Machine Learning Activity

 
None of the above

 
Data Science Activity

 
Probability Calculation activity

Question 16 0.25
/ 0.25 pts

Type of problem in statistics will be

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Unstructured

 
None of these

 
Semi-Structured

 
Well Structured

Question 17 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data Security

 
Data quantity

 
Data Availability

 
All of these

Question 18 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Root Cause Analysis

 
Cognitive Analytics

 
Statistical Analysis

 
Pattern Recognition

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 19 0.25
/ 0.25 pts

Big data is defined as ______________?

 
An engineering discipline that is concerned with all aspects of software
production.

 
The process of making machines capable of mimicking human behavior

 
Collections of datasets whose volume, velocity or variety is so large that
it is difficult to store, manage, process and analyze the data using
traditional databases and data processing tools

 
The process of extracting and creating information from raw data by
using different techniques

Question 20 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Predictive Analytics

 
Descriptive Analytics

 
Diagnostic Analytics

 
Prescriptive Analytics

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/10
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
30 minutes 4.75 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


4.75 out of 5
Submitted Dec 4 at 8:41
This attempt took 30 minutes.

Question 1 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Administrator

 
Data Creator

 
None

 
Mathematical Analyst

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

___________ is considered to be a sub-field of or one of the tools of AI,


which provides machines with the capability of learning from
experience.

 
Data analysis

 
Data Science

 
Machine Learning

 
Artificial Intelligence

Question 3 0.25
/ 0.25 pts

What is the process of training machine learning models, evaluating


their performance, and using them to make predictions?

 
Predictive Modelling

 
Visualization

 
Feature Engineering

 
Data Exploration

Question 4 0.25
/ 0.25 pts

Data Science domain knowledge needs

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
High level understanding of Technical know-how specific to the domain.

 
Understanding of domain concepts

 
All of the listed

 
Client Requirement understanding

Incorrect Question 5 0
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
True

 
False

Question 6 0.25
/ 0.25 pts

How is data science understood in terms of interpretability and


Accuracy?

 
High Accuracy, Low Interpretability

 
low Accuracy, high Interpretability

Question 7 0.25
/ 0.25 pts

Exploratory data analysis does not help in


https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Derivation of new attributes

 
In univariate and bivariate variable analysis

 
Finding out the data type of a variable

 
Finding statistical estimates of a variable

Question 8 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Wrangling

 
Data Annotation

 
Data Label

 
Data Writing

Question 9 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Collection

 
Data Preparation

 
Data Cleaning

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Understanding

Question 10 0.25
/ 0.25 pts

Data preprocessing improves the data quality and make data mining
algorithms efficient and effective. What are the data preprocessing
tasks along with Data cleaning?

 
Data Integration

 
Data Reduction

 
Data Transformation

 
All of the above

Question 11 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Predictive

 
Descriptive

 
Diagnostic

 
Prescriptive

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 12 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are positively correlated

 
The features are independent

 
The features are negatively correlated

Question 13 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Machine Learning Activity

 
Data Science Activity

 
None of the above

 
Probability Calculation activity

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 14 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Diagnostic analytics

 
Predictive analytics

 
Descriptive analytics

 
None of the above

Question 15 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the


voice is male or female a Data Science Activity

 
True

 
False

Question 16 0.25
/ 0.25 pts

Type of problem in statistics will be

 
Well Structured

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Unstructured

 
Semi-Structured

 
None of these

Question 17 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data Availability

 
Data Security

 
All of these

 
Data quantity

Question 18 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Image & Speech Recognition

 
Recommendation Systems

 
Online Price Comparison

 
Privacy Checker

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 19 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Statistical Analysis

 
Pattern Recognition

 
Root Cause Analysis

 
Cognitive Analytics

Question 20 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Predictive Analytics

 
Descriptive Analytics

 
Prescriptive Analytics

 
Diagnostic Analytics

Quiz Score:
4.75 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
14 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 4 at 8:52
This attempt took 14 minutes.

Question 1 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
True

 
False

Question 2 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Data science is a ___________ Analysis.

 
Standard & dynamic

 
Descriptive & Static

 
Static & Comparative.

 
Predictive & Prescriptive

Question 3 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Data related

 
Technology related

 
All three

 
Organization related

Question 4 0.25
/ 0.25 pts

___________ comprises the strategies and technologies used by


enterprises for data analysis and management.

 
Business Intelligence

 
Artificial Intelligence

 
None of the Above.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Machine Learning

Question 5 0.25
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Predictive & Prescriptive

 
Descriptive & Static

 
Static & comparative

 
Standard & dynamic

Question 6 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 7 0.25
/ 0.25 pts

Exploratory data analysis does not help in

 
In univariate and bivariate variable analysis

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Derivation of new attributes

 
Finding out the data type of a variable

 
Finding statistical estimates of a variable

Question 8 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Targeted advertising as per customer's need

 
Creation of new fiscal policy

 
Product recommender systems

 
Fraud detection and prevention system in a bank

Question 9 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Preparation

 
Data Understanding

 
Data Requirements

 
Data Collection

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 10 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Label

 
Data Wrangling

 
Data Annotation

 
Data Writing

Question 11 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the


voice is male or female a Data Science Activity

 
True

 
False

Question 12 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Data Scientist

 
Data Engineer

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Business Intelligence Analyst

 
Data Analyst

Question 13 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
None of the above

 
Probability Calculation activity

 
Machine Learning Activity

 
Data Science Activity

Question 14 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic


process?

 
Business Intelligence

 
Data Mining

 
Data Warehouse

 
Data Science

Question 15 0.25
/ 0.25 pts
https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
None of the above

 
Diagnostic analytics

 
Descriptive analytics

 
Predictive analytics

Question 16 0.25
/ 0.25 pts

Type of problem in statistics will be

 
None of these

 
Well Structured

 
Unstructured

 
Semi-Structured

Question 17 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
What should I do?

 
What happened?

 
What will happen?

 
Why did it happen?

Question 18 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data quantity

 
Data Security

 
All of these

 
Data Availability

Question 19 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Vision

 
Velocity

 
Variety

 
Volume

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 20 0.25
/ 0.25 pts

Big data is defined as ______________?

 
The process of making machines capable of mimicking human behavior

 
An engineering discipline that is concerned with all aspects of software
production.

 
The process of extracting and creating information from raw data by
using different techniques

 
Collections of datasets whose volume, velocity or variety is so large that
it is difficult to store, manage, process and analyze the data using
traditional databases and data processing tools

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/9
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due Dec 4 at 19:00 Points 5 Questions 20 Available Dec 3 at 19:00 - Dec 4 at 19:00 24 hours
Time Limit 30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score

LATEST Attempt 1 15 minutes 5 out of 5

 Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz: 5 out of 5


Submitted Dec 4 at 8:47
This attempt took 15 minutes.

Question 1 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Data science is a ___________ Analysis.

  Static & Comparative.

  Descriptive & Static

  Standard & dynamic

  Predictive & Prescriptive

Question 2 0.25 / 0.25 pts

Data science is an Interdisciplinary field.

  True

  False

Question 3 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Data Engineers are also known as Data Miners.

  True

  False

Question 4 0.25 / 0.25 pts

_________________ is the science of collecting, analyzing, presenting, and interpreting data. Objective to
draw conclusions on the population.

  Statistics

  Business Intelligence

  Data Analytics

  Computational Intelligence

Question 5 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

___________ comprises the strategies and technologies used by enterprises for data analysis and
management.

  None of the Above.

  Artificial Intelligence

  Business Intelligence

  Machine Learning

Question 6 0.25 / 0.25 pts

Which one of this belongs to the Data Science career?

  None

  Mathematical Analyst

  Data Administrator

  Data Creator

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 7 0.25 / 0.25 pts

Which one refers to the labelling of the data?   

  Data Writing

  Data Annotation

  Data Label

  Data Wrangling

Question 8 0.25 / 0.25 pts

Data preprocessing improves the data quality and make data mining algorithms efficient and effective. What
are the data preprocessing tasks along with Data cleaning?

  Data Integration

  All of the above

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  Data Reduction

  Data Transformation

Question 9 0.25 / 0.25 pts

Data scientist is not responsible for 

  Data analytics

  Data manipulation

  Data mining

  Building continuous data stream

Question 10 0.25 / 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the best possible answer.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  Data Understanding

  Data Preparation

  Data Requirements

  Data Collection

Question 11 0.25 / 0.25 pts

While playing cards predicting the next card to be a joker is 

  Probability Calculation activity

  Machine Learning Activity

  Data Science Activity

  None of the above

Question 12 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

A scenario where you visited a doctor for your fever and the doctor asked you questions like: Were you
exposed to rain or cold climate; or did you have contact with a sick person; or did you have food from outside
etc., and then doctor comes to a conclusion based on your response to the queries can be considered
analogous to which stage of data analytics.

  Diagnostic

  Prescriptive

  Predictive

  Descriptive

Question 13 0.25 / 0.25 pts

Can extracting frequencies from audio signals to determine whether the voice is male or female a Data
Science Activity

  True

  False

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 14 0.25 / 0.25 pts

Among the following, which user role is more concerned with non-functional requirements such as security,
fault tolerance and efficiency in data processing applications?

  Data Analyst

  Business Intelligence Analyst

  Data Scientist

  Data Engineer

Question 15 0.25 / 0.25 pts

The scatterplot implies that

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  The features are independent

  The features are positively correlated

  The features are negatively correlated

Question 16 0.25 / 0.25 pts

Type of problem in statistics will be

  Unstructured

  Semi-Structured

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  None of these

  Well Structured

Question 17 0.25 / 0.25 pts

Which of the following is not a characteristic of Big Data?

  Velocity

  Variety

  Volume

  Vision

Question 18 0.25 / 0.25 pts

Which of the following questions is answered by Predictive Analytics?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 11/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  What happened?

  Why did it happen?

  What will happen?

  What should I do?

Question 19 0.25 / 0.25 pts

Big data is defined as ______________?

  The process of making machines capable of mimicking human behavior

  An engineering discipline that is concerned with all aspects of software production.

 
Collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process
and analyze the data using traditional databases and data processing tools

  The process of extracting and creating information from raw data by using different techniques

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 12/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 20 0.25 / 0.25 pts

Diagnostic Analysis is often referred as,

  Root Cause Analysis

  Statistical Analysis

  Cognitive Analytics

  Pattern Recognition

Quiz Score: 5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 13/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
19 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 23:56
This attempt took 19 minutes.

Question 1 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

How is data science understood in terms of interpretability and Accuracy?

 
High Accuracy, Low Interpretability

 
low Accuracy, high Interpretability

Question 2 0.25
/ 0.25 pts

___________ comprises the strategies and technologies used by enterprises for data analysis and
management.

 
Artificial Intelligence

 
Business Intelligence

 
None of the Above.

 
Machine Learning

Question 3 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

The Data Warehouse is a “Schema-on -Load” Approach.

 
True

 
False

Question 4 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 5 0.25
/ 0.25 pts

Business Intelligence follows _________________ process.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Static & comparative

 
Standard & dynamic

 
Predictive & Prescriptive

 
Descriptive & Static

Question 6 0.25
/ 0.25 pts

What is the process of training machine learning models, evaluating their performance, and using them to
make predictions?

 
Visualization

 
Predictive Modelling

 
Data Exploration

 
Feature Engineering

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 7 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Fraud detection and prevention system in a bank

 
Creation of new fiscal policy

 
Targeted advertising as per customer's need

 
Product recommender systems

Question 8 0.25
/ 0.25 pts

Data preprocessing improves the data quality and make data mining algorithms efficient and effective. What
are the data preprocessing tasks along with Data cleaning?

 
Data Transformation

 
Data Reduction

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
All of the above

 
Data Integration

Question 9 0.25
/ 0.25 pts

Data scientist is not responsible for 

 
Data mining

 
Data analytics

 
Building continuous data stream

 
Data manipulation

Question 10 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the best possible answer.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Collection

 
Data Understanding

 
Data Preparation

 
Data Requirements

Question 11 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-functional requirements such as security,
fault tolerance and efficiency in data processing applications?

 
Data Scientist

 
Business Intelligence Analyst

 
Data Engineer

 
Data Analyst

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 12 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are positively correlated

 
The features are independent

 
The features are negatively correlated

Question 13 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which among the following advocates an exploratory and dynamic process?

 
Data Mining

 
Data Warehouse

 
Business Intelligence

 
Data Science

Question 14 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that the Tamil version of the movie was
viewed the most, out of the 3 dubbed languages. The data science team plans to develop a recommendation
system for future releases based on this movie trend, which of the following ‘analytical’ approaches would
you suggest.

 
None of the above

 
Predictive analytics

 
Diagnostic analytics

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Descriptive analytics

Question 15 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor asked you questions like: Were you
exposed to rain or cold climate; or did you have contact with a sick person; or did you have food from outside
etc., and then doctor comes to a conclusion based on your response to the queries can be considered
analogous to which stage of data analytics.

 
Descriptive

 
Diagnostic

 
Prescriptive

 
Predictive

Question 16 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
All of these

 
Data Availability

 
Data quantity

 
Data Security

Question 17 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Cognitive Analytics

 
Statistical Analysis

 
Root Cause Analysis

 
Pattern Recognition

Question 18 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 11/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Big data is defined as ______________?

 
Collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process and
analyze the data using traditional databases and data processing tools

 
An engineering discipline that is concerned with all aspects of software production.

 
The process of making machines capable of mimicking human behavior

 
The process of extracting and creating information from raw data by using different techniques

Question 19 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Diagnostic Analytics

 
Prescriptive Analytics

 
Descriptive Analytics

 
Predictive Analytics

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 12/13
04/12/2022, 00:12 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 20 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Recommendation Systems

 
Privacy Checker

 
Image & Speech Recognition

 
Online Price Comparison

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 13/13
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
10 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 22:49
This attempt took 10 minutes.

Question 1 0.25
/ 0.25 pts

_____________ is the practice of designing and building systems for


collecting, storing, and analyzing data at scale.

 
Data Analysis

 
Data Mining

 
Machine Learning

 
Data engineering

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Creator

 
Data Administrator

 
None

 
Mathematical Analyst

Question 3 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
False

 
True

Question 4 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
All of the listed

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
High level understanding of Technical know-how specific to the domain.

 
Client Requirement understanding

Question 5 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Organization related

 
All three

 
Technology related

 
Data related

Question 6 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 7 0.25
/ 0.25 pts

Data scientist is not responsible for 

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data analytics

 
Data manipulation

 
Data mining

 
Building continuous data stream

Question 8 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Collection

 
Data Requirements

 
Data Preparation

 
Data Understanding

Question 9 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Collection

 
Data Cleaning

 
Data Understanding

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Preparation

Question 10 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Writing

 
Data Wrangling

 
Data Label

 
Data Annotation

Question 11 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are negatively correlated

 
The features are positively correlated

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The features are independent

Question 12 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Descriptive

 
Predictive

 
Prescriptive

 
Diagnostic

Question 13 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Probability Calculation activity

 
Machine Learning Activity

 
Data Science Activity

 
None of the above

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 14 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Data Engineer

 
Data Analyst

 
Business Intelligence Analyst

 
Data Scientist

Question 15 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Predictive analytics

 
Descriptive analytics

 
None of the above

 
Diagnostic analytics

Question 16 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which of the following are the main challenges in Data Science?

 
Data Availability

 
Data Security

 
Data quantity

 
All of these

Question 17 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Pattern Recognition

 
Statistical Analysis

 
Cognitive Analytics

 
Root Cause Analysis

Question 18 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Vision

 
Velocity

 
Variety

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Volume

Question 19 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Image & Speech Recognition

 
Online Price Comparison

 
Privacy Checker

 
Recommendation Systems

Question 20 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
What happened?

 
What will happen?

 
What should I do?

 
Why did it happen?

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/9
QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
3 minutes 4.75 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


4.75 out of 5
Submitted Dec 3 at 23:10
This attempt took 3 minutes.

Incorrect Question 1 0
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Descriptive & Static

 
Predictive & Prescriptive

 
Standard & dynamic

 
Static & comparative

Question 2 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
High level understanding of Technical know-how specific to the domain.

 
All of the listed

 
Client Requirement understanding

Question 3 0.25
/ 0.25 pts

___________ is considered to be a sub-field of or one of the tools of AI, which provides machines with the
capability of learning from experience.

 
Artificial Intelligence

 
Machine Learning

 
Data Science

 
Data analysis

Question 4 0.25
/ 0.25 pts

_________________ is the science of collecting, analyzing, presenting, and interpreting data. Objective to
draw conclusions on the population.

 
Business Intelligence

 
Data Analytics

 
Computational Intelligence

 
Statistics

Question 5 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 6 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Creator

 
Data Administrator

 
None

 
Mathematical Analyst

Question 7 0.25
/ 0.25 pts

Exploratory data analysis does not help in

 
In univariate and bivariate variable analysis

 
Finding out the data type of a variable

 
Finding statistical estimates of a variable

 
Derivation of new attributes

Question 8 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the best possible answer.

 
Data Requirements

 
Data Preparation

 
Data Understanding

 
Data Collection

Question 9 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a meal. Which phase of the data
science process does the analogy fit correctly? (Choose the best answer)

 
Data Understanding

 
Data Cleaning

 
Data Collection

 
Data Preparation

Question 10 0.25
/ 0.25 pts

Data scientist is not responsible for 


 
Data analytics

 
Data manipulation

 
Data mining

 
Building continuous data stream

Question 11 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic process?

 
Business Intelligence

 
Data Science

 
Data Mining

 
Data Warehouse

Question 12 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that the Tamil version of the movie was
viewed the most, out of the 3 dubbed languages. The data science team plans to develop a recommendation
system for future releases based on this movie trend, which of the following ‘analytical’ approaches would
you suggest.

 
Predictive analytics

 
Descriptive analytics

 
None of the above

 
Diagnostic analytics

Question 13 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-functional requirements such as security,
fault tolerance and efficiency in data processing applications?

 
Data Analyst

 
Data Scientist

 
Data Engineer

 
Business Intelligence Analyst

Question 14 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
None of the above

 
Machine Learning Activity

 
Data Science Activity

 
Probability Calculation activity

Question 15 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are positively correlated

 
The features are independent

 
The features are negatively correlated

Question 16 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Cognitive Analytics

 
Pattern Recognition

 
Statistical Analysis

 
Root Cause Analysis

Question 17 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data Availability

 
Data quantity

 
All of these

 
Data Security

Question 18 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Vision

 
Velocity

 
Volume

 
Variety

Question 19 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
What should I do?

 
Why did it happen?

 
What will happen?

 
What happened?

Question 20 0.25
/ 0.25 pts

Big data is defined as ______________?

 
The process of making machines capable of mimicking human behavior

 
The process of extracting and creating information from raw data by using different techniques

 
An engineering discipline that is concerned with all aspects of software production.

 
Collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process
and analyze the data using traditional databases and data processing tools

Quiz Score:
4.75 out of 5
03/12/2022, 23:22 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
2 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 23:22
This attempt took 2 minutes.

Question 1 0.25
/ 0.25 pts

_________________ is the science of collecting, analyzing, presenting,


and interpreting data. Objective to draw conclusions on the population.

 
Data Analytics

 
Business Intelligence

 
Statistics

 
Computational Intelligence

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

___________ is considered to be a sub-field of or one of the tools of


AI, which provides machines with the capability of learning from
experience.

 
Data analysis

 
Machine Learning

 
Artificial Intelligence

 
Data Science

Question 3 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Creator

 
Data Administrator

 
None

 
Mathematical Analyst

Question 4 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

How is data science understood in terms of interpretability and


Accuracy?

 
High Accuracy, Low Interpretability

 
low Accuracy, high Interpretability

Question 5 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Data related

 
Organization related

 
All three

 
Technology related

Question 6 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
False

 
True

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 7 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Preparation

 
Data Understanding

 
Data Requirements

 
Data Collection

Question 8 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Label

 
Data Wrangling

 
Data Annotation

 
Data Writing

Question 9 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Creation of new fiscal policy

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Product recommender systems

 
Fraud detection and prevention system in a bank

 
Targeted advertising as per customer's need

Question 10 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Collection

 
Data Preparation

 
Data Understanding

 
Data Cleaning

Question 11 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Data Analyst

 
Data Scientist

 
Data Engineer

 
Business Intelligence Analyst

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 12 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Machine Learning Activity

 
None of the above

 
Probability Calculation activity

 
Data Science Activity

Question 13 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are negatively correlated

 
The features are independent

 
The features are positively correlated

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 14 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic


process?

 
Data Warehouse

 
Data Mining

 
Business Intelligence

 
Data Science

Question 15 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Diagnostic

 
Predictive

 
Descriptive

 
Prescriptive

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 16 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Recommendation Systems

 
Image & Speech Recognition

 
Privacy Checker

 
Online Price Comparison

Question 17 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Prescriptive Analytics

 
Predictive Analytics

 
Diagnostic Analytics

 
Descriptive Analytics

Question 18 0.25
/ 0.25 pts

Big data is defined as ______________?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The process of extracting and creating information from raw data by
using different techniques

 
An engineering discipline that is concerned with all aspects of software
production.

 
Collections of datasets whose volume, velocity or variety is so large that
it is difficult to store, manage, process and analyze the data using
traditional databases and data processing tools

 
The process of making machines capable of mimicking human behavior

Question 19 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Volume

 
Velocity

 
Variety

 
Vision

Question 20 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/10
03/12/2022, 23:23 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
What will happen?

 
What happened?

 
What should I do?

 
Why did it happen?

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
30 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 23:18
This attempt took 30 minutes.

Question 1 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
All of the listed

 
High level understanding of Technical know-how specific to the domain.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Client Requirement understanding

Question 2 0.25
/ 0.25 pts

___________ is considered to be a sub-field of or one of the tools of AI,


which provides machines with the capability of learning from experience.

 
Machine Learning

 
Data analysis

 
Data Science

 
Artificial Intelligence

Question 3 0.25
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Standard & dynamic

 
Predictive & Prescriptive

 
Static & comparative

 
Descriptive & Static

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 4 0.25
/ 0.25 pts

___________ comprises the strategies and technologies used by


enterprises for data analysis and management.

 
Artificial Intelligence

 
Business Intelligence

 
None of the Above.

 
Machine Learning

Question 5 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Technology related

 
All three

 
Organization related

 
Data related

Question 6 0.25
/ 0.25 pts

Data science is a ___________ Analysis.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Static & Comparative.

 
Standard & dynamic

 
Predictive & Prescriptive

 
Descriptive & Static

Question 7 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Creation of new fiscal policy

 
Product recommender systems

 
Fraud detection and prevention system in a bank

 
Targeted advertising as per customer's need

Question 8 0.25
/ 0.25 pts

Data preprocessing improves the data quality and make data mining
algorithms efficient and effective. What are the data preprocessing tasks
along with Data cleaning?

 
Data Reduction

 
Data Transformation

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Integration

 
All of the above

Question 9 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Cleaning

 
Data Collection

 
Data Preparation

 
Data Understanding

Question 10 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the best
possible answer.

 
Data Preparation

 
Data Understanding

 
Data Collection

 
Data Requirements

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 11 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic


process?

 
Business Intelligence

 
Data Science

 
Data Warehouse

 
Data Mining

Question 12 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Probability Calculation activity

 
Data Science Activity

 
Machine Learning Activity

 
None of the above

Question 13 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency in
data processing applications?

 
Data Engineer

 
Data Analyst

 
Business Intelligence Analyst

 
Data Scientist

Question 14 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are independent

 
The features are positively correlated

 
The features are negatively correlated

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 15 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the


voice is male or female a Data Science Activity

 
True

 
False

Question 16 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data Security

 
Data quantity

 
All of these

 
Data Availability

Question 17 0.25
/ 0.25 pts

Type of problem in statistics will be

 
Unstructured

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Well Structured

 
Semi-Structured

 
None of these

Question 18 0.25
/ 0.25 pts

Big data is defined as ______________?

 
An engineering discipline that is concerned with all aspects of software
production.

 
Collections of datasets whose volume, velocity or variety is so large that it
is difficult to store, manage, process and analyze the data using traditional
databases and data processing tools

 
The process of extracting and creating information from raw data by using
different techniques

 
The process of making machines capable of mimicking human behavior

Question 19 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Cognitive Analytics

 
Statistical Analysis

 
Pattern Recognition

 
Root Cause Analysis

Question 20 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Prescriptive Analytics

 
Diagnostic Analytics

 
Predictive Analytics

 
Descriptive Analytics

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
4 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 23:16
This attempt took 4 minutes.

Question 1 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Organization related

 
Data related

 
All three

 
Technology related

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

Data science is a ___________ Analysis.

 
Predictive & Prescriptive

 
Descriptive & Static

 
Standard & dynamic

 
Static & Comparative.

Question 3 0.25
/ 0.25 pts

_____________ is the practice of designing and building systems for


collecting, storing, and analyzing data at scale.

 
Data Analysis

 
Data engineering

 
Machine Learning

 
Data Mining

Question 4 0.25
/ 0.25 pts

What is the process of training machine learning models, evaluating


their performance, and using them to make predictions?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Exploration

 
Visualization

 
Predictive Modelling

 
Feature Engineering

Question 5 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
True

 
False

Question 6 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
High level understanding of Technical know-how specific to the domain.

 
Understanding of domain concepts

 
All of the listed

 
Client Requirement understanding

Question 7 0.25
/ 0.25 pts
https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Collection

 
Data Cleaning

 
Data Preparation

 
Data Understanding

Question 8 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Product recommender systems

 
Creation of new fiscal policy

 
Fraud detection and prevention system in a bank

 
Targeted advertising as per customer's need

Question 9 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Preparation

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Understanding

 
Data Requirements

 
Data Collection

Question 10 0.25
/ 0.25 pts

Data preprocessing improves the data quality and make data mining
algorithms efficient and effective. What are the data preprocessing
tasks along with Data cleaning?

 
All of the above

 
Data Integration

 
Data Transformation

 
Data Reduction

Question 11 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Prescriptive

 
Diagnostic

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Predictive

 
Descriptive

Question 12 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Business Intelligence Analyst

 
Data Analyst

 
Data Scientist

 
Data Engineer

Question 13 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Predictive analytics

 
Descriptive analytics

 
Diagnostic analytics

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
None of the above

Question 14 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are independent

 
The features are negatively correlated

 
The features are positively correlated

Question 15 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Machine Learning Activity

 
Probability Calculation activity

 
Data Science Activity

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
None of the above

Question 16 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Descriptive Analytics

 
Predictive Analytics

 
Diagnostic Analytics

 
Prescriptive Analytics

Question 17 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Recommendation Systems

 
Privacy Checker

 
Online Price Comparison

 
Image & Speech Recognition

Question 18 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?


https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Vision

 
Velocity

 
Variety

 
Volume

Question 19 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
All of these

 
Data quantity

 
Data Security

 
Data Availability

Question 20 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
What happened?

 
Why did it happen?

 
What should I do?

 
What will happen?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/10
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
12 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 4 at 8:21
This attempt took 12 minutes.

Question 1 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

What is the process of training machine learning models, evaluating their performance, and using them to
make predictions?

 
Visualization

 
Feature Engineering

 
Predictive Modelling

 
Data Exploration

Question 2 0.25
/ 0.25 pts

The Data Warehouse is a “Schema-on -Load” Approach.

 
True

 
False

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 3 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
All of the listed

 
Client Requirement understanding

 
High level understanding of Technical know-how specific to the domain.

Question 4 0.25
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Standard & dynamic

 
Static & comparative

 
Descriptive & Static

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Predictive & Prescriptive

Question 5 0.25
/ 0.25 pts

_____________ is the practice of designing and building systems for collecting, storing, and analyzing data
at scale.

 
Data engineering

 
Machine Learning

 
Data Mining

 
Data Analysis

Question 6 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
True

 
False

Question 7 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a meal. Which phase of the data
science process does the analogy fit correctly? (Choose the best answer)

 
Data Understanding

 
Data Preparation

 
Data Cleaning

 
Data Collection

Question 8 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Data preprocessing improves the data quality and make data mining algorithms efficient and effective. What
are the data preprocessing tasks along with Data cleaning?

 
All of the above

 
Data Reduction

 
Data Transformation

 
Data Integration

Question 9 0.25
/ 0.25 pts

Exploratory data analysis does not help in

 
Finding out the data type of a variable

 
Finding statistical estimates of a variable

 
Derivation of new attributes

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
In univariate and bivariate variable analysis

Question 10 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Annotation

 
Data Writing

 
Data Wrangling

 
Data Label

Question 11 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the voice is male or female a Data
Science Activity

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
True

 
False

Question 12 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic process?

 
Business Intelligence

 
Data Warehouse

 
Data Mining

 
Data Science

Question 13 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that the Tamil version of the movie was
viewed the most, out of the 3 dubbed languages. The data science team plans to develop a recommendation
https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

system for future releases based on this movie trend, which of the following ‘analytical’ approaches would
you suggest.

 
Predictive analytics

 
None of the above

 
Diagnostic analytics

 
Descriptive analytics

Question 14 0.25
/ 0.25 pts

The scatterplot implies that

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The features are independent

 
The features are positively correlated

 
The features are negatively correlated

Question 15 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor asked you questions like: Were you
exposed to rain or cold climate; or did you have contact with a sick person; or did you have food from outside
etc., and then doctor comes to a conclusion based on your response to the queries can be considered
analogous to which stage of data analytics.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Descriptive

 
Prescriptive

 
Diagnostic

 
Predictive

Question 16 0.25
/ 0.25 pts

Type of problem in statistics will be

 
None of these

 
Well Structured

 
Semi-Structured

 
Unstructured

Question 17 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 11/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which of the following are the main challenges in Data Science?

 
Data Security

 
Data Availability

 
All of these

 
Data quantity

Question 18 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Online Price Comparison

 
Recommendation Systems

 
Image & Speech Recognition

 
Privacy Checker

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 12/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 19 0.25
/ 0.25 pts

Big data is defined as ______________?

 
The process of extracting and creating information from raw data by using different techniques

 
The process of making machines capable of mimicking human behavior

 
An engineering discipline that is concerned with all aspects of software production.

 
Collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process
and analyze the data using traditional databases and data processing tools

Question 20 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Variety

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 13/14
12/4/22, 8:21 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Volume

 
Velocity

 
Vision

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 14/14
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
10 minutes 4.25 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


4.25 out of 5
Submitted Dec 3 at 21:39
This attempt took 10 minutes.

Question 1 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
All of the listed

 
Client Requirement understanding

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
High level understanding of Technical know-how specific to the domain.

 
Understanding of domain concepts

Question 2 0.25
/ 0.25 pts

___________ comprises the strategies and technologies used by enterprises for data analysis and management.

 
Machine Learning

 
Business Intelligence

 
Artificial Intelligence

 
None of the Above.

Question 3 0.25
/ 0.25 pts

The Data Warehouse is a “Schema-on -Load” Approach.

 
True

 
False

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 4 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 5 0.25
/ 0.25 pts

_________________ is the science of collecting, analyzing, presenting, and interpreting data. Objective to draw conclusions on the
population.

 
Business Intelligence

 
Computational Intelligence

 
Data Analytics

 
Statistics

Question 6 0.25
/ 0.25 pts

___________ is considered to be a sub-field of or one of the tools of AI, which provides machines with the capability of learning from
experience.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Science

 
Data analysis

 
Machine Learning

 
Artificial Intelligence

Incorrect
Question 7 0
/ 0.25 pts

Exploratory data analysis does not help in

 
In univariate and bivariate variable analysis

 
Finding statistical estimates of a variable

 
Derivation of new attributes

 
Finding out the data type of a variable

Question 8 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the best possible answer.

 
Data Preparation

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Requirements

 
Data Collection

 
Data Understanding

Question 9 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Label

 
Data Writing

 
Data Annotation

 
Data Wrangling

Question 10 0.25
/ 0.25 pts

Data preprocessing improves the data quality and make data mining algorithms efficient and effective. What are the data preprocessing
tasks along with Data cleaning?

 
Data Integration

 
Data Reduction

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Transformation

 
All of the above

Question 11 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
None of the above

 
Probability Calculation activity

 
Machine Learning Activity

 
Data Science Activity

Incorrect Question 12 0
/ 0.25 pts

Which among the following advocates an exploratory and dynamic process?

 
Data Warehouse

 
Data Science

 
Business Intelligence

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Mining

Question 13 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the voice is male or female a Data Science Activity

 
True

 
False

Incorrect Question 14 0
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Diagnostic analytics

 
Predictive analytics

 
None of the above

 
Descriptive analytics

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 15 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of data analytics.

 
Predictive

 
Prescriptive

 
Diagnostic

 
Descriptive

Question 16 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data Availability

 
Data Security

 
All of these

 
Data quantity

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 17 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Cognitive Analytics

 
Statistical Analysis

 
Pattern Recognition

 
Root Cause Analysis

Question 18 0.25
/ 0.25 pts

Type of problem in statistics will be

 
None of these

 
Well Structured

 
Semi-Structured

 
Unstructured

Question 19 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which of the following is not a characteristic of Big Data?

 
Velocity

 
Vision

 
Volume

 
Variety

Question 20 0.25
/ 0.25 pts

Big data is defined as ______________?

 
Collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process and analyze the data using traditional
databases and data processing tools

 
The process of extracting and creating information from raw data by using different techniques

 
The process of making machines capable of mimicking human behavior

 
An engineering discipline that is concerned with all aspects of software production.

Quiz Score:
4.25 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/10
Question 1
0.17 / 0.17 pts
Data preprocessing improves the data quality and make data mining algorithms efficient
and effective. What are the data preprocessing tasks along with Data cleaning?

Data Integration

Data Reduction

Data Transformation

All of the above.

Question 2
0.17 / 0.17 pts
Which of the following artifacts are not considered in the Descriptive analytics?

Standard report

Alerts

Adhoc reports

Predictive model

Question 3
0.17 / 0.17 pts
If you were to arrange the following data analytics techniques in the increasing order of
complexity, which of the following is considered the correct order?
Predictive, Diagnostic, Prescriptive, Descriptive

Descriptive, Diagnostic, Predictive, Prescriptive

Diagnostic, Predictive, Prescriptive, Descriptive

Prescriptive, Descriptive, Diagnostic, Predictive

Question 4
0.17 / 0.17 pts
Data scientist is not responsible for

Data mining

Data manipulation

Building continuous data stream

Data analytics

Question 5
0.17 / 0.17 pts
Which of these is not an example of the application of data science?

Fraud detection and prevention system in a bank

Creation of new fiscal policy


Targeted advertising as per customer's need

Product recommender systems

Question 6
0.17 / 0.17 pts
If a person is coming from software development background, which of the following
Data science project roles will best suite him?

Big Data Engineer

Machine Learning Engineer

Data Analyst

Storyteller

Question 7
0.17 / 0.17 pts

A person goes to a supermarket to purchase ingredients for making a meal. Which


phase of the data science process does the analogy fit correctly? (Choose the best
answer)

Data Understanding

Data Collection

Data Cleaning
Data Preparation

Question 8
0.17 / 0.17 pts
Which of the following are correct component for data science?

Data Engineering

Advanced Computing

Domain expertise

All of the options

IncorrectQuestion 9
0 / 0.17 pts
Which of the following is not an issue with CRISP-DM model?

The end-users of the analytical model are required to post-rationalize the model, which
leads to a lot of dissatisfaction

It very much underestimates the amount real experimentation that is needed to get at
viable results

Various modeling techniques are selected and applied, and their parameters are
calibrated to optimal values3
Thorough evaluation indeed is needed, yet the CRISP-DM methodology does not
prescribe how to do this.

Question 10
0.17 / 0.17 pts
Which of the following curve analysis is conducted on each predictor for classification?

NOC

ROC

COC

All of the mentioned

Question 11
0.17 / 0.17 pts
Which of the following best describes the difference between the data analyst and data
scientist?

Data analyst just deal with numbers whereas data scientists deals with algorithms

Data analyst does estimation whereas data scientist predicts & explains it as well

Data analyst and data scientists plays the same role in the project

Data analyst are more proficient in R whereas Data scientists are more proficient in
Python
IncorrectQuestion 12
0 / 0.17 pts
Which of the following is not a application for data science?

Recommendation Systems

Image & Speech Recognition

Online Price Comparison

Privacy Checker

Question 13
0.17 / 0.17 pts
Which one refers to the labelling of the data?

Data Writing

Data Annotation

Data Label

Data Wrangling

Question 14
0.17 / 0.17 pts
Data Science project steps are highly linear.

True
False

Question 15
0.17 / 0.17 pts
Which of the following can be a significant challenge in data science?

Irregular communication with stakeholders

Lack of project funding in data science projects

Data Quality

Inadequately defined organizational structure

Question 16
0.17 / 0.17 pts
CRISP-DM methodology is specifically built for IT(Information Technology) projects.

True

False

Question 17
0.17 / 0.17 pts
In "Business Understanding" phase of the data science process, the goals are identified
and the objectives defined.

True
False

Question 18
0.17 / 0.17 pts
Which of the following step is performed by data scientist after acquiring the data?

Data Cleaning

Data Integration

Data Replication

All of the options

Question 19
0.17 / 0.17 pts
Which one of the following is not a necessary characteristic of a Data Scientist?

Creative

Technical

Punctual

Communicative

Question 20
0.17 / 0.17 pts
Which of the following is not the characteristics for "Localized Analytics"?

Usable in functional silos

Only at functional or process level

Occurs in disconnected manner

Key data, technology and analysts are centralized

Question 21
0.17 / 0.17 pts
Raw data should be processed only one time.

True

False

IncorrectQuestion 22
0 / 0.17 pts
Which of the following data science project step is the most critical step for the success
of the project?

Model Evaluation

Model Selection
Model Building

Data preprocessing

Question 23
0.17 / 0.17 pts
Data science is the process of diverse set of data through ____________

organizing data

processing data

analysing data

All of the options

Question 24
0.17 / 0.17 pts
Correlation always implies causation.

True

False

Question 25
0.17 / 0.17 pts
Which one of the following statement(s) is correct (Choose the most appropriate
answer)?
Analytics is a process in which a computer examines information using mathematical
methods to find useful patterns.

Data Analytics refers to the techniques used to analyze data to enhance productivity
and business gain.

Data analytics is the pursuit of extracting meaning from raw data using specialized
computer systems.

All the statements

None of the statements

IncorrectQuestion 26
0 / 0.17 pts
The answer to following question can be obtained by which type of analytics?
"Whats the best that can happen?"

Diagnostic Analytics

Descriptive Analytics

Prescriptive Analytics

Predictive Analytics

Question 27
0.17 / 0.17 pts
Pattern Recognition is a sub-field of Data Science?
True

False

PartialQuestion 28
0.11 / 0.17 pts
Which of the following are the reasons for the sudden growth of analytics?

Data is growing at 40% compound annual rate

Large number of analysts available in the market

Large number of user friendly analytics tools available for data processing

Cost of storage has hugely dropped

Question 29
0.17 / 0.17 pts
Training and testing datasets are developed during _________ phase.

Model planning

Model building

Operationalize

None
Question 30
0.17 / 0.17 pts
In _________ phase, final report/technical document of process is prepared

Model planning

Model building

Operationalize

None
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
13 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 4 at 9:04
This attempt took 13 minutes.

Question 1 0.25
/ 0.25 pts

_____________ is the practice of designing and building systems for


collecting, storing, and analyzing data at scale.

 
Data Analysis

 
Data engineering

 
Data Mining

 
Machine Learning

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 3 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
High level understanding of Technical know-how specific to the domain.

 
Client Requirement understanding

 
Understanding of domain concepts

 
All of the listed

Question 4 0.25
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Static & comparative

 
Standard & dynamic

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Descriptive & Static

 
Predictive & Prescriptive

Question 5 0.25
/ 0.25 pts

___________ comprises the strategies and technologies used by


enterprises for data analysis and management.

 
None of the Above.

 
Business Intelligence

 
Artificial Intelligence

 
Machine Learning

Question 6 0.25
/ 0.25 pts

The Data Warehouse is a “Schema-on -Load” Approach.

 
True

 
False

Question 7 0.25
/ 0.25 pts

Data scientist is not responsible for 

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data analytics

 
Data manipulation

 
Building continuous data stream

 
Data mining

Question 8 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Preparation

 
Data Collection

 
Data Requirements

 
Data Understanding

Question 9 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Writing

 
Data Annotation

 
Data Label

 
Data Wrangling

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 10 0.25
/ 0.25 pts

Exploratory data analysis does not help in

 
Finding out the data type of a variable

 
Finding statistical estimates of a variable

 
In univariate and bivariate variable analysis

 
Derivation of new attributes

Question 11 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Diagnostic

 
Descriptive

 
Prescriptive

 
Predictive

Question 12 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Can extracting frequencies from audio signals to determine whether the


voice is male or female a Data Science Activity

 
True

 
False

Question 13 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Predictive analytics

 
None of the above

 
Descriptive analytics

 
Diagnostic analytics

Question 14 0.25
/ 0.25 pts

The scatterplot implies that

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The features are positively correlated

 
The features are independent

 
The features are negatively correlated

Question 15 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Machine Learning Activity

 
None of the above

 
Data Science Activity

 
Probability Calculation activity

Question 16 0.25
/ 0.25 pts

Type of problem in statistics will be

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Unstructured

 
None of these

 
Semi-Structured

 
Well Structured

Question 17 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data Security

 
Data quantity

 
Data Availability

 
All of these

Question 18 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Root Cause Analysis

 
Cognitive Analytics

 
Statistical Analysis

 
Pattern Recognition

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 19 0.25
/ 0.25 pts

Big data is defined as ______________?

 
An engineering discipline that is concerned with all aspects of software
production.

 
The process of making machines capable of mimicking human behavior

 
Collections of datasets whose volume, velocity or variety is so large that
it is difficult to store, manage, process and analyze the data using
traditional databases and data processing tools

 
The process of extracting and creating information from raw data by
using different techniques

Question 20 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Predictive Analytics

 
Descriptive Analytics

 
Diagnostic Analytics

 
Prescriptive Analytics

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/10
12/4/22, 9:05 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/10
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
30 minutes 4.75 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


4.75 out of 5
Submitted Dec 4 at 8:41
This attempt took 30 minutes.

Question 1 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Administrator

 
Data Creator

 
None

 
Mathematical Analyst

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

___________ is considered to be a sub-field of or one of the tools of AI,


which provides machines with the capability of learning from
experience.

 
Data analysis

 
Data Science

 
Machine Learning

 
Artificial Intelligence

Question 3 0.25
/ 0.25 pts

What is the process of training machine learning models, evaluating


their performance, and using them to make predictions?

 
Predictive Modelling

 
Visualization

 
Feature Engineering

 
Data Exploration

Question 4 0.25
/ 0.25 pts

Data Science domain knowledge needs

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
High level understanding of Technical know-how specific to the domain.

 
Understanding of domain concepts

 
All of the listed

 
Client Requirement understanding

Incorrect Question 5 0
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
True

 
False

Question 6 0.25
/ 0.25 pts

How is data science understood in terms of interpretability and


Accuracy?

 
High Accuracy, Low Interpretability

 
low Accuracy, high Interpretability

Question 7 0.25
/ 0.25 pts

Exploratory data analysis does not help in


https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Derivation of new attributes

 
In univariate and bivariate variable analysis

 
Finding out the data type of a variable

 
Finding statistical estimates of a variable

Question 8 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Wrangling

 
Data Annotation

 
Data Label

 
Data Writing

Question 9 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Collection

 
Data Preparation

 
Data Cleaning

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Understanding

Question 10 0.25
/ 0.25 pts

Data preprocessing improves the data quality and make data mining
algorithms efficient and effective. What are the data preprocessing
tasks along with Data cleaning?

 
Data Integration

 
Data Reduction

 
Data Transformation

 
All of the above

Question 11 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Predictive

 
Descriptive

 
Diagnostic

 
Prescriptive

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 12 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are positively correlated

 
The features are independent

 
The features are negatively correlated

Question 13 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Machine Learning Activity

 
Data Science Activity

 
None of the above

 
Probability Calculation activity

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 14 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Diagnostic analytics

 
Predictive analytics

 
Descriptive analytics

 
None of the above

Question 15 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the


voice is male or female a Data Science Activity

 
True

 
False

Question 16 0.25
/ 0.25 pts

Type of problem in statistics will be

 
Well Structured

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Unstructured

 
Semi-Structured

 
None of these

Question 17 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data Availability

 
Data Security

 
All of these

 
Data quantity

Question 18 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Image & Speech Recognition

 
Recommendation Systems

 
Online Price Comparison

 
Privacy Checker

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/9
04/12/2022, 08:44 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 19 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Statistical Analysis

 
Pattern Recognition

 
Root Cause Analysis

 
Cognitive Analytics

Question 20 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Predictive Analytics

 
Descriptive Analytics

 
Prescriptive Analytics

 
Diagnostic Analytics

Quiz Score:
4.75 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
14 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 4 at 8:52
This attempt took 14 minutes.

Question 1 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
True

 
False

Question 2 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Data science is a ___________ Analysis.

 
Standard & dynamic

 
Descriptive & Static

 
Static & Comparative.

 
Predictive & Prescriptive

Question 3 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Data related

 
Technology related

 
All three

 
Organization related

Question 4 0.25
/ 0.25 pts

___________ comprises the strategies and technologies used by


enterprises for data analysis and management.

 
Business Intelligence

 
Artificial Intelligence

 
None of the Above.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Machine Learning

Question 5 0.25
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Predictive & Prescriptive

 
Descriptive & Static

 
Static & comparative

 
Standard & dynamic

Question 6 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 7 0.25
/ 0.25 pts

Exploratory data analysis does not help in

 
In univariate and bivariate variable analysis

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Derivation of new attributes

 
Finding out the data type of a variable

 
Finding statistical estimates of a variable

Question 8 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Targeted advertising as per customer's need

 
Creation of new fiscal policy

 
Product recommender systems

 
Fraud detection and prevention system in a bank

Question 9 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Preparation

 
Data Understanding

 
Data Requirements

 
Data Collection

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 10 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Label

 
Data Wrangling

 
Data Annotation

 
Data Writing

Question 11 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the


voice is male or female a Data Science Activity

 
True

 
False

Question 12 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Data Scientist

 
Data Engineer

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Business Intelligence Analyst

 
Data Analyst

Question 13 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
None of the above

 
Probability Calculation activity

 
Machine Learning Activity

 
Data Science Activity

Question 14 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic


process?

 
Business Intelligence

 
Data Mining

 
Data Warehouse

 
Data Science

Question 15 0.25
/ 0.25 pts
https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
None of the above

 
Diagnostic analytics

 
Descriptive analytics

 
Predictive analytics

Question 16 0.25
/ 0.25 pts

Type of problem in statistics will be

 
None of these

 
Well Structured

 
Unstructured

 
Semi-Structured

Question 17 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
What should I do?

 
What happened?

 
What will happen?

 
Why did it happen?

Question 18 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data quantity

 
Data Security

 
All of these

 
Data Availability

Question 19 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Vision

 
Velocity

 
Variety

 
Volume

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 20 0.25
/ 0.25 pts

Big data is defined as ______________?

 
The process of making machines capable of mimicking human behavior

 
An engineering discipline that is concerned with all aspects of software
production.

 
The process of extracting and creating information from raw data by
using different techniques

 
Collections of datasets whose volume, velocity or variety is so large that
it is difficult to store, manage, process and analyze the data using
traditional databases and data processing tools

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/9
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due Dec 4 at 19:00 Points 5 Questions 20 Available Dec 3 at 19:00 - Dec 4 at 19:00 24 hours
Time Limit 30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score

LATEST Attempt 1 15 minutes 5 out of 5

 Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz: 5 out of 5


Submitted Dec 4 at 8:47
This attempt took 15 minutes.

Question 1 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Data science is a ___________ Analysis.

  Static & Comparative.

  Descriptive & Static

  Standard & dynamic

  Predictive & Prescriptive

Question 2 0.25 / 0.25 pts

Data science is an Interdisciplinary field.

  True

  False

Question 3 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Data Engineers are also known as Data Miners.

  True

  False

Question 4 0.25 / 0.25 pts

_________________ is the science of collecting, analyzing, presenting, and interpreting data. Objective to
draw conclusions on the population.

  Statistics

  Business Intelligence

  Data Analytics

  Computational Intelligence

Question 5 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

___________ comprises the strategies and technologies used by enterprises for data analysis and
management.

  None of the Above.

  Artificial Intelligence

  Business Intelligence

  Machine Learning

Question 6 0.25 / 0.25 pts

Which one of this belongs to the Data Science career?

  None

  Mathematical Analyst

  Data Administrator

  Data Creator

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 7 0.25 / 0.25 pts

Which one refers to the labelling of the data?   

  Data Writing

  Data Annotation

  Data Label

  Data Wrangling

Question 8 0.25 / 0.25 pts

Data preprocessing improves the data quality and make data mining algorithms efficient and effective. What
are the data preprocessing tasks along with Data cleaning?

  Data Integration

  All of the above

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  Data Reduction

  Data Transformation

Question 9 0.25 / 0.25 pts

Data scientist is not responsible for 

  Data analytics

  Data manipulation

  Data mining

  Building continuous data stream

Question 10 0.25 / 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the best possible answer.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  Data Understanding

  Data Preparation

  Data Requirements

  Data Collection

Question 11 0.25 / 0.25 pts

While playing cards predicting the next card to be a joker is 

  Probability Calculation activity

  Machine Learning Activity

  Data Science Activity

  None of the above

Question 12 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

A scenario where you visited a doctor for your fever and the doctor asked you questions like: Were you
exposed to rain or cold climate; or did you have contact with a sick person; or did you have food from outside
etc., and then doctor comes to a conclusion based on your response to the queries can be considered
analogous to which stage of data analytics.

  Diagnostic

  Prescriptive

  Predictive

  Descriptive

Question 13 0.25 / 0.25 pts

Can extracting frequencies from audio signals to determine whether the voice is male or female a Data
Science Activity

  True

  False

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 14 0.25 / 0.25 pts

Among the following, which user role is more concerned with non-functional requirements such as security,
fault tolerance and efficiency in data processing applications?

  Data Analyst

  Business Intelligence Analyst

  Data Scientist

  Data Engineer

Question 15 0.25 / 0.25 pts

The scatterplot implies that

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  The features are independent

  The features are positively correlated

  The features are negatively correlated

Question 16 0.25 / 0.25 pts

Type of problem in statistics will be

  Unstructured

  Semi-Structured

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  None of these

  Well Structured

Question 17 0.25 / 0.25 pts

Which of the following is not a characteristic of Big Data?

  Velocity

  Variety

  Volume

  Vision

Question 18 0.25 / 0.25 pts

Which of the following questions is answered by Predictive Analytics?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 11/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  What happened?

  Why did it happen?

  What will happen?

  What should I do?

Question 19 0.25 / 0.25 pts

Big data is defined as ______________?

  The process of making machines capable of mimicking human behavior

  An engineering discipline that is concerned with all aspects of software production.

 
Collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process
and analyze the data using traditional databases and data processing tools

  The process of extracting and creating information from raw data by using different techniques

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 12/13
12/4/22, 8:48 AM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 20 0.25 / 0.25 pts

Diagnostic Analysis is often referred as,

  Root Cause Analysis

  Statistical Analysis

  Cognitive Analytics

  Pattern Recognition

Quiz Score: 5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 13/13
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due Dec 4 at 19:00 Points 5 Questions 20
Available Dec 3 at 19:00 - Dec 4 at 19:00 24 hours Time Limit 30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score

LATEST Attempt 1 18 minutes 5 out of 5

 Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz: 5 out of 5


Submitted Dec 3 at 23:26
This attempt took 18 minutes.

Question 1 0.25 / 0.25 pts

The Data Warehouse is a “Schema-on -Load” Approach.

  False

  True

Question 2 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which one of this belongs to the Data Science career?

  Data Administrator

  None

  Mathematical Analyst

  Data Creator

Question 3 0.25 / 0.25 pts

How is data science understood in terms of interpretability and


Accuracy?

  low Accuracy, high Interpretability

  High Accuracy, Low Interpretability

Question 4 0.25 / 0.25 pts

___________ comprises the strategies and technologies used by


enterprises for data analysis and management.

  Machine Learning

  Artificial Intelligence

  None of the Above.

  Business Intelligence

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 5 0.25 / 0.25 pts

_____________ is the practice of designing and building systems for


collecting, storing, and analyzing data at scale.

  Data Analysis

  Machine Learning

  Data engineering

  Data Mining

Question 6 0.25 / 0.25 pts

_________________ is the science of collecting, analyzing, presenting,


and interpreting data. Objective to draw conclusions on the population.

  Statistics

  Computational Intelligence

  Data Analytics

  Business Intelligence

Question 7 0.25 / 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

correctly? (Choose the best answer)

  Data Cleaning

  Data Collection

  Data Understanding

  Data Preparation

Question 8 0.25 / 0.25 pts

Exploratory data analysis does not help in

  Derivation of new attributes

  In univariate and bivariate variable analysis

  Finding out the data type of a variable

  Finding statistical estimates of a variable

Question 9 0.25 / 0.25 pts

Data preprocessing improves the data quality and make data mining
algorithms efficient and effective. What are the data preprocessing
tasks along with Data cleaning?

  Data Reduction

  Data Transformation

  All of the above

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  Data Integration

Question 10 0.25 / 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

  Data Collection

  Data Understanding

  Data Requirements

  Data Preparation

Question 11 0.25 / 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

  Data Engineer

  Data Scientist

  Data Analyst

  Business Intelligence Analyst

Question 12 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

While playing cards predicting the next card to be a joker is 

  Machine Learning Activity

  Data Science Activity

  Probability Calculation activity

  None of the above

Question 13 0.25 / 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

  None of the above

  Descriptive analytics

  Diagnostic analytics

  Predictive analytics

Question 14 0.25 / 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

response to the queries can be considered analogous to which stage of


data analytics.

  Descriptive

  Prescriptive

  Predictive

  Diagnostic

Question 15 0.25 / 0.25 pts

Which among the following advocates an exploratory and dynamic


process?

  Data Science

  Business Intelligence

  Data Mining

  Data Warehouse

Question 16 0.25 / 0.25 pts

Which of the following are the main challenges in Data Science?

  Data Security

  Data quantity

  Data Availability

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  All of these

Question 17 0.25 / 0.25 pts

Which of the following is not an application of Data Science?

  Image & Speech Recognition

  Privacy Checker

  Online Price Comparison

  Recommendation Systems

Question 18 0.25 / 0.25 pts

Diagnostic Analysis is often referred as,

  Root Cause Analysis

  Cognitive Analytics

  Pattern Recognition

  Statistical Analysis

Question 19 0.25 / 0.25 pts

Big data is defined as ______________?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/9
12/3/22, 11:27 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The process of making machines capable of mimicking human behavior

 
Collections of datasets whose volume, velocity or variety is so large that
it is difficult to store, manage, process and analyze the data using
traditional databases and data processing tools

 
The process of extracting and creating information from raw data by
using different techniques

 
An engineering discipline that is concerned with all aspects of software
production.

Question 20 0.25 / 0.25 pts

Which of the following is not a characteristic of Big Data?

  Vision

  Variety

  Velocity

  Volume

Quiz Score: 5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/9
I NTRODUCTION TO DATA S CIENCE
M ODULE # 1 : I NTRODUCTION
Dr. Shreyas Rao
BITS Pilani
Profile of Instructor
Dr. Shreyas Rao
• 18+ Years of Experience in IT, Teaching and Research

• Working as Associate Professor (Off Campus), Dept. of CSIS, BITS-Pilani, WILP

• B.E from VTU, M.S in Software Systems from BITS (WILP) and PhD from MAHE

• Worked as Business Analyst and Team Lead at SLK Software Services for 7 years

• Previously worked in Presidency University and Sahyadri College, Mangaluru as R&D


Head, CSE

• COE member in AI&ML and COE member in Data Science (Govt. Sponsored for 1.2 Cr)

I N T RO DUC TIO N TO D AT A S C I E N C E
Profile of Instructor
Dr. Shreyas Rao
Consultant:
• ISRO-SAC (Ahmedabad) funded research project titled “Ontology Enabled Disaster
Management Web Service using Data Integration” as Technical Consultant. Deployed in
ISRO.
• Designed and Developed ‘Dhriti’, a mental health resource Chabot that caters to mental
health needs of people during Covid, from the COE in AI&ML, SCEM. Bot is released in
Dakshina Kannada region of Karnataka which answers user queries in English, Kannada and
Hindi languages. Deployed on the Web and Facebook Messenger channels.

I N T RO DUC TIO N TO D AT A S C I E N C E
Profile of Instructor
Dr. Shreyas Rao
Collaboration with Dept. of Health Innovation, Kasturba Hospital, MAHE
• Telemedicine effectiveness during Covid Wave-I at Kasturba Hospital, Manipal (Statistical
Analysis)
• Study on psychological implications of COVID-19 on Nursing professionals (Statistical
Analysis)
• Covid prediction using Patient Discharge Data (Deep Learning)
Dept. of Psychology, Montfort College:
• AI enabled tool for juvenile self-transformation (Mental Health domain, Deep Learning &
NLP)

*Published papers can be viewed at https://scholar.google.com.tw/citations?user=MFNrrlcAAAAJ&hl=en&oi=ao


I N T RO DUC TIO N TO D AT A S C I E N C E
T ABLE OF C ONTENTS

1
1 C OURSE L OGISTICS
2 Fundamentals of Data Science
3 D ATA S C I E N C E R E A L W O R L D A P P L I C AT I O N S
4 D ATA S C I E N C E V S . B U S I N E S S I N T E L L I G E N C E
5 D ATA S C I E N T I S T
6 S O F T WA R E E N G I N E E R I N G F O R D ATA S C I E N C E
7 D ATA S C I E N C E C H A L L E N G E S

I N T RO DUC TIO N TO D AT A S C I E N C E
C OURSE S T R UCTURE

M1 Introduction to Data Science


M2 Data Analytics
M3 Data Science Process
M4 Data Science Teams
M5 Data and Data Models
M6 Data Wrangling and Feature Engineering
M7 Data Visualization
M8 Storytelling with Data
M9 Ethics for Data Science

I N T RO DUC TIO N TO D AT A S C I E N C E
T E X T AND R EFERENCE B OOKS
T EXT B OOKS
T1 Introducing Data Science by Cielen, Meysman and Ali
T2 Storytelling with Data, A data visualization guide for business professionals,
by Cole, Nussbaumer Knaflic; Wiley
T3 Introduction to Data Mining, by Tan, Steinbach and Vipin Kumar

R EFERENCE B OOKS
R1 The Art of Data Science by Roger D Peng and Elizabeth Matsui
R2 Ethics and Data Science by DJ Patil, Hilary Mason, Mike Loukides
R3 Python Data Science Handbook: Essential tools for working with data by Jake
VanderPlas
R4 KDD, SEMMA and CRISP-DM: A Parallel Overview , Ana Azevedo and M.F.
Santos, IADS-DM, 2008
I N T RO DUC TIO N TO D AT A S C I E N C E
C A N VA S
Most relevant and up to date info on
• Course Handout
• Schedule for Webinar, Quiz, and Assignments [By 19-Nov-22]
• Lecture Slides
• Quiz
• Assignment

The video recording will be available in Microsoft Teams.

I N T RO DUC TIO N TO D AT A S C I E N C E
Evaluation

1. EC1- 30 marks
• Three quizzes (5 marks each) -10 marks (best 2 will be considered)
• One assignment - 20 marks
2. EC2 [Mid Term Exam] – 30 marks
3. EC3 [Comprehensive Exam] – 40 marks

I N T RO DUC TIO N TO D AT A S C I E N C E
T ABLE OF C ONTENTS

1 C OURSE L OGISTICS
2 F U N DA M E N TA L S O F D ATA S C I E N C E
3 D ATA S C I E N C E R E A L W O R L D A P P L I C AT I O N S
4 D ATA S C I E N C E V S . B U S I N E S S I N T E L L I G E N C E
5 D ATA S C I E N T I S T
6 S O F T WA R E E N G I N E E R I N G F O R D ATA S C I E N C E
7 D ATA S C I E N C E C H A L L E N G E S

I N T RO DUC TIO N TO D AT A S C I E N C E
W H A T i s S CIENCE?

 Science is the systematic study of the structure and behavior of world (phenomenon)
through observation, experimentation and measurement.

Observe Experiment Measure

I N T RO DUC TIO N TO D AT A S C I E N C E
Prefixes to ‘Science’

Science Observe Experiment Measure


Computer Compute Store Visualize Analyze Automate
Science
Biological Research Solve Develop Synthesize
Science
Data Capture Prepare Process Analyze Visualize
Science Predict Uncover Insights Enable Decision Making

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E

Data Science is the "study of data".


Data Science is an art of uncovering insights and trends that are hiding behind the
data.
Data Science helps to translate data into a story. The story telling helps in uncovering
insights. The insights help in making decision or strategic choices.
Data Science is the process of using data to understand different things.
• Requires a major effort of preparing, cleaning, scrubbing, or standardizing the data.
• Algorithms are then applied to crunch pre-processed data.
• This process is iterative and requires analysts’ awareness of the best practices.
• The most important aspect of data science is interpreting the results of the analysis in
order to make decisions.

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E – I n t e r d i s c i p l i n a r y F i e l d

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E – M U L T I P L E D I S C I P L I N E S

I N T RO DUC TIO N TO D AT A S C I E N C E
WHY D AT A S C I E N C E ?

• ”Data Science is the sexiest job in the 21st century” – IBM.


• “Data is the New Oil” - Tesco marketing mastermind Clive Humby (2006)
• Data Science is one of the fastest growing fields in the world.
• According to the U.S. Bureau of Labor Statistics, 11.5 million new jobs will be created by
the year 2026.
• Even with COVID-19 situation, and the amount of shortage in talent, there might not be a
dip in data science as a career option.

I N T RO DUC TIO N TO D AT A S C I E N C E
WHY D AT A S C I E N C E ?
• In India, the average salary of a data scientist as of January 2022 is Rs.10L/yr.
[Glassdoor, 2022].
• The increase in data science as a career choice in 2022 will also see the rise in its
various job roles.
• Data Engineer
• Data Administrator
• Machine Learning Engineer
• Statistician
• Data and Analytics Manager

I N T RO DUC TIO N TO D AT A S C I E N C E
N E E D O F D AT A S C I E N C E - D I G I T A L D A T A D E L U G E

https://www.retailtouchpoints.com/resources/digital-data-deluge-becomes-a-tsunami-due-to-covid-19
I N T RO DUC TIO N TO D AT A S C I E N C E
N E E D O F D AT A S C I E N C E

Data deluge – resulting in tons of data.


Supportive technologies:
• Powerful algorithms to support computation
[Ex: Transformer models like BERT, GPT-3]
• Open source software and tools [Python]
• Computational speed, accuracy and cost [Cloud Computing – Azure, AWS]
• Data storage in terms of capacity and cost.

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E , A I A N D M L C o n v e r g e n c e
Artificial Intelligence
• AI involves making machines capable of mimicking human behavior, particularly
cognitive functions like facial recognition, automated driving, sorting mail based on
postal code.
Machine Learning
• Considered a sub-field of or one of the tools of AI.
• Involves providing machines with the capability of learning from experience.
• Experience for machines comes in the form of data.
Data Science
• Data science is the application of machine learning, artificial intelligence, and other
quantitative fields like statistics, visualization, and mathematics to uncover insights from
data to enable better decision marking.

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E , A I A N D M L

https://www.sciencedirect.com/topics/physics-and-astronomy/artificial-intelligence
I N T RO DUC TIO N TO D AT A S C I E N C E
T ABLE OF C ONTENTS

1 C OURSE L OGISTICS
2 F U N DA M E N TA L S O F D ATA S C I E N C E
3 D ATA S C I E N C E R E A L W O R L D A P P L I C AT I O N S
4 D ATA S C I E N C E V S . B U S I N E S S I N T E L L I G E N C E
5 D ATA S C I E N T I S T
6 S O F T WA R E E N G I N E E R I N G F O R D ATA S C I E N C E
7 D ATA S C I E N C E C H A L L E N G E S

I N T RO DUC TIO N TO D AT A S C I E N C E
U S E C A S E S O F D ATA S C I E N C E

DataFlair
I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E I N F A C E B O O K

Social Analytics
Utilizes quantitative research to gain insights about the social interactions among
people.
Makes use of deep learning, facial recognition, and text analysis.
In facial recognition, it uses powerful neural networks to classify faces in the
photographs.
In text analysis, it uses “DeepText” to understand people’s interest and aligns
photographs with texts.
It uses deep learning for targeted advertising.
Using the insights gained from data, it clusters users based on their preferences and
provides them with the advertisements that appeal to them.
I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E I N A M A Z O N

Improving E-Commerce Experience


Personalized recommendation
• Predictive analytics (a personalized recommender system) to increase customer
satisfaction.
• Purchase history of customers, other customer suggestions, and user ratings are
analyzed to recommend products. [Product recommendation]
Anticipatory shipping model [Inventory Updation & Management]
• Predict the products that are most likely to be purchased by its users.
• Analyzes pattern of customer purchases and keeps products in the nearest warehouse
which the customers may utilize in the future. [Market Basket Analysis – Data Mining]

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E I N A M A Z O N – C O N T D ...

Improving E-Commerce Experience


Price discounts
• Using parameters such as the user activity, order history, prices offered by the
competitors, product availability, etc., Amazon provides discounts on popular items and
earns profits on less popular items.
Fraud Detection
• Detect fraud sellers and fraudulent purchases.

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E I N U B E R
Improving Rider Experience
Uber maintains large database of drivers, customers, and several other records.
Makes extensive use of Big Data and crowdsourcing to derive insights and provide
best services to its customers.
Dynamic pricing
• Use of big Data and data science to calculate fares based on specific parameters.
• Uber matches customer profile with the most suitable driver and charges them based on
the time it takes to cover the distance rather than the distance itself.
• The time of travel is calculated using algorithms that make use of data related to traffic
density and weather conditions.
• When the demand is higher (more riders) than supply (less drivers), the price of the ride
goes up. [Rainy Season]

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E I N B A N K O F A M E R I C A
Improving Customer Experience
Erica – a virtual financial assistant (BoA)
• Erica serves as a customer advisor to over 45 million users around the world.
• Erica makes use of Speech Recognition to take customer inputs.
Fraud detection
• Uses data science and predictive analytics to detect frauds in payments,
insurance, credit cards, and customer information.
Customer segmentation
• Segment their customers in the high-value and low-value segments.
• Data scientists makes use of clustering, logistic regression, decision trees to
help the banks to understand the Customer Lifetime Value (CLV) and group
them in the appropriate segments.
• Customer segmentation helps in up-selling and cross-selling of products.

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E I N A I R B N B

Improving Customer Experience


Providing better search results
• Uses big data of customer and host information, homestays and lodge records, and
website traffic.
• Uses data science to provide better search results to its customers and find compatible
hosts.
Detecting bounce rates
• Use of demographic analytics to analyze bounce rates from their websites.
Providing ideal lodgings and localities
• Uses knowledge graphs where the user’s preferences are matched with the various
parameters to provide ideal lodgings and localities.

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E I N S P O T I F Y

Improving Customer Experience and recommendation


Providing better music streaming experience
• Provide personalized music recommendations.
• Uses over 600 GBs of daily data generated by the users to build its algorithms to boost
user experience.
Improving experience for artists and managers
• Spotify for Artists application allows the artists and managers to analyze their streams,
fan approval and the hits they are generating through Spotify’s playlists.

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E I N S P O T I F Y ... C O N T D ..

Spotify uses data science to gain insights about which universities had the highest
percentage of party playlists and which ones spent the most time on it.
”Spotify Insights” publishes information about the ongoing trends in the music.
Spotify’s Niland, an API based product, uses machine learning to provide better
searches and recommendations to its users.
Spotify analyzes listening habits of its users to predict the Grammy Award Winners.

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E I N H e a l t h c a r e
Covid Patient Discharge Prediction (Dataset: 2nd Wave April-2021 to June 2021)
Type of Project: Machine Learning
Dataset size: 1233 patients suffering from Covid
Variables:
X: Age, Gender, Co_morbid, Admit Date, Discharge date, days of stay,
covid_severity
Y: Discharge Type (Recovered, Expired)
Exploratory Data Analysis: Univariate, Bivariate, Multivariate
Models applied: Support Vector Machine, Naïve Bayes, Logistic Regression,
Decision Trees, KNN, ANN, Random Forest
Best Accuracy: Random Forest (92%)

I N T RO DUC TIO N TO D AT A S C I E N C E
A P P L I C AT I O N S O F D ATA S C I E N C E

DataFlair
I N T RO DUC TIO N TO D AT A S C I E N C E
A P P L I C AT I O N S O F D ATA S C I E N C E

edureka.co
I N T RO DUC TIO N TO D AT A S C I E N C E
T ABLE OF C ONTENTS

1 C OURSE L OGISTICS
2 F U N DA M E N TA L S O F D ATA S C I E N C E
3 D ATA S C I E N C E R E A L W O R L D A P P L I C AT I O N S
4 D ATA S C I E N C E V S . B U S I N E S S I N T E L L I G E N C E
5 D ATA S C I E N T I S T
6 S O F T WA R E E N G I N E E R I N G F O R D ATA S C I E N C E
7 D ATA S C I E N C E C H A L L E N G E S

I N T RO DUC TIO N TO D AT A S C I E N C E
D ATA S C I E N C E V S . B U S I N E S S I N T E L L I G E N C E

Business intelligence comprises the strategies and technologies used by enterprises for the data analysis and
management of business information. One of the key BI components is Data Warehouse.

I N T RO DUC TIO N TO D AT A S C I E N C E
D ATA S C I E N C E V S . B U S I N E S S I N T E L L I G E N C E

I N T RO DUC TIO N TO D AT A S C I E N C E
D ATA S C I E N T I S T V S . B I A N A LY S T

I N T RO DUC TIO N TO D AT A S C I E N C E
D ATA S C I E N C E V S . S TAT I S T I C S
• Statistics is the science of collecting, analyzing, presenting, and interpreting data. Objective
to draw conclusions on the population.
• The science of statistics enables Data Science.
• Data Science expands the application of statistics towards solving Big Data challenges.
• Data Science comprises of 4As (data architecture, data acquisition, data analysis and data
archiving). The two types of statistics namely ‘descriptive’ and ‘inferential’ are applied during
‘Data Analysis’ phase in data science.

*Source - H. Hassani et al., “The science of statistics versus data science: What is the future?”, Technological
Forecasting and Social Change (Elsevier), Volume 173, 2021

I N T RO DUC TIO N TO D AT A S C I E N C E
D ATA S C I E N C E V S . S TAT I S T I C S
Statistics Data Science
Theoretical Origins Mathematical biology and Statistics and Probability
biometry
Main Focus Theoretical Sophistication Practical Solutions to
real problems
Main Approach Methodology / Model Application of machine
development and confirmation learning and data mining
models
Focus of Model Building Examination of correlations, Hyper parameter
causality between the variables optimization and feature
selection
Interpretability vs High Interpretability, Low High Accuracy, Low
Accuracy Accuracy Interpretability (XAI or
Explainable AI)

I N T RO DUC TIO N TO D AT A S C I E N C E
D ATA S C I E N C E V S . S TAT I S T I C S
Statistics Data Science
Type of problem Well structured Semi structured or
[Survey – Likert Scale data] unstructured

Size of dataset Small Large


Homogeneous Heterogeneous

End Goal Data Analysis Data Analysis and


Prediction

Source - H. Hassani et al., “The science of statistics versus data science: What is the future?”, Technological
Forecasting and Social Change (Elsevier), Volume 173, 2021

I N T RO DUC TIO N TO D AT A S C I E N C E
Data Mining vs Data Science
• Data Mining field started in 1989 as “Algorithms for Pattern Recognition”, later remodeled as a “Step in the KDD process”
• Data Mining is Goal-oriented and Process driven in nature!
• Understand the business goals first, then apply the DM process to arrive at a result!
• Process takes center stage!
• More of ‘mining’ the data to find insights using algorithms!

• Data Science term first coined in 1962, but remodeled in 2007 as “Derive insights from big data for making smarter
decisions”
• Data Science is Data-oriented and Exploratory in nature!
• Data exploration may help define the business goals or insights and arrive at results!
• Data takes the center stage!
• More work in ‘exploring or searching’ data, than actual mining!

I N T RO DUC TIO N TO D AT A S C I E N C E
T ABLE OF C ONTENTS

1 C OURSE L OGISTICS
2 F U N DA M E N TA L S O F D ATA S C I E N C E
3 D ATA S C I E N C E R E A L W O R L D A P P L I C AT I O N S
4 D ATA S C I E N C E V S . B U S I N E S S I N T E L L I G E N C E
5 D ATA S C I E N T I S T
6 S O F T WA R E E N G I N E E R I N G F O R D ATA S C I E N C E
7 D ATA S C I E N C E C H A L L E N G E S

I N T RO DUC TIO N TO D AT A S C I E N C E
W H O IS A D AT A S C I E N T I S T ?

A data scientist is someone who extracts insights from messy data.


A data scientist is responsible for guiding a data science project from start to finish.
Success in a data science project comes not just from an one tool, but from having
quantifiable goals, good methodology, cross-discipline interactions, and a repeatable
workflow.

I N T RO DUC TIO N TO D AT A S C I E N C E
R O L E O F A D AT A S C I E N T I S T

Reframe business challenges as analytics challenges.


This is a skill to diagnose the problem, consider the core of a given problem, and
determine which kinds of candidate analytical method can be applied to solve it.
Design, implement and deploy statistical models and data mining techniques on
data. This activity is mainly the role of data scientist, applying complex or advanced
analytical methods to a variety of business problem using data.
Develop insights that lead to actionable recommendations.

Learn how to draw insights out of data and communicate them effectively.

I N T RO DUC TIO N TO D AT A S C I E N C E
Data Science – Hierarchy of Needs

I N T RO DUC TIO N TO D AT A S C I E N C E
Differences between roles

Data engineering is the practice of designing and building systems for collecting, storing, and analyzing data at
scale. Lay the foundation for Data Analysis. Concerned with security, reliability, fault tolerance, scalability and
efficiency of the data processing systems.
I N T RO DUC TIO N TO D AT A S C I E N C E
S K I L L S R E q U I R E D F O R A D AT A S C I E N T I S T

Communicative Qualitative

Data
Curious Technical
Scientist

Creative Skeptical

I N T RO DUC TIO N TO D AT A S C I E N C E
T O O L S AVA I L A B L E T O A D ATA S C I E N T I S T

R
SQL
Python

Scala

Tools SAS

Hadoop

Julia
Tableau
Weka

I N T RO DUC TIO N TO D AT A S C I E N C E
A L G O R I T H M S F O R A D AT A S C I E N T I S T

Logistic
Regression
K-means Linear
clustering Regression

PCA Algorithms Apriori

Decision
SVM
Tree
ANN

I N T RO DUC TIO N TO D AT A S C I E N C E
T ABLE OF C ONTENTS

1 C OURSE L OGISTICS
2 F U N DA M E N TA L S O F D ATA S C I E N C E
3 D ATA S C I E N C E R E A L W O R L D A P P L I C AT I O N S
4 D ATA S C I E N C E V S . B U S I N E S S I N T E L L I G E N C E
5 D ATA S C I E N T I S T
6 S O F T WA R E E N G I N E E R I N G F O R D ATA S C I E N C E
7 D ATA S C I E N C E C H A L L E N G E S

I N T RO DUC TIO N TO D AT A S C I E N C E
S O F T WA R E E N G I N E E R I N G
In general,
Software engineering is an engineering discipline that is concerned with all aspects of
software production.
Software includes computer programs, all associated documentation, and
configuration data that are needed for software to work correctly.
Waterfall model, Iterative models, Agile models

I N T RO DUC TIO N TO D AT A S C I E N C E
D ATA S C I E N C E P R O C E S S

I N T RO DUC TIO N TO D AT A S C I E N C E
S O F T WA R E E N G I N E E R I N G V S . D ATA S C I E N C E

Software Engineering Data Science

Concerned with creating useful appli- Involves collecting, analyzing and visualizing
cations data
Software engineers use the SDLC pro- Data scientists utilize the ETL (Ex-
cess tract, Tranform, Load) process
Uses frameworks like Waterfall, Agile (Scrum, Methodologies like CRISP-DM, SMAM, SEMMA,
XP) Big Data Lifecycle etc.
Software engineers use programming languages like Data scientists use tools like Ama-
C#, Java and web frameworks like Django, Flask zon S3, MongoDB, Hadoop, and MySQL

Skills are focused on coding languages Skills include machine learning,


statistics, and data visualization

I N T RO DUC TIO N TO D AT A S C I E N C E
T ABLE OF C ONTENTS

1 C OURSE L OGISTICS
2 F U N DA M E N TA L S O F D ATA S C I E N C E
3 D ATA S C I E N C E R E A L W O R L D A P P L I C AT I O N S
4 D ATA S C I E N C E V S . B U S I N E S S I N T E L L I G E N C E
5 D ATA S C I E N T I S T
6 S O F T WA R E E N G I N E E R I N G F O R D ATA S C I E N C E
7 D ATA S C I E N C E C H A L L E N G E S

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E C H A L L E N G E S

Data science challenges can be categorized as:


Data related
Organization related
Technology related
People related
Skill related

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A S C I E N C E C H A L L E N G E S

Source – Business Broadway Survey 2018

I N T RO DUC TIO N TO D AT A S C I E N C E
C OGNITIVE B IAS

Cognitive Biases are the distortions of reality because of the lens through which we
view the world. [Subjective vs Objective view of reality]
Each of us sees things differently based on our preconceptions, past experiences,
cultural, environmental, and social factors. This doesn’t necessarily mean that the
way we think or feel about something is truly representative of reality.

I N T RO DUC TIO N TO D AT A S C I E N C E
References:

• Introducing Data Science by Cielen, Meysman and Ali


• The Art of Data Science by Roger D Peng and Elizabeth Matsui

https://data-flair.training/blogs/data-science-use-
cases/ https:
• //www.northeastern.edu/graduate/blog/what-does-a-data-scientist-
do/
• https://www.visual-paradigm.com/guide/software-development-
process/ what-is-a-software-process-model/
• https://www.sciencedirect.com/science/article/abs/pii/S004016252100

T HANK YOU
5448

I N T RO DUC TIO N TO D AT A S C I E N C E
I NTRODUCTION TO DATA S CIENCE
M ODULE # 2 : DATA A NALYTICS
IDS Course Team
BITS Pilani
TABLE OF C ONTENTS

1 A N A LY T I C S

2 B I G D ATA

3 D ATA A N A LY T I C S

4 C A S E S T U D I E S O N D ATA A N A LY T I C S

I N T RO DUC TIO N TO D AT A S C I E N C E
D EFINITION O F A NALY T I C S – D ICTIONARY

O X F O R D Analytics is the systematic computational analysis of data or statistics.


C A M B R I D G E Analytics is a process in which a computer examines information using
mathematical methods in order to find useful patterns.
D I C T I O N A RY . C O M Analytics is the analysis of data, typically large sets of business data,
by the use of mathematics, statistics, and computer software.

Analytics is treated as both a noun and a verb.

Source: Big Data Analytics – A Hands-on Approach by Arshdeep Bahga & Vijay Madisetti
I N T RO DUC TIO N TO D AT A S C I E N C E
D EFINITION OF A NALY T I C S – W E B S I T E S

O R A C L E Analytics is the process of discovering, interpreting, and communicating


significant patterns in data and using tools to empower your entire
organization to ask any question of any data in any environment on any
device.
E D U R E K A Data Analytics refers to the techniques used to analyze data to enhance
productivity and business gain.
I N FO R M AT I C A Data analytics is the pursuit of extracting meaning from raw data using
specialized computer systems.

Source: Big Data Analytics – A Hands-on Approach by Arshdeep Bahga & Vijay Madisetti
I N T RO DUC TIO N TO D AT A S C I E N C E
D EFINING A NALY T I C S

Analytics is the process of extracting and creating information from raw data by using
techniques such as:
• filtering, processing, categorizing, condensing and contextualizing the data.
Analytics is a broad term that encompasses the processes, technologies, frameworks
and algorithms to extract meaningful insights from data.
This information thus obtained is then used to infer knowledge about the system
and/or its users, and its operations to make the systems smarter and more efficient.

Source: Big Data Analytics – A Hands-on Approach by Arshdeep Bahga & Vijay Madisetti
I N T RO DUC TIO N TO D AT A S C I E N C E
G OA L S O F D AT A A N A L Y T I C S
To predict something
• whether a transaction is a fraud or not [Banking]
• whether it will rain on a particular day [Weather Forecast]
• whether a tumor is benign or malignant [Cancer Prediction, Healthcare]
To find patterns in the data
• finding the top 10 coldest days in the year [Weather Forecast]
• which pages are visited the most on a particular website [Web Traffic Rank]
• finding the most searched celebrity in a particular year [Awards]
To find relationships in the data
• finding similar news articles [Bing, Google]
• finding similar patients in an electronic health record system [Healthcare]
• finding related products on an e-commerce website [Recommendation]
• finding correlation between news items and stock prices
* https://www.cnbc.com/2022/04/04/twitter-shares-soar-more-than-25percent-after-elon-musk-takes-9percent-stake-in-social-media-company.html
I N T RO DUC TIO N TO D AT A S C I E N C E
TABLE OF C ONTENTS

1 A N A LY T I C S

2 B I G D ATA

3 D ATA A N A LY T I C S

4 C A S E S T U D I E S O N D ATA A N A LY T I C S

I N T RO DUC TIO N TO D AT A S C I E N C E
B I G D AT A

Big data is defined as collections of datasets whose volume, velocity or variety is so


large that it is difficult to store, manage, process and analyze the data using
traditional databases and data processing tools.
Big Data analytics deals with collection, storage, processing and analysis of this
massive scale data.
Specialized tools and frameworks are required for big data analysis when:
1 the volume of data involved is so large that it is difficult to store, process and analyze
data on a single machine
2 the velocity of data is very high and the data needs to be analyzed in real-time
3 there is variety of data involved, which can be structured, unstructured or
semi-structured, and is collected from multiple data sources
4 various types of analytics need to be performed to extract value from the data

I N T RO DUC TIO N TO D AT A S C I E N C E
B I G D AT A - E X A M P L E

I N T RO DUC TIO N TO D AT A S C I E N C E
C H A R A C T E R I S T I C S O F B I G D ATA

Volume

Velocity

Big Data
Value
5 V’s

Variety

Veracity

I N T RO DUC TIO N TO D AT A S C I E N C E
C H A R A C T E R I S T I C S O F B I G D AT A

1 Volume
• Volume of data involved is so large that it is difficult to store, process and analyze data
on a single machine.
• Volumes of data generated by IT / IoT systems is growing exponentially.
• lowering costs of data storage and processing architectures [possible due to Cloud]
• need to extract valuable insights from the data to improve business processes, efficiency
and service to consumers.
2 Velocity
• Velocity of data refers to how fast the data is generated.
• High velocity of data results in the volume of data accumulated to become very large, in
short span of time.
• Need to consider parameters such as data provenance and accuracy

I N T RO DUC TIO N TO D AT A S C I E N C E
C H A R A C T E R I S T I C S O F B I G D AT A

3 Variety
• Variety refers to the forms / types of the data.
• Big data comes in different forms such as structured, unstructured or semi-structured,
including text data, image, audio, video and sensor data.
4 Veracity
• Veracity refers to how accurate is the data.
• To extract value from the data, the data needs to be cleaned to remove noise.
5 Value
• Value of data refers to the usefulness of data for the intended purpose.
• The value of the data is also related to the veracity or accuracy of the data.
• For some applications value also depends on how fast we are able to process the data.
[Static (Warehouse) vs Real Time (lecture)]
I N T RO DUC TIO N TO D AT A S C I E N C E
TABLE OF C ONTENTS

1 A N A LY T I C S

2 B I G D ATA

3 D ATA A N A LY T I C S

4 C A S E S T U D I E S O N D ATA A N A LY T I C S

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A A N A L Y T I C S

Data analytics is defined as a process of cleaning, transforming, and modeling data


to discover useful information for business decision-making.
4 different types of analytics
1 Descriptive Analytics
2 Diagnostic Analytics
3 Predictive Analytics
4 Prescriptive Analytics

I N T RO DUC TIO N TO D AT A S C I E N C E
D ATA A N A LY T I C S

I N T RO DUC TIO N TO D AT A S C I E N C E
D E S C R I P T IVE A NALY T I C S

Answers the question of what happened.


Summarize past data usually in the form of dashboards.
Insights into the past.
Also known as statistical analysis.
Raw data from multiple data sources.

I N T RO DUC TIO N TO D AT A S C I E N C E
D E S C R I P T I V E A N A LY T I C S E X A M P L E - I

I N T RO DUC TIO N TO D AT A S C I E N C E
D E S C R I P T I V E A N A LY T I C S E X A M P L E - I I

Paper - Healthcare Delivery through Telemedicine during the COVID-19 Pandemic: Case Study from a Tertiary Care Center in South India
https://pubmed.ncbi.nlm.nih.gov/33528313/

I N T RO DUC TIO N TO D AT A S C I E N C E
D E S C R I P T IVE A NALY T I C S

Techniques:
• Descriptive Statistics - histogram, correlation
• Data Visualization
• Exploratory Analysis [Seaborn Library in Python]

I N T RO DUC TIO N TO D AT A S C I E N C E
D IA G N O S T I C A N A LY T I C S

• Answers the question of why something happened.


• Gives in-depth insights into data.
• Identify relationship between data and identify patterns of behavior.

• Diagnostic analytics is a form of data analytics that builds on descriptive analytics to


help you understand why something happened in the past.

• Often, diagnostic analysis is referred to as root cause analysis. It involves processes


such as data discovery, data mining, and drill down and drill through.

I N T RO DUC TIO N TO D AT A S C I E N C E
D IA G N O S T I C A N A LY T I C S E X A M P L E
What is the effect of global warming in the Southwest monsoon?

I N T RO DUC TIO N TO D AT A S C I E N C E
D IA G N O S T I C A N A L Y T I C S

• Pattern recognition to identify patterns.


• Linear / Logistic regression to identify relationship.
• Neural Network
• Deep Learning techniques

I N T RO DUC TIO N TO D AT A S C I E N C E
P R E D I C T I V E A NALY T I C S

• Answers the question of what is likely to happen.


Predict future trends.
• Being able to predict allows one to make better decisions.
Analysis based on machine or deep learning.
• Accuracy of the forecasting or prediction highly depends on data quality and stability
of the situation.

I N T RO DUC TIO N TO D AT A S C I E N C E
P R E D I C T I V E A N A LY T I C S E X A M P L E - I

I N T RO DUC TIO N TO D AT A S C I E N C E
P R E D I C T I V E A N A LY T I C S E X A M P L E - I I

Covid Patient Discharge Prediction (Dataset: 2nd Wave April-2021 to June 2021)
Type of Project: Machine Learning
Dataset size: 1233 patients suffering from Covid
Variables:
X: Age, Gender, Co_morbid, Admit Date, Discharge date, days of stay,
covid_severity
Y: Discharge Type (Recovered, Expired)
Exploratory Data Analysis: Univariate, Bivariate, Multivariate
Models applied: Support Vector Machine, Naïve Bayes, Logistic Regression,
Decision Trees, KNN, ANN, Random Forest
Best Accuracy: Random Forest (92%)

I N T RO DUC TIO N TO D AT A S C I E N C E
P R E D I C T I V E A NALY T I C S

Techniques / Algorithms:
• Regression
• Classification
• ML algorithms like Linear regression, Logistic regression, SVM
• Deep Learning techniques

I N T RO DUC TIO N TO D AT A S C I E N C E
P R E S C R I P T I V E A NALY T I C S

Answers the question of what might happen.


Data-driven decision making and corrective actions
Prescribe what action to take to eliminate a future problem or take full advantage of a
promising trend.
Need historical internal data and external information like trends.
Analysis based on machine or deep learning, business rules.
Use of AI to improve decision making.

I N T RO DUC TIO N TO D AT A S C I E N C E
P R E S C R I P T I V E A N A LY T I C S E X A M P L E - I
• Apollo Hospitals uses an AI tool to predict the risk of cardiovascular disease.
• The Apollo AI-powered “Cardiovascular Disease Risk” tool will help healthcare providers to predict
the risk of cardiac disease in their patients [Predictive Analytics]
• The prediction initiates intervention early enough to make a real difference. [Prescriptive]
• The cardiac risk scoring tool is remarkable for the speed in processing data and its accuracy at
predicting the probability of a patient developing coronary disease.
• Using the tool, physicians will be enabled to deliver proactive, pre-emptive and preventive care for at-
risk individuals, improving lives, while mitigating future pressure on healthcare systems.

https://www.apollohospitals.com/apollo-in-the-news/apollo-hospitals-has-launched-an-artificial-intelligence-tool-to-predict-the-risk-of-cardiovascular-disease/#:~:text=On%20COVID%2D19-
,Apollo%20Hospitals%20has%20launched%20an%20Artificial%20Intelligence%20tool,the%20risk%20of%20cardiovascular%20disease.&text=Apollo%20Hospitals%20announced%20the%20national,the%20risk%20of
%20cardiovascular%20disease.

I N T RO DUC TIO N TO D AT A S C I E N C E
P R E S C R I P T I V E A N A LY T I C S E X A M P L E - I I
How can we improve the crop production?

I N T RO DUC TIO N TO D AT A S C I E N C E
Types of Data Analytics

I N T RO DUC TIO N TO D AT A S C I E N C E
Types of Data Analytics

Exercise

Instagram Reels allows users to create fun videos and share with their contacts. Users can
record 15 second multi-clip videos with audio and effects. Some features include: exploring reels
based on subject; following, commenting and liking a reel; identifying trends to create new reels.
The reels are released in two versions – public (free for all), and premium (subscription basis).

 Discuss the four analytical tasks that can be performed with respect to the Instagram Reels?
[Descriptive, Diagnostic, Predictive and Prescriptive]

I N T RO DUC TIO N TO D AT A S C I E N C E
Types of Data Analytics

Instagram Reels

Descriptive - How many followers do you have, how many views, comments, likes for your
video [free], audience breakdown by country, follower activity per hour [premium]
Diagnostic - Why your video’s engagement rate is less. [premium users]
Predictive - Trending topics for you to make video on - their approximate engagement rates
[premium]
Prescriptive - Tips to increase average watch time of your videos [premium]

I N T RO DUC TIO N TO D AT A S C I E N C E
C O G N I T I V E A N A LY T I C S
Cognitive Analytics – What I Don’t Know?

ht t ps: //w w w. 10x d s. c o m / b l og / c og n i t i v e - a n a l yt i cs -t o -r ei n v e nt - b us i n es s /


I N T RO DUC TIO N TO D AT A S C I E N C E
C OGNITIVE A NALY T I C S
• Next level of Analytics
• Human cognition is based on the context and reasoning.
• Cognitive systems mimic how humans reason and process.
• Cognitive systems analyze information and draw inferences using probability.
• They continuously learn from data and reprogram themselves.
• According to one source:
• ”The essential distinction between cognitive platforms and artificial
intelligence systems is that you want an AI to do something for you. A
cognitive platform is something you turn to for collaboration or for advice.”

https://interestingengineering.com/cognitive-computing-more-human-than-artificial-intelligence
I N T RO DUC TIO N TO D AT A S C I E N C E
C OGNITIVE A NALY T I C S
• Involves Semantics, AI, Machine learning, Deep
Learning, Natural Language Processing, and Neural
Networks.
• Simulates human thought process to learn from the data
and extract the hidden patterns from data.
• Uses all types of data: audio, video, text, images in the
analytics process.
• Although this is the top tier of analytics maturity, Cognitive
Analytics can be used in the prior levels.
• According to Jean Francois Puget:
• ”It extends the analytics journey to areas that were
unreachable with more classical analytics techniques like
business intelligence, statistics, and operations research.”

I N T RO DUC TIO N TO D AT A S C I E N C E
C OGNITIVE A NALY T I C S

Example of Cognitive Analytics : Woebot Mental Health App

• Provides mental health support, using Cognitive Behavioral Therapy (CBT)


• NLP based self-learning App that advises / chats with users on mental health, developed by Stanford
University

Benefits:
• Using Woebot led to significant reductions in anxiety and depression among people aged 18-28 years
old, compared to an information-only control group.
• 85% of participants used Woebot on a daily or almost daily basis.

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A A N A L Y T I C S -B A S E D O N D OMAIN
Types of analytics according to the domain
1 Marketing Analytics
2 Financial Analytics
3 Healthcare Analytics
4 Sports Analytics
5 HR Analytics
6 Customer Analytics
7 Web Analytics
8 Social Analytics
9 Political Analytics

I N T RO DUC TIO N TO D AT A S C I E N C E
Sports Analytics - Powerbat

I N T RO DUC TIO N TO D AT A S C I E N C E
Web Analytics – Google Analytics

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A A N A L Y T I C S - T Y P E O F D AT A

Types of analytics according to the type of data


1 Text analytics
2 Real-time data analytics
3 Multimedia analytics
4 Geo analytics
5 Mobile analytics

I N T RO DUC TIO N TO D AT A S C I E N C E
Geo Analytics – Location Intelligence

https://medium.com/loctruth/unlock-the-power-of-location-intelligence-c0cea20d5a06
I N T RO DUC TIO N TO D AT A S C I E N C E
TABLE OF C ONTENTS

1 A N A LY T I C S

2 B I G D ATA

3 D ATA A N A LY T I C S

4 C A S E S T U D I E S O N D ATA A N A LY T I C S

I N T RO DUC TIO N TO D AT A S C I E N C E
D E S C R I P T IVE A NALY T I C S – E X A M P L E # 1

Data captured
Problem Statement : Gender
“Market research team at Aqua Analytics Age (In years)
Pvt. Ltd is assigned a task to identify pro- Education (In years)

file of a typical customer for a Digital fit- Relationship Status (Single or Partnered)
Annual Household income
ness band that is offered by Titanic Corp.
Average number of times customer tracks activity each
The market research team decides to inves- week
tigate whether there are differences across Number of miles customer expect to walk each week
the usage patterns and product lines with Self-rated fitness on a scale 1 –5 where 1 is poor shape
and 5 is excellent.
respect to customer characteristics”
Models of the product purchased - IQ75, MZ65, DX87

https://medi um. com/@ashi shpa hwa7/ f irst- case-study- i n - descri pti ve-a n a l ytics-a744140c39a4
I N T RO DUC TIO N TO D AT A S C I E N C E
D E S C R I P T I V E A N A LY T I C S – E X A M P L E # 1

I N T RO DUC TIO N TO D AT A S C I E N C E
D IA G N O S T I C A N A L Y T I C S – E X A M P L E # 1

Problem Statement :
“During the 1980s General Electric was selling different products to its customers such as
light bulbs, jet engines, windmills, and other related products. Also, they separately sell
parts and services this means they would sell you a certain product you would use it until it
needs repair either because of normal wear and tear or because it’s broken. And you would
come back to GE and then GE would sell you parts and services to fix it. Model for GE was
focusing on how much GE was selling, in sales of operational equipment, and in sales of
parts and services. And what does GE need to do to drive up those sales?”

https://medium.com/parrotai/
underst a n d - d a t a - a n al yt i c s - f r a m e w o rk - w i t h - a - c a s e - st u d y - i n - t h e - b u si n e ss - w o rl d -15 b f b421028 d
I N T RO DUC TIO N TO D AT A S C I E N C E
D IA G N O S T I C A N A L Y T I C S – E X A M P L E # 1

https://www.sganalytics.com/blog/change-management-analytics-adoption/
I N T RO DUC TIO N TO D AT A S C I E N C E
P R E D I C T I V E A NALY T I C S – E X A M P L E

• Google launched Google Flu Trends (GFT), to collect predictive analytics regarding the
outbreaks of flu. It’s a great example of seeing big data analytics in action.
• So, did Google manage to predict influenza activity in real-time by aggregating search engine
queries with this big data and adopting predictive analytics?
• Even with a wealth of big data analytics on search queries, GFT overestimated the prevalence
of flu by over 50% in 2012-2013 and 2011-2012.
• They matched the search engine terms conducted by people in different regions of the world.
• And, when these queries were compared with traditional flu surveillance systems, Google found
that the predictive analytics of the flu season pointed towards a correlation with higher search
engine traffic for certain phrases.

I N T RO DUC TIO N TO D AT A S C I E N C E
P R E D I C T I V E A NALY T I C S – E X A M P L E

https://www.slideshare.net/VasileiosLampos/
usergenerated-content-collective-and-personalised-inference-tasks
I N T RO DUC TIO N TO D AT A S C I E N C E
P R E S C R I P T I V E A NALY T I C S

Whenever you go to Amazon, the site recommends dozens and dozens of products to
you. These are based not only on your previous shopping history (reactive), but also
based on what you’ve searched for online, what other people who’ve shopped for the
same things have purchased, and about a million other factors (proactive).
Amazon and other large retailers are taking deductive, diagnostic, and predictive data
and then running it through a prescriptive analytics system to find products that you
have a higher chance of buying.
Every bit of data is broken down and examined with the end goal of helping the
company suggest products you may not have even known you wanted.

ht t ps: //ac c e nt -t e c h n o l og i e s. c o m /2020 /06 / 18 / e x am p l e s - o f - pr e scr i pt i v e - a n a l yt i cs /


I N T RO DUC TIO N TO D AT A S C I E N C E
H E A LT H C A R E A NALY T I C S – C ASE S TUDY

Self study
https://integratedmp.com/
4-key -h e a l t hca re -a na l y ti c s-sourc e s-i s-your -prac t i c e -usi ng-the m /
https://www.youtube.com/watch?v=olpuyn6kemg

I N T RO DUC TIO N TO D AT A S C I E N C E
References:

Big Data Analytics – A Hands-on Approach by Arshdeep Bahga & Vijay Madisetti

https://blog.hootsuite.com/tiktok-analytics/
T HANK YOU

I N T RO DUC TIO N TO D AT A S C I E N C E
I NTRODUCTION TO DATA S CIENCE
M ODULE # 3 : DATA A NALYTICS - METHODOLOGIES
IDS Course Team
BITS Pilani
T ABLE OF C ONTENTS

1 D ATA A N A LY T I C S
2 D ATA A N A LY T I C S M E T H O D O L O G I E S
3 CRISP-DM
4 SEMMA
5 SMAM
6 B I G D ATA L I F E - C Y C L E
7 C H A L L E N G E S I N D ATA D R I V E N D E C I S I O N - M A K I N G

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A A N A L Y T I C S

Data Analytics is defined as a process of


cleaning, transforming, and modeling data to
discover useful information for business
decision-making.

I N T RO DUC TIO N TO D AT A S C I E N C E
T ABLE OF C ONTENTS

1 D ATA A N A LY T I C S
2 D ATA A N A LY T I C S M E T H O D O L O G I E S
3 CRISP-DM
4 SEMMA
5 SMAM
6 B I G D ATA L I F E - C Y C L E
7 C H A L L E N G E S I N D ATA D R I V E N D E C I S I O N - M A K I N G

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A A N A L Y T I C S M E T H O D O L O G I E S

• Methodology is a set of guiding principles and processes used to plan,


manage, and execute projects.
• It helps data analysts to reduce risks, avoid duplication of efforts and to
ultimately increase the impact of the project.
Use standard methodology to ensure a good outcome.
1 CRISP-DM
2 SEMMA
3 SMAM
4 Big Data Life-cycle

I N T RO DUC TIO N TO D AT A S C I E N C E
N EED FOR A M e t h o d o l o g y

• Framework for recording experience.


• Allows projects to be replicated
• Aid to project planning and management.
• “Comfort factor” for new adopters
• Demonstrates maturity of Data Mining
• Encourage best practices and help to obtain better results.

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A A n a l y t i c s M E T H O D O L O G Y
10 Questions the process aims to answer
Problem to Approach
1 What is the problem that you are trying to solve?
2 Are there available solutions to similar problems?
Working with Data
3 What data do you need to answer the question?
4 Where is the data coming from? Identify all Sources. How will you acquire it?
5 Is the data that you collected representative of the problem to be solved?
6 What additional work is required to manipulate and work with the data?
Delivering the Answer
7 In what way can the data be visualized to get to the answer that is required?
8 Does the model used really answer the initial question or does it need to be adjusted?
9 Can you put the model into practice?
10 Can you get constructive feedback into answering the question?
I N T RO DUC TIO N TO D AT A S C I E N C E
T ABLE OF C ONTENTS

1 D ATA A N A LY T I C S
2 D ATA A N A LY T I C S M E T H O D O L O G I E S
3 CRISP-DM
4 SEMMA
5 SMAM
6 B I G D ATA L I F E - C Y C L E
7 C H A L L E N G E S I N D ATA D R I V E N D E C I S I O N - M A K I N G

I N T RO DUC TIO N TO D AT A S C I E N C E
CRISP-DM
CRISP-DM Phases
 Cross Industry Standard Process for Data
Mining
 People realized they needed a process to define
data mining steps applicable across any Industry
such as Retail, E-Commerce, Healthcare etc.
 Conceived by Daimler-Benz and Integral
Solutions Ltd in the year 1996

 6 high-level phases

 Iterative approach to the development of


analytical models.

I N T RO DUC TIO N TO D AT A S C I E N C E
C R I S P - D M P HASES

1. Business understanding – What does the business need?


• Understand project objectives and requirements.
• Based on domain knowledge and business strategies.
2. Data understanding – What data do we have / need? Is it clean?
• Initial data collection and familiarization.
• Identify data quality issues.
• Identify initial obvious results.
3. Data preparation – How do we organize the data for modeling?
• Record and attribute selection.
• Data cleansing.

I N T RO DUC TIO N TO D AT A S C I E N C E
C R I S P - D M P HASES

4. Modeling – What modeling techniques should we apply?


• Run the data mining tools.
5. Evaluation – Which model best meets the business objectives?
• Determine if results meet business objectives.
• Identify business issues that should have been addressed earlier.
6. Deployment – How do stakeholders access the results?
• Put the resulting models into practice.
• Set up for continuous mining of the data.

I N T RO DUC TIO N TO D AT A S C I E N C E
C R I S P - D M P HASES AND T ASKS

I N T RO DUC TIO N TO D AT A S C I E N C E
W HY CRISP-DM?

1. Reliable and Repeatable by people with little data mining skills.


2. Evergreen [Applicable for Data Mining, Data Scientists, Data Analyst titles]
3. Most other methodologies have evolved from CRISP-DM over time, so
understanding this is essential
4. Thorough [Interdisciplinary – Work with Managers, SMEs, Other teams]
5. Practical
• Concept easy to understand
• Always ties investigation with application [Always tied to Business Problem]
• Flexible
• Free

I N T RO DUC TIO N TO D AT A S C I E N C E
Advantages and Disadvantages
Advantages:
• Clearly defined process (phases and tasks).
• Supports various data mining techniques
• Has documentation of several successful case studies following the approach

Disadvantages:
• Long and Complicated process
• Blind hand-off to IT from Data Science team without picturizing the operationalization
• No real measure of ROI, once all phases are completed

https://www.diva-portal.org/smash/get/diva2:1250897/FULLTEXT01.pdf

I N T RO DUC TIO N TO D AT A S C I E N C E
T ABLE OF C ONTENTS

1 D ATA A N A LY T I C S
2 D ATA A N A LY T I C S M E T H O D O L O G I E S
3 CRISP-DM
4 SEMMA
5 SMAM
6 B I G D ATA L I F E - C Y C L E
7 C H A L L E N G E S I N D ATA D R I V E N D E C I S I O N - M A K I N G

I N T RO DUC TIO N TO D AT A S C I E N C E
SEMMA

• SAS Institute developed SEMMA as the process


for data mining.
• 5 stages - Sample, Explore, Modify, Model,
Assess
• Used to solve a wide range of business
problems, including fraud identification,
customer retention and turnover, database
marketing, customer loyalty, bankruptcy
forecasting, market segmentation, as well as
risk, affinity, and portfolio analysis.

I N T RO DUC TIO N TO D AT A S C I E N C E
SEMMA

• SEMMA is a logical organization of the functional tool set of SAS Enterprise Miner
for carrying out the core tasks of data mining.
• Enterprise Miner is a Data Mining Software to create predictive and descriptive
models for large volumes of data.
• Enterprise Miner can be used as part of any iterative data mining methodology
adopted by the client. Naturally steps such as formulating a well defined business or
research problem and assembling quality representative data sources are critical to
the overall success of any data mining project.
• SEMMA is focused on the model development aspects of data mining.
• SEMMA overlaps with Data Preparation, Modelling and Evaluation phases of CRISP-DM

I N T RO DUC TIO N TO D AT A S C I E N C E
S E M M A S TAGES
1. Sample
•1 Sampling the data by extracting a portion of a large data set big enough to contain the
significant information, yet small enough to manipulate quickly.
• Partitioning the data to create training and test samples.
• Identifying dependent and independent variables influencing the process.
2. Explore
• Exploration of the data by searching for unanticipated trends and anomalies in order to
gain understanding and ideas.
• Perform Univariate analysis (single variable) and multivariate analysis (relationships)
3. Modify
• Modification of the data by creating, selecting, and transforming the variables to focus
the model selection process.

I N T RO DUC TIO N TO D AT A S C I E N C E
S E M M A S TAGES

4. Model
• Apply variety of data mining techniques to produce a projected model [ML, Deep Learning,
Transfer Learning]
5. Assess
• Assessing the data by evaluating the usefulness and reliability of the findings from the
data mining process and estimate how well it performs.

I N T RO DUC TIO N TO D AT A S C I E N C E
Advantages and Disadvantages
Advantages:
• Focus on only “Model aspects of Data Mining”
• Useful in most Machine Learning Projects where data comes from single datasource
Ex: Prima Indian Diabetes Dataset [Predict Diabetes], Titanic Dataset [Predict
Passenger Survival] from Kaggle

Disadvantages:
• Does not take into account the business understanding of a problem
• Disregards Data Collection and Processing from different data sources

https://www.diva-portal.org/smash/get/diva2:1250897/FULLTEXT01.pdf

I N T RO DUC TIO N TO D AT A S C I E N C E
SEMMA – Case Study
Covid Patient Discharge Prediction (Dataset: 2nd Wave April-2021 to June 2021)
Type of Project: Machine Learning
1. Sample : Dataset size: 1233 patients suffering from Covid
2. Explore: Univariate (Null values, Mean, basic statistics), Bivariate (correlation – pearson, chi square)
3. Modify : PCA (Principal Component Analysis)
4. Model : Feature Engineering, Subset selection
Final Variables:
X: Age, Gender, Co_morbid, Admit Date, Discharge date, days of stay, covid_severity
Y: Discharge Type (Recovered, Expired)
Models applied: Support Vector Machine, Naïve Bayes, Logistic Regression, Decision Trees, KNN, ANN,
Random Forest
5. Assess : Best Accuracy: Random Forest (92%)

I N T RO DUC TIO N TO D AT A S C I E N C E
T ABLE OF C ONTENTS

1 D ATA A N A LY T I C S
2 D ATA A N A LY T I C S M E T H O D O L O G I E S
3 CRISP-DM
4 SEMMA
5 SMAM
6 B I G D ATA L I F E - C Y C L E
7 C H A L L E N G E S I N D ATA D R I V E N D E C I S I O N - M A K I N G

I N T RO DUC TIO N TO D AT A S C I E N C E
SMAM
SMAM
(Standard Methodology for
Analytics Models)

http://www.datascienceassn.org/content/standard-methodology-analytical-models

I N T RO DUC TIO N TO D AT A S C I E N C E
S M A M P HASES
Phase Description
Use-case identification Selection of the ideal approach from a list of candidates
Model requirements Understanding the conditions required for the model to func-
gathering tion
Data preparation Getting the data ready for the modeling
Modeling experiments Scientific experimentation to solve the business question
Insight creation Visualization and dash-boarding to provide insight
Proof of Value: ROI Running the model in a small scale setting to prove the value
Operationalization Embedding the analytical model in operational systems
Model life-cycle Governance around model lifetime and refresh

I N T RO DUC TIO N TO D AT A S C I E N C E
SMAM Phases
Phase I - Use Case Identification
• Brainstorming of Business / Management / SMEs (Domain) / IT (Data Scientist)
teams
• Discussion revolves around:
• Business Needs
• Expert inputs on the domain
• Data Availability
• Analytical Model Complexity – time and effort
• Outcome: Selected Use Case and roadmap for next phases

I N T RO DUC TIO N TO D AT A S C I E N C E
SMAM Phases
Phase II – Model Requirements Gathering
• Involved parties include Business / End-users / Data Scientists / IT
• Preparation of Model Requirement Document
• Business requirements
• IT requirements
• End user requirements
• Data requirements
• Analytical model requirements

I N T RO DUC TIO N TO D AT A S C I E N C E
SMAM Phases
Phase III – Data Preparation
• Involved parties include IT / Data Administrators / DBA / Data Modelers and Data
Scientists
• Discussion on:
• Data Access
• Data Location
• Data Understanding
• Data Validation
• Data format [prepared by DBAs and consumed by Data Scientist]
• The process is agile; the data scientist tries out various approaches on smaller sets
and then may ask IT/ DBAs to perform the required transformation in large.

I N T RO DUC TIO N TO D AT A S C I E N C E
SMAM Phases
Phase IV – Modeling Experiments
• Data Scientist:
• Creates testable hypothesis [Prediction of heart disease]
• Model features [Identify X and Y variables]
• Creates Analytical Model [Regression / Classification / Clustering]
• Evaluates the Analytical Model
[Metrics – Accuracy, Precision, Sensitivity, Specificity etc.]

I N T RO DUC TIO N TO D AT A S C I E N C E
SMAM Phases
Phase V – Insight Creation
• Data Scientist:
• Analytical reporting [Inference] and Operational reporting [Prediction]
• Visualization and Dashboards
• Provide business usable insights

I N T RO DUC TIO N TO D AT A S C I E N C E
SMAM Phases
Phase VI – Proof of Value: ROI
• Quality of the analytical model is observed [Ex: Accuracy of the model is >90%]
• Analytical model is applied to new data and outcomes are measured to verify if
financially viable [for small POC].
• If ROI is positive for POC:
• Set up full-scale experiment with control groups
• Measure the model effectiveness
• Compute ROI and success criteria
• Involve Finance department / IT / End-users and Data Scientists in this phase

I N T RO DUC TIO N TO D AT A S C I E N C E
SMAM Phases
Phase VII – Operationalization
• Data Scientist works with IT department to create repeatable experimentation of
the model; hand-over process of the model
• IT prepare the Operational environment
• Integration with existing / legacy applications
• Possible software development as Mobile / Web App for end-user usage

I N T RO DUC TIO N TO D AT A S C I E N C E
SMAM Phases
Phase VIII – Model Lifecycle
• Involves maintenance of the analytical model in-view of changing customer needs
• Two types of model changes:
a. Model Refresh – Model is trained with more recent data, leaving the model
structurally untouched
b. Model Upgrade – Initiated by availability of new data sources and a
business request to improve model performance.
• Involved are operational team, IT team, Data Scientists, DBAs, end-users

I N T RO DUC TIO N TO D AT A S C I E N C E
T ABLE OF C ONTENTS

1 D ATA A N A LY T I C S
2 D ATA A N A LY T I C S M E T H O D O L O G I E S
3 CRISP-DM
4 SEMMA
5 SMAM
6 B I G D ATA L I F E - C Y C L E
7 C H A L L E N G E S I N D ATA D R I V E N D E C I S I O N - M A K I N G

I N T RO DUC TIO N TO D AT A S C I E N C E
B I G D ATA A N A LY T I C S L I F E C YC L E

• Big data defers from traditional data primarily due to


volume, velocity, variety, veracity and value
• A step-by-step methodology is required to acquire,
process, analyze, visualize the big data

Book - Big Data Fundamentals: Concepts, Drivers & Techniques


https://www.informit.com/articles/article.aspx?p=2473128&seqNum=11
I N T RO DUC TIO N TO D AT A S C I E N C E
B I G D ATA A N A LY T I C S L I F E C YC L E
Stage I : Business Case Evaluation
• Create a well-defined business case and get approval
• Identify KPIs that define the assessment criteria, to make business goals SMART
(specific, measurable, attainable, relevant, timely)
• Business case must qualify as a ‘big data’ problem – volume, velocity, variety,
veracity, value
• Outcome: Budget requirements, identify software (tools), hardware, training
requirements

I N T RO DUC TIO N TO D AT A S C I E N C E
B I G D ATA A N A LY T I C S L I F E C YC L E
Stage II : Data Identification
• Identify the datasets required for the project and their sources
• Guideline: Identify as many sources as possible, which help gain insights
• Sources can be internal / external to the enterprise
• Internal – Data marts, Data warehouses or operational systems
• External – Data within Blogs, websites etc.

I N T RO DUC TIO N TO D AT A S C I E N C E
B I G D ATA A N A LY T I C S L I F E C YC L E
Stage III : Data Acquisition and Filtering
• Data is gathered from all sources identified in the previous phase
• Data filtering is performed to remove corrupted / noise data
• Corrupt – records with missing / nonsensical values / invalid data
types
• Create metadata, helps in data provenance, accuracy and quality
• Dataset size & structure
• Source information
• Date and time of creation
• Language specific information

I N T RO DUC TIO N TO D AT A S C I E N C E
B I G D ATA A N A LY T I C S L I F E C YC L E
Stage IV : Data Extraction
• Extract disparate data and transform it into a format that the underlying Big Data
solution can use for the purpose of the data analysis .

Extraction of Latitude and Longitude from JSON


User Id and Comments
extracted from XML document

I N T RO DUC TIO N TO D AT A S C I E N C E
B I G D ATA A N A LY T I C S L I F E C YC L E
Stage V : Data Validation and Cleansing
• Big data may receive redundant data across sources
• Redundancy can be used to interconnect dataset and fill missing values

• The first value in Dataset B is validated against its corresponding value in Dataset A.
• The second value in Dataset B is not validated against its corresponding value in Dataset A.
• If a value is missing, it is inserted from Dataset A.

I N T RO DUC TIO N TO D AT A S C I E N C E
B I G D ATA A N A LY T I C S L I F E C YC L E
Stage VI : Data Aggregation and Representation
• Integrating multiple datasets together to arrive at unified view
• Involves joining datasets based on common fields such as ID or Date
• Semantics standardization (Ex: Surname and Last name – Same value
labeled differently in different datasets)
• Represent using standard data format (row-oriented database)

I N T RO DUC TIO N TO D AT A S C I E N C E
B I G D ATA A N A LY T I C S L I F E C YC L E
Stage VII : Data Analysis
• Perform EDA (Exploratory Data Analysis)
• Apply Analytics: Descriptive, Diagnostic, Predictive or Prescriptive

I N T RO DUC TIO N TO D AT A S C I E N C E
B I G D ATA A N A LY T I C S L I F E C YC L E
Stage VIII : Data Visualization
• Use tools to graphically visualize and communicate the insights to business users
• Present Dashboards
• Excel, Tableau, Power BI etc.

I N T RO DUC TIO N TO D AT A S C I E N C E
B I G D ATA A N A LY T I C S L I F E C YC L E
Stage IX : Utilization of Analysis Results
• Determining how and where the processed analysis data can be leveraged
• Results can be:
• Fed as input to enterprise systems (Customer analysis result fed into
OTT platform to assist recommendation)
• Refine the business process (Ex: Consolidate transportation routes as
part of supply chain process)
• Generate alerts (Send notification to users via Email or SMS about
impending events)

I N T RO DUC TIO N TO D AT A S C I E N C E
B I G D ATA A N A LY T I C S L I F E C YC L E
CASE STUDY: Background
• Company X is an Insurance Company that deals with health and home insurance
• The company has a ‘Claim Management System’ which contains the claim data,
incident photographs and claim notes
• The company wants to invest in Big Data Analytics to “detect fraudulent claims in the
building sector”
• Let us see how the company uses the ‘Big Data Analytics’ Lifecycle to achieve the
objective of ‘detecting fraudulent claims in the building sector’

* Building Insurance is a type of Home insurance that covers the structure of the house from any kinds of danger or risks

I N T RO DUC TIO N TO D AT A S C I E N C E
Case Study: Detect Fraudulent Claims

Phase I: Business Case Evaluation


• Use case is important as it leads to decrease in monetary loss for Company X
• It covers ‘opportunistic fraud’ such as lying and exaggeration which covers
majority of insurance claim cases.
• KPI for success is set as – ‘reduction in fraudulent claims by 15%.’
• Regarding budget allocation and Infrastructure upgrade, Company X decides to
leverage Open Source Big Data Solution – Hadoop Ecosystem.

I N T RO DUC TIO N TO D AT A S C I E N C E
Case Study: Detect Fraudulent Claims

Phase II: Data Identification


• Internal datasets: Policy data, insurance application documents, claims data,
incident photographs, emails
• External datasets: Social Media Data (Twitter Feeds), Weather reports,
Geographical data (GIS), and census data.
• The claim data consists of historical claim data consisting of multiple fields where
one of the fields specifies if the claim was fraudulent or legitimate.

I N T RO DUC TIO N TO D AT A S C I E N C E
Case Study: Detect Fraudulent Claims

Phase III: Data Acquisition and Filtering


• Policy data obtained from Policy Administration System
• Claim data from Claims Management System
• Call center agent notes and emails from CRM system
• Social Media Data (Twitter Feeds), Weather reports, Geographical data (GIS),
and census data are obtained from third party vendors.
• To ensure provenance, each dataset is attached metadata such as dataset
name, source, size, format, acquired date and number of records.
• Batch filtering jobs to remove corrupt records in external datasets.

I N T RO DUC TIO N TO D AT A S C I E N C E
Case Study: Detect Fraudulent Claims

Phase IV: Data Extraction


• Tweets dataset is in JSON format: User Id, Timestamp and Tweet Text is
extracted into tabular form.
• Weather dataset is in XML format: Timestamp, Temperature Forecast, Wind
Speed Forecast, Wind Direction Forecast, Snow Forecast and Flood Forecast
parameters extracted into tabular form.

I N T RO DUC TIO N TO D AT A S C I E N C E
Case Study: Detect Fraudulent Claims

Phase V: Data Validation and Cleaning


• Check the extracted fields from Twitter and Weather datasets for typographical
errors, incorrect data, data type validation and range validation

I N T RO DUC TIO N TO D AT A S C I E N C E
Case Study: Detect Fraudulent Claims

Phase VI: Data Aggregation and Representation


• For meaningful analysis of data, join together policy data, claim data, call center
agent notes in a single dataset that is tabular, where each field can be
referenced through a user query.
• Resulting dataset is stored in RDBMS datastore.

I N T RO DUC TIO N TO D AT A S C I E N C E
Case Study: Detect Fraudulent Claims

Phase VII: Data Analysis


• Perform Exploratory Data Analysis
• This stage is repeated a number of times as the results generated after the first
pass are not conclusive enough to comprehend what makes a fraudulent claim
different from a legitimate claim.
• Machine learning models were developed using Naïve Bayes, Random Forest,
Decision Tree, Logistic Model Tree etc
• Metrics used: Accuracy, Precision, Recall, F-Measure, ROC

I N T RO DUC TIO N TO D AT A S C I E N C E
Case Study: Detect Fraudulent Claims

Phase VIII: Data Visualization


• The team has discovered some interesting findings and now needs to convey the
results to the Insurance experts.
• Different visualization methods are used including bar and line graphs and scatter
plots.
• Scatter plots are used to analyze groups of fraudulent and legitimate claims in the light
of different factors, such as customer age, age of policy, number of claims made
and value of claim.

I N T RO DUC TIO N TO D AT A S C I E N C E
Case Study: Detect Fraudulent Claims

Phase IX: Utilization of Analysis Results

• The machine learning model was incorporated into the existing claim
processing system to flag fraudulent claims.

I N T RO DUC TIO N TO D AT A S C I E N C E
When to use what Methodology?

CRISP-DM SEMMA SMAM Big Data Custom


Lifecycle
Start-up with no Have identified Need quick POC / Big data Have additional
prior experience in dataset, preferably MVP before taking requirements (5Vs) steps / phases in
data mining or data single data source the big bang addition to the
science approach methodology

Good Model development Need proof of ROI Multiple data Find the
documentation and is priority before investment sources methodology as
case studies (provenance, quality constraining
aspects of data)

Suitable for both Maybe as a POC/ Need clarity on the Integrate the model Ex: IBM / Netflix /
data mining and MVP. division of roles and with existing Google customize
data science No deployment responsibilities of systems big data lifecycle
projects clarity team members in (operationalization) and CRISP-DM in
project execution many projects

I N T RO DUC TIO N TO D AT A S C I E N C E
T ABLE OF C ONTENTS

1 D ATA A N A LY T I C S
2 D ATA A N A LY T I C S M E T H O D O L O G I E S
3 CRISP-DM
4 B I G D ATA L I F E - C Y C L E
5 SEMMA
6 SMAM
7 C H A L L E N G E S I N D ATA D R I V E N D E C I S I O N - M A K I N G

I N T RO DUC TIO N TO D AT A S C I E N C E
D AT A D R I V E N D E C I S I O N - M A K I N G

Create new Blockbuster People Analytics Serving of Ads


Mass Personalization
of Menus Hit series

Analyze over 30 million Diagnose HR issues, Image recognition


Past History, Weather, analyze employee (pattern of people
plays a day, 4 million
Time of Day, Local performance reviews, drinking),
subscriber ratings, 3 million
events manage workforce and demographics,
searches developed ‘house
of cards’ talent better background, offer
personalized ads

https://unscrambl.com/blog/data-driven-companies-examples/

I N T RO DUC TIO N TO D AT A S C I E N C E
C H A L L E N G E S I N D AT A D R I V E N D E C I S I O N - M A K I N G

1. Discrimination
• Algorithmic discrimination can come from various sources.
• Data used to train algorithms may have biases that lead to discriminatory decisions.
• Discrimination may arise from the use of a particular algorithm.
• Algorithms can result in discrimination as a result of misuse of certain models in different
contexts.
• Biased data can be used both as evidence for the training of algorithms and as evidence
of their effectiveness.

I N T RO DUC TIO N TO D AT A S C I E N C E
C H A L L E N G E S I N D AT A D R I V E N D E C I S I O N - M A K I N G
1. Racism embedded in US healthcare
In October 2019, researchers found that an algorithm used on more than 200
million people in US hospitals to predict which patients would likely need extra
medical care heavily favoured white patients over black patients. While race
itself wasn’t a variable used in this algorithm, another variable highly
correlated to race was, which was healthcare cost history. The rationale was
that cost summarizes how many healthcare needs a particular person has.
For various reasons, black patients incurred lower healthcare costs than white
patients with the same conditions on average.

I N T RO DUC TIO N TO D AT A S C I E N C E
C H A L L E N G E S I N D AT A D R I V E N D E C I S I O N - M A K I N G
2. Amazon’s hiring algorithm
Amazon’s one of the largest tech giants in the world. And so, it’s no surprise
that they’re heavy users of machine learning and artificial intelligence. In
2015, Amazon realized that their algorithm used for hiring employees was
found to be biased against women. The reason for that was because the
algorithm was based on the number of resumes submitted over the past ten
years, and since most of the applicants were men, it was trained to favor men
over women.

I N T RO DUC TIO N TO D AT A S C I E N C E
C H A L L E N G E S I N D AT A D R I V E N D E C I S I O N - M A K I N G

2. Lack of transparency
• Transparency refers to the capacity to understand a computational model and therefore
contribute to the attribution of responsibility for consequences derived from its use.
• A model is transparent if a person can easily observe it and understand it.
• Three types of opacity (i.e. lack of transparency) in algorithmic decisions
• Intentional opacity – The objective of this type of opacity is to protect the algorithm
inventors’ intellectual property.
• Knowledge opacity – This type of opacity is due to the fact that the most people lack the
technical skills to understand how algorithms and computational models are constructed.
• Intrinsic opacity – This type of opacity arises from the nature of certain computer learning
methods (e.g. deep learning models).

https://philpapers.org/rec/BURHTM

I N T RO DUC TIO N TO D AT A S C I E N C E
C H A L L E N G E S I N D AT A D R I V E N D E C I S I O N - M A K I N G
3. Violation of privacy
• Misuse of users’ personal data and on data aggregation by entities such as data
brokers, may have direct implications for people’s privacy. [Google faced Lawsuit
for Privacy Violation in 2020 – selling data to 3rd party companies]
4. Digital literacy
• Devote resources to digital and computer literacy programs from children to the elderly.
• This enables the society to make decisions about technologies that we do not
understand. [Cases of Cyberbullying among Juvenile population]
5. Fuzzy responsibility
• As more and more decisions that affect millions of people are made automatically by
algorithms, we must be clear about who is responsible for the consequences of these
decisions. Transparency is often considered a fundamental factor in the clarity of
attribution of responsibility.
I N T RO DUC TIO N TO D AT A S C I E N C E
C H A L L E N G E S I N D AT A D R I V E N D E C I S I O N - M A K I N G
6. Lack of ethical frameworks
• Algorithmic data-based decision-making processes generate important ethical dilemmas
regarding what actions are appropriate in light of the inferences made by algorithms.
• It is therefore essential that decisions be made in accordance with a clearly defined and
accepted ethical framework.
• There is no single method for introducing ethical principles into algorithmic decision
processes.

On March 18, 2018, at around 10 p.m., Elaine Herzberg was wheeling her bicycle
across a street in Tempe, Arizona, when she was struck and killed by a self-driving
car. Although there was a human operator behind the wheel, an autonomous
system—artificial intelligence—was in full control.

I N T RO DUC TIO N TO D AT A S C I E N C E
C H A L L E N G E S I N D AT A D R I V E N D E C I S I O N - M A K I N G
7. Lack of diversity
• Data-based algorithms and artificial intelligence techniques for decision-making have
been developed by homogeneous groups of IT professionals.
• Ensure that teams are diverse in terms of areas of knowledge as well as demographic
factors [interdisciplinary – teaching medical doctors data science for self-computation]

I N T RO DUC TIO N TO D AT A S C I E N C E
R EFERENCES

https://www.kdnuggets.com/2014/10/
crisp-dm-top-methodology-analytics-data-mining-data-science-projecthtml
https://ww w. datasci encecentral.com/ prof i l es/ bl og s/
crisp-dm-a-standard-methodology-to-ensure-a-good-outcome
https://docu mentation.sas.com/?docsetId=emref&docsetTarget=n061bzurmej4j3n1jnj8bbjjm 1a2.htm&
docsetVersion=14.3&locale=en
http://jesshampton.com/2011/02/16/semma-and-crisp-dm-data-mining-methodologies/
https://www.kdnuggets.com/2015/08/new-standard-methodology-analytical-models.html
https://medium.com/illumination-curated/big-data-lifecycle-management-629dfe16b78d
https://www.esadeknowledge.com/view/
7- chal l e ng es - a n d - o p p ort u ni t i es - i n - d a t a - b a s e d - d e ci si o n -m a ki n g - 193560

T HANK YOU

I N T RO DUC TIO N TO D AT A S C I E N C E
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
10 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 22:49
This attempt took 10 minutes.

Question 1 0.25
/ 0.25 pts

_____________ is the practice of designing and building systems for


collecting, storing, and analyzing data at scale.

 
Data Analysis

 
Data Mining

 
Machine Learning

 
Data engineering

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Creator

 
Data Administrator

 
None

 
Mathematical Analyst

Question 3 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
False

 
True

Question 4 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
All of the listed

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
High level understanding of Technical know-how specific to the domain.

 
Client Requirement understanding

Question 5 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Organization related

 
All three

 
Technology related

 
Data related

Question 6 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 7 0.25
/ 0.25 pts

Data scientist is not responsible for 

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data analytics

 
Data manipulation

 
Data mining

 
Building continuous data stream

Question 8 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Collection

 
Data Requirements

 
Data Preparation

 
Data Understanding

Question 9 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Collection

 
Data Cleaning

 
Data Understanding

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Preparation

Question 10 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Writing

 
Data Wrangling

 
Data Label

 
Data Annotation

Question 11 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are negatively correlated

 
The features are positively correlated

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The features are independent

Question 12 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Descriptive

 
Predictive

 
Prescriptive

 
Diagnostic

Question 13 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Probability Calculation activity

 
Machine Learning Activity

 
Data Science Activity

 
None of the above

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 14 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Data Engineer

 
Data Analyst

 
Business Intelligence Analyst

 
Data Scientist

Question 15 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Predictive analytics

 
Descriptive analytics

 
None of the above

 
Diagnostic analytics

Question 16 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which of the following are the main challenges in Data Science?

 
Data Availability

 
Data Security

 
Data quantity

 
All of these

Question 17 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Pattern Recognition

 
Statistical Analysis

 
Cognitive Analytics

 
Root Cause Analysis

Question 18 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Vision

 
Velocity

 
Variety

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Volume

Question 19 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Image & Speech Recognition

 
Online Price Comparison

 
Privacy Checker

 
Recommendation Systems

Question 20 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
What happened?

 
What will happen?

 
What should I do?

 
Why did it happen?

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
10 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 22:49
This attempt took 10 minutes.

Question 1 0.25
/ 0.25 pts

_____________ is the practice of designing and building systems for


collecting, storing, and analyzing data at scale.

 
Data Analysis

 
Data Mining

 
Machine Learning

 
Data engineering

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Creator

 
Data Administrator

 
None

 
Mathematical Analyst

Question 3 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
False

 
True

Question 4 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
All of the listed

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
High level understanding of Technical know-how specific to the domain.

 
Client Requirement understanding

Question 5 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Organization related

 
All three

 
Technology related

 
Data related

Question 6 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 7 0.25
/ 0.25 pts

Data scientist is not responsible for 

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data analytics

 
Data manipulation

 
Data mining

 
Building continuous data stream

Question 8 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Collection

 
Data Requirements

 
Data Preparation

 
Data Understanding

Question 9 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Collection

 
Data Cleaning

 
Data Understanding

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Preparation

Question 10 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Writing

 
Data Wrangling

 
Data Label

 
Data Annotation

Question 11 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are negatively correlated

 
The features are positively correlated

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The features are independent

Question 12 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Descriptive

 
Predictive

 
Prescriptive

 
Diagnostic

Question 13 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Probability Calculation activity

 
Machine Learning Activity

 
Data Science Activity

 
None of the above

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 14 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Data Engineer

 
Data Analyst

 
Business Intelligence Analyst

 
Data Scientist

Question 15 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Predictive analytics

 
Descriptive analytics

 
None of the above

 
Diagnostic analytics

Question 16 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which of the following are the main challenges in Data Science?

 
Data Availability

 
Data Security

 
Data quantity

 
All of these

Question 17 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Pattern Recognition

 
Statistical Analysis

 
Cognitive Analytics

 
Root Cause Analysis

Question 18 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Vision

 
Velocity

 
Variety

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Volume

Question 19 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Image & Speech Recognition

 
Online Price Comparison

 
Privacy Checker

 
Recommendation Systems

Question 20 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
What happened?

 
What will happen?

 
What should I do?

 
Why did it happen?

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
10 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 22:49
This attempt took 10 minutes.

Question 1 0.25
/ 0.25 pts

_____________ is the practice of designing and building systems for


collecting, storing, and analyzing data at scale.

 
Data Analysis

 
Data Mining

 
Machine Learning

 
Data engineering

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Creator

 
Data Administrator

 
None

 
Mathematical Analyst

Question 3 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
False

 
True

Question 4 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
All of the listed

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
High level understanding of Technical know-how specific to the domain.

 
Client Requirement understanding

Question 5 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Organization related

 
All three

 
Technology related

 
Data related

Question 6 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 7 0.25
/ 0.25 pts

Data scientist is not responsible for 

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data analytics

 
Data manipulation

 
Data mining

 
Building continuous data stream

Question 8 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Collection

 
Data Requirements

 
Data Preparation

 
Data Understanding

Question 9 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Collection

 
Data Cleaning

 
Data Understanding

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Preparation

Question 10 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Writing

 
Data Wrangling

 
Data Label

 
Data Annotation

Question 11 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are negatively correlated

 
The features are positively correlated

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The features are independent

Question 12 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Descriptive

 
Predictive

 
Prescriptive

 
Diagnostic

Question 13 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Probability Calculation activity

 
Machine Learning Activity

 
Data Science Activity

 
None of the above

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 14 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Data Engineer

 
Data Analyst

 
Business Intelligence Analyst

 
Data Scientist

Question 15 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Predictive analytics

 
Descriptive analytics

 
None of the above

 
Diagnostic analytics

Question 16 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which of the following are the main challenges in Data Science?

 
Data Availability

 
Data Security

 
Data quantity

 
All of these

Question 17 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Pattern Recognition

 
Statistical Analysis

 
Cognitive Analytics

 
Root Cause Analysis

Question 18 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Vision

 
Velocity

 
Variety

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Volume

Question 19 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Image & Speech Recognition

 
Online Price Comparison

 
Privacy Checker

 
Recommendation Systems

Question 20 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
What happened?

 
What will happen?

 
What should I do?

 
Why did it happen?

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/9
QUIZ -1
Due Dec 4 at 19:00 Points 5 Questions 20
Available Dec 3 at 19:00 - Dec 4 at 19:00 24 hours Time Limit 30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student.

Attempt History
Attempt Time Score
LATEST Attempt 1 28 minutes 4 out of 5

Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz: 4 out of 5


Submitted Dec 3 at 22:19
This attempt took 28 minutes.

Question 1 0.25 / 0.25 pts

_____________ is the practice of designing and building systems for


collecting, storing, and analyzing data at scale.
Machine Learning

Question 4 0.25 / 0.25 pts

Which one of this belongs to the Data Science career?


True
Question 11 0.25 / 0.25 pts

The scatterplot implies that


Question 19 0.25 / 0.25 pts

Which of the following questions is answered by Predictive Analytics?


QUIZ -1
Due Dec 4 at 19:00 Points 5 Questions 20
Available Dec 3 at 19:00 - Dec 4 at 19:00 24 hours Time Limit 30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student.

Attempt History
Attempt Time Score
LATEST Attempt 1 28 minutes 4 out of 5

Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz: 4 out of 5


Submitted Dec 3 at 22:19
This attempt took 28 minutes.

Question 1 0.25 / 0.25 pts

_____________ is the practice of designing and building systems for


collecting, storing, and analyzing data at scale.
Machine Learning

Question 4 0.25 / 0.25 pts

Which one of this belongs to the Data Science career?


True
Question 11 0.25 / 0.25 pts

The scatterplot implies that


Question 19 0.25 / 0.25 pts

Which of the following questions is answered by Predictive Analytics?


12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
10 minutes 4.25 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


4.25 out of 5
Submitted Dec 3 at 21:39
This attempt took 10 minutes.

Question 1 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
All of the listed

 
Client Requirement understanding

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
High level understanding of Technical know-how specific to the domain.

 
Understanding of domain concepts

Question 2 0.25
/ 0.25 pts

___________ comprises the strategies and technologies used by enterprises for data analysis and management.

 
Machine Learning

 
Business Intelligence

 
Artificial Intelligence

 
None of the Above.

Question 3 0.25
/ 0.25 pts

The Data Warehouse is a “Schema-on -Load” Approach.

 
True

 
False

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 4 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 5 0.25
/ 0.25 pts

_________________ is the science of collecting, analyzing, presenting, and interpreting data. Objective to draw conclusions on the
population.

 
Business Intelligence

 
Computational Intelligence

 
Data Analytics

 
Statistics

Question 6 0.25
/ 0.25 pts

___________ is considered to be a sub-field of or one of the tools of AI, which provides machines with the capability of learning from
experience.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Science

 
Data analysis

 
Machine Learning

 
Artificial Intelligence

Incorrect
Question 7 0
/ 0.25 pts

Exploratory data analysis does not help in

 
In univariate and bivariate variable analysis

 
Finding statistical estimates of a variable

 
Derivation of new attributes

 
Finding out the data type of a variable

Question 8 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the best possible answer.

 
Data Preparation

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Requirements

 
Data Collection

 
Data Understanding

Question 9 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Label

 
Data Writing

 
Data Annotation

 
Data Wrangling

Question 10 0.25
/ 0.25 pts

Data preprocessing improves the data quality and make data mining algorithms efficient and effective. What are the data preprocessing
tasks along with Data cleaning?

 
Data Integration

 
Data Reduction

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Transformation

 
All of the above

Question 11 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
None of the above

 
Probability Calculation activity

 
Machine Learning Activity

 
Data Science Activity

Incorrect Question 12 0
/ 0.25 pts

Which among the following advocates an exploratory and dynamic process?

 
Data Warehouse

 
Data Science

 
Business Intelligence

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Mining

Question 13 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the voice is male or female a Data Science Activity

 
True

 
False

Incorrect Question 14 0
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Diagnostic analytics

 
Predictive analytics

 
None of the above

 
Descriptive analytics

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 15 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of data analytics.

 
Predictive

 
Prescriptive

 
Diagnostic

 
Descriptive

Question 16 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data Availability

 
Data Security

 
All of these

 
Data quantity

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 17 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Cognitive Analytics

 
Statistical Analysis

 
Pattern Recognition

 
Root Cause Analysis

Question 18 0.25
/ 0.25 pts

Type of problem in statistics will be

 
None of these

 
Well Structured

 
Semi-Structured

 
Unstructured

Question 19 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/10
12/3/22, 9:41 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which of the following is not a characteristic of Big Data?

 
Velocity

 
Vision

 
Volume

 
Variety

Question 20 0.25
/ 0.25 pts

Big data is defined as ______________?

 
Collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process and analyze the data using traditional
databases and data processing tools

 
The process of extracting and creating information from raw data by using different techniques

 
The process of making machines capable of mimicking human behavior

 
An engineering discipline that is concerned with all aspects of software production.

Quiz Score:
4.25 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/10
Question 1
0.17 / 0.17 pts
Data preprocessing improves the data quality and make data mining algorithms efficient
and effective. What are the data preprocessing tasks along with Data cleaning?

Data Integration

Data Reduction

Data Transformation

All of the above.

Question 2
0.17 / 0.17 pts
Which of the following artifacts are not considered in the Descriptive analytics?

Standard report

Alerts

Adhoc reports

Predictive model

Question 3
0.17 / 0.17 pts
If you were to arrange the following data analytics techniques in the increasing order of
complexity, which of the following is considered the correct order?
Predictive, Diagnostic, Prescriptive, Descriptive

Descriptive, Diagnostic, Predictive, Prescriptive

Diagnostic, Predictive, Prescriptive, Descriptive

Prescriptive, Descriptive, Diagnostic, Predictive

Question 4
0.17 / 0.17 pts
Data scientist is not responsible for

Data mining

Data manipulation

Building continuous data stream

Data analytics

Question 5
0.17 / 0.17 pts
Which of these is not an example of the application of data science?

Fraud detection and prevention system in a bank

Creation of new fiscal policy


Targeted advertising as per customer's need

Product recommender systems

Question 6
0.17 / 0.17 pts
If a person is coming from software development background, which of the following
Data science project roles will best suite him?

Big Data Engineer

Machine Learning Engineer

Data Analyst

Storyteller

Question 7
0.17 / 0.17 pts

A person goes to a supermarket to purchase ingredients for making a meal. Which


phase of the data science process does the analogy fit correctly? (Choose the best
answer)

Data Understanding

Data Collection

Data Cleaning
Data Preparation

Question 8
0.17 / 0.17 pts
Which of the following are correct component for data science?

Data Engineering

Advanced Computing

Domain expertise

All of the options

IncorrectQuestion 9
0 / 0.17 pts
Which of the following is not an issue with CRISP-DM model?

The end-users of the analytical model are required to post-rationalize the model, which
leads to a lot of dissatisfaction

It very much underestimates the amount real experimentation that is needed to get at
viable results

Various modeling techniques are selected and applied, and their parameters are
calibrated to optimal values3
Thorough evaluation indeed is needed, yet the CRISP-DM methodology does not
prescribe how to do this.

Question 10
0.17 / 0.17 pts
Which of the following curve analysis is conducted on each predictor for classification?

NOC

ROC

COC

All of the mentioned

Question 11
0.17 / 0.17 pts
Which of the following best describes the difference between the data analyst and data
scientist?

Data analyst just deal with numbers whereas data scientists deals with algorithms

Data analyst does estimation whereas data scientist predicts & explains it as well

Data analyst and data scientists plays the same role in the project

Data analyst are more proficient in R whereas Data scientists are more proficient in
Python
IncorrectQuestion 12
0 / 0.17 pts
Which of the following is not a application for data science?

Recommendation Systems

Image & Speech Recognition

Online Price Comparison

Privacy Checker

Question 13
0.17 / 0.17 pts
Which one refers to the labelling of the data?

Data Writing

Data Annotation

Data Label

Data Wrangling

Question 14
0.17 / 0.17 pts
Data Science project steps are highly linear.

True
False

Question 15
0.17 / 0.17 pts
Which of the following can be a significant challenge in data science?

Irregular communication with stakeholders

Lack of project funding in data science projects

Data Quality

Inadequately defined organizational structure

Question 16
0.17 / 0.17 pts
CRISP-DM methodology is specifically built for IT(Information Technology) projects.

True

False

Question 17
0.17 / 0.17 pts
In "Business Understanding" phase of the data science process, the goals are identified
and the objectives defined.

True
False

Question 18
0.17 / 0.17 pts
Which of the following step is performed by data scientist after acquiring the data?

Data Cleaning

Data Integration

Data Replication

All of the options

Question 19
0.17 / 0.17 pts
Which one of the following is not a necessary characteristic of a Data Scientist?

Creative

Technical

Punctual

Communicative

Question 20
0.17 / 0.17 pts
Which of the following is not the characteristics for "Localized Analytics"?

Usable in functional silos

Only at functional or process level

Occurs in disconnected manner

Key data, technology and analysts are centralized

Question 21
0.17 / 0.17 pts
Raw data should be processed only one time.

True

False

IncorrectQuestion 22
0 / 0.17 pts
Which of the following data science project step is the most critical step for the success
of the project?

Model Evaluation

Model Selection
Model Building

Data preprocessing

Question 23
0.17 / 0.17 pts
Data science is the process of diverse set of data through ____________

organizing data

processing data

analysing data

All of the options

Question 24
0.17 / 0.17 pts
Correlation always implies causation.

True

False

Question 25
0.17 / 0.17 pts
Which one of the following statement(s) is correct (Choose the most appropriate
answer)?
Analytics is a process in which a computer examines information using mathematical
methods to find useful patterns.

Data Analytics refers to the techniques used to analyze data to enhance productivity
and business gain.

Data analytics is the pursuit of extracting meaning from raw data using specialized
computer systems.

All the statements

None of the statements

IncorrectQuestion 26
0 / 0.17 pts
The answer to following question can be obtained by which type of analytics?
"Whats the best that can happen?"

Diagnostic Analytics

Descriptive Analytics

Prescriptive Analytics

Predictive Analytics

Question 27
0.17 / 0.17 pts
Pattern Recognition is a sub-field of Data Science?
True

False

PartialQuestion 28
0.11 / 0.17 pts
Which of the following are the reasons for the sudden growth of analytics?

Data is growing at 40% compound annual rate

Large number of analysts available in the market

Large number of user friendly analytics tools available for data processing

Cost of storage has hugely dropped

Question 29
0.17 / 0.17 pts
Training and testing datasets are developed during _________ phase.

Model planning

Model building

Operationalize

None
Question 30
0.17 / 0.17 pts
In _________ phase, final report/technical document of process is prepared

Model planning

Model building

Operationalize

None
Question 1
0.17 / 0.17 pts
Data preprocessing improves the data quality and make data mining algorithms efficient
and effective. What are the data preprocessing tasks along with Data cleaning?

Data Integration

Data Reduction

Data Transformation

All of the above.

Question 2
0.17 / 0.17 pts
Which of the following artifacts are not considered in the Descriptive analytics?

Standard report

Alerts

Adhoc reports

Predictive model

Question 3
0.17 / 0.17 pts
If you were to arrange the following data analytics techniques in the increasing order of
complexity, which of the following is considered the correct order?
Predictive, Diagnostic, Prescriptive, Descriptive

Descriptive, Diagnostic, Predictive, Prescriptive

Diagnostic, Predictive, Prescriptive, Descriptive

Prescriptive, Descriptive, Diagnostic, Predictive

Question 4
0.17 / 0.17 pts
Data scientist is not responsible for

Data mining

Data manipulation

Building continuous data stream

Data analytics

Question 5
0.17 / 0.17 pts
Which of these is not an example of the application of data science?

Fraud detection and prevention system in a bank

Creation of new fiscal policy


Targeted advertising as per customer's need

Product recommender systems

Question 6
0.17 / 0.17 pts
If a person is coming from software development background, which of the following
Data science project roles will best suite him?

Big Data Engineer

Machine Learning Engineer

Data Analyst

Storyteller

Question 7
0.17 / 0.17 pts

A person goes to a supermarket to purchase ingredients for making a meal. Which


phase of the data science process does the analogy fit correctly? (Choose the best
answer)

Data Understanding

Data Collection

Data Cleaning
Data Preparation

Question 8
0.17 / 0.17 pts
Which of the following are correct component for data science?

Data Engineering

Advanced Computing

Domain expertise

All of the options

IncorrectQuestion 9
0 / 0.17 pts
Which of the following is not an issue with CRISP-DM model?

The end-users of the analytical model are required to post-rationalize the model, which
leads to a lot of dissatisfaction

It very much underestimates the amount real experimentation that is needed to get at
viable results

Various modeling techniques are selected and applied, and their parameters are
calibrated to optimal values3
Thorough evaluation indeed is needed, yet the CRISP-DM methodology does not
prescribe how to do this.

Question 10
0.17 / 0.17 pts
Which of the following curve analysis is conducted on each predictor for classification?

NOC

ROC

COC

All of the mentioned

Question 11
0.17 / 0.17 pts
Which of the following best describes the difference between the data analyst and data
scientist?

Data analyst just deal with numbers whereas data scientists deals with algorithms

Data analyst does estimation whereas data scientist predicts & explains it as well

Data analyst and data scientists plays the same role in the project

Data analyst are more proficient in R whereas Data scientists are more proficient in
Python
IncorrectQuestion 12
0 / 0.17 pts
Which of the following is not a application for data science?

Recommendation Systems

Image & Speech Recognition

Online Price Comparison

Privacy Checker

Question 13
0.17 / 0.17 pts
Which one refers to the labelling of the data?

Data Writing

Data Annotation

Data Label

Data Wrangling

Question 14
0.17 / 0.17 pts
Data Science project steps are highly linear.

True
False

Question 15
0.17 / 0.17 pts
Which of the following can be a significant challenge in data science?

Irregular communication with stakeholders

Lack of project funding in data science projects

Data Quality

Inadequately defined organizational structure

Question 16
0.17 / 0.17 pts
CRISP-DM methodology is specifically built for IT(Information Technology) projects.

True

False

Question 17
0.17 / 0.17 pts
In "Business Understanding" phase of the data science process, the goals are identified
and the objectives defined.

True
False

Question 18
0.17 / 0.17 pts
Which of the following step is performed by data scientist after acquiring the data?

Data Cleaning

Data Integration

Data Replication

All of the options

Question 19
0.17 / 0.17 pts
Which one of the following is not a necessary characteristic of a Data Scientist?

Creative

Technical

Punctual

Communicative

Question 20
0.17 / 0.17 pts
Which of the following is not the characteristics for "Localized Analytics"?

Usable in functional silos

Only at functional or process level

Occurs in disconnected manner

Key data, technology and analysts are centralized

Question 21
0.17 / 0.17 pts
Raw data should be processed only one time.

True

False

IncorrectQuestion 22
0 / 0.17 pts
Which of the following data science project step is the most critical step for the success
of the project?

Model Evaluation

Model Selection
Model Building

Data preprocessing

Question 23
0.17 / 0.17 pts
Data science is the process of diverse set of data through ____________

organizing data

processing data

analysing data

All of the options

Question 24
0.17 / 0.17 pts
Correlation always implies causation.

True

False

Question 25
0.17 / 0.17 pts
Which one of the following statement(s) is correct (Choose the most appropriate
answer)?
Analytics is a process in which a computer examines information using mathematical
methods to find useful patterns.

Data Analytics refers to the techniques used to analyze data to enhance productivity
and business gain.

Data analytics is the pursuit of extracting meaning from raw data using specialized
computer systems.

All the statements

None of the statements

IncorrectQuestion 26
0 / 0.17 pts
The answer to following question can be obtained by which type of analytics?
"Whats the best that can happen?"

Diagnostic Analytics

Descriptive Analytics

Prescriptive Analytics

Predictive Analytics

Question 27
0.17 / 0.17 pts
Pattern Recognition is a sub-field of Data Science?
True

False

PartialQuestion 28
0.11 / 0.17 pts
Which of the following are the reasons for the sudden growth of analytics?

Data is growing at 40% compound annual rate

Large number of analysts available in the market

Large number of user friendly analytics tools available for data processing

Cost of storage has hugely dropped

Question 29
0.17 / 0.17 pts
Training and testing datasets are developed during _________ phase.

Model planning

Model building

Operationalize

None
Question 30
0.17 / 0.17 pts
In _________ phase, final report/technical document of process is prepared

Model planning

Model building

Operationalize

None
QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
22 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 4 at 12:57
This attempt took 22 minutes.

Question 1 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
All three

 
Organization related

 
Technology related

 
Data related

Question 2 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
None

 
Mathematical Analyst

 
Data Administrator

 
Data Creator

Question 3 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 4 0.25
/ 0.25 pts

_________________ is the science of collecting, analyzing, presenting,


and interpreting data. Objective to draw conclusions on the population.

 
Data Analytics

 
Statistics

 
Business Intelligence

 
Computational Intelligence

Question 5 0.25
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Descriptive & Static

 
Static & comparative

 
Predictive & Prescriptive

 
Standard & dynamic

Question 6 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
False

 
True

Question 7 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.
 
Data Collection

 
Data Preparation

 
Data Requirements

 
Data Understanding

Question 8 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Wrangling

 
Data Writing

 
Data Annotation

 
Data Label

Question 9 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Creation of new fiscal policy

 
Targeted advertising as per customer's need

 
Fraud detection and prevention system in a bank

 
Product recommender systems

Question 10 0.25
/ 0.25 pts

Data preprocessing improves the data quality and make data mining
algorithms efficient and effective. What are the data preprocessing
tasks along with Data cleaning?

 
Data Reduction

 
Data Integration

 
Data Transformation

 
All of the above

Question 11 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Business Intelligence Analyst

 
Data Scientist

 
Data Analyst

 
Data Engineer

Question 12 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 


 
None of the above

 
Data Science Activity

 
Probability Calculation activity

 
Machine Learning Activity

Question 13 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic


process?

 
Business Intelligence

 
Data Mining

 
Data Warehouse

 
Data Science

Question 14 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the


voice is male or female a Data Science Activity

 
True

 
False

Question 15 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are negatively correlated

 
The features are independent

 
The features are positively correlated

Question 16 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Diagnostic Analytics

 
Predictive Analytics

 
Descriptive Analytics

 
Prescriptive Analytics

Question 17 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data quantity

 
Data Security

 
Data Availability

 
All of these

Question 18 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
Why did it happen?

 
What happened?

 
What will happen?

 
What should I do?

Question 19 0.25
/ 0.25 pts

Big data is defined as ______________?


 
Collections of datasets whose volume, velocity or variety is so large that
it is difficult to store, manage, process and analyze the data using
traditional databases and data processing tools

 
The process of making machines capable of mimicking human behavior

 
An engineering discipline that is concerned with all aspects of software
production.

 
The process of extracting and creating information from raw data by
using different techniques

Question 20 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Image & Speech Recognition

 
Online Price Comparison

 
Recommendation Systems

 
Privacy Checker

Quiz Score:
5 out of 5
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
10 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 22:49
This attempt took 10 minutes.

Question 1 0.25
/ 0.25 pts

_____________ is the practice of designing and building systems for


collecting, storing, and analyzing data at scale.

 
Data Analysis

 
Data Mining

 
Machine Learning

 
Data engineering

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Creator

 
Data Administrator

 
None

 
Mathematical Analyst

Question 3 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
False

 
True

Question 4 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
All of the listed

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
High level understanding of Technical know-how specific to the domain.

 
Client Requirement understanding

Question 5 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Organization related

 
All three

 
Technology related

 
Data related

Question 6 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 7 0.25
/ 0.25 pts

Data scientist is not responsible for 

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data analytics

 
Data manipulation

 
Data mining

 
Building continuous data stream

Question 8 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Collection

 
Data Requirements

 
Data Preparation

 
Data Understanding

Question 9 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Collection

 
Data Cleaning

 
Data Understanding

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Preparation

Question 10 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Writing

 
Data Wrangling

 
Data Label

 
Data Annotation

Question 11 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are negatively correlated

 
The features are positively correlated

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The features are independent

Question 12 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Descriptive

 
Predictive

 
Prescriptive

 
Diagnostic

Question 13 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Probability Calculation activity

 
Machine Learning Activity

 
Data Science Activity

 
None of the above

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 14 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Data Engineer

 
Data Analyst

 
Business Intelligence Analyst

 
Data Scientist

Question 15 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Predictive analytics

 
Descriptive analytics

 
None of the above

 
Diagnostic analytics

Question 16 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which of the following are the main challenges in Data Science?

 
Data Availability

 
Data Security

 
Data quantity

 
All of these

Question 17 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Pattern Recognition

 
Statistical Analysis

 
Cognitive Analytics

 
Root Cause Analysis

Question 18 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Vision

 
Velocity

 
Variety

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/9
03/12/2022, 22:49 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Volume

Question 19 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Image & Speech Recognition

 
Online Price Comparison

 
Privacy Checker

 
Recommendation Systems

Question 20 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
What happened?

 
What will happen?

 
What should I do?

 
Why did it happen?

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/9
QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
26 minutes 4.25 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


4.25 out of 5
Submitted Dec 4 at 13:06
This attempt took 26 minutes.

Question 1 0.25
/ 0.25 pts

How is data science understood in terms of interpretability and


Accuracy?

 
low Accuracy, high Interpretability

 
High Accuracy, Low Interpretability

Question 2 0.25
/ 0.25 pts
What is the process of training machine learning models, evaluating
their performance, and using them to make predictions?

 
Predictive Modelling

 
Feature Engineering

 
Visualization

 
Data Exploration

Question 3 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
Client Requirement understanding

 
High level understanding of Technical know-how specific to the domain.

 
All of the listed

Question 4 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Creator

 
Mathematical Analyst

 
Data Administrator

 
None

Question 5 0.25
/ 0.25 pts

Data science is a ___________ Analysis.

 
Standard & dynamic

 
Predictive & Prescriptive

 
Descriptive & Static

 
Static & Comparative.

Incorrect Question 6 0
/ 0.25 pts

___________ comprises the strategies and technologies used by


enterprises for data analysis and management.

 
None of the Above.

 
Business Intelligence

 
Artificial Intelligence

 
Machine Learning

Incorrect Question 7 0
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Wrangling

 
Data Label

 
Data Annotation

 
Data Writing

Question 8 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Creation of new fiscal policy

 
Fraud detection and prevention system in a bank

 
Product recommender systems

 
Targeted advertising as per customer's need

Question 9 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Requirements

 
Data Understanding

 
Data Collection

 
Data Preparation

Incorrect Question 10 0
/ 0.25 pts

Exploratory data analysis does not help in

 
Derivation of new attributes

 
Finding statistical estimates of a variable

 
In univariate and bivariate variable analysis

 
Finding out the data type of a variable

Question 11 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
None of the above

 
Descriptive analytics

 
Diagnostic analytics

 
Predictive analytics

Question 12 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Data Engineer

 
Business Intelligence Analyst

 
Data Analyst

 
Data Scientist

Question 13 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Descriptive

 
Diagnostic

 
Predictive

 
Prescriptive

Question 14 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the


voice is male or female a Data Science Activity

 
True

 
False

Question 15 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Data Science Activity

 
Machine Learning Activity

 
Probability Calculation activity

 
None of the above

Question 16 0.25
/ 0.25 pts

Type of problem in statistics will be

 
None of these

 
Semi-Structured

 
Well Structured

 
Unstructured

Question 17 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Vision

 
Velocity

 
Variety

 
Volume

Question 18 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data quantity

 
Data Availability

 
Data Security

 
All of these

Question 19 0.25
/ 0.25 pts
Which of the following is not an application of Data Science?

 
Online Price Comparison

 
Image & Speech Recognition

 
Privacy Checker

 
Recommendation Systems

Question 20 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
What happened?

 
What should I do?

 
Why did it happen?

 
What will happen?

Quiz Score:
4.25 out of 5
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
30 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 23:18
This attempt took 30 minutes.

Question 1 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
All of the listed

 
High level understanding of Technical know-how specific to the domain.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Client Requirement understanding

Question 2 0.25
/ 0.25 pts

___________ is considered to be a sub-field of or one of the tools of AI,


which provides machines with the capability of learning from experience.

 
Machine Learning

 
Data analysis

 
Data Science

 
Artificial Intelligence

Question 3 0.25
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Standard & dynamic

 
Predictive & Prescriptive

 
Static & comparative

 
Descriptive & Static

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 4 0.25
/ 0.25 pts

___________ comprises the strategies and technologies used by


enterprises for data analysis and management.

 
Artificial Intelligence

 
Business Intelligence

 
None of the Above.

 
Machine Learning

Question 5 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Technology related

 
All three

 
Organization related

 
Data related

Question 6 0.25
/ 0.25 pts

Data science is a ___________ Analysis.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Static & Comparative.

 
Standard & dynamic

 
Predictive & Prescriptive

 
Descriptive & Static

Question 7 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Creation of new fiscal policy

 
Product recommender systems

 
Fraud detection and prevention system in a bank

 
Targeted advertising as per customer's need

Question 8 0.25
/ 0.25 pts

Data preprocessing improves the data quality and make data mining
algorithms efficient and effective. What are the data preprocessing tasks
along with Data cleaning?

 
Data Reduction

 
Data Transformation

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Integration

 
All of the above

Question 9 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Cleaning

 
Data Collection

 
Data Preparation

 
Data Understanding

Question 10 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the best
possible answer.

 
Data Preparation

 
Data Understanding

 
Data Collection

 
Data Requirements

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 11 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic


process?

 
Business Intelligence

 
Data Science

 
Data Warehouse

 
Data Mining

Question 12 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Probability Calculation activity

 
Data Science Activity

 
Machine Learning Activity

 
None of the above

Question 13 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency in
data processing applications?

 
Data Engineer

 
Data Analyst

 
Business Intelligence Analyst

 
Data Scientist

Question 14 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are independent

 
The features are positively correlated

 
The features are negatively correlated

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 15 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the


voice is male or female a Data Science Activity

 
True

 
False

Question 16 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data Security

 
Data quantity

 
All of these

 
Data Availability

Question 17 0.25
/ 0.25 pts

Type of problem in statistics will be

 
Unstructured

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Well Structured

 
Semi-Structured

 
None of these

Question 18 0.25
/ 0.25 pts

Big data is defined as ______________?

 
An engineering discipline that is concerned with all aspects of software
production.

 
Collections of datasets whose volume, velocity or variety is so large that it
is difficult to store, manage, process and analyze the data using traditional
databases and data processing tools

 
The process of extracting and creating information from raw data by using
different techniques

 
The process of making machines capable of mimicking human behavior

Question 19 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/10
03/12/2022, 23:19 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Cognitive Analytics

 
Statistical Analysis

 
Pattern Recognition

 
Root Cause Analysis

Question 20 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Prescriptive Analytics

 
Diagnostic Analytics

 
Predictive Analytics

 
Descriptive Analytics

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/10
12/3/22, 11:35 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due Dec 4 at 19:00 Points 5 Questions 20 Available Dec 3 at 19:00 - Dec 4 at 19:00 24 hours
Time Limit 30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1 2 minutes 5 out of 5

 Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz: 5 out of 5


Submitted Dec 3 at 23:34
This attempt took 2 minutes.

Question 1 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/13
12/3/22, 11:35 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

_____________ is the practice of designing and building systems for collecting, storing, and
analyzing data at scale.

  Data Mining

  Data engineering

  Data Analysis

  Machine Learning

Question 2 0.25 / 0.25 pts

Which one of this belongs to the Data Science career?

  Data Creator

  Mathematical Analyst

  None

  Data Administrator

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/13
12/3/22, 11:35 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 3 0.25 / 0.25 pts

What is the process of training machine learning models, evaluating their performance, and using
them to make predictions?

  Feature Engineering

  Visualization

  Predictive Modelling

  Data Exploration

Question 4 0.25 / 0.25 pts

Business Intelligence follows _________________ process.

  Descriptive & Static

  Standard & dynamic

  Static & comparative

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/13
12/3/22, 11:35 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  Predictive & Prescriptive

Question 5 0.25 / 0.25 pts

Data science challenges can be categorized as

  All three

  Technology related

  Data related

  Organization related

Question 6 0.25 / 0.25 pts

Data science is an Interdisciplinary field.

  True

  False

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/13
12/3/22, 11:35 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 7 0.25 / 0.25 pts

Exploratory data analysis does not help in

  In univariate and bivariate variable analysis

  Derivation of new attributes

  Finding statistical estimates of a variable

  Finding out the data type of a variable

Question 8 0.25 / 0.25 pts

Which one refers to the labelling of the data?   

  Data Wrangling

  Data Writing

  Data Annotation

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/13
12/3/22, 11:35 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  Data Label

Question 9 0.25 / 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the best possible answer.

  Data Understanding

  Data Requirements

  Data Preparation

  Data Collection

Question 10 0.25 / 0.25 pts

A person goes to a supermarket to purchase ingredients for making a meal. Which phase of the data
science process does the analogy fit correctly? (Choose the best answer)

  Data Understanding

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/13
12/3/22, 11:35 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  Data Preparation

  Data Collection

  Data Cleaning

Question 11 0.25 / 0.25 pts

The scatterplot implies that

  The features are negatively correlated

  The features are positively correlated

  The features are independent

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/13
12/3/22, 11:35 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 12 0.25 / 0.25 pts

Among the following, which user role is more concerned with non-functional requirements such as
security, fault tolerance and efficiency in data processing applications?

  Data Analyst

  Data Engineer

  Business Intelligence Analyst

  Data Scientist

Question 13 0.25 / 0.25 pts

Which among the following advocates an exploratory and dynamic process?

  Business Intelligence

  Data Mining

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/13
12/3/22, 11:35 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  Data Science

  Data Warehouse

Question 14 0.25 / 0.25 pts

While playing cards predicting the next card to be a joker is 

  Machine Learning Activity

  Probability Calculation activity

  Data Science Activity

  None of the above

Question 15 0.25 / 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that the Tamil version of the movie
was viewed the most, out of the 3 dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend, which of the following
‘analytical’ approaches would you suggest.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/13
12/3/22, 11:35 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

  Predictive analytics

  Descriptive analytics

  None of the above

  Diagnostic analytics

Question 16 0.25 / 0.25 pts

Which of the following is also known as “Statistical Analysis”?

  Predictive Analytics

  Diagnostic Analytics

  Prescriptive Analytics

  Descriptive Analytics

Question 17 0.25 / 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/13
12/3/22, 11:35 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Which of the following is not an application of Data Science?

  Privacy Checker

  Image & Speech Recognition

  Recommendation Systems

  Online Price Comparison

Question 18 0.25 / 0.25 pts

Type of problem in statistics will be

  Semi-Structured

  Unstructured

  Well Structured

  None of these

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 11/13
12/3/22, 11:35 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 19 0.25 / 0.25 pts

Which of the following are the main challenges in Data Science?

  Data Availability

  All of these

  Data quantity

  Data Security

Question 20 0.25 / 0.25 pts

Diagnostic Analysis is often referred as,

  Cognitive Analytics

  Pattern Recognition

  Root Cause Analysis

  Statistical Analysis

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 12/13
12/3/22, 11:35 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Quiz Score: 5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 13/13
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
14 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 4 at 8:52
This attempt took 14 minutes.

Question 1 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
True

 
False

Question 2 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Data science is a ___________ Analysis.

 
Standard & dynamic

 
Descriptive & Static

 
Static & Comparative.

 
Predictive & Prescriptive

Question 3 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Data related

 
Technology related

 
All three

 
Organization related

Question 4 0.25
/ 0.25 pts

___________ comprises the strategies and technologies used by


enterprises for data analysis and management.

 
Business Intelligence

 
Artificial Intelligence

 
None of the Above.

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Machine Learning

Question 5 0.25
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Predictive & Prescriptive

 
Descriptive & Static

 
Static & comparative

 
Standard & dynamic

Question 6 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 7 0.25
/ 0.25 pts

Exploratory data analysis does not help in

 
In univariate and bivariate variable analysis

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Derivation of new attributes

 
Finding out the data type of a variable

 
Finding statistical estimates of a variable

Question 8 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Targeted advertising as per customer's need

 
Creation of new fiscal policy

 
Product recommender systems

 
Fraud detection and prevention system in a bank

Question 9 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Preparation

 
Data Understanding

 
Data Requirements

 
Data Collection

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 10 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Label

 
Data Wrangling

 
Data Annotation

 
Data Writing

Question 11 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the


voice is male or female a Data Science Activity

 
True

 
False

Question 12 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Data Scientist

 
Data Engineer

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Business Intelligence Analyst

 
Data Analyst

Question 13 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
None of the above

 
Probability Calculation activity

 
Machine Learning Activity

 
Data Science Activity

Question 14 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic


process?

 
Business Intelligence

 
Data Mining

 
Data Warehouse

 
Data Science

Question 15 0.25
/ 0.25 pts
https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
None of the above

 
Diagnostic analytics

 
Descriptive analytics

 
Predictive analytics

Question 16 0.25
/ 0.25 pts

Type of problem in statistics will be

 
None of these

 
Well Structured

 
Unstructured

 
Semi-Structured

Question 17 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
What should I do?

 
What happened?

 
What will happen?

 
Why did it happen?

Question 18 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data quantity

 
Data Security

 
All of these

 
Data Availability

Question 19 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Vision

 
Velocity

 
Variety

 
Volume

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/9
04/12/2022, 08:52 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 20 0.25
/ 0.25 pts

Big data is defined as ______________?

 
The process of making machines capable of mimicking human behavior

 
An engineering discipline that is concerned with all aspects of software
production.

 
The process of extracting and creating information from raw data by
using different techniques

 
Collections of datasets whose volume, velocity or variety is so large that
it is difficult to store, manage, process and analyze the data using
traditional databases and data processing tools

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/9
QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
26 minutes 4.25 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


4.25 out of 5
Submitted Dec 4 at 13:06
This attempt took 26 minutes.

Question 1 0.25
/ 0.25 pts

How is data science understood in terms of interpretability and


Accuracy?

 
low Accuracy, high Interpretability

 
High Accuracy, Low Interpretability

Question 2 0.25
/ 0.25 pts
What is the process of training machine learning models, evaluating
their performance, and using them to make predictions?

 
Predictive Modelling

 
Feature Engineering

 
Visualization

 
Data Exploration

Question 3 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
Client Requirement understanding

 
High level understanding of Technical know-how specific to the domain.

 
All of the listed

Question 4 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Creator

 
Mathematical Analyst

 
Data Administrator

 
None

Question 5 0.25
/ 0.25 pts

Data science is a ___________ Analysis.

 
Standard & dynamic

 
Predictive & Prescriptive

 
Descriptive & Static

 
Static & Comparative.

Incorrect Question 6 0
/ 0.25 pts

___________ comprises the strategies and technologies used by


enterprises for data analysis and management.

 
None of the Above.

 
Business Intelligence

 
Artificial Intelligence

 
Machine Learning

Incorrect Question 7 0
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Wrangling

 
Data Label

 
Data Annotation

 
Data Writing

Question 8 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Creation of new fiscal policy

 
Fraud detection and prevention system in a bank

 
Product recommender systems

 
Targeted advertising as per customer's need

Question 9 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Requirements

 
Data Understanding

 
Data Collection

 
Data Preparation

Incorrect Question 10 0
/ 0.25 pts

Exploratory data analysis does not help in

 
Derivation of new attributes

 
Finding statistical estimates of a variable

 
In univariate and bivariate variable analysis

 
Finding out the data type of a variable

Question 11 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
None of the above

 
Descriptive analytics

 
Diagnostic analytics

 
Predictive analytics

Question 12 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Data Engineer

 
Business Intelligence Analyst

 
Data Analyst

 
Data Scientist

Question 13 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Descriptive

 
Diagnostic

 
Predictive

 
Prescriptive

Question 14 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the


voice is male or female a Data Science Activity

 
True

 
False

Question 15 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Data Science Activity

 
Machine Learning Activity

 
Probability Calculation activity

 
None of the above

Question 16 0.25
/ 0.25 pts

Type of problem in statistics will be

 
None of these

 
Semi-Structured

 
Well Structured

 
Unstructured

Question 17 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Vision

 
Velocity

 
Variety

 
Volume

Question 18 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data quantity

 
Data Availability

 
Data Security

 
All of these

Question 19 0.25
/ 0.25 pts
Which of the following is not an application of Data Science?

 
Online Price Comparison

 
Image & Speech Recognition

 
Privacy Checker

 
Recommendation Systems

Question 20 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
What happened?

 
What should I do?

 
Why did it happen?

 
What will happen?

Quiz Score:
4.25 out of 5
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
4 minutes 5 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


5 out of 5
Submitted Dec 3 at 23:16
This attempt took 4 minutes.

Question 1 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
Organization related

 
Data related

 
All three

 
Technology related

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

Data science is a ___________ Analysis.

 
Predictive & Prescriptive

 
Descriptive & Static

 
Standard & dynamic

 
Static & Comparative.

Question 3 0.25
/ 0.25 pts

_____________ is the practice of designing and building systems for


collecting, storing, and analyzing data at scale.

 
Data Analysis

 
Data engineering

 
Machine Learning

 
Data Mining

Question 4 0.25
/ 0.25 pts

What is the process of training machine learning models, evaluating


their performance, and using them to make predictions?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Exploration

 
Visualization

 
Predictive Modelling

 
Feature Engineering

Question 5 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
True

 
False

Question 6 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
High level understanding of Technical know-how specific to the domain.

 
Understanding of domain concepts

 
All of the listed

 
Client Requirement understanding

Question 7 0.25
/ 0.25 pts
https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Collection

 
Data Cleaning

 
Data Preparation

 
Data Understanding

Question 8 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Product recommender systems

 
Creation of new fiscal policy

 
Fraud detection and prevention system in a bank

 
Targeted advertising as per customer's need

Question 9 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the
best possible answer.

 
Data Preparation

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data Understanding

 
Data Requirements

 
Data Collection

Question 10 0.25
/ 0.25 pts

Data preprocessing improves the data quality and make data mining
algorithms efficient and effective. What are the data preprocessing
tasks along with Data cleaning?

 
All of the above

 
Data Integration

 
Data Transformation

 
Data Reduction

Question 11 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Prescriptive

 
Diagnostic

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Predictive

 
Descriptive

Question 12 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-
functional requirements such as security, fault tolerance and efficiency
in data processing applications?

 
Business Intelligence Analyst

 
Data Analyst

 
Data Scientist

 
Data Engineer

Question 13 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Predictive analytics

 
Descriptive analytics

 
Diagnostic analytics

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
None of the above

Question 14 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are independent

 
The features are negatively correlated

 
The features are positively correlated

Question 15 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Machine Learning Activity

 
Probability Calculation activity

 
Data Science Activity

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
None of the above

Question 16 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Descriptive Analytics

 
Predictive Analytics

 
Diagnostic Analytics

 
Prescriptive Analytics

Question 17 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Recommendation Systems

 
Privacy Checker

 
Online Price Comparison

 
Image & Speech Recognition

Question 18 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?


https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Vision

 
Velocity

 
Variety

 
Volume

Question 19 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
All of these

 
Data quantity

 
Data Security

 
Data Availability

Question 20 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
What happened?

 
Why did it happen?

 
What should I do?

 
What will happen?

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/10
12/3/22, 11:16 PM QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Quiz Score:
5 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 10/10
QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
3 minutes 4.75 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


4.75 out of 5
Submitted Dec 3 at 23:10
This attempt took 3 minutes.

Incorrect Question 1 0
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Descriptive & Static

 
Predictive & Prescriptive

 
Standard & dynamic

 
Static & comparative

Question 2 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
Understanding of domain concepts

 
High level understanding of Technical know-how specific to the domain.

 
All of the listed

 
Client Requirement understanding

Question 3 0.25
/ 0.25 pts

___________ is considered to be a sub-field of or one of the tools of AI, which provides machines with the
capability of learning from experience.

 
Artificial Intelligence

 
Machine Learning

 
Data Science

 
Data analysis

Question 4 0.25
/ 0.25 pts

_________________ is the science of collecting, analyzing, presenting, and interpreting data. Objective to
draw conclusions on the population.

 
Business Intelligence

 
Data Analytics

 
Computational Intelligence

 
Statistics

Question 5 0.25
/ 0.25 pts

Data science is an Interdisciplinary field.

 
True

 
False

Question 6 0.25
/ 0.25 pts

Which one of this belongs to the Data Science career?

 
Data Creator

 
Data Administrator

 
None

 
Mathematical Analyst

Question 7 0.25
/ 0.25 pts

Exploratory data analysis does not help in

 
In univariate and bivariate variable analysis

 
Finding out the data type of a variable

 
Finding statistical estimates of a variable

 
Derivation of new attributes

Question 8 0.25
/ 0.25 pts

In which phase the duplicates (of the data) are removed? Choose the best possible answer.

 
Data Requirements

 
Data Preparation

 
Data Understanding

 
Data Collection

Question 9 0.25
/ 0.25 pts

A person goes to a supermarket to purchase ingredients for making a meal. Which phase of the data
science process does the analogy fit correctly? (Choose the best answer)

 
Data Understanding

 
Data Cleaning

 
Data Collection

 
Data Preparation

Question 10 0.25
/ 0.25 pts

Data scientist is not responsible for 


 
Data analytics

 
Data manipulation

 
Data mining

 
Building continuous data stream

Question 11 0.25
/ 0.25 pts

Which among the following advocates an exploratory and dynamic process?

 
Business Intelligence

 
Data Science

 
Data Mining

 
Data Warehouse

Question 12 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that the Tamil version of the movie was
viewed the most, out of the 3 dubbed languages. The data science team plans to develop a recommendation
system for future releases based on this movie trend, which of the following ‘analytical’ approaches would
you suggest.

 
Predictive analytics

 
Descriptive analytics

 
None of the above

 
Diagnostic analytics

Question 13 0.25
/ 0.25 pts

Among the following, which user role is more concerned with non-functional requirements such as security,
fault tolerance and efficiency in data processing applications?

 
Data Analyst

 
Data Scientist

 
Data Engineer

 
Business Intelligence Analyst

Question 14 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
None of the above

 
Machine Learning Activity

 
Data Science Activity

 
Probability Calculation activity

Question 15 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are positively correlated

 
The features are independent

 
The features are negatively correlated

Question 16 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

 
Cognitive Analytics

 
Pattern Recognition

 
Statistical Analysis

 
Root Cause Analysis

Question 17 0.25
/ 0.25 pts

Which of the following are the main challenges in Data Science?

 
Data Availability

 
Data quantity

 
All of these

 
Data Security

Question 18 0.25
/ 0.25 pts

Which of the following is not a characteristic of Big Data?

 
Vision

 
Velocity

 
Volume

 
Variety

Question 19 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
What should I do?

 
Why did it happen?

 
What will happen?

 
What happened?

Question 20 0.25
/ 0.25 pts

Big data is defined as ______________?

 
The process of making machines capable of mimicking human behavior

 
The process of extracting and creating information from raw data by using different techniques

 
An engineering discipline that is concerned with all aspects of software production.

 
Collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process
and analyze the data using traditional databases and data processing tools

Quiz Score:
4.75 out of 5
04/12/2022, 01:56 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

QUIZ -1
Due
Dec 4 at 19:00
Points
5
Questions
20
Available
Dec 3 at 19:00 - Dec 4 at 19:00
24 hours
Time Limit
30 Minutes

Instructions
Instructions:

1. Each question carries 0.25 marks.

2. Time limit for answering the Quiz is 30 minutes.

3. Only one submission allowed per student. 

Attempt History
Attempt Time Score
LATEST Attempt 1
20 minutes 4.75 out of 5


Correct answers will be available Dec 7 at 19:00 - Dec 8 at 19:00.

Score for this quiz:


4.75 out of 5
Submitted Dec 4 at 1:47
This attempt took 20 minutes.

Question 1 0.25
/ 0.25 pts

Business Intelligence follows _________________ process.

 
Predictive & Prescriptive

 
Static & comparative

 
Descriptive & Static

 
Standard & dynamic

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 1/9
04/12/2022, 01:56 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 2 0.25
/ 0.25 pts

Data science challenges can be categorized as

 
All three

 
Data related

 
Organization related

 
Technology related

Question 3 0.25
/ 0.25 pts

Data Engineers are also known as Data Miners.

 
True

 
False

Question 4 0.25
/ 0.25 pts

_________________ is the science of collecting, analyzing, presenting,


and interpreting data. Objective to draw conclusions on the population.

 
Business Intelligence

 
Statistics

 
Data Analytics

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 2/9
04/12/2022, 01:56 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Computational Intelligence

Question 5 0.25
/ 0.25 pts

What is the process of training machine learning models, evaluating


their performance, and using them to make predictions?

 
Feature Engineering

 
Data Exploration

 
Visualization

 
Predictive Modelling

Question 6 0.25
/ 0.25 pts

Data Science domain knowledge needs

 
High level understanding of Technical know-how specific to the domain.

 
Client Requirement understanding

 
All of the listed

 
Understanding of domain concepts

Question 7 0.25
/ 0.25 pts

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 3/9
04/12/2022, 01:56 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

A person goes to a supermarket to purchase ingredients for making a


meal. Which phase of the data science process does the analogy fit
correctly? (Choose the best answer)

 
Data Cleaning

 
Data Collection

 
Data Preparation

 
Data Understanding

Question 8 0.25
/ 0.25 pts

Which one refers to the labelling of the data?   

 
Data Writing

 
Data Label

 
Data Annotation

 
Data Wrangling

Question 9 0.25
/ 0.25 pts

Data scientist is not responsible for 

 
Data analytics

 
Data manipulation

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 4/9
04/12/2022, 01:56 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Data mining

 
Building continuous data stream

Question 10 0.25
/ 0.25 pts

Which of these is not an example of the application of data science?

 
Product recommender systems

 
Fraud detection and prevention system in a bank

 
Targeted advertising as per customer's need

 
Creation of new fiscal policy

Question 11 0.25
/ 0.25 pts

The scatterplot implies that

 
The features are negatively correlated

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 5/9
04/12/2022, 01:56 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
The features are positively correlated

 
The features are independent

Question 12 0.25
/ 0.25 pts

While playing cards predicting the next card to be a joker is 

 
Data Science Activity

 
Probability Calculation activity

 
Machine Learning Activity

 
None of the above

Question 13 0.25
/ 0.25 pts

A scenario where you visited a doctor for your fever and the doctor
asked you questions like: Were you exposed to rain or cold climate; or
did you have contact with a sick person; or did you have food from
outside etc., and then doctor comes to a conclusion based on your
response to the queries can be considered analogous to which stage of
data analytics.

 
Diagnostic

 
Prescriptive

 
Descriptive

 
Predictive

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 6/9
04/12/2022, 01:56 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 14 0.25
/ 0.25 pts

Recently, the movie RRR released on OTT platform. It is observed that


the Tamil version of the movie was viewed the most, out of the 3
dubbed languages. The data science team plans to develop a
recommendation system for future releases based on this movie trend,
which of the following ‘analytical’ approaches would you suggest.

 
Descriptive analytics

 
Diagnostic analytics

 
Predictive analytics

 
None of the above

Question 15 0.25
/ 0.25 pts

Can extracting frequencies from audio signals to determine whether the


voice is male or female a Data Science Activity

 
True

 
False

Question 16 0.25
/ 0.25 pts

Diagnostic Analysis is often referred as,

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 7/9
04/12/2022, 01:56 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

 
Cognitive Analytics

 
Statistical Analysis

 
Pattern Recognition

 
Root Cause Analysis

Question 17 0.25
/ 0.25 pts

Which of the following is not an application of Data Science?

 
Recommendation Systems

 
Privacy Checker

 
Online Price Comparison

 
Image & Speech Recognition

Incorrect Question 18 0
/ 0.25 pts

Type of problem in statistics will be

 
Well Structured

 
Unstructured

 
None of these

 
Semi-Structured

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 8/9
04/12/2022, 01:56 QUIZ -1: Introduction to Data Science (S1-22_DSECLZG523)

Question 19 0.25
/ 0.25 pts

Which of the following is also known as “Statistical Analysis”?

 
Prescriptive Analytics

 
Predictive Analytics

 
Diagnostic Analytics

 
Descriptive Analytics

Question 20 0.25
/ 0.25 pts

Which of the following questions is answered by Predictive Analytics?

 
What happened?

 
Why did it happen?

 
What will happen?

 
What should I do?

Quiz Score:
4.75 out of 5

https://bits-pilani.instructure.com/courses/1726/quizzes/3386 9/9

You might also like