You are on page 1of 22

CS8091 Big Data Analytics MCQ - Regulations 2017

NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.


Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 01 Unit Name : Introduction to Big Data Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


S.
Objective Questions (MCQ /True or False / Fill up with Choices ) BTL
No.
Which of the following is not an example of Social Media?
a. Twitter
1. b. Google
L3
c. Insta
d. Youtube
By 2025, the volume of digital data will increase to
a. TB
2. b. YB L1
c. ZB
d. EB
For Drawing insights for Business what are need?
a. Collecting the data
3. b. Storing the data L5
c. Analysing the data
d. All the above
Does Facebook uses "Big Data " to perform the concept of Flashback? Is this True or
False.
4. L3
a. TRUE
b. FALSE
The Process of describing the data that is huge and complex to store and process is known
as
a. Analytics
5. L1
b. Data mining
c. Big Data
d. Data Warehouse
Data generated from online transactions is one of the example for volume of big data. Is
this true or False.
6. L3
a. TRUE
b. FALSE
Velocity is the speed at which the data is processed
7. a. TRUE L4
b. FALSE
_____________ have a structure but cannot be stored in a database.
a. Structured
8. b. Semi-Structured L2
c. Unstructured
d. None of these
____________refers to the ability to turn your data useful for business.
a. Velocity
9. b. Variety L1
c. Value
d. Volume
Prepared By: Udhaya Kumar. R AP/ CSE., Page 1 of 6
www.studymaterialz.in 1
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 01 Unit Name : Introduction to Big Data Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


Value tells the trustworthiness of data in terms of quality and accuracy.
10. a. TRUE L3
b. FALSE
GFS consists of a ____________ Master and ___________ Chunk Servers
a. Single, Single
11. b. Multiple, Single L1
c. Single, Multiple
d. Multiple, Multiple
Files are divided into ____________ sized Chunks.
a. Static
12. b. Dynamic L2
c. Fixed
d. Variable
____________is an open source framework for storing data and running application on
clusters of commodity hardware.
a. HDFS
13. L1
b. Hadoop
c. MapReduce
d. Cloud
HDFS Stores how much data in each clusters that can be scaled at any time?
a. 32
14. b. 64 L2
c. 128
d. 256
Hadoop MapReduce allows you to perform distributed parallel processing on large
volumes of data quickly and efficiently… is this MapReduce or Hadoop… i.e statement is
15. True or False L4
a. TRUE
b. FALSE
Hortonworks was introduced by Cloudera and owned by Yahoo.
16. a. TRUE L1
b. FALSE
Hadoop YARN is used for Cluster Resource Management in Hadoop Ecosystem.
17. a. TRUE L4
b. FALSE
Google Introduced MapReduce Programming model in 2004.
18. a. TRUE L4
b. FALSE
______________ phase sorts the data & ____________creates logical clusters.
a. Reduce, YARN
b. MAP, YARN
19. L2
c. REDUCE, MAP
d. MAP, REDUCE

Prepared By: Udhaya Kumar. R AP/ CSE., Page 2 of 6


www.studymaterialz.in 2
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 01 Unit Name : Introduction to Big Data Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


There is only one operation between Mapping and Reducing is it True or False…
a. TRUE
20. L4
b. FALSE

__________ is factors considered before Adopting Big Data Technology.


a. Validation
21. b. Verification L3
c. Data
d. Design
_________ for improving supply chain management to optimize stock management,
replenishment, and forecasting;
a. Descriptive
22. L3
b. Diagnostic
c. Predictive
d. Prescriptive
which among the following is not a Data mining and analytical applications?
a. profile matching
23. b. social network analysis L2
c. facial recognition
d. Filtering
________________ as a result of data accessibility, data latency, data availability, or limits
on bandwidth in relation to the size of inputs.
a. Computation-restricted throttling
24. L1
b. Large data volumes
c. Data throttling
d. Benefits from data parallelization
As an example, an expectation of using a recommendation engine would be to increase
same-customer sales by adding more items into the market basket.
a. Lowering costs
25. L2
b. Increasing revenues
c. Increasing productivity
d. Reducing risk
Which storage subsystem can support massive data volumes of increasing size.
a. Extensibility
26. b. Fault tolerance L5
c. Scalability
d. High-speed I/O capacity
______________provides performance through distribution of data and fault tolerance
through replication
a. HDFS
27. b. PIG L3
c. HIVE
d. HADOOP

Prepared By: Udhaya Kumar. R AP/ CSE., Page 3 of 6


www.studymaterialz.in 3
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 01 Unit Name : Introduction to Big Data Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


______________ is a programming model for writing applications that can process Big
Data in parallel on multiple nodes.
a. HDFS
28. b. MAP REDUCE L1
c. HADOOP
d. HIVE

_____________________ takes the grouped key-value paired data as input and runs a
Reducer function on each one of them.
a. MAPPER
29. b. REDUCER L2
c. COMBINER
d. PARTITIONER

_______________ is a type of local Reducer that groups similar data from the map phase
into identifiable sets.
a. MAPPER
30. b. REDUCER L3
c. COMBINER
d. PARTITIONER

While Installing Hadoop how many xml files are edited and list them ?
i. core-site.xml
ii. hdfs-site.xml
31. L4
iii. mapred.xml
iv. yarn.xml

Write the code for core-site.xml ?


<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>D:\hadoop\temp</value>
32. L6
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:50071</value>
</property>
</configuration>

</?xml >
33. Write the code for hdfs-site.xml ? L3
Prepared By: Udhaya Kumar. R AP/ CSE., Page 4 of 6
www.studymaterialz.in 4
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 01 Unit Name : Introduction to Big Data Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property><name>dfs.replication</name><value>1</value></property>
<property>
<name>dfs.namenode.name.dir</name><value>/hadoop2.6.0/data/name</value><f
inal>true</final></property>
<property><name>dfs.datanode.data.dir</name><value>/hadoop2.6.0/data/data</v
alue><final>true</final> </property>
</configuration>
</xml>

Write the code for mapred.xml?


<?xml version="1.0"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
<property>
34. <name>mapreduce.application.classpath</name> L3
<value>/hadoop-2.6.0/share/hadoop/mapreduce/*,
/hadoop-2.6.0/share/hadoop/mapreduce/lib/*,
/hadoop-2.6.0/share/hadoop/common/*,
/hadoop-2.6.0/share/hadoop/common/lib/*,
/hadoop-2.6.0/share/hadoop/yarn/*,
/hadoop-2.6.0/share/hadoop/yarn/lib/*,
/hadoop-2.6.0/share/hadoop/hdfs/*,
/hadoop-2.6.0/share/hadoop/hdfs/lib/*,
</value>
</property>
</configuration>

Write the code for yarn-site.xml ?


<?xml version="1.0"?>
<configuration>
35. <property> L3
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

Prepared By: Udhaya Kumar. R AP/ CSE., Page 5 of 6


www.studymaterialz.in 5
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 01 Unit Name : Introduction to Big Data Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>D:\hadoop\userlog</value><final>true</final>
</property>
<property><name>yarn.nodemanager.local-
dirs</name><value>D:\hadoop\temp\nm-localdir</value></property>
<property>
<name>yarn.nodemanager.delete.debug-delay-sec</name>
<value>600</value>
</property>
<property><name>yarn.application.classpath</name>
<value>/hadoop-2.6.0/,/hadoop-
2.6.0/share/hadoop/common/*,/hadoop2.6.0/share/hadoop/common/lib/*,/hadoop-
2.6.0/share/hadoop/hdfs/*,/hadoop2.6.0/share/hadoop/hdfs/lib/*,/hadoop-
2.6.0/share/hadoop/mapreduce/*,/hadoop2.6.0/share/hadoop/mapreduce/lib/*,/hado
op-2.6.0/share/hadoop/yarn/*,/hadoop2.6.0/share/hadoop/yarn/lib/*</value>
</property>
</configuration>
what are the environmental variable set for Hadoop ?
i. User variables:
 Variable: HADOOP_HOME
 Value: D:\hadoop-2.6.0
ii. System variable:
 Variable: Path
 Value: D:\hadoop-2.6.0\bin
D:\hadoop-2.6.0\sbin
36. D:\hadoop-2.6.0\share\hadoop\common\* L1
D:\hadoop-2.6.0\share\hadoop\hdfs
D:\hadoop-2.6.0\share\hadoop\hdfs\lib\*
D:\hadoop-2.6.0\share\hadoop\hdfs\*
D:\hadoop-2.6.0\share\hadoop\yarn\lib\*
D:\hadoop-2.6.0\share\hadoop\yarn\*
D:\hadoop-2.6.0\share\hadoop\mapreduce\lib\*
D:\hadoop-2.6.0\share\hadoop\mapreduce\*
D:\hadoop-2.6.0\share\hadoop\common\lib\*

Prepared By: Udhaya Kumar. R AP/ CSE., Page 6 of 6


www.studymaterialz.in 6
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 02 Unit Name : Clustering and Classification Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


S.
Objective Questions (MCQ /True or False / Fill up with Choices ) BTL
No.
Movie Recommendation systems are an example of
1.Classification 2. Clustering 3. Reinforcement Learning 4. Regression
a. 2 Only
1.
b. 1 and 2 L3
c. 1 and 3
d. 2 and 3
Sentiment Analysis is an example of
1. Regression 2. Classification 3. Clustering 4 Reinforcement Learning
a. 1, 2 and 4
2. L3
b. 1 and 3
c. 1, 2 and 3
d. 1 and 2
Can decision trees be used for performing clustering?
3. a. True L4
b. False
What is the minimum no. of variables/ features required to perform clustering?
1. 0
4. 2. 1 L1
3. 2
4. 3
For two runs of K-Mean clustering is it expected to get same clustering results?
5. 1. Yes L3
2. No
Which of the following can act as possible termination conditions in K-Means?
1. For a fixed number of iterations.
2. Assignment of observations to clusters does not change between iterations. Except for
cases with a bad local minimum.
3. Centroids do not change between successive iterations. 4.Terminate when RSS falls
6. L1
below a threshold.
a. 1, 3 and 4
b. 1, 2 and 3
c. 1, 2 and 4
d. All of the above
Which of the following algorithm is most sensitive to outliers?
1. K-means clustering algorithm
7. 2. K-medians clustering algorithm L3
3. K-modes clustering algorithm
4. K-medoids clustering algorithm
After performing K-Means Clustering analysis on a dataset, you observed the following
8. L6
dendrogram. Which of the following conclusion can be drawn from the dendrogram?

Prepared By: Udhaya Kumar. R AP/ CSE., Page 1 of 5


www.studymaterialz.in 7
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 02 Unit Name : Clustering and Classification Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK

a. There were 28 data points in clustering analysis


b. The best no. of clusters for the analyzed data points is 4
c. The proximity function used is Average-link clustering
d. The above dendrogram interpretation is not possible for K-Means clustering
analysis
In the figure below, if you draw a horizontal line on y- axis for y=2. What will be the
number of clusters formed?

9. L6

1. 1
2. 2
3. 3
4. 4
In which of the following cases will K-Means clustering fail to give good results?
1. Data points with outliers
2. Data points with different densities
3. Data points with round shapes
10. 4. Data points with non-convex shapes L4
a. 1 and 2
b. 2 and 3
c. 2 and 4
d. 1, 2 and 4
The discrete variables and continuous variables are two types of
a. Open end classification
11. b. Time series classification L1
c. Qualitative classification
d. Quantitative classification

Prepared By: Udhaya Kumar. R AP/ CSE., Page 2 of 5


www.studymaterialz.in 8
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 02 Unit Name : Clustering and Classification Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


Bayesian classifiers is
1. A class of learning algorithm that tries to find an optimum classification of a set of
examples using the probabilistic theory.
2. Any mechanism employed by a learning system to constrain the search space of a
12. hypothesis L1
3. An approach to the design of learning algorithms that is inspired by the fact that when
people encounter new situations, they often explain them by reference to familiar
experiences, adapting the explanations to fit the new situation.
4. None of these
Classification accuracy is
1. A subdivision of a set of examples into a number of classes
2. Measure of the accuracy, of the classification of a concept that is given by a
13. L1
certain theory
3. The task of assigning a classification to a set of examples
4. None of these
Classification task referred to
1. A subdivision of a set of examples into a number of classes
2. A measure of the accuracy, of the classification of a concept that is given by a
14. L1
certain theory
3. The task of assigning a classification to a set of examples
4. None of these
Euclidean distance measure is
1. A stage of the KDD process in which new data is added to the existing selection.
2. The process of finding a solution for a problem simply by enumerating all possible
15. L1
solutions according to some pre-defined order and then testing them
3. The distance between two points as calculated using the Pythagoras theorem
4. None of these
_____________________ is good at handle missing data and support both the kind of
attributes ( i.e Categorial and Continuous attributes )
a. ID3.
16. L4
b. C4.5.
c. CART.
d. Naïve Bayes.
Decision trees use ______________________, in that they always choose the option
that seems the best available at that moment.
a. Greedy Algorithms.
17. L2
b. Divide and Conquer.
c. Backtracking.
d. Shortest Path Method.
Decision trees cannot handle categorical attributes with many distinct values, such as
country codes for telephone numbers.
18. L4
a. TRUE
b. FALSE
19. __________________are easy to implement and can execute efficiently even without L2
Prepared By: Udhaya Kumar. R AP/ CSE., Page 3 of 5
www.studymaterialz.in 9
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 02 Unit Name : Clustering and Classification Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


prior knowledge of the data, they are among the most popular algorithms for classifying
text documents.
a. ID3
b. Naïve Bayes classifiers
c. CART
d. None of these.
High entropy means that the partitions in classification are
a. Pure
20. b. Not pure L2
c. Useful
d. Useless
Which of the following statements about Naive Bayes is incorrect?
a. Attributes are equally important.
21. b. Attributes are statistically dependent of one another given the class value. L4
c. Attributes are statistically independent of one another given the class value.
d. Attributes can be nominal or numeric
The maximum value for entropy depends on the number of classes so if we have 8 Classes
what will be the max entropy.

22. L6
a. Max Entropy is 1
b. Max Entropy is 2
c. Max Entropy is 3
d. Max Entropy is 4
John flies frequently and likes to upgrade his seat to first class. He has determined that if
he checks in for his flight at least two hours early, the probability that he will get an
upgrade is 0.75; otherwise, the probability that he will get an upgrade is 0.35. With his
busy schedule, he checks in at least two hours before his flight only 40% of the time.
Suppose John did not receive an upgrade on his most recent attempt. What is the
23. L6
probability that he did not arrive two hours early?
a. 0.892
b. 0.796
c. 0.685
d. 0.999
Point out the wrong statement.
a. k-nearest neighbor is same as k-means
24. b. k-means clustering is a method of vector quantization L4
c. k-means clustering aims to partition n observations into k clusters
d. none of the mentioned
Consider the following example “How we can divide set of articles such that those articles
have the same theme (we do not know the theme of the articles ahead of time) " is this:
25. L3
1. Clustering
2. Classification
Prepared By: Udhaya Kumar. R AP/ CSE., Page 4 of 5
www.studymaterialz.in 10
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 02 Unit Name : Clustering and Classification Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


3. Regression
4. None of These
Can we use K Mean Clustering to identify the objects in video?
26. 1. Yes L4
2. No
Clustering techniques are _________________ in the sense that the data scientist
does not determine, in advance, the labels to apply to the clusters.
1. Unsupervised
27. L2
2. Supervised
3. Reinforcement
4. Neural network

Prepared By: Udhaya Kumar. R AP/ CSE., Page 5 of 5


www.studymaterialz.in 11
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 03 Unit Name : Association and Recommendation Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


S.
Objective Questions (MCQ /True or False / Fill up with Choices ) BTL
No.
______________________metric is examined to determine a reasonably optimal value of
k.
1. Mean Square Error
1.
2. Within Sum of Squares (WSS) L5
3. Speed
4. None of These
If an itemset is considered frequent, then any subset of the frequent itemset must also be
frequent.
1. Apriori Property
2. L2
2. Downward Closure Property
3. Either 1 or 2
4. Both 1 & 2
if {bread,eggs,milk} has a support of 0.15 and {bread,eggs} also has a support of 0.15, the
confidence of rule {bread,eggs}→{milk} is
1. 0
3. L6
2. 1
3. 2
4. 3
Confidence is a measure of how X and Y are really related rather than coincidentally
happening together.
4. L4
a. True
b. False
A high-confidence rule can sometimes be misleading because confidence does not consider
support of the itemset in the rule consequent. Is This True ?
5. L4
a. Yes
b. No
_________________recommend items based on similarity measures between users and/or
items.
1. Content Based Systems
6. L2
2. Hybrid System
3. Collaborative Filtering Systems
4. None of These
There are _____ major Classification of Collaborative Filtering Mechanisms
1. 1
7. 2. 2 L1
3. 3
4. None of These
Movie Recommendation to peoples is an example of
1. User Based Recommendation
8. 2. Item Based Recommendation L3
3. Knowledge Based Recommendation
4. Content Based Recommendation
Prepared By: Udhaya Kumar. R AP/ CSE., Page 1 of 4
www.studymaterialz.in 12
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 03 Unit Name : Association and Recommendation Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


__________________ recommenders rely on an explicitly defined set of recommendation
rules.
1. Constraint Based
9. L2
2. Case Based
3. Content Based
4. User Based
Parallelized hybrid recommender systems operate dependently of one another and produce
separate recommendation lists.
10. L4
1. True
2. False
Association rules are sometimes referred to as
a. market basket analysis
11. b. Itemset Filtering L1
c. Frequent Itemset Analysis
d. None of these.
if 80% of all transactions contain itemset {bread}, then the support of {bread} is 0.8.
Similarly, if 60% of all transactions contain itemset {bread,butter}, then the support of
{bread,butter} is
12. a. 0.4 L6
b. 0.5
c. 0.6
d. 0.7
Lift is defined as the measure of certainty or trustworthiness associated with each
discovered rule.
13. L4
a. TRUE
b. FALSE
_______________ is able to identify trustworthy rules, but it cannot tell whether a rule is
coincidental.
a. Lift
14. L1
b. Confidence
c. Support
d. Leverage
_____________________ recommend items based on similarity measures between users
and/or items. The items recommended to a user are those preferred by similar users.
a. Collaborative Filtering System
15. L2
b. Content Based Recommendation
c. Knowledge Based Recommendation
d. Hybrid Approaches
Pure collaborative approaches take a matrix of given user–item ratings as the only input
and typically produce output. Is it Pure Collaborative?
16. L4
a. Yes
b. No
With respect to the determination of the set of similar users, one common measure used in
17. L1
recommender systems is
Prepared By: Udhaya Kumar. R AP/ CSE., Page 2 of 4
www.studymaterialz.in 13
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 03 Unit Name : Association and Recommendation Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


a. Cosine Similarity Measure
b. Pearson’s correlation coefficient.
c. Mean Squared Error Method
d. None of these.
Large-scale e-commerce sites, often implement a different technique,
_____________________ which is more apt for offline preprocessing and thus allows for
the computation of recommendations in real time even for a very large rating matrix.
18. a. Item-Based Recommendation L2
b. User-Based Recommendation
c. Content-Based Recommendation
d. None of these
Here are two very short texts to compare and find the cosine similarity measure?
I. Julie loves me more than Linda loves me
II. Jane likes me more than Julie loves me
19. a. 0.6 L6
b. 0.7
c. 0.8
d. 0.9
___________________ is based on the availability of item descriptions and a profile that
assigns importance to these characteristics.
a. Item-Based Recommendation
20. L2
b. User-Based Recommendation
c. Content-Based Recommendation.
d. None of these
Consider the features of a movie which are not relevant to a recommendation system.
a. The set of actors of the movie.
21. b. The Director L3
c. The Year in which the movie was made
d. The Budget of the movie.
A ____________________has been implemented, for similarity based retrieval under
nearest neighbors.
a. k-nearest-neighbor method (kNN)
22. L2
b. Conventional Neural Network (CNN)
c. Bayes Theorem
d. Naïve Bayes Classifier
Case-based recommenders focus on the retrieval of similar items on the basis of different
types of similarity measures
23. L4
a. TRUE
b. FALSE
In ________________recommendation approaches, items are retrieved using similarity
measures that describe to which extent item properties match some given user’s
24. requirements. L2
a. Item-Based
b. Case-Based
Prepared By: Udhaya Kumar. R AP/ CSE., Page 3 of 4
www.studymaterialz.in 14
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 03 Unit Name : Association and Recommendation Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


c. Content-Based
d. User-Based
_______________are based on a sequenced order of techniques, in which each succeeding
recommender only refines the recommendations of its predecessor.
a. Weighted Hybrids
25. L1
b. Mixed Hybrids
c. Cascade Hybrids
d. Switching Hybrids
____________________require an oracle that decides which recommender should be
used in a specific situation, depending on the user profile and/or the quality of
recommendation
26. a. Weighted Hybrids L1
b. Mixed Hybrids
c. Cascade Hybrids
d. Switching Hybrids

Prepared By: Udhaya Kumar. R AP/ CSE., Page 4 of 4


www.studymaterialz.in 15
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 04 Unit Name : Stream Concepts Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


S.
Objective Questions (MCQ /True or False / Fill up with Choices ) BTL
No.
Which one does'nt belong to application of Data Stream?
1. Web traffic
1. 2. Internet
L1
3. Sensor data
4. None of these
These queries are, in a sense, permanently executing and produce outputs at appropriate
times.
2. L1
a. Standing query
b. Ad-hoc query
Google wants to know what queries are more frequent today than yesterday.
1. Mining Query Stream
3. 2. Mining Login Stream L4
3. Mining Search Stream
4. Mining Click Stream
Yahoo wants to know which of its pages are getting an unusual number of hits in the past
hour
1. Mining Query Stream
4. L4
2. Mining Login Stream
3. Mining Search Stream
4. Mining Click Stream
Surveillance cameras produce images with high resolution than satellites
5. 1. True L4
2. False
In the data stream model, individual data items may be __________________, e.g.,
network measurements, call records, web page visits, sensor readings, and so on.
a. Key – Value Pair
6. L2
b. Relational Tuples
c. Variable
d. Database
A data stream is a real time continuous and ordered sequence of items. It is possible to
control the order in which the items arrive, nor it is feasible to locally store a stream in its
7. entirety in any memory device. Is this Statement True… L4
a. YES
b. NO
Long running queries are registered in the _________________ and placed into groups
for shared processing.
8. a. Query Repository L1
b. Archival Storage
c. Limited Working Storage

Prepared By: Udhaya Kumar. R AP/ CSE., Page 1 of 4


www.studymaterialz.in 16
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 04 Unit Name : Stream Concepts Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


d. Main Memory
Which among the following was not the example of Data Stream concepts?
a. Financial Applications
9. b. Network Monitoring L3
c. Fraud Detection
d. Web Application
In DSMS, The data model and query processor will allow either order based or time based
operations. Is this Statement True..
10. L4
a. YES
b. NO
In Streaming Queries , Alerting the user when Stock crosses over a price point is an
example of ______________
a. Continuous Queries
11. L1
b. One Time Queries
c. Sampling Queries
d. None of These
For Example : An increase in queries like “dengue fever symptoms” enables us to predict
the number of sufferers. Which one it belong.
a. Query Stream
12. L3
b. Click Stream
c. User Stream
d. Content Stream
Standing Queries are executed when user prefer and produce output at appropriate times…
Is This True ?
13. L4
a. YES
b. NO
At Ocean Surface Temperature Sensor, the data stream model will process the maximum
temperature ever recorded analysis is this?
14. L2
a. Standing Query
b. Ad-hoc Query
A useful model of stream processing is that queries are about a window of length N – the
N most recent elements received. N is so large it cannot be stored in memory, or even on
15. disk. Is this Statement is ? L4
a. TRUE
b. FALSE
The stream-processing algorithm is executed in query processor, without access to main
memory or with only rare accesses to secondary storage. Is this Statement True?
16. L3
a. YES
b. NO
Web sites often like to report the number of unique users over the past month. Kindly
complete the sql :
17. L6
SELECT ________(________(name))
FROM Logins
Prepared By: Udhaya Kumar. R AP/ CSE., Page 2 of 4
www.studymaterialz.in 17
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 04 Unit Name : Stream Concepts Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


WHERE time_____ t;
a. SUM, UNIQUE, ==
b. COUNT,UNIQUE,<=
c. COUNT, DISTINCT, >=
d. SUM, DISTINCT,<=
A search engine receives a stream of queries, and it would like to study the behavior of
typical users. Assume that the stream consists of tuples (user, query, time)
18. L3
a. Standing Query
b. Ad-hoc Query
More generally, we can obtain a sample consisting of any rational fraction a/b of the users
by hashing user names to b buckets, 0 through b − 1. Add the search query to the sample if
19. the hash value is less than a. Is this Statement True? L4
a. YES
b. NO
Which filtering eliminate most of the tuples that do not meet the criteria?
a. Blooms Filtering
20. b. AMS Filtering L5
c. DGIM Filtering
d. None of These
________________ is reactive because it waits for users to request a query and then
delivers the analytics.
a. On Demand Real Time Analytics
21. L3
b. Continuous Real Time Analytics
c. Time based Analytics
d. Content Based Analytics
Monitoring stock market trends provide analytics to help users make a decision to buy or
sell all in real time.
a. On Demand Real Time Analytics
22. L3
b. Continuous Real Time Analytics
c. Time based Analytics
d. Content Based Analytics
Sentiment analysis is widely applied to reviews and social media for a variety of
applications ranging from marketing to customer service. Is it True…
23. a. TRUE L2
b. FALSE
__________________is based on a model of representing individual entities and
numerous kinds of relationships that connect those entities.
a. Graph analytics
24. L1
b. Real time analytics
c. Sentiment Analysis
d. Stock market Prediction
A ____________ that can be represented using a triple format consisting of a
25. L2
subject (the source point of the relationship), an object (the target), and a predicate

Prepared By: Udhaya Kumar. R AP/ CSE., Page 3 of 4


www.studymaterialz.in 18
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 04 Unit Name : Stream Concepts Date 30.09.2020

OBJECTIVE TYPE QUESTION BANK


(that models the type of the relationship).
a. Directed Graph
b. Undirected Graph
c. Weighted Graph
d. Un Weighted Graph

Prepared By: Udhaya Kumar. R AP/ CSE., Page 4 of 4


www.studymaterialz.in 19
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 05 Unit Name : NoSQL Data Management for Big Date 30.09.2020
Data and Visualization
OBJECTIVE TYPE QUESTION BANK
S.
Objective Questions (MCQ /True or False / Fill up with Choices ) BTL
No.
Among top 10 ranking database model with relational DBMS which one is First ?
a. MySQL
1. b. Oracle
L1
c. MongoDB
d. Cassandra
Which among the Database which one is popular ?
a. MongoDB
2. b. Oracle L2
c. MySQL
d. Cassandra
Which among the following are incorrect in regards with NoSQL ?
a. Its Easy and ready to manage with clusters.
3. b. Suitable for upcoming data explosions. L4
c. It requires to keep track with data structure
d. Provide easy and flexible system.
Which Database Administrator job was in trends with job trends ?
a. MongoDB
4. b. CouchDB L2
c. SimpleDB
d. Redis
No SQL Means _________________
a. Not SQL
5. b. No Usage of SQl L1
c. Not Only SQL
d. Not for SQL
In Relational database Management System Scaling is possible
6. a. TRUE L4
b. FALSE
Which among the following is not the example of NoSql ?
a. Google
7. b. NetFlix L3
c. Amazon
d. CERN
Carlo Strozzi used the term NoSQL in ________ to name his lightweight, open-source
relational database that did not expose the standard SQL interface.
a. 1965
8. L1
b. 1989
c. 1998
d. 2007
In Brewer’s Cap Theorem which among the following was not considered ?
9. L2
a. Consistency

Prepared By: Udhaya Kumar. R AP/ CSE., Page 1 of 3


www.studymaterialz.in 20
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 05 Unit Name : NoSQL Data Management for Big Date 30.09.2020
Data and Visualization
OBJECTIVE TYPE QUESTION BANK
b. Availability
c. Partition Tolerance
d. None of these
“If the network is broken, your database won’t work “because RDBMS have Network
partitions. Is this statement True?
10. L4
a. Yes
b. No
If CAP Considers only Availability and Partition what are apt example in real-time ?
a. BigTable
11. b. Dynamo L1
c. Postgres
d. None of these
If CAP Considers only Consistency and Partition what are apt example in real-time ?
a. BigTable
12. b. Dynamo L1
c. Postgres
d. None of these
If CAP Considers only Availability and Availability what are apt example in real-time ?
a. BigTable
13. b. Dynamo L1
c. Postgres
d. None of these
Scalability and better performance of No SQL is Achieved by sacrificing ACID
Compatibility Is it TRUE?
14. L4
a. TRUE
b. FALSE
In Document Based NoSQL, All Documents are usually organized into collections or
databases with unique structure. Is this True ?
15. L4
a. TRUE
b. FALSE
Key Value store was used in which real time applications ?
a. BigTable
16. b. Dynamo L1
c. Postgres
d. None of these
Graph model of NOSQL was used in ?
a. Twitter
17. b. Facebook L3
c. Google
d. Whatsapp
Column Based Model of NoSQL was not supported in ?
18. a. Twitter L3
b. Facebook

Prepared By: Udhaya Kumar. R AP/ CSE., Page 2 of 3


www.studymaterialz.in 21
CS8091 Big Data Analytics MCQ - Regulations 2017
NADAR SARASWATHI COLLEGE OF ENGINEERING AND TECHNOLOGY, THENI.
Course/Branch : B.E / CSE Year / Semester : IVth YR / VII Sem Format
NAC/TLP-07a.13
No.
Subject Code : CS8091 Subject Name : Big Data Analytics Rev. No. 02
Unit No : 05 Unit Name : NoSQL Data Management for Big Date 30.09.2020
Data and Visualization
OBJECTIVE TYPE QUESTION BANK
c. Google
d. BigTable
Document Based Model was used in ?
a. MongoDB
19. b. CouchDB L3
c. SimpleDB
d. Redis
MongoDB is __________________
a. Column Based
20. b. Key Value Based L2
c. Document Based
d. Graph Based
____________ is the process of storing data records across multiple machines
a. Sharding
21. b. HDFS L1
c. HIVE
d. HBASE
The results of a hive query can be stored as
a. Local File
22. b. HDFS File L2
c. Both
d. Cannot be stored
The position of a specific column in a Hive table
a. can be anywhere in the table creation clause
23. b. must match the position of the corresponding data in the data file L3
c. Must match the position only for date time data type in the data file
d. Must be arranged alphabetically
The Hbase tables are
A. Made read only by setting the read-only option
24. B. Always writeable L1
C. Always read-only
D. Are made read only using the query to the table
Hbase creates a new version of a record during
A. Creation of a record
25. B. Modification of a record L2
C. Deletion of a record
D. All the above

Prepared By: Udhaya Kumar. R AP/ CSE., Page 3 of 3


www.studymaterialz.in 22

You might also like