You are on page 1of 394

1.

Business intelligence (BI) is a broad category of application programs which


includes __

1. Decision support
2. Data mining
3. OLAP
4. All of the mentioned
Show Answer
All of the mentioned

2. BI can catalyze a business’s success in terms of _

1. Distinguish the products and services that drive revenues


2. Rank customers and locations based on profitability
3. Ranks customers and locations based on probability
4. All of the mentioned
Show Answer
All of the mentioned

3. Which of the following areas are affected by BI?

1. Revenue
2. CRM
3. Sales
4. All of the mentioned
Show Answer
CRM(Customer relationship management)

4. ___ is a performance management tool that recapitulates an organization’s


performance from several standpoints on a single page

1. Balanced Scorecard
2. Data Cube
3. Dashboard
4. All of the mentioned
Show Answer
Balanced Scorecard

5. __ is a system where operations like data extraction, transformation and


oading operations are executed.

1. Data staging
2. Data integration
3. ETL
4. None of the mentioned
Show Answer
Data staging

6. ______ is a category of applications and technologies for presenting and


analyzing corporate and external data.

1. Data warehouse
2. MIS
3. EIS
4. All of the mentioned
Show Answer
EIS(Enterprise Information System)

7. Which of the following is the process of basing an organization’s actions and


decisions on actual measured results of performance?

1. Institutional performance management


2. Gap analysis
3. Slice and Dice
4. None of the mentioned
Show Answer
Institutional performance management

8. Which of the following does not form part of BI Stack in SQL Server?

1. SSRS
2. SSIS
3. SSAS
4. OBIEE
Show Answer
OBIEE

9. BI can catalyze a business’s success in terms of ____

1. Distinguish the products and services that drive revenues


2. Rank customers and locations based on profitability
3. Ranks customers and locations based on probability
4. All of the mentioned
Show Answer
All of the mentioned

10. This is an approach to selling goods and services in which a prospect


explicitly agrees in advance to receive marketing information

1. customer managed relationship


2. data mining
3. permission marketing
4. one-to-one marketing
Show Answer
permission marketing

Data analytics mcq with answers pdf


11. In an Internet context, this is the practice of tailoring Web pages to
individual users’ characteristics or preferences.

1. Web services
2. customer-facing
3. client/server
4. personalization
Show Answer
personalization

12. This is the processing of data about customers and their relationship with the
enterprise in order to improve the enterprise’s future sales and service and lower
cost.

1. clickstream analysis
2. database marketing
3. customer relationship management
4. CRM analytics
Show Answer
CRM analytics

13. This is a broad category of applications and technologies for gathering,


storing, analyzing, and providing access to data to help enterprise users make
better business decisions.

1. best practice
2. data mart
3. business information warehouse
4. business intelligence
Show Answer
business intelligence

14. This is a systematic approach to the gathering, consolidation, and processing


of consumer data (both for customers and potential customers) that is
maintained in a company’s databases

1. database marketing
2. marketing encyclopedia
3. application integration
4. service oriented integration
Show Answer
database marketing

15. This is an arrangement in which a company outsources some or all of its


customer relationship management functions to an application service provider
(ASP).

1. spend management
2. supplier relationship management
3. hosted CRM
4. Customer Information Control System
Show Answer
hosted CRM

SPPU mcqs for High Performance


Computing
16. This is an XML-based metalanguage developed by the Business Process
Management Initiative (BPMI) as a means of modeling business processes, much
as XML is, itself, a metalanguage with the ability to model enterprise data.

1. BizTalk
2. BPML
3. e-biz
4. ebXML b
Show Answer
BPML

17. This is a central point in an enterprise from which all customer contacts are
managed.

1. contact center
2. help system
3. multichannel marketing
4. call center
Show Answer
contact center

18. This is the practice of dividing a customer base into groups of individuals
that are similar in specific ways relevant to marketing, such as age, gender,
interests, spending habits, and so on.
1. customer service chat
2. customer managed relationship
3. customer life cycle
4. customer segmentation
Show Answer
customer segmentation

19. In data mining, this is a technique used to predict future behavior and
anticipate the consequences of change.

1. predictive technology
2. disaster recovery
3. phase change
4. predictive modeling
Show Answer
predictive modeling

20. According to analysts, for what can traditional IT systems provide a


foundation when they’re integrated with big data technologies like Hadoop?”

1. Big data management and data mining


2. Data warehousing and business intelligence
3. Management of Hadoop clusters
4. Collecting and storing unstructured data
Show Answer
Big data management and data mining

data analytics mcq questions and


answers
21. All of the following accurately describe Hadoop, EXCEPT:

1. Open source
2. Real-time
3. Java-based
4. Distributed computing approach
Show Answer
Real-time

22. ____has the world’s largest Hadoop cluster

1. Apple
2. Datamatics
3. Facebook
4. None of the mentioned
Show Answer
Facebook

23. What are the five V’s of Big Data?

1. Volume
2. velocity
3. Variety
4. All of the above
Show Answer
All of the above

24. ____ hides the limitations of Java behind a powerful and concise Clojure API
for Cascading.”

1. Scalding
2. Cascalog
3. Hcatalog
4. Hcalding
Show Answer
Cascalog

25. What are the main components of Big Data?

1. MapReduce
2. HDFS
3. YARN
4. All of these
Show Answer
All of these

26. What are the different features of Big Data Analytics?

1. Open-Source
2. Scalability
3. Data Recovery
4. All the above
Show Answer
All the above

27. Define the Port Numbers for NameNode, Task Tracker and Job Tracker

1. NameNode
2. Task Tracker
3. Job Tracker
4. All of the above
Show Answer
All of the above

28. Facebook Tackles Big Data With ____ based on Hadoop

1. Project Prism
2. Prism
3. ProjectData
4. ProjectBid
Show Answer
Project Prism

29. What is a unit of data that flows through a Flume agent?

1. Record
2. Event
3. Row
4. Log
Show Answer
Event

30. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of
students from a college. Which of the following statement is true in the
following case

1. Feature F1 is an example of nominal variable


2. Feature F1 is an example of ordinal variable
3. It doesn’t belong to any of the above category
4. Both of these
Show Answer
Feature F1 is an example of ordinal variable

data analytics mcq with answers


31. Which of the following is an example of a deterministic algorithm?

1. PCA
2. K-Means
3. None of the above
4. all of the above
Show Answer
PCA
32. What is the entropy of the target variable?

1. -(5/8 log(5/8) + 3/8 log(3/8))


2. 5/8 log(5/8) + 3/8 log(3/8)
3. 5/8 log(5 8) + 3/8 log(3/8)
4. 5/8 log(3/8) – 3/8 log(5/8)
Show Answer
-(5/8 log(5/8) + 3/8 log(3/8))

33. Point out the correct statement.

1. OLAP is an umbrella term that refers to an assortment of software applications


for analyzing an organization’s raw data for intelligent decision making
2. Business intelligence equips enterprises to gain business advantage from data
3. BI makes an organization agile thereby giving it a lower edge in today’s
evolving market condition
4. None of the mentioned
Show Answer
Business intelligence equips enterprises to gain business advantage from data

34. BI can catalyze a business’s success in terms of ____

1. Distinguish the products and services that drive revenues


2. Rank customers and locations based on profitability
3. Ranks customers and locations based on probability
4. All of the mentioned
Show Answer
All of the mentioned

data analytics multiple choice


questions
35. Heuristic is

1. A set of databases from different vendors, possibly using different database


paradigms
2. An approach to a problem that is not guaranteed to work but performs well in
most cases
3. Information that is hidden in a database and that cannot be recovered by a
simple SQL query.
4. None of these
Show Answer
An approach to a problem that is not guaranteed to work but performs well in most
cases
36. Heterogeneous databases referred to

1. A set of databases from different b vendors, possibly using different database


paradigms
2. An approach to a problem that is not guaranteed to work but performs well in
most cases.
3. Information that is hidden in a database and that cannot be recovered by a
simple SQL query.
4. None of these
Show Answer
A set of databases from different b vendors, possibly using different database
paradigms

1. Is it possible that Assignment of observations to clusters does not change


between successive iterations in K-Means

1. Yes
2. No
3. Can’t say
4. None of these
Show Answer
Yes

2. Which of the following can act as possible termination conditions in K-Means?

1. For a fixed number of iterations.


2. Assignment of observations to clusters does not change between iterations.
Except for cases with a bad local minimum.
3. Centroids do not change between successive iterations.
4. Terminate when RSS falls below a threshold.
5. All of the above
Show Answer
All of the above

3. Which of the following clustering algorithms suffers from the problem of


convergence at local optima?

1. K- Means clustering algorithm


2. Agglomerative clustering algorithm
3. Expectation-Maximization clustering algorithm
4. Diverse clustering algorithm
5. both a and c
Show Answer
both a and c
4. How can Clustering (Unsupervised Learning) be used to improve the accuracy
of Linear Regression model (Supervised Learning):

1. Creating different models for different cluster groups.


2. Creating an input feature for cluster ids as an ordinal variable.
3. Creating an input feature for cluster centroids as a continuous variable.
4. Creating an input feature for cluster size as a continuous variable.
5. All of the above
Show Answer
All of the above

5. What could be the possible reason(s) for producing two different


dendrograms using agglomerative clustering algorithm for the same dataset?
because

1. Proximity function used


2. of data points used
3. of variables used
4. All of the above
Show Answer
All of the above

6. In which of the following cases will K-Means clustering fail to give good
results?

1. Data points with outliers


2. Data points with different densities
3. Data points with round shapes
4. Data points with non-convex shapes
5. a, b and d
Show Answer
a, b and d

7. Which of the following is/are valid iterative strategy for treating missing
values before clustering analysis?

1. Imputation with mean


2. Nearest Neighbor assignment
3. computation with Expectation
4. Maximization algorithm All of the above
Show Answer
computation with Expectation

8. Feature scaling is an important step before applying K-Mean algorithm. What


is reason behind this?
1. In distance calculation it will give the same weights for all features
2. You always get the same clusters. If you use or don’t use feature scaling
3. In Manhattan distance it is an important step but in Euclidian it is not
4. None of these
Show Answer
In distance calculation it will give the same weights for all features

9. Which of the following method is used for finding optimal of cluster in


K-Mean algorithm?

1. Elbow method
2. Manhattan method
3. Euclidian mehthod
4. All of the above
Show Answer
Elbow method

10. What is true about K-Mean Clustering?

1. K-means is extremely sensitive to cluster center initializations


2. Bad initialization can lead to Poor convergence speed
3. Bad initialization can lead to bad overall clustering
4. None of these
Show Answer
None of these

Data Analytics mcq sppu


11. Which of the following can be applied to get good results for K-means
algorithm corresponding to global minima?

1. Try to run algorithm for different centroid initialization


2. Adjust number of iterations
3. Find out the optimal number of clusters
4. All of the above
Show Answer
All of the above

12. If you are using Multinomial mixture models with the


expectation-maximization algorithm for clustering a set of data points into two
clusters, which of the assumptions are important:

1. All the data points follow two Gaussian distribution


2. All the data points follow n Gaussian distribution (n >2)
3. All the data points follow two multinomial distribution
4. All the data points follow n multinomial distribution (n >2)
Show Answer
All the data points follow two multinomial distribution

13. Which of the following is/are not true about Centroid based K-Means
clustering algorithm and Distribution based expectation-maximization
clustering algorithm:

1. Both starts with random initializations


2. Both are iterative algorithms
3. Both have strong assumptions that the data points must fulfill
4. Expectation maximization algorithm is a special case of K-Means
Show Answer
Expectation maximization algorithm is a special case of K-Means

14. Which of the following is/are not true about DBSCAN clustering algorithm:

1. For data points to be in a cluster, they must be in a distance threshold to a


core point
2. It has strong assumptions for the distribution of data points in dataspace
3. It has substantially high time complexity of order O(n3)
4. It does not require prior knowledge of the no. of desired clusters
5. both b and c
Show Answer
both b and c

15. Which of the following are the high and low bounds for the existence of
F-Score?

1. [0,1]
2. (0,1)
3. [-1,1]
4. None of the above
Show Answer
[0,1]

16. All of the following increase the width of a confidence interval except:

1. Increased confidence level


2. Increased variability
3. Increased sample size
4. Decreased sample size
Show Answer
Increased sample size
17. The p-value in hypothesis testing represents which of the following: Please
select the best answer of those provided below

1. The probability of failing to reject the null hypothesis, given the observed
results
2. The probability that the null hypothesis is true, given the observed results
3. The probability that the observed results are statistically significant, given that
the null hypothesis is true
4. The probability of observing results as extreme or more extreme than currently
observed, given that the null hypothesis is true
Show Answer
The probability of observing results as extreme or more extreme than currently
observed, given that the null hypothesis is true

18. Assume that the difference between the observed, paired sample values is
defined in the same manner and that the specified significance level is the same
for both hypothesis tests. Using the same data, the statement that “a
paired/dependent two sample t-test is equivalent to a one sample t-test on the
paired differences, resulting in the same test statistic, same p-value, and same
conclusion” is: Please select the best answer of those provided below.

1. Always True
2. Never True
3. Sometimes True
4. Not Enough Information
Show Answer
Always True

19. Green sea turtles have normally distributed weights, measured in kilograms,
with a mean of 134.5 and a variance of 49 0. A particular green sea turtle’s
weight has a z-score of -2.4. What is the weight of this green sea turtle? Round
to the nearest whole number.

1. 17 kg
2. 151 kg
3. 118 kg
4. 252 kg c
Show Answer
118 kg

Data analytics mcq with answers


20. What percentage of measurements in a dataset fall above the median?

1. 49%
2. 50%
3. 51%
4. Cannot Be Determined
Show Answer
Cannot Be Determined

21. The proportion of variation in 5k race times that can be explained by the
variation in the age of competitive male runners was approximately 0.663. What
is the value of the sample linear correlation coefficient? Round to 3 decimal
places.

1. 0.663
2. 0.814
3. -0.814
4. 0.440
Show Answer
-0.814

22. Using all of the results provided, is it reasonable to predict the 5k race time
(minutes) of a competitive male runner 73 years of age?”

1. Yes; linear correlation between age and 5k race times is statistically significant
2. Yes; both the sample linear regression equation and an age in years is
provided
3. No; linear correlation between age and 5k race times is not statistically
significant
4. No; the age provided is beyond the scope of our available sample data” d
Show Answer
No; linear correlation between age and 5k race times is not statistically significant

23. If an itemset is considered frequent, then any subset of the frequent itemset
must also be frequent.

1. Apriori Property
2. Downward Closure Property
3. Either 1 or 2
4. Both 1 and 2
Show Answer
Both 1 and 2

24. Algorithm is

1. It uses machine-learning techniques. Here program can learn from past


experience and adapt themselves to new situations
2. Computational procedure that takes some value as input and produces some
value as output
3. Science of making machines performs tasks that would require intelligence
when performed by humans
4. None of these
Show Answer
Computational procedure that takes some value as input and produces some value as
output

25. Bias is

1. A class of learning algorithm that tries to find an optimum classification of a


set of examples using the probabilistic theory
2. Any mechanism employed by a learning system to constrain the search space
of a hypothesis
3. An approach to the design of learning algorithms that is inspired by the fact
that when people encounter new situations, they often explain them by
reference to familiar experiences, adapting the explanations to fit the new
situation.
4. None of these
Show Answer
Any mechanism employed by a learning system to constrain the search space of a
hypothesis

26. Classification is

1. A subdivision of a set of examples into a number of classes


2. A measure of the accuracy, of the classification of a concept that is given by a
certain theory
3. The task of assigning a classification to a set of examples
4. None of these
Show Answer
A subdivision of a set of examples into a number of classes

27. Binary attribute are

1. This takes only two values. In general, these values will be 0 and 1 and .they
can be coded as one bit
2. The natural environment of a certain species
3. Systems that can be used without knowledge of internal operations
4. None of these
Show Answer
This takes only two values. In general, these values will be 0 and 1 and .they can be
coded as one bit

28. Cluster is

1. Group of similar objects that differ significantly from other objects


2. Operations on a database to transform or simplify data in order to prepare it
for a machine-learning algorithm
3. Symbolic representation of facts or ideas from which information can
potentially be extracted
4. None of these
Show Answer
Group of similar objects that differ significantly from other objects

29. A definition of a concept is ______ if it recognizes all the instances of that


concept

1. Complete
2. Consistent
3. Constant
4. None of these
Show Answer
Complete

30. A definition oF a concept is _______ if it classifies any examples as coming


within the concept

1. Complete
2. Consistent
3. Constant
4. None of these
Show Answer
Consistent

30. Data selection is

1. The actual discovery phase of a knowledge discovery process


2. The stage of selecting the right data for a KDD process
3. A subject-oriented integrated time variant non-volatile collection of data in
support of management
4. None of these
Show Answer
The stage of selecting the right data for a KDD process

30. Classification task referred to

1. A subdivision of a set of examples into a number of classes


2. A measure of the accuracy, of the classification of a concept that is given by a
certain theory
3. The task of assigning a classification to a set of examples
4. None of these
Show Answer
The task of assigning a classification to a set of examples

Share as picture

Q1) Charts tips in MS-Excel can


1. * Show the formatting of a data label
2. Show the name of a data series
3. Show the value of data point
4. band c
Q2) What define the mean
1. the statistical or arithmetic average.
2. x the middlemost score.
3. * the most frequently occurring score.
4. x the best representation for every set of data.
Q3) Classification method in which upper and lower limits of interval is also in
clas
1.x stock-out cost
2
✓ sordering costs
3. X carrying costs
4 x purchasing costs

Q4) If sample size is greater than or equal to 30 then sample standard


deviation can be approximated to population standard deviatio
1. x known standard deviation
2. ✓ unknown standard deviation
3.x standard interval deviation
4. x population interval theorem
Q5) Type of cumulative frequency distribution in which class intervals are
added in bottom to top order is classified as
1
more than type distribution
2. x marginal distribution
3. x variation distribution
4. x less than type distribution
06) Histograms, pie charts and frequency polygons are all types of
✓one dimension diagrams
2 x two dimension diagrams
3 x cumulative diagrams
Xem
댐u ‴㐵⸳u 댐 댐 댐 댐 u •
Q7) What does general tables of data used to show data in orderly manner
called
1. x single characteristics tables
2.
repository tables
3. X manifold tables
4. x double characteristic table
Q8) While using a pivot table in Excel, you drop the fields of information that
you want in
areas
1. * & Report Filter
2. x Column Labels
3. * Row Labels
4.Values
Q9) What chart object in MS-Excel, is horizontal or vertical line that extends
across the plot
1. Category axis
2. *&Data marker
ANS-GRIDLINE

Q10) What is valid for a parameter and a statistic associated with repeated
random samples of the same size f
1. * & Values of a parameter will vary from sample to sample but values of a
statistic will not.
2. * Values of both a parameter and a statistic may vary from sample to sample.
3. x Values of a parameter will vary according to the sampling distribution for
that parameter.
4. Values of a statistic will vary according to the sampling distribution for that
statistic.
Q11) What is to be used when faced with the decision of how to arrange
furniture in a room
1. x Mathematical model
2. x Mental model
3. x Physical model
4
Visual model
012) What are the frequencies of all specific values of x and y variables with
total calculated frequencies classified as
1.x variate frequencies
2 x unconditional frequencies
3.x conditional frequencies
4
marginal frequencies
Q13) What will be the mean of data for a random variable x having
probabilities of (1+r)/3, (1+2r)/3 and (0.2+3r)/3 for values of 1,2
1. x 0.8
2. x 1.3
3. x 1.5
4. 1.8
Q14) What doesthe method in which sample statistic is used to estimate value
of parameters of population classified as?
1
✓ estimation
2.X valuation
3.x probability calculation
4. x limited theorem estimation
015) Type of cumulative frequency distribution in which class intervals are
added in top to bottom order is classified as
1x
variation distribution
less than type distribution

016) Which MS-Excel function converts miles to kilometers, kilograms to


pounds, and so on
1. Convert
2. x Product
3.x Change
4. All of above
017) While using a pivot table in Excel, when you drop information into the
"Values" area in the lower-rig
by summation. Which of the following tools will you use to change the default
summarization
1. Value Field Settings
2.x Pivottable Field List
3.x What-lf Analysis
4.x Column Labels
018) Which term refers to the risk of a type I error in a hypothesis test
1 x Power
2 x Confidence level
3. Level of significance

Q19) Which type of analytics, uses statistical and machine learning techniques
1. * Decisive
2. * Descriptive
3. Predictive
4. * Prescriptive
Q20) Three dimensional diagrams are named as so because they considers
both
1. x length and breadth
2. x breadth and depth
3. depth, length and breadth
4. x depth and length
021) Which type of analytics, supports human decisions with visual analytics
the user models to
1
✓ Decisive
2. * Descriptive
3. X Predictive
4 x Prescriptive

Q22) What is the value of any sample statistic which is used to estimate
parameter
1.
point estimate
2. x population estimate
3. x sample estimate
4. x parameter estimate
Q23) A MS-Excel function inside another function is called a
function.
1. Nested
2. Round
3. * Sum
4. x Text
Q24) The essence of decision analysis is
1. breaking down complex situations into manageable elements.
2
✓ choosing the best course of action among alternatives.
3.x finding the root cause of why something has gone wrong.

Q25) Which type of analytics, gain insight from historical data with reporting,
sca
1. x Decisive
2. Descriptive
3. X Predictive
4. * Prescriptive
Q26) Discrete variables and continuous variables are two types of
1. x open end classification
2. x time series classification
3.x qualitative classification
4. ✓ quantitative classification
Q27) Number of observations are 30 and value of arithmetic mean is 15 then
what is the
1.X 15
2.450
3. X 200

Q28) What is P(A/B) if the probability of event A is 0.2 and the probability of
event B is 0.4
1. VP(A) = 0.2
2. * P(A)/P(B) = 0.2/0.4 = 42
3. * P(A) * P(B) = (0.2)(0.4) = 0.08
4. x None of the above.
Q29) 3-D reference in a formula of MS-Excel
1. Can not be modified
2. Only appears on summary worksheets
3. Limits the formatting options
4. Spans worksheets
030) Considering types of diagrams; what are squares, circles and rectangles
classified as
1. x cumulative diagram
2.x dispersion diagrams
3. x one dimension diagrams

4
two dimension diagram

Q31) What is the use of a spreadsheet model


1. * To implement a computer model.
2. Because spreadsheets are convenient.
3.X To analyze decision alternatives.
4. All of these
Q32) What is the class interval when largest value is 60 and smallest value is -
1. x nominal distribution
2. ordinal distribution
3.x chronological distribution
4.x frequency distribution
Q33) What does the stem considered as in stem and leaf display diagrams
used in
1.x central digits
2. * trailing digits
3. leading digits
4. x dispersed digits

Q34) What is the variance for the following discrete data ( 2 6 8 3 7 9 1 4]?
1. x 40.0
2. x 5.0
3.8 2.74
4. 7.5
Q35) Criteria of selecting point estimator must include information of
1.x consistency
2. x unbiasedness
3.x efficiency
4. ✓ all of above
Q36) What are Upper and lower boundaries of interval of confidence classified
as
1. x error biased limits
2. X marginal limits
3. * estimate limits
4
confidence limits

Q37) What will be the standard deviation for the process having an exponential
distributi
1. x 0.4
2. x 5.0
3. X 12.5
4. 25.0
Q38) Which type of analytics, recommend decisions using optimization,
simulation etc.
1. * Decisive
2. * Descriptive
3. X Predictive
4. Prescriptive
Q39) What is the median for the following data -- [ 24 3 6 1 8 9 257]
1.X 2.0
2.x 4.7
3. 4.5
4X100

Q40) If vertical lines are drawn at every point of straight line in frequency
1. x width diagram
2. x length diagram
3. histogram
4. X dimensional bar charts
Q41) Which statement is valid if a fair coin is flipped 10 times
1. * The number of heads will equal the number of tails.
2. The probability of all heads is greater than the probability of all tails.
3. The probability of HHHHHHHHHH = the probability of HTHTHTHTHT.
E
4.* The probability of HHHHHHHHHH < the probability of HTHTHTHTHT.
Q42) Types of histograms includes
1. X deviation bar charts
2.x paired bar charts
3.x grouped charts
all fiabave

043) What of the following refers to 95% confidence interval of a 20% a


1.
6% to 34%
2. X 8% to 32%
3. X 13% to 27%
4. X 17% to 23%
Q44) If standard deviation of population 1 is 3 with sample size 8, and popu
the standard deviation of sampling distribution?
1.8 4.044
2. x 3.044
3.1.044
4. X 2.044
045) What will be degree of freedom for the error term under one-way ANOVA
h
and 2 and 4 observations at level 3
1.x 3.0
2.x 5.0
3. 11.0

246) What percentage of the total population will be present in a normal di


1. 0.47
2. 0.68
3. 0.95
4. x 0.99
Q47) While using a pivot table in Excel, the
area is where you will drop a
1. Report Filter
2.x Column Labels
3. * Row Labels
4. x Values
048) Which hypothesis test should be used to ascertain improvement of the
wor
1.x 2-sample z test
2.x 2-sample t test
3. ✓ Paired t test
4.x F test

Q49) What is the characteristic of a variable being termed independent


1.
The function value depends upon their values.
2. * The decision maker has no control over them,
3. x The variables have no relationship to one another.
4. The variable is described as an output of the spreadsheet model.
Q50) How do you classify Diagrams used to represent grouped and ungrouped
1. x breadth diagrams
2. bar diagrams
3. x width diagrams
4. x length diagrams

. How many types of BI users are there?


A. 2
B. 3
C. 4
D. 5

View Answer

Ans : C

Explanation: Four types of BI users : The Professional Data Analyst, The


IT users, The head of the company, The Business Users.
2. Which of the following statement is true about Business
Intelligence?
A. BI convert raw data into meaningful information
B. BI has a direct impact on organization's strategic, tactical and
operational business decisions.
C. BI tools perform data analysis and create reports, summaries,
dashboards, maps, graphs, and charts
D. All of the above

View Answer

Ans : D

Explanation: All of the above statement are true.

3. KPI stands for?


A. Key Performance Indicators
B. Key Performance Identifer
C. Key Processes Identifer
D. Key Processes Indicators

View Answer

Ans : A

Explanation: BI : creating KPI (Key Performance Indicators) based on


historic data

4. Which of the following does not form part of BI Stack in SQL


Server?
A. SSRS
B. SSIS
C. SSAS
D. OBIEE
View Answer

Ans : D

Explanation: Oracle Business Intelligence Enterprise Edition Plus, also


termed OBI EE Plus, is Oracle Corporation’s set of business intelligence
tools.

5. _________ is a category of applications and technologies for


presenting and analyzing corporate and external data.
A. MIS
B. DIS
C. EIS
D. CIS

View Answer

Ans : C

Explanation: EIS stands for Enterprise Information System.

6. Which of the following areas are affected by BI?


A. Revenue
B. CRM
C. Sales
D. CPM

View Answer

Ans : B

Explanation: Customer relationship management (CRM) is a system for


managing a company’s interactions with current and future customers.
It often involves using technology to organize, automate and synchronize
sales, marketing, customer service, and technical support.
7. Business intelligence (BI) is a broad category of application
programs which includes _____________
A. Decision support
B. Data Mining
C. OLAP
D. All of the above

View Answer

Ans : D

Explanation: Business intelligence (BI) is a broad category of application


programs and technologies for gathering, storing, analyzing, and
providing access to data from various data sources.

8. __________ is a system where operations like data extraction,


transformation and loading operations are executed.
A. Data staging
B. Data integration
C. ETL
D. None of the above

View Answer

Ans : A

Explanation: A staging area, or landing zone, is an intermediate storage


area used for data processing during the extract, transform and load
process. The data staging area sits between the data source and the data
target, which are often data warehouses, data marts, or other data
repositories.

9. Business intelligence equips enterprises to gain business


advantage from data
A. TRUE
B. FALSE
C. Can be true or false
D. Can not say

View Answer

Ans : A

Explanation: Once an organization is powered with BI it can anticipate


enhanced turnaround time on data collection, come up with fresh ideas
for novel business initiatives.

10. BI is a category of database software that provides an


interface to help users quickly and interactively scrutinize the
results in a variety of dimensions of the data
A. TRUE
B. FALSE
C. Can be true or false
D. Can not say

View Answer

Ans : B

Explanation: Online Analytical Processing is a category of database


software that provides an interface to help users quickly and interactively
scrutinize the results in a variety of dimensions of the data.

1. __________ is a subject-oriented, integrated,


time-variant, nonvolatile collection of data
insupport of management decisions.
A. Data Mining

B. Data Warehousing
C. Web Mining

D. Text Mining

Discussion

B. Data Warehousing
2. The data Warehouse is __________.
A. Read only

B. Write only

C. Read write only

D. None

Discussion

A. Read only
3. Expansion for DSS in DW is _________.
A. Decision Support system

B. Decision Single System

C. Data Storable System

D. Data Support System

Discussion

A. Decision Support system


4. The important aspect of the data warehouse
environment is that data found within the
datawarehouse is__________.
A. subject-oriented

B. time-variant

C. integrate

D. D All of the above


Discussion

D. D All of the above

5. The time horizon in Data warehouse is


usually__________.
A. 1-2 years

B. 3-4years

C. 5-6 years

D. 5-10 years

Discussion

D. 5-10 years
6. The data is stored, retrieved & updated
in___________.
A. OLAP

B. OLTP

C. SMTP

D. FTP

Discussion

B. OLTP
7. __________describes the data contained in the
data warehouse.
A. Relational data

B. Operational data

C. Metadata

D. Informational data

Discussion
C. Metadata
8. __________predicts future trends & behaviors,
allowing business managers to make proactive,
knowledge-driven decisions.
A. Data warehouse

B. Data mining

C. Data marts

D. Metadata

Discussion

B. Data mining

9. _________is the heart of the warehouse.


A. Data mining database servers

B. Data warehouse database servers

C. Data mart database servers

D. Relational data base servers

Discussion

B. Data warehouse database servers


10. _________is the specialized data warehouse
database.
A. Oracle

B. DBZ

C. Informix

D. Redbrick

Discussion

D. Redbrick
11. ___________defines the structure of the data
held in operational databases and used
byoperational applications.
A. User-level metadata

B. Data warehouse metadata

C. Operational metadata

D. Data mining metadata

Discussion

C. Operational metadata
12. _________is held in the catalog of the
warehouse database system.
A. Application level metadata

B. Algorithmic level metadata

C. Departmental level metadata

D. Core warehouse metadata

Discussion

B. Algorithmic level metadata


13. __________maps the core warehouse metadata
to business concepts, familiar and useful toend
users.
A. Application level metadata

B. User level metadata

C. Enduser level metadata

D. Core level metadata

Discussion
A. Application level metadata
14. ____________ consists of formal definitions,
such as a COBOL layout or a database schema.
A. Classical metadata

B. Transformation metadata

C. Historical metadata

D. Structural metadata

Discussion

A. Classical metadata
15. ________consists of information in the
enterprise that is not in classical form.
A. Mushy metadata

B. Differential metadata

C. Data warehouse

D. Data mining

Discussion

A. Mushy metadata
16. ________ Databases are owned by particular
departments or business groups.
A. Informational

B. Operational

C. Both informational and operational

D. Flat

Discussion

B. Operational
17. The star schema is composed of __________
fact table.
A. one

B. Two

C. Three

D. four

Discussion

A. one
18. The time horizon in operational environment
is ________.
A. 30-60 days

B. 60-90 days

C. 90-120 days

D. 120-150 days

Discussion

B. 60-90 days
19. The key used in operational environment may
not have an element of _________.
A. Time

B. Cost

C. Frequency

D. Quality

Discussion

A. Time
20. Data can be updated in ______ environment.
A. Data warehouse

B. Data mining

C. Operational

D. Informational

Discussion

C. Operational

21. Record cannot be updated in__________.


A. OLTP

B. Files

C. RDBMS

D. data warehouse

Discussion

D. data warehouse
22. The source of all data warehouse data is
the__________.
A. Operational environment

B. Informal environment

C. Formal environment

D. Technology environment

Discussion

A. Operational environment

23. Data warehouse contains ______data that is


never found in the operational environment.
A. Normalized

B. Informational
C. Summary

D. Denormalized

Discussion

C. Summary
24. The modern CASE tools belong
to_________category.
A. Analysis

B. Development

C. Coding

D. Delivery

Discussion

A. Analysis

25. Bill Inmon has estimated___________of the


time required to build a datawarehouse, is
consumed in the conversion process.
A. 10 percent

B. 20 percent

C. 30 percent

D. 40 percent

Discussion

D. 40 percent

26. Detail data in single fact table is otherwise


known as__________.
A. Monoatomic data
B. Diatomic data

C. Atomic data

D. Multiatomic data

Discussion

C. Atomic data

27. ______test is used in an online transactional


processing environment.
A. MEGA

B. MICRO

C. MACRO

D. ACID

Discussion

D. ACID
28. ______is a good alternative to the star
schema.
A. Star schema

B. Snowflake schema

C. Fact constellation

D. Star-snowflake schema

Discussion

C. Fact constellation
29. The biggest drawback of the level indicator
in the classic star-schema is
thatitlimits_________.
A. Quantify
B. Qualify

C. Flexibility

D. Ability

Discussion

C. Flexibility

30. A data warehouse is__________.


A. Updated by end users

B. Contains numerous naming conventions and formats

C. Organized around important subject areas

D. Contains only current data

Discussion

C. Organized around important subject areas


31. An operational system is________.
A. Used to run the business in real time and is based on historical data

B. Used to run the business in real time and is based on current data

C. Used to support decision making and is based on current data

D. Used to support decision making and is based on historical data

Discussion

B. Used to run the business in real time and is based on current data
32. The generic two-level data warehouse
architecture includes________.
A. At least one data mart

B. Data that can extracted from numerous internal and external


sources

C. Near real-time updates


D. Far real-time updates

Discussion

C. Near real-time updates


33. The active data warehouse architecture
includes _________.
A. At least one data mart

B. Data that can extracted from numerous internal and external


sources

C. Near real-time updates

D. All of the above

Discussion

D. All of the above


34. Reconciled data is_________.
A. Data stored in the various operational systems throughout the
organization

B. Current data intended to be the single source for all decision


support systems

C. Data stored in one operational system in the organization

D. Data that has been selected and formatted for end-user support
applications

Discussion

B. Current data intended to be the single source for all decision


support systems
35. Transient data is___________.
A. Data in which changes to existing records cause the previous version
of the records to be eliminated
B. Data in which changes to existing records do not cause the previous
version of the records to be eliminated

C. Data that are never altered or deleted once they have been adde

D. D Data that are never deleted once they have been added

Discussion

A. Data in which changes to existing records cause the previous version


of the records to be eliminated
36. The extract process is __________.
A. Capturing all of the data contained in various operational systems

B. Capturing a subset of the data contained in various operational


systems

C. Capturing all of the data contained in various decision support


systems

D. Capturing a subset of the data contained in various decision


supportsystems

Discussion

B. Capturing a subset of the data contained in various operational


systems
37. Data scrubbing is___________.
A. A process to reject data from the data warehouse and to create the
necessary indexes

B. A process to load the data in the data warehouse and to create the
necessary indexes

C. A process to upgrade the quality of data after it is moved into a


data warehouse

D. A process to upgrade the quality of data before it is moved into a


data warehouse
Discussion

D. A process to upgrade the quality of data before it is moved into a


data warehouse
38. The load and index is_______.
A. A process to reject data from the data warehouse and to create the
necessary indexes

B. A process to load the data in the data warehouse and to create the
necessary indexes

C. A process to upgrade the quality of data after it is moved into a


data warehouse

D. A process to upgrade the quality of data before it is moved into a


data warehouse

Discussion

B. A process to load the data in the data warehouse and to create the
necessary indexes
39. Data transformation includes___________.
A. A process to change data from a detailed level to a summary level

B. A process to change data from a summary level to a detailed level

C. Joining data from one source into various sources of data

D. Separating data from one source into various sources of data

Discussion

A. A process to change data from a detailed level to a summary level


40. ______ is called a multifield
transformation.
A. Converting data from one field into multiple fields

B. Converting data from fields into field


C. Converting data from double fields into multiple fields

D. Converting data from one field to one field

Discussion

A. Converting data from one field into multiple fields


41. The type of relationship in star schema
is_________.
A. Many-to-many

B. One-to-one

C. One-to-many

D. Many-to-one

Discussion

C. One-to-many

42. Fact tables are_______.


A. Completely demoralized

B. Partially demoralized

C. Completely normalize

D. D Partially normalized

Discussion

C. Completely normalize
43. ____________is the goal of data mining.
A. To explain some observed event or condition

B. To confirm that data exists

C. To analyze data for expected relationships

D. To create a new data warehouse


Discussion

A. To explain some observed event or condition

44. Business Intelligence and data warehousing


is used for________.
A. Forecasting

B. Data Mining

C. Analysis of large volumes of product sales data

D. All of the above

Discussion

D. All of the above


45. The data administration subsystem helps you
perform all of the following, except_________.
A. Backups and recovery

B. Query optimization

C. Security management

D. Create, change, and delete information

Discussion

D. Create, change, and delete information


46. The most common source of change data in
refreshing a data warehouse is__________.
A. Query able change data

B. Cooperative change data

C. Logged change data

D. Snapshot change data

Discussion
A. Query able change data
47. ___________are responsible for running
queries and reports against data warehouse
tables.
A. Hardware

B. Software

C. End users

D. Middle ware

Discussion

C. End users

48. Query tool is meant for_________.


A. Data acquisition

B. Information delivery

C. Information exchange

D. Communication

Discussion

A. Data acquisition
49. Classification rules are extracted
from____________.
A. Root node

B. Decision tree

C. Siblings

D. Branches

Discussion

B. Decision tree
50. Dimensionality reduction reduces the data
set size by removing_____________.
A. Relevant attributes

B. Irrelevant attributes

C. Derived attributes

D. Composite attributes

Discussion

B. Irrelevant attributes

51. ___________is a method of incremental


conceptual clustering.
A. CORBA

B. OLAP

C. COBWEB

D. STING

Discussion

C. COBWEB
52. Effect of one attribute value on a given class
is independent of values of other attribute
iscalled____________.
A. Value independence

B. Class conditional independence

C. Conditional independence

D. Unconditional independence

Discussion
A. Value independence
53. The main organizational justification for
implementing a data warehouse is
toprovide__________.
A. Cheaper ways of handling transportation

B. Decision support

C. Storing large volume of data

D. Access to data

Discussion

C. Storing large volume of data

54. Multidimensional database is otherwise known


as___________.
A. RDBMS

B. DBMS

C. EXTENDED RDBMS

D. EXTENDED DBMS

Discussion

B. DBMS

55. Data warehouse architecture is based


on___________.
A. DBMS

B. RDBMS

C. Sybase

D. SQL Server

Discussion
B. RDBMS
56. Source data from the warehouse comes
from__________.
A. ODS

B. TDS

C. MDDB

D. ORDBMS

Discussion

A. ODS
57. ___________is a data transformation process.
A. Comparison

B. Projection

C. Selection

D. Filtering

Discussion

D. Filtering

58. The technology area associated with CRM


is__________.
A. Specialization

B. Generalization

C. Personalization

D. Summarization

Discussion

C. Personalization
59. SMP stands for________.
A. Symmetric Multiprocessor

B. Symmetric Multiprogramming

C. Symmetric Metaprogramming

D. Symmetric Microprogramming

Discussion

A. Symmetric Multiprocessor

60. _____________are designed to overcome any


limitations placed on the warehouse by thenature
of the relational data model.
A. Operational database

B. Relational database

C. Multidimensional database

D. Data repository

Discussion

C. Multidimensional database
61. __________are designed to overcome any
limitations placed on the warehouse by the
natureof the relational data model.
A. Operational database

B. Relational database

C. Multidimensional database

D. Data repository

Discussion

C. Multidimensional database
62. MDDB stands for _______________.
A. Multiple data doubling

B. Multidimensional databases

C. Multiple double dimension

D. Multi-dimension doubling

Discussion

B. Multidimensional databases

63. ____________is data about data.


A. Metadata

B. Microdata

C. Minidata

D. Multidata

Discussion

A. Metadata
64. _____________is an important functional
component of the metadata.
A. Digital directory

B. Repository

C. Information directory

D. Data dictionary

Discussion

C. Information directory

65. EIS stands for____________.


A. Extended interface system

B. Executive interface system

C. Executive information system


D. Extendable information system

Discussion

C. Executive information system


66. ____________is data collected from natural
systems.
A. MRI scan

B. ODS data

C. Statistical data

D. Historical data

Discussion

A. MRI scan

67. __________is an example of application


development environments.
A. Visual Basic

B. Oracle

C. Sybase

D. SQL Server

Discussion

A. Visual Basic
68. The term that is not associated with data
cleaning process is____________.
A. Domain consistency

B. Deduplication

C. Disambiguation

D. Segmentation
Discussion

D. Segmentation

69. __________are some popular OLAP tools.


A. Metacube, Informix

B. Oracle Express, Essbase

C. HOLAP

D. MOLAP

Discussion

A. Metacube, Informix
70. Capability of data mining is to build
__________models.
A. Retrospective

B. Interrogative

C. Predictive

D. Imperative

Discussion

C. Predictive
71. ________is a process of determining the
preference of customer's majority.
A. Association

B. Preferencing

C. Segmentation

D. Classification

Discussion

B. Preferencing
72. Strategic value of data mining is_______.
A. Cost-sensitive

B. Work-sensitive

C. Time-sensitive

D. Technical-sensitive

Discussion

C. Time-sensitive
73. ___________proposed the approach for data
integration issues.
A. Ralph Campbell

B. Ralph Kimball

C. John Raphlin

D. James Gosling

Discussion

B. Ralph Kimball
74. The terms equality and roll up are associated
with__________.
A. OLAP

B. Visualization

C. Data mart

D. Decision tree

Discussion

C. Data mart
75. Exceptional reporting in data warehousing is
otherwise called as__________.
A. Exception

B. Alerts

C. Errors

D. Bugs

Discussion

B. Alerts

76. _________is a metadata repository.


A. Prism solution directory manager

B. CORBA

C. STUNT

D. COBWEB

Discussion

A. Prism solution directory manager


77. _________is an expensive process in building
an expert system.
A. Analysis

B. Study

C. Design

D. Information collection

Discussion

D. Information collection
78. The full form of KDD is__________.
A. Knowledge database

B. Knowledge discovery in database


C. Knowledge data house

D. Knowledge data definition

Discussion

B. Knowledge discovery in database


79. The first International conference on KDD was
held in the year_________.
A. 1996

B. 1997

C. 1995

D. 1994

Discussion

C. 1995

80. Removing duplicate records is a process


called____________.
A. Recovery

B. Data cleaning

C. Data cleansing

D. Data pruning

Discussion

B. Data cleaning
81. __________contains information that gives
users an easy-to-understand perspective of
theinformation stored in the data warehouse.
A. Business metadata

B. Technical metadata
C. Operational metadata

D. Financial metadata

Discussion

A. Business metadata
82. _________helps to integrate, maintain and
view the contents of the data warehousing system.
A. Business directory

B. Information directory

C. Data dictionary

D. Database

Discussion

B. Information directory

83. Discovery of cross-sales opportunities is


called _________.
A. Segmentation

B. Visualization

C. Correction

D. Association

Discussion

D. Association
84. Data marts that incorporate data mining tools
to extract sets of data are called_________.
A. Independent data mart

B. Dependent data marts

C. Intra-entry data mart


D. Inter-entry data mart

Discussion

B. Dependent data marts


85. __________can generate programs itself,
enabling it to carry out new tasks.
A. Automated system

B. Decision making system

C. Self-learning system

D. Productivity system

Discussion

D. Productivity system

86. The power of self-learning system lies


in___________.
A. Cost

B. Speed

C. Accuracy

D. Simplicity

Discussion

C. Accuracy
87. Building the informational database is done
with the help of ___________.
A. Transformation or propagation tools

B. Transformation tools only

C. Propagation tools only

D. Extraction tools
Discussion

A. Transformation or propagation tools

88. How many components are there in a data


warehouse?
A. Two

B. Three

C. Four

D. Five

Discussion

D. Five
89. Which of the following is not a component of
a data warehouse?
A. Metadata

B. Current detail data

C. Lightly summarized data

D. Component Key

Discussion

D. Component Key
90. ___________is data that is distilled from the
low level of detail found at the current
detailedleve.
A. Highly summarized data

B. Lightly summarized data

C. Metadata

D. Older detail data


Discussion

B. Lightly summarized data

91. Highly summarized data is __________.


A. Compact and easily accessible

B. Compact and expensive

C. Compact and hardly accessible

D. Compact

Discussion

A. Compact and easily accessible


92. A directory to help the DSS analyst locate
the contents of the data warehouse isseen
in__________.
A. Current detail data

B. Lightly summarized data

C. Metadata

D. Older detail data

Discussion

C. Metadata
93. Metadata contains at least__________.
A. The structure of the data

B. The algorithms used for summarization

C. The mapping from the operational environment to the data


warehouse

D. All of the above

Discussion
D. All of the above
94. Which of the following is not a old detail
storage medium?
A. Phot Optical Storage

B. RAID

C. Microfinche

D. Pen drive

Discussion

D. Pen drive
95. The data from the operational environment
enter_____________of data warehouse.
A. Current detail data

B. Older detail data

C. Lightly summarized data

D. Highly summarized data

Discussion

A. Current detail data


96. The data in current detail level resides
till__________event occurs.
A. Purge

B. Summarization

C. Archieve

D. D All of the above

Discussion

D. D All of the above


97. The dimension tables describe the
__________.
A. Entities

B. Facts

C. Keys

D. Units of measures

Discussion

B. Facts
98. The granularity of the fact is the
___________ of detail at which it is recorded.
A. Transformation

B. Summarization

C. Level

D. Tr

Discussion

A. Transformation
99. Which of the following is not a primary grain
in analytical modeling.
A. Transaction

B. Periodic snapshot

C. Accumulating snapshot

D. All of the above

Discussion

B. Periodic snapshot
100. Granularity is determined by___________.
A. Number of parts to a key

B. Granularity of those parts

C. Both A and B

D. None of the above

Discussion

C. Both A and B

101. ________of data means that the attributes


within a given entity are fully dependent on
theentire primary key of the entity.
A. Additively

B. Granularity

C. Functional dependency

D. Dimensionality

Discussion

C. Functional dependency
102. A fact is said to be fully additive
if_________.
A. It is additive over every dimension of its dimensionality

B. Additive over at least one but not all of the dimensions

C. Not additive over any dimension

D. None of the above

Discussion

A. It is additive over every dimension of its dimensionality


103. A fact is said to be partially additive
if_______.
A. It is additive over every dimension of its dimensionality

B. Additive over at least one but not all of the dimensions

C. Not additive over any dimension

D. None of the above

Discussion

B. Additive over at least one but not all of the dimensions


104. A fact is said to be non-additive if_______.
A. It is additive over every dimension of its dimensionality

B. Additive over at least one but not all of the dimensions

C. Not additive over any dimension

D. None of the above

Discussion

C. Not additive over any dimension


105. Non-additive measures can often combined
with additive measures to create new_________.
A. Additive measures

B. Non-additive measures

C. Partially additive

D. All of the above

Discussion

A. Additive measures
106. A fact representing cumulative sales units
over a day at a store for a product is a_________.
A. Additive fact

B. Fully additive fact

C. Partially additive fact

D. Non-additive fact

Discussion

B. Fully additive fact

107. Which of the following is the other name of


Data mining?
A. Exploratory data analysis

B. Data driven discovery

C. Deductive learning

D. All of the above

Discussion

D. All of the above

108. Which of the following is a predictive


model?
A. Clustering

B. Regression

C. Summarization

D. Association rules

Discussion

B. Regression
109. Which of the following is a descriptive
model?
A. Classification
B. Regression

C. Sequence discovery

D. Association rules

Discussion

C. Sequence discovery

110. A_________model identifies patterns or


relationships.
A. Descriptive

B. Predictive

C. Regression

D. Time series analysis

Discussion

A. Descriptive
111. A predictive model makes use of______.
A. Current data.

B. Historical data.

C. Both current and historical data.

D. Assumptions

Discussion

B. Historical data.

112. ______ maps data into predefined groups.


A. Regression

B. Time series analysis

C. Prediction

D. Classification
Discussion

D. Classification

113. _____ is used to map a data item to a real


valued prediction variable.
A. Regression

B. Time series analysis

C. Prediction

D. Classification

Discussion

B. Time series analysis


114. In _____ , the value of an attribute is
examined as it varies over time.
A. Regression

B. Time series analysis

C. Sequence discovery

D. Prediction

Discussion

B. Time series analysis


115. In ______ the groups are not predefined.
A. Association rules

B. Summarization

C. Clustering

D. Prediction

Discussion

C. Clustering
116. Link Analysis is otherwise called as ____.
A. Affinity analysis

B. Association rules

C. Both A & B

D. Prediction

Discussion

C. Both A & B
117. ______ is a the input to KDD.
A. Data

B. Information

C. Query

D. Process

Discussion

A. Data

118. The output of KDD is ______.


A. Data

B. Information

C. Query

D. Useful information

Discussion

D. Useful information

119. The KDD process consists of ____steps.


A. Three

B. Four

C. Five
D. Six

Discussion

C. Five
120. Treating incorrect or missing data is called
as________.
A. Selection

B. Preprocessing

C. Transformation

D. Interpretation

Discussion

B. Preprocessing

121. Converting data from different sources into


a common format for processing is called as____ .
A. Selection

B. Preprocessing

C. Transformation

D. Interpretation

Discussion

C. Transformation
122. Various visualization techniques are used
in_________step of KDD.
A. Selection

B. Transformation

C. Data mining

D. Interpretation
Discussion

D. Interpretation

123. Extreme values that occur infrequently are


called as___________.
A. Outliers

B. Rare values

C. Dimensionality reduction

D. All of the above

Discussion

A. Outliers
124. Box plot and scatter diagram techniques
are_________.
A. Graphical

B. Geometri

C. C Icon-base

D. D Pixel-based

Discussion

B. Geometri
125. _____ is used to proceed from very specific
knowledge to more general information.
A. Induction

B. Compression

C. Approximation

D. Substitution

Discussion
A. Induction

126. Describing some characteristics of a set of


data by a general model is viewed as___________.
A. Induction.

B. Compression

C. Approximation

D. Summarization

Discussion

B. Compression
127. ______ helps to uncover hidden information
about the data.
A. Induction

B. Compression

C. Approximation

D. Summarization

Discussion

C. Approximation
128. ______ are needed to identify training data
and desired results.
A. Programmers

B. Designers

C. Users

D. Administrators
Discussion

C. Users

129. Over fitting occurs when a model_________.


A. Does fit in future states

B. Does not fit in future states

C. Does fit in current state

D. Does not fit in current state

Discussion

B. Does not fit in future states


130. The problem of dimensionality curse
involves___________.
A. The use of some attributes may interfere with the correct
completion of a data mining task.

B. The use of some attributes may simply increase the overall


complexity.

C. Some may decrease the efficiency of the algorithm.

D. All of the above

Discussion

D. All of the above


131. Incorrect or invalid data is known as
_______.
A. Changing data

B. Noisy data

C. Outliers

D. Missing data
Discussion

B. Noisy data

132. ROI is an acronym of _______.


A. Return on Investment

B. Return on Information

C. Repetition of Information

D. Runtime of Instruction

Discussion

A. Return on Investment
133. The ______of data could result in the
disclosure of information that is deemed to
beconfidential.
A. Authorized use

B. Unauthorized use

C. Authenticated use

D. Unauthenticated use

Discussion

B. Unauthorized use
134. _________data are noisy and have many
missing attribute values.
A. Preprocessed

B. Cleaned

C. Real-worl

D. D Tr

Discussion
D. D Tr
135. The rise of DBMS occurred in early _______.
A. 1950's

B. 1960's

C. 1970's

D. 1980's

Discussion

C. 1970's

136. SQL stand for_________.


A. Standard Query Language

B. Structured Query Language

C. Standard Quick List.

D. Structured Query list

Discussion

B. Structured Query Language


137. Which of the following is not a data mining
metric?
A. Space complexity

B. Time complexity

C. ROI

D. All of the above

Discussion

D. All of the above


138. Reducing the number of attributes to solve
the high dimensionality problem is
calledas_____________.
A. Dimensionality curse

B. Dimensionality reduction

C. Cleaning

D. Over fitting

Discussion

B. Dimensionality reduction
139. Data that are not of interest to the data
mining task is called as _____.
A. Missing data

B. Changing data

C. Irrelevant data

D. Noisy data

Discussion

C. Irrelevant data
140. _________are effective tools to attack the
scalability problem.
A. Sampling

B. Parallelization

C. Both A & B

D. None of the above

Discussion

C. Both A & B
141. Market-basket problem was formulated
by____________.
A. Agrawal et al

B. Steve et al

C. Toda et al

D. Simon et al

Discussion

A. Agrawal et al
142. Data mining helps in________.
A. Inventory managemen

B. Sales promotion strategies

C. Marketing strategies

D. All of the above

Discussion

D. All of the above


143. The proportion of transaction supporting X
in T is called_____________.
A. Confidence

B. Support

C. Support count

D. All of the above

Discussion

B. Support
144. The absolute number of transactions
supporting X in T is called _______.
A. Confidence

B. Support

C. Support count

D. None of the above

Discussion

C. Support count

145. The value that says that transactions in D


that support X also support Y is
called__________.
A. Confidence

B. Support

C. Support count

D. None of the above

Discussion

A. Confidence
146. If T consist of 500000 transactions, 20000
transaction contain bread, 30000 transaction
contain jam, 10000 transaction contain both
bread and jam. Then the support of bread and jam
is_________.
A. 2%

B. 20%

C. 3%

D. 30%

Discussion
A. 2%
147. 7 If T consist of 500000 transactions, 20000
transaction contain bread, 30000 transaction
contain jam, 10000 transaction contain both
bread and jam. Then the confidence of buying
bread with jam is____________.
A. 33.33%

B. 66.66%

C. 45%

D. 50%

Discussion

D. 50%
148. The left hand side of an association rule
is called________.
A. Consequent

B. Onset

C. Antecedent

D. Precedent

Discussion

C. Antecedent
149. The right hand side of an association rule
is called__________.
A. Consequent

B. Onset

C. Antecedent

D. Precedent
Discussion

A. Consequent

150. Which of the following is not a desirable


feature of any efficient algorithm?
A. To reduce number of input operation

B. To reduce number of output operations

C. To be efficient in computing

D. To have maximal code length

Discussion

D. To have maximal code length


151. All set of items whose support is greater
than the user-specified minimum support are
calledas_____________
A. Border set

B. Frequent set

C. Maximal frequent set

D. Lattice

Discussion

B. Frequent set

152. If a set is a frequent set and no superset


of this set is a frequent set, then it
iscalled____________
A. Maximal frequent set

B. Border set

C. Lattice

D. Infrequent sets
Discussion

A. Maximal frequent set

153. Any subset of a frequent set is a frequent


set. This is_________
A. Upward closure property

B. Downward closure property

C. Maximal frequent set

D. Border set

Discussion

B. Downward closure property


154. Any superset of an infrequent set is an
infrequent set. This is ___________
A. Maximal frequent set

B. Border set

C. Upward closure property

D. Downward closure property

Discussion

C. Upward closure property


155. If an itemset is not a frequent set and no
superset of this is a frequent set, then it is
A. Maximal frequent set

B. Border set

C. Upward closure property

D. Downward closure property

Discussion
B. Border set
156. A priori algorithm is otherwise called
as_________
A. Width-wise algorithm

B. Level-wise algorithm

C. Pincer-search algorithm

D. FP growth algorithm

Discussion

B. Level-wise algorithm
157. The A Priori algorithm is a____________
A. Top-down search

B. Breadth first search

C. Depth first search

D. Bottom-up search

Discussion

D. Bottom-up search

158. The first phase of A Priori algorithm


is___________
A. Candidate generation

B. Itemset generation

C. Pruning

D. Partitioning

Discussion

A. Candidate generation
159. The second phase of A Priori algorithm
is____________
A. Candidate generation

B. Itemset generation

C. Pruning

D. Partitioning

Discussion

C. Pruning
160. The step eliminates the extensions of
(k-1)-itemsets which are not found to be frequent,
from being considered for counting support.
A. Candidate generation

B. Pruning

C. Partitioning

D. Itemset eliminations

Discussion

B. Pruning
161. The a priori frequent itemset discovery
algorithm moves in the lattice
A. Upward

B. Downward

C. Breadthwise

D. Both upward and downward

Discussion

A. Upward
162. After the pruning of a priori
algorithm,__________will remain
A. Only candidate set

B. No candidate set

C. Only border set

D. No border set

Discussion

B. No candidate set
163. The number of iterations in a priori
A. Increases with the size of the maximum frequent set

B. Decreases with increase in size of the maximum frequent set

C. Increases with the size of the data

D. Decreases with the increase in size of the data

Discussion

A. Increases with the size of the maximum frequent set


164. MFCS is the acronym of____________
A. Maximum Frequency Control Set

B. Minimal Frequency Control Set

C. Maximal Frequent Candidate Set

D. Minimal Frequent Candidate Set

Discussion

C. Maximal Frequent Candidate Set


165. Dynamic Itemset Counting Algorithm was
proposed by
A. Bin et al
B. Argawal et at

C. Toda et al

D. Simon et at

Discussion

A. Bin et al

166. Itemsets in the category of structures have


a counter and the stop number with them
A. Dashed

B. Circle

C. Box

D. Solid

Discussion

A. Dashed
167. The itemsets in the_________category
structures are not subjected to any counting
A. Dashes

B. Box

C. Soli

D. D Circle

Discussion

C. Soli
168. Certain itemsets in the dashed circle whose
support count reach support valueduring an
iteration move into the______________
A. Dashed box
B. Solid circle

C. Solid box

D. None of the above

Discussion

A. Dashed box

169. Certain itemsets enter afresh into the


system and get into the , which are essentially
the supersets of the itemsets that move from the
dashed circle to the dashed box
A. Dashed box

B. Solid circle

C. Solid box

D. Dashed circle

Discussion

D. Dashed circle

170. The item sets that have completed on full


pass move from dashed circle to________
A. Dashed box

B. Solid circle

C. Solid box

D. None of the above

Discussion

B. Solid circle
171. The FP-growth algorithm has phases
A. One
B. Two

C. Three

D. Four

Discussion

B. Two

172. A frequent pattern tree is a tree structure


consisting of ________
A. An item-prefix-tree

B. A frequent-item-header table

C. A frequent-item-node

D. Both A & B

Discussion

D. Both A & B
173. The non-root node of item-prefix-tree
consists of fields
A. Two

B. Three

C. Four

D. Five

Discussion

B. Three
174. The frequent-item-header-table consists of
fields
A. Only one.

B. Two.
C. Three.

D. Four

Discussion

B. Two.
175. The paths from root node to the nodes
labelled 'a' are called_________
A. Transformed prefix path

B. Suffix subpath

C. Transformed suffix path

D. Prefix subpath

Discussion

D. Prefix subpath

76. The transformed prefix paths of a node 'a'


form a truncated database of pattern which
cooccur with a is called________
A. Suffix path

B. FP-tree

C. Conditional pattern base

D. Prefix path

Discussion

C. Conditional pattern base

177. The goal of________is to discover both the


dense and sparse regions of a data set
A. Association rule

B. Classification
C. Clustering

D. Genetic Algorithm

Discussion

C. Clustering
178. Which of the following is a clustering
algorithm?
A. A priori

B. CLARA

C. Pincer-Search

D. FP-growth

Discussion

B. CLARA

179. clustering technique start with as many


clusters as there are records, with eachcluster
having only one record
A. Agglomerative

B. Divisive

C. Partition

D. Numeric

Discussion

A. Agglomerative

180. clustering techniques starts with all


records in one cluster and then try to split that
A. Agglomerative.

B. Divisive.
C. Partition.

D. Numeric

Discussion

B. Divisive.
181. Which of the following is a data set in the
popular UCI machine-learning repository?
A. CLARA.

B. CACTUS.

C. STIRR.

D. MUSHROOM

Discussion

D. MUSHROOM

182. In algorithm each cluster is represented by


the center of gravity of the cluster
A. K-medoid

B. K-means

C. Stirr

D. Rock

Discussion

B. K-means
183. In each cluster is represented by one of the
objects of the cluster located near the center
A. K-medoid

B. K-means

C. Stirr
D. Rock

Discussion

A. K-medoid
184. Pick out a k-medoid algorithm
A. DBSCAN

B. BIRCH

C. PAM

D. CURE

Discussion

C. PAM
185. Pick out a hierarchical clustering
algorithm
A. DBSCAN

B. BIRCH

C. PAM

D. CURE

Discussion

A. DBSCAN

186. CLARANS stands for


A. CLARA Net Server

B. Clustering Large Application Range Network Search

C. Clustering Large Applications based on Randomized Search

D. Clustering Application Randomized Search

Discussion

C. Clustering Large Applications based on Randomized Search


187. BIRCH is a________
A. Agglomerative clustering algorithm

B. Hierarchical algorithm

C. Hierarchical-agglomerative algorithm

D. Divisive

Discussion

C. Hierarchical-agglomerative algorithm
188. The cluster features of different
subclusters are maintained in a tree
called_________
A. CF tree

B. FP tree

C. FP growth tree

D. B tree

Discussion

A. CF tree
189. The_______algorithm is based on the
observation that the frequent sets are normally
veryfew in number compared to the set of all
itemsets
A. A priori

B. Clustering

C. Association rule

D. Partition

Discussion
D. Partition
190. The partition algorithm uses scans of the
databases to discover all frequent sets
A. Two

B. Four

C. Six

D. Eight

Discussion

A. Two
191. The basic idea of the Apriori algorithm is
to generate_____item sets of a particular size
&scans the database
A. Candidate

B. Primary

C. Secondary

D. Superkey

Discussion

A. Candidate

192. is the most well-known association rule


algorithm and is used in most commercialproducts
A. Apriori algorithm

B. Partition algorithm

C. Distributed algorithm

D. Pincer-search algorithm

Discussion
A. Apriori algorithm
193. An algorithm called________is used to
generate the candidate item sets for each pass
afterthe first
A. Apriori

B. Apriori-gen

C. Sampling

D. Partition

Discussion

B. Apriori-gen

194. The basic partition algorithm reduces the


number of database scans to __________ &
dividesit into partitions
A. One

B. Two

C. Three

D. Four

Discussion

B. Two
195. and prediction may be viewed as types of
classification
A. Decision.

B. Verification.

C. Estimation.

D. Illustration
Discussion

C. Estimation.

196. can be thought of as classifying an


attribute value into one of a set of possible
classes
A. Estimation.

B. Prediction.

C. Identification.

D. Clarification

Discussion

B. Prediction.
197. Prediction can be viewed as forecasting a
value
A. Non-continuous.

B. Constant.

C. Continuous.

D. variable

Discussion

C. Continuous.

198. data consists of sample input data as well


as the classification assignment for the data
A. Missing.

B. Measuring.

C. Non-training.

D. Training
Discussion

B. Measuring.

199. Rule based classification algorithms


generate_________rule to perform the
classification
A. If-then

B. While

C. Do while

D. Switch

Discussion

A. If-then
200. are a different paradigm for computing which
draws its inspiration fromneuroscience
A. Computer networks

B. Neural networks

C. Mobile networks

D. Artificial networks

Discussion

B. Neural networks

Expense list

Q1) Which type of analytics, gain insight from historical data with reporting,
scorecards, clustering
1. Decisive
2. Descriptive
3. X Predictive
4. Prescriptive
Q2) What is to be used when faced with the decision of How to arrange
furniture in a room
1. x Mathematical model
2. X Mental model
3. * Physical model
4.
Visual model
Q3) Which type of analytics, supports human decisions with visual analytics
the user models to reflect re
1
► Decisive
2.x Descriptive
04) What is the characteristic of best models
1. accurately reflect relevant characteristics of the real-world object or decision.
2. X are mathematical models.
3. x replicate all aspects of the real-world object or decision,
4. x replicate the characteristics of a component in isolation from the rest of the
system.
Q5) Which type of analytics, uses statistical and machine learning techniques
1.x Decisive
2x Descriptive
3. Predictive
4. Prescriptive
06) The essence of decision analysis is
1. x breaking down complex situations into manageable elements.
2 choosing the best course of action among alternatives,

Q7) Which type of analytics, recommend decisions using optimization,


simulation etc.
1.x Decisive
2. * Descriptive
3. X Predictive
4. Prescriptive

1. Information can be converted into knowledge about ___ patterns and


future trends.
Ans: Historical

2. Data about data is called ___.


Ans: Metadata

3. Facts, numbers, or text is called ___.


Ans: Data

4. ___ and ___ are the key to emerging Business Intelligence technologies.
Ans: Data warehouse and data mining
5. Data mining is also called ___.
Ans: Knowledge discovery

6. Online Analytical Processing (OLAP) is a technology that is used to create


___ software.
Ans: Decision support

7. OLAP Supports ___ user access and multiple queries.


Ans: Multiple

8. Statistics techniques are incorporated into Data mining methods.


(True/False).
Ans: True

9. ___ Optimization techniques are based on the concepts of genetic


combination, mutation, and natural selection.
Ans: Genetic algorithms

10. What is Mineset?


Ans: MineSet is software that provides tools for searching, sorting, filtering
and drilling down enabling previously complex data models to be viewed
intuitively through real-time 3-D graphical representation.

11. A data warehouse refers to a database that is maintained separately from


an organization’s operational databases. (True/False)
Ans: True

12. A data warehouse is usually constructed by integrating multiple


heterogeneous sources. (True/False)
Ans: True

13. ___ system is customer-oriented and is used for transaction and query
processing by clerks, clients, and information technology professionals.
Ans: OLTP

14. A ___ allows data to be modelled and viewed in multiple Dimensions.


Ans: Data cube

15. In ___ schema some dimension tables are normalized, thereby further
splitting the data into additional tables.
Ans: Snowflake

16. The ___ data model is commonly used in the design of relational
databases.
Ans: Entity-relationship

17. Data warehouses and OLAP tools are based on ___ data model.
Ans: Multidimensional
18. The ___ exposes the information being captured, stored, and managed by
operational systems.
Ans: Data source view

19. ___ are the intermediate servers that stand in between a relational back –
end server and client front – end tools.
Ans: Relational OLAP (ROLAP) servers

20. A ___ is a set of views over operational databases.


Ans: Virtual warehouse

21. The ___ software gives the user the opportunity to look at the data from a
variety of different dimensions.
Ans: Multidimensional Analysis

22. Which of the following statements defines Business Intelligence?


A. Converting data into knowledge and making it available throughout the
organization
B. Analytical software and solutions for gathering, consolidating, analyzing
and providing access to information in a way that is supposed to let the users
of an enterprise make better business decisions.
C. Both A & B
Ans: C. Both A & B

23. Based on the overall requirements of business intelligence, the ___ layer
is required to extract, cleanse and transform data into load files for the
information warehouse.
Ans: Data integration

24. Data Mining is not a business solution; it is just a technology. (True/False)


Ans: True

25. ___ is a random error or variance in measured variables.


Ans: Noise

26. State true or false


I. BI applications can also help managers to be better informed about actions
that a company’s competitors are taking
II. BI can help companies share selected strategic information with business
partners.
III. BI 2.0″ is used to describe the acquisition, provision and analysis of
“real-time” data
A. i-T, ii-F, iii-F
B. i-T, ii-T, iii-F
C. i-T, ii-F, iii-T
D. i-T, ii-T, iii-T
Ans: D.
27. ___ routines attempt to fill in missing values, smooth out noise while
identifying outlines, and correct inconsistencies in the data.
Ans: Data cleaning

28. ___ is used to refer to systems and technologies that provide the business
with the means for decision-makers to extract personalized meaningful
information about their business and industry.
Ans: Business Intelligence

29. In ___ each value in a bin is replaced by the mean value of the bin.
Ans: Smoothing by bin means

30. ___ regression involves finding the “best” line to fit two variables so that
one variable can be used to predict the other.
Ans: Linear

31. ___ works to remove the noise from the data that includes techniques like
binning, clustering, and regression.
Ans: Smoothing

32. Redundancies can be detected by correlation analysis. (True/False)


Ans: True

33. The ___ technique uses encoding mechanisms to reduce the data set size.
Ans: Data compression

34. In which Strategy of data reduction redundant attributes are detected.


A. Date cube aggregation
B. Numerosity reduction
C. Data compression
D. Dimension reduction
Ans: D. Dimension reduction

35. ___ hierarchies can be used to reduce the data by collecting and replacing
low-level concepts by higher-level concepts.
Ans: Concept

36. The ___ rule can be used to segment numeric data into relatively uniform,
“natural” intervals.
Ans: 3-4-5

37. Oracle, SQL/Server, DB2 are examples for ___.


Ans: DBMS

38. Data Base Management System (DBMS) supports query languages.


(True/False)
Ans: True
39. The ___ item sets find all sets of items (items sets) whose support is
greater than the user-specified minimum support, σ.
Ans: Frequent set

40. A frequent set is a ___ if it is a frequent set and no superset of this is a


frequent set.
Ans: Maximal frequent set

41. ___ techniques are used to detect relationships or associations between


specific values of categorical variables in large data sets.
Ans: Association rule mining

42. A Decision Tree is a ___ model.


Ans: Predictive model

43. Using a decision tree, only categorical variables would be modelled.


(True/False).
Ans: False

44. Clustering is an unsupervised learning method (True/false).


Ans: False

45. Neural networks are made up of many ___.


Ans: Artificial neurons

46. For a given transaction database T, a ___ is an expression of the form X


=> Y, where X and Y are subsets of A and X => Y holds with confidence Ʈ, if
Ʈ% of transactions in D support X also support Y.
Ans: Association rule

47. The ___ rule describes associations between quantitative items or


attributes.
Ans: Quantitative association

48. The ___ step eliminates the extensions of (k-1) – itemsets, which are not
found to be frequent, from being considered for counting support.
Ans: Pruning

49. In the first phase of the Partition algorithm, the algorithm logically divides
the database into a number of ___.
Ans: non – overlapping partitions.

50. The a priori algorithm operates in a ___ and ___.


Ans: bottom-up, breadth-first search method.

51. ___ algorithm works like a train running over the data, with stops at
intervals M between transactions. When the train reaches the end of the
transaction file it completes one path.
Ans: DIC Algorithm
52. FP–Tree Growth Algorithm can be implemented in ___ Phases.
Ans: Two

53. FP – tree stands for ___.


Ans: Frequent pattern tree

54. Data mining systems should provide capabilities to mine association rules
at multiple levels of abstraction and traverse easily among different
abstraction spaces (True/False).
Ans: True

55. Which one of the following is alternative search strategies for mining
multiple-level associations with reduced support?
a) Level – by level independent
b) Level – cross-filtering by a single item
c) Level – cross-filtering by k – itemset:
d) All the above
Ans: d) All the above

56. Which of the following is NOT a common binning strategy?


a) Equiwidth binning,
b) Equidepth binning,
c) Homogeneity – based binning,
d) Equilength binning
Ans: d) Equilength binning

57. Association rules that involve two or more dimension or predicates can be
referred to as ___.
Ans: Multidimensional association rules.

58. An algorithm that performs a series of “walks” through itemset space is


called a ___.
Ans: Random walk algorithm.

59. What are knowledge type constraints?


Ans: They specify the type of knowledge to be mined.

60. A standard measure of within-cluster similarity is ___.


Ans: variance

61. The process of grouping a set of physical or abstract objects into classes of
similar objects is called ___.
Ans: Cluster

62. Clustering may also be considered as ___.


Ans: Segmentation

63. Clustering is also called:


a. Segmentation
b. Compression
c. Partitions with similar objects
d. All the above
Ans: d. All the above

64. Clustering is used only in data mining (True/False).


Ans: True

65. Clustering is a form of learning by observation rather than ___.


Ans: By example

66. Weight and height of an individual fall into ___ kind of variables.
Ans: Continuous

67. In the K-means algorithm for partitioning, each cluster is represented by


the ___ of objects in the cluster.
Ans: Means

68. K-means clustering requires prior knowledge about number clusters


required as its input.(True/False).
Ans: True

69. One form of unsupervised learning is ___.


Ans: Clustering

70. ___ software provides a set of partitioned clustering algorithms that treat
the clustering problem as an optimization process.
Ans: CLUTO

71. Data classification is a ___ step process.


Ans: Two

72. ___ can be viewed as the construction and use of a model to assess the
class of an unlabeled sample, or to assess the value or value ranges of an
attribute that a given sample is likely to have.
Ans: Prediction

73. ___ of data removes or reduces noise (by applying smoothing techniques)
and the treatment of missing values.
Ans: Pre-processing

74. ___ method refers to the ability to construct the model efficiently given a
large amount of data.
Ans: Scalability

75. What is a decision tree?


Ans: This is a flow – chart – like a tree structure, where each internal node
denotes a test on an attribute, each branch represents an outcome of the test,
and leaf nodes represent classes or class distributions.
76. The basic algorithm for decision tree induction is a ___ algorithm.
Ans: greedy

77. The ___ measure is used to select the test attribute at each node in the
tree.
Ans: information gain

78. A user session is a ___ record spanning the entire Web.


Ans: Clickstream record

79. ___ is simple text files that are automatically generated every time
someone accesses one Website.
Ans: Log File

80. ___ files are frequently used in sequential mining.


Ans: Web log files

81. ___ is used to examine the structure of a particular website and collate
and analyze related data.
Ans: Structural mining

82. Which of the following techniques are concerned about user navigation
accessing?
a. Web structural mining
b. Web usage mining
c. Web content mining
d. Web data definition mining
Ans: b. Web usage mining

83. Web data is ___.


a. Structured data
b. Un-structured data
c. Only text data
d. Binary data
Ans: b. Un-structured data

84. ___ Web mining involves the development of Sophisticated Artificial


Intelligence systems.
Ans: an agent-based approach

85. The ___ approaches to Web mining have generally focused on techniques
for integrating and organizing the heterogeneous and semi-structured data on
the Web into more structured and high-level collections of resources.
Ans: database

86. Association rules involving multimedia objects can be mined in ___ and
___ databases.
Ans: Image and video
87. In ___ approach, the signature of an image includes color histograms
based on the color composition of an image regardless of its scale or
orientation.
Ans: Color histogram-based signature

88. Which of the following are the measures of the text retrieval documents?
a. Precision
b. Recall
c. F-score
d. a,b,c
Ans: d. a,b,c

89. Data stored in most text databases are ___.


Ans: Semi-structured

90. Which of the following is the first step in text retrieval systems?
a. Stemming
b. Term words finding
c. Tokenization
d. Replacing the null data with keywords
Ans: c. Tokenization

91. Which of the following are the stop words?


a. A
b. The
c. of
d. a,b,c
Ans: d. a,b,c

92. Text databases are also called ___.


Ans: Document databases

93. Insurance and direct mail are two industries that rely on ___ to make
profitable business decisions.
Ans: data analysis

94. To aid decision-making, analysts construct ___ models using warehouse


data to predict the outcomes of a variety of decision alternatives.
Ans: predictive

95. A ___ profile is a model that predicts the future purchasing behaviour of
an individual customer, given historical transaction data for both the
individual and for the larger population of all of a particular company’s
customers.
Ans: predictive

96. Data mining can be used to help predict future patient behaviour and to
improve treatment programs (True/False).
Ans: True
98. Data mining in the telecommunication industry helps to understand the
business involved, identify telecommunication patterns (True/False).
Ans: True

99. GDP stands for ___.


Ans: gross domestic product

100. ___ is proving to be a critical link between theory, simulation, and


experiment.
Ans: data-intensive computing

101. IDS are based on ___ that are developed by the manual encoding of
expert knowledge.
Ans: Handcrafted signatures

102. Choose the correct option.


Data mining can be used to improve ___.
a) Efficiency
b) Quality of data
c) Marketing
d) All the above
Ans: D. All the above.

103. To improve accuracy, data mining programs are used to analyze audit
data and extract features that can distinguish normal activities from
intrusions. (True/False)
Ans: True

104. Data mining-based IDSs (especially anomaly detection systems) have


higher false-positive rates than traditional handcrafted signature-based
methods. (True/False)
Ans: True

105. ___ is a new class of intrusion detection algorithms that do not rely on
labelled data.
Ans: Unsupervised anomaly detection

106. ___ algorithm uses the frequency distribution of each feature’s values to
proportionally generate a sufficient amount of anomalies.
Ans: Distribution Based Artificial Anomaly

107. OLAP typically includes the following kinds of analyses: simple,


comparison, trend, ___ and ___.
Ans: Variance and ranking

108. Patient Rule Induction Method (PRIM) and Weighted Item Sets (WIS), is
a type of ___ technique.
Ans: Association rule
109. ___ tools cannot discover high average regions or find new patterns in
data.
Ans: OLAP

110. ___ method is useful for finding patterns or associations between


attributes.
Ans: WIS

1. In the research process, the management question has the following critical activity in
sequence.

 Origin, selection, statement, exploration and refinement


 Origin, statement, selection, exploration and refinement
 Origin, exploration, selection, refinement, and statement
 Origin, exploration, refinement, selection and statement

2. The chapter that details the way in which the research was conducted is the _________
chapter

 Introduction
 Literature review
 Research methodology
 Data analysis
 Conclusion and recommendations

3. Business research has an inherent value to the extent that it helps management make
better decisions. Interesting information about consumers, employees, or competitors
might be pleasant to have, but its value is limited if the information cannot be applied to a
critical decision.

 True
 False

4. The researcher should never report flaws in procedural design and estimate their effect
on the findings.

 True
 False

5. Adequate analysis of the data is the least difficult phase of research for the novice.

 True
 False

6. The validity and reliability of the data should be checked occasionally

 True
 False

7. Researchers are tempted to rely too heavily on data collected in a prior study and use it
in the interpretation of a new study

 True
 False

8. What is a good research? The following are correct except


 Purpose clearly defined
 Research process detailed
 Research design thoroughly planned
 Findings presented ambiguously

9. Greater confidence in the research is warranted if the researcher is experienced, has a


good reputation in research, and is a person of integrity

 True
 False

10. A complete disclosure of methods and procedures used in the research study is
required. Such openness to scrutiny has a positive effect on the quality of research.
However, competitive advantage often mitigates against methodology disclosure in
business research.

 True
 False

11. Research is any organized inquiry carried out to provide information for solving
problems.

 True
 False

12. In deduction, the conclusion must necessarily follow from the reasons given. In
inductive argument there is no such strength of relationship between reasons and
conclusions.

 True
 False

13. Conclusions must necessarily follow from the premises. Identify the type of arguments
that follows the above condition.

 Induction
 Combination of Induction and Deduction
 Deduction Variables

14. Eminent scientists who claim there is no such thing as the scientific method, or if exists,
it is not revealed by what they write, caution researchers about using template like
approaches

 True
 False

15. One of the terms given below is defined as a bundle of meanings or characteristics
associated with certain events, objects, conditions, situations, and the like

 Construct
 Definition
 Concept
 Variable

16. This is an idea or image specifically invented for a given research and/or theory
building purpose
 Concept
 Construct
 Definition
 Variables

17. The following are the synonyms for independent variable except

 Stimulus
 Manipulated
 Consequence
 Presumed Cause

18. The following are the synonyms for dependent variable except

 Presumed effect
 Measured Outcome
 Response
 Predicted from…

19. In the research process, a management dilemma triggers the need for a decision.

 True
 False

20. Every research proposal, regardless of length should include two basic sections. They
are:

 Research question and research methodology


 Research proposal and bibliography
 Research method and schedule
 Research question and bibliography

21. The purpose of the research proposal is:

 To generate monetary sources for the organization


 To present management question to be researched and its importance
 To discuss the research efforts of others who have worked on related management
question.

22. A proposal is also known as a:

 Work plan
 Prospectus
 Outline
 Draft plan
 All of the above

23. Non response error occurs when you cannot locate the person or could not encourage
the respondent to participate in answering.

 True
 False

24. Secondary data can almost always be obtained more quickly and at a lower cost than
__________data.

 Tertiary
 Collective
 Research
 Primary

25. The purpose of __________________ research is to help in the process of developing


a clear and precise statement of the research problem rather than in providing a definitive
answer.

 Marketing
 Causal
 Exploratory
 Descriptive

Answers
1. a. 2. c. 3. a. 4. b. 5. b. 6. b. 7. a. 8. d. 9. a. 10. a. 11. a. 12. a. 13. C. 14. a. 15. c. 16. b.
17. c. 18. d. 19. a. 20. a. 21. a. 22. e. 23. a. 24. d. 25. c.

R was created by?


A. Ross Ihaka
B. Robert Gentleman
C. Both A and B
D. Ross Gentleman

View Answer

Ans : C

Explanation: R was created by Ross Ihaka and Robert Gentleman at the


University of Auckland, New Zealand, and is currently developed by the
R Development Core Team.

2. R allows integration with the procedures written in the?


A. C
B. Ruby
C. Java
D. Basic

View Answer

Ans : A
Explanation: R allows integration with the procedures written in the C,
C++, .Net, Python or FORTRAN languages for efficiency.

3. R is free software distributed under a GNU-style copy left, and


an official part of the GNU project called?
A. GNU A
B. GNU S
C. GNU L
D. GNU R

View Answer

Ans : B

Explanation: R is free software distributed under a GNU-style copy left,


and an official part of the GNU project called GNU S.

4. R made its first appearance in?


A. 1992
B. 1995
C. 1993
D. 1994

View Answer

Ans : C

Explanation: R made its first appearance in 1993.

5. Which of the following is true about R?


A. R is a well-developed, simple and effective programming
language
B. R has an effective data handling and storage facility
C. R provides a large, coherent and integrated collection of tools
for data analysis.
D. All of the above

View Answer

Ans : D

Explanation: All of the above statement are true.

6. Point out the wrong statement?

A. Setting up a workstation to take full advantage of the


customizable features of R is a straightforward thing
B. q() is used to quit the R program
C. R has an inbuilt help facility similar to the man facility of UNIX
D. Windows versions of R have other optional help systems also

View Answer

Ans : B

Explanation: help command is used for knowing details of particular


command in R.

7. Command lines entered at the console are limited to about


________ bytes
A. 4095
B. 4096
C. 4097
D. 4098

View Answer

Ans : A

Explanation: Elementary commands can be grouped together into one


compound expression by braces (‘{’ and ‘}’).
8. R language is a dialect of which of the following languages?
A. s
B. c
C. sas
D. matlab

View Answer

Ans : A

Explanation: The R language is a dialect of S which was designed in the


1980s. Since the early 90’s the life of the S language has gone down a
rather winding path. The scoping rules for R are the main feature that
makes it different from the original S language.

9. How many atomic vector types does R have?


A. 3
B. 4
C. 5
D. 6

View Answer

Ans : D

Explanation: R language has 6 atomic data types. They are logical,


integer, real, complex, string (or character) and raw. There is also a class
for “raw” objects, but they are not commonly used directly in data
analysis.

10. R files has an extension _____.


A. .S
B. .RP
C. .R
D. .SP
View Answer

Ans : C

Explanation: All R files have an extension .R. R provides a mechanism for


recalling and re-executing previous commands. All S programmed files
will have an extension .S. But R has many functions than S.

This set of R Programming Language Multiple Choice Questions & Answers


(MCQs) focuses on “Overview of R – 1”.

1. They primary R system is available from the ______


a) CRAN
b) CRWO
c) GNU
d) CRDO
View Answer

Answer: a
Explanation: CRAN stands for Comprehensive R Archive Network.

2. Point out the wrong statement?


a) Key feature of R was that its syntax is very similar to S
b) R runs only on Windows computing platform and operating system
c) R has been reported to be running on modern tablets, phones, PDAs, and
game consoles
d) R functionality is divided into a number of Packages
View Answer

Answer: b
Explanation: R runs on almost any standard computing platform and
operating system.

3. R functionality is divided into a number of ________


a) Packages
b) Functions
c) Domains
d) Classes
View Answer

Answer: a
Explanation: CRAN also hosts many add-on packages that can be used to
extend the functionality of R.

4. Which Package contains most fundamental functions to run R?


a) root
b) child
c) base
d) parent
View Answer

Answer: c
Explanation: base package in R contains the most fundamental functions.

Note: Join free Sanfoundry classes at Telegram or Youtube

advertisement
5. Point out the wrong statement?
a) One nice feature that R shares with many popular open source projects is
frequent releases
b) R has sophisticated graphics capabilities
c) S’s base graphics system allows for very fine control over essentially every
aspect of a plot or graph
d) All of the mentioned
View Answer

Answer: c
Explanation: R has maintained the original S philosophy, which is that it
provides a language that is both useful for interactive work, but contains
a powerful programming language for developing new tools.
6. Which of the following is a base package for R language?
a) util
b) lang
c) tools
d) spatial
View Answer

Answer: c
Explanation: The other packages contained in the “base” system include
utils, stats, datasets, graphics, grDevices, grid, methods, parallel, compiler,
splines, tcltk, stats4.

Take R Programming Tests Now!

7. Which of the following is “Recommended” package in R?


a) util
b) lang
c) stats
d) spatial
View Answer

Answer: d
Explanation: “Recommended” packages also include boot, class, cluster,
codetools, foreign, KernSmooth, lattice, mgcv, nlme, rpart, survival,
MASS, nnet, Matrix.

8. What is the output of getOption(“defaultPackages”) in R studio?


a) Installs a new package
b) Shows default packages in R
c) Error
d) Nothing will print
View Answer

Answer: b
Explanation: There are base packages (which come with R automatically),
and contributed packages. The base packages are maintained by a select
group of volunteers called R Core. In addition to the base packages, there
are over ten thousand additional contributed packages written by
individuals all over the world.

9. Advanced users can write ___ code to manipulate R objects directly.


a) C, C++
b) C++, Java
c) Java, C
d) Java
View Answer

Answer: a
Explanation: For computationally-intensive tasks, C, C++ and Fortran
code can be linked and called at run time.

10. Which of the following is used for Statistical analysis in R language?


a) RStudio
b) Studio
c) Heck
d) KStudio
View Answer

Answer: a
Explanation: R

What will be the output of the following R program?

r<-0:10

r[2]

a) 0
b) 1
c) 2
d) 3
View Answer
Answer: b
Explanation: 1 is the output of the above code as indexing in R starts from
1. The output can be viewed in the R console. R studio has both R terminal
and the R console. Each output format is implemented as a function in R.
You can customize the output by passing arguments to the function as
sub-values of the output field.

Sanfoundry Certification Contest of the Month is Live. 100+ Subjects.


Participate Now!

advertisement
2. Which of the following operator is used to create integer sequences?
a) :
b) ;
c) –
d) ~
View Answer

Answer: a
Explanation: “:” operator is used to create an integer sequence. The other
operators are used for other purposes. Integer sequence is the basic
operator used in R. The [ operator can be used to extract multiple elements
of a vector by passing the operator an integer sequence.

3. What will be the output of the following R program?

Check this: Information Technology Books | Programming MCQs

y<-0:5

vector(y)

y[3]

a) Error in vector(y): invalid ‘mode’ argument


b) 1
c) 4
d) 3
View Answer

Answer: a
Explanation: y is already vector; second line is an invalid argument. The
third line will give us the output. When an R vector is printed you will
notice that an index for the vector is printed in square brackets
[] on the side.

4. In R language, a vector is defined that it can only contain objects of the


________
a) Same class
b) Different class
c) Similar class
d) Any class
View Answer

Answer: a
Explanation: A vector can only contain objects of the same class. A vector
cannot have contain objects of the different class. Same class objects are
used mostly. The most basic type of R object is a vector. Empty vectors can
be created with the vector() function.

5. A list is represented as a vector but can contain objects of ___________


a) Same class
b) Different class
c) Similar class
d) Any class
View Answer

Answer: b
Explanation: A list can contain objects of different class. But a vector can
only contain objects of the same class. A vector cannot have contain objects
of the different class. Same class objects are used mostly.
6. How can we define ‘undefined value’ in R language?
a) Inf
b) Sup
c) Und
d) NaN
View Answer

Answer: d
Explanation: NaN is used to define the “undefined” value in the R language.
Undefined values also have some value in R. Missing values are denoted by
NA or NaN for q undefined mathematical operations. A NaN value is also
NA but the converse is not true.

7. What is NaN called?


a) Not a Number
b) Not a Numeric
c) Number and Number
d) Number a Numeric
View Answer

Answer: a
Explanation: NaN is called Not a Number. It is the full form of NaN. Full
forms can be viewed in R studio by typing help. A NaN value is also NA but
the converse is not true. The value NaN represents an undefined value.

8. How can we define ‘infinity’ in R language?


a) Inf
b) Sup
c) Und
d) NaN
View Answer

Answer: a
Explanation: Inf is used to define “Infinity” in R. It is somewhat different
from other programming languages. There is also a special number of Inf
which represents infinity.

9. What will be the output of the following R code?

y <- c(TRUE, 2)

a) [1] “TRUE” “2”


b) [1] “TRUE” 2
c) [1] “0” “2”
d) [1] 1 2
View Answer

Answer: d
Explanation: Here TRUE is taken as 1. Then it will give output as 1 and 2.
FALSE can be taken as 0. T and F are short-hand ways to specify TRUE
and FALSE.

10. What is the class defined by the following R code?

y<-c(2,”t”)

a) Character
b) Numeric
c) Logical
d) Integer
View Answer

Answer: a
Explanation: Here 2 is changed into character. Since the y belongs to list.
A list contains only characters. Combining a numer

What is the class defined in the following R code?

y<-c(FALSE,2)

a) Character
b) Numeric
c) Logical
d) Integer
View Answer

Answer: b
Explanation: Numeric and FALSE is executed as 0. It is somewhat different
from other programming languages. Console will give a class as Numeric.
A vector can only contain objects of the same class. A list is represented as
a vector but can contain objects of different classes.

Subscribe Now: R Programming Newsletter | Important Subjects


Newsletters

advertisement
2. Which one of the following is not a basic datatype?
a) Numeric
b) Character
c) Data frame
d) Integer
View Answer

Answer: c
Explanation: Data frame is not the basic data type of R. Numeric,
character, integer are the basic types of R. The basic data types are used
many times. Data frames are used to store tabular data in R. They are an
important type of object in R and are used in a variety of statistical
modelling applications.

3. How do you create an integer suppose 5 in R?


a) 5L
b) 5l
c) 5i
d) 5d
View Answer
Answer: a
Explanation: To create an integer L should be added to the integer. L is
added to specify that it is an integer. An integer can also be created with
many types. If you explicitly want an integer, you need to specify the L
suffix.

Participate in R Programming Certification Contest of the Month Now!

4. What will be the output of the following R code?

x<- c (“a”,” b”)

as.numeric(x)

a) [1] 1 2
b) [1] TRUE TRUE
c) [1] NA NA (Warning message: NAs introduced by coercion)
d) [1] NAN
View Answer

Answer: c
Explanation: Characters cannot be expressed as numeric. Therefore NA’s
are printed as output. NA will specify the missing elements in the list.
When nonsensical coercion takes place, you will usually get a warning from
R.

5. The dimension attribute is itself an integer vector of length _______


a) 1
b) 2
c) 3
d) 4
View Answer

Answer: b
Explanation: It is itself an integer vector of length 2. The dimension
attribute in R is an integer vector. Real values larger in modulus than the
largest integer are coerced to NA. Matrices are vectors with a dimension
attribute. The dimension attribute is itself an integer vector of length 2
(number of rows, number of columns).

6. How could be the matrix constructed by using the following R code?

m <- matrix(1:6, nrow = 2, ncol = 3)

a) row-wise
b) column-wise
c) any manner
d) data insufficient
View Answer

Answer: b
Explanation: If nothing is mentioned, matrix is created column-wise. If we
want in row-wise then we have to specify. We have to mention “by row”
to create a matrix in row wise. The filter( ) function is used to extract
subsets of rows from a data frame. This function is similar to the existing
subset( ) function.

7. Matrices can be created by row-binding with the help of the following


function.
a) rjoin()
b) rbind()
c) rowbind()
d) rbinding()
View Answer

Answer: b
Explanation: rbind() is used to create a matrix by row-binding. Row-
binding is the basic function of R. R – bind is used to bind the functions in
R. Matrices can be created by column-binding or row-binding with the
cbind() and rbind() functions.

8. What is the function used to test objects (returns a logical operator) if they
are NA?
a) is.na()
b) is.nan()
c) as.na()
d) as.nan()
View Answer

Answer: a
Explanation: is.na() is the function used to test if they are NA. We can
check NA ‘s at any stage of the code. Generally, We will remove the NA’s
for the operations in R like mean etc.., is.na() is used to test objects if they
are NA.

9. What is the function used to test objects (returns a logical operator) if they
are NaN?
a) as.nan()
b) is.na()
c) as.na()
d) is.nan()
View Answer

Answer: d
Explanation: is.nan() is used to test if they are NaN. We can check NAN‘s
at any stage of the code. We will remove the NA’s for the operations in R.
is.nan() is used to test for NaN.

10. What is the function to set column names for a matrix?


a) names()
b) colnames()
c) col.names()
d) column name cannot be set for a matrix
View Answer

Answer: b
Explanation: colnames() is the function to set column names for a matrix.
rownames() is the function to set row names for a mat
1. The most convenient way to use R is at a graphics workstation running
a ________ system.
a) windowing
b) running
c) interfacing
d) matrix
View Answer

Answer: a
Explanation: Most classical statistics and much of the latest methodology
is available for use with R.

2. Point out the wrong statement?


a) Setting up a workstation to take full advantage of the customizable
features of R is a straightforward thing
b) q() is used to quit the R program
c) R has an inbuilt help facility similar to the man facility of UNIX
d) Windows versions of R have other optional help systems also
View Answer

Answer: b
Explanation: help command is used for knowing details of particular
command in R.

3. Which of the following is default prompt for UNIX environment?


a) >
b) >>
c) <
d) <<
View Answer

Answer: a
Explanation: When you use the R program it issues a prompt when it
expects input commands.
4. Which of the following will start the R program?
a) $ R
b) > R
c) * R
d) @ R
View Answer

Answer: a
Explanation: At this point R commands may be issued.

Note: Join free Sanfoundry classes at Telegram or Youtube

advertisement
5. Point out the wrong statement?
a) Windows versions of R have other optional help system also
b) The help.search command (alternatively ??) allows searching for help in
various ways
c) R is case insensitive as are most UNIX based packages, so A and a are
different symbols and would refer to different variables
d) $ R is used to start the R program
View Answer

Answer: c
Explanation: R is an expression language with a very simple syntax.

6. Which of the following statement is alternative to _________

Take R Programming Practice Tests - Chapterwise!


Start the Test Now: Chapter 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

?solve

a) help(solve)
b) print(solve)
c) bind(solve)
d) matrix(solve)
View Answer

Answer: a
Explanation: help is used to get more information on any specific named
function.

7. Elementary commands in R consist of either _______ or assignments.


a) utilstats
b) language
c) expressions
d) packages
View Answer

Answer: c
Explanation: If an expression is given as a command, it is evaluated,
printed (unless specifically made invisible), and the value is lost.

8. If a command is not complete at the end of a line, R will give a different


prompt, by default it is ____________
a) *
b) –
c) +
d) /
View Answer

Answer: c
Explanation: Comments can be put almost anywhere, starting with a
hashmark (‘#’), everything to the end of the line is a comment.

9. Command lines entered at the console are limited to about ________ bytes.
a) 3000
b) 4095
c) 5000
d) 6000
View Answer

Answer: b
Explanation: Elementary commands can be grouped together into one
compound expression by braces (‘{’ and ‘}’).

10._____ text editor provides more general support mechanisms via ESS for
working interactively with R.
a) EAC
b) Emacs
c) Shell
d) ECAP
View Answer

Answer: b
Explanation: The recall and editing capabili

1. What is output of getOption(“defaultPackages”) in R studio?


a) Installs a new package
b) Shows default packages in R
c) Error
d) Nothing will print
View Answer

Answer: b
Explanation: There are base packages (which come with R automatically),
and contributed packages. The base packages are maintained by a select
group of volunteers, called R Core. In addition to the base packages, there
are over ten thousand additional contributed packages written by
individuals all over the world.

2. What will be the output of the following R code?

x <- c(3, 7, NA, 4, 7)


y <- c(5, NA, 1, 2, 2)

x + y

a) Symbol
b) Missing Data
c) 5
d) 15.5
View Answer

Answer: b
Explanation: Missing data are a persistent and prevalent problem in many
statistical analyses, especially those associated with the social sciences. R
reserves the special symbol NA to represent missing data. Ordinary
arithmetic with NA value gives NA’s (addition, subtraction, etc.) and
applying a function to a vector that has a NA in it will usually give a NA.

Note: Join free Sanfoundry classes at Telegram or Youtube

advertisement
3. R language is a dialect of which of the following languages?
a) S
b) C
c) MATLAB
d) SAS
View Answer

Answer: a
Explanation: The R language is a dialect of S which was designed in the
1980s. Since the early 90’s the life of the S language has gone down a
rather winding path. The scoping rules for R are the main feature that
makes it different from the original S language.

Take R Programming Tests Now!

4. R language has superficial similarity with _________


a) C
b) Python
c) MATLAB
d) SAS
View Answer

Answer: a
Explanation: The language syntax has a superficial similarity with C, but
the semantics are of the FPL (functional programming language) variety
with stronger affinities with Lisp and APL. There are many syntaxes in C,
which are closely resembled with R.

5. What is the mode of ‘a’ in the following R code?

a <- c(1,” a”, FALSE)

a) Numeric
b) Character
c) Integer
d) Logical
View Answer

Answer: b
Explanation: All three elements can be expressed as a character. Both
paste() and cat() will printout text to the console by combining multiple
character vectors together. The original data are formatted as character
strings so we convert them to R’s Date format for easier manipulation.

6. What is the length of b?

b <- 2:7

a) 4
b) 5
c) 6
d) 0
View Answer

Answer: c
Explanation: Length of b [1] 2 3 4 5 6 7 is 6. We can also create an empty
list of a prespecified length with the vector() function. Data frames are
represented as a special type of list where every element of the list has to
have the same length.

7. What is the mode of b in the following R code?

b <- c(TRUE, TRUE, 1)

a) Numeric
b) Character
c) Integer
d) Logical
View Answer

Answer: a
Explanation: All the elements in ‘b’ can be expressed in numeric. Both
paste() and cat() will printout text to the console by combining multiple
character vectors together. The original data are formatted as character
strings so we convert them to R’s Data format for easier manipulation.

8. What are the typeof(x) and mode(x) in the following R syntax?

x<-1:3

a) Numeric, Integer
b) Integer, Numeric
c) Integer, Integer
d) Numeric, Numeric
View Answer

Answer: b
Explanation: Here typeof() tells about the data type. They are an
important type of object in R and are used in a variety of statistical
modelling applications. You can determine an object’s type with the typeof
function.
9. How many atomic vector types does R have?
a) 5
b) 6
c) 8
d) 10
View Answer

Answer: b
Explanation: R language has 6 atomic data types. They are logical, integer,
real, complex, string (or character) and raw. There is also a class for “raw”
objects, but they are not commonly used directly in data analysis.

10. What is the function to set row names for a data frame?
a) row.names()
b) colnames()
c) col.names()
d) column name cannot be set for a data frame
View Answer

Answer: a
Explanation: row.names() is the function to set row names for a data
frame. Data frames have a special attribute called row.name

1. Is It possible to inspect the source code of R?


a) Yes
b) No
c) Can’t say
d) Some times
View Answer

Answer: a
Explanation: Anybody is free to download and install these packages and
even inspect the source code. The instructions for obtaining R largely
depend on the user’s hardware and operating system.
2. How to install for a package and all of the other packages on which for
depends?
a) install.packages (for, depends = TRUE)
b) R.install.packages (“for”, depends = TRUE)
c) install.packages (“for”, depends = TRUE)
d) install (“for”, depends = FALSE)
View Answer

Answer: c
Explanation: To install a package named for, open up R and type
install.packages(“for”). To install foo and additionally install all of the
other packages on which for depends, instead type install.packages (“for”,
depends = TRUE).

3. __________ function is used to watch for all available packages in library.


a) lib()
b) fun.lib()
c) libr()
d) library()
View Answer

Answer: d
Explanation: Type library() at the command prompt to see a list of all
available packages in the library. For total information about the
installation of R and add-on packages, see the R Installation and
Administration manual.

Sanfoundry Certification Contest of the Month is Live. 100+ Subjects.


Participate Now!

advertisement
4. The longer programs are called ____________
a) Files
b) Structures
c) Scripts
d) Data
View Answer

Answer: c
Explanation: The longer programs called scripts, there is too much code to
write all at once at the command prompt. Furthermore, for longer scripts,
it is convenient to be able to only modify a certain piece of the script and
run it again in R.

5. Scripts will run on ___________________


a) Script Editors
b) Console
c) Terminal
d) GCC Compiler
View Answer

Answer: a
Explanation: script editors are designed to aid the communication and
code writing process. They have all sorts of features including R syntax
highlighting, automatic code completion, delimiter matching, and
dynamic help on the R functions.

Check this: Programming MCQs | Information Technology Books

6. Which of the following is a “Recommended” package in R?


a) Util
b) Lang
c) Stats
d) Spatial
View Answer

Answer: d
Explanation: “Recommended” packages also include boot, class, cluster,
codetools, foreign, KernSmooth, lattice, mgcv, nlme, rpart, survival,
MASS, nnet, Matrix. There are about ten thousand packages in R now.
7. Full Form of GUI is ___________________
a) Guided User Interface
b) Graphical User Interface
c) Guided Used Interface
d) Graphical User Interval
View Answer

Answer: b
Explanation: GUI elements are usually accessed through a device. All
programs running a GUI use a consistent set of graphical elements so that
once the user learns a particular interface.

8. ____________ provides a point-and-click interface to many basic statistic


problems.
a) Commander
b) GUI
c) Console
d) Terminal
View Answer

Answer: a
Explanation: R Commander provides a point-and-click interface to
statistical problems. It is called the “Commander” because every time one
makes a selection, the code corresponding to the task is listed in the output
window.

9. What will be the output of the following R code?

options(digits = 16)20/6

a) 3.33
b) 3.333
c) 3.3333333
d) 3.3333333333333333
View Answer
Answer: d
Explanation: We know that 20/6 is a repeating decimal, We can change
the number of digits displayed with options. This will make the number
after the decimal point to extend for the required amount.

10. In which IDE we can interact with R?


a) R studio
b) Console
c) GCC
d) Power shell
View Answer

Answer: a
Explanation: An IDE tailored to the needs of interactive data analysis and
statistical programming called R studio. In R studio we can directly
interact with R through the inbuilt functions and packages. We can also
download new packages.

11. Which programming language is more based on the results?


a) R
b) C
c) C++
d) Java
View Answer

Answer: a
Explanation: Compared to other programming languages, the R
community tends to be more focussed on results instead of processes.
Knowledge of software engineering best practice.

12. Why learning R becomes tough?


a) Special files
b) Functions
c) Packages
d) Special Cases
View Answer

Answer: d
Explanation: You are confronted with over 20 years of evolution every
time you use R. Learning R can be hard because there are many special
cases in R to remember. R is the best user of memory.

13. R is mostly used in ______________


a) Problem solving
b) Statistics
c) Probability
d) All of the mentioned
View Answer

Answer: d
Explanation: Statistics for relatively advanced users. R has thousands of
packages, designed, maintained, and widely used by statisticians. We can
code ourselves if a command is not present.

14. Why is it needed for R studio to update regularly?


a) Bugs
b) More Functions
c) Methods
d) For more packages
View Answer

Answer: a
Explanation: RStudio is very popular with a nice interface and well
thought out, especially for more advanced usage. It can be a bit buggy, so
make sure you update it regularly. Available on all platforms.

15. What is the meaning of “<-“?


a) Functions
b) Loops
c) Addition
d) Assignment
View Answer

Answer: d
Explanation: The expression a <- 16 creates a variable called a and gives
it the value 16 called assignment. The variable on the left is assigned to the
value on the right. The left side should have only a single one.

16. In the expression x <- 4 in R, what is the class of ‘x’ as determined by


the `class()’ function?
a) Character
b) Numeric
c) Integer
d) Word
View Answer

Answer: c
Explanation: In R, there is an extension of the numeric or character
vectors. They are not a separate type of object but simply an atomic vector
with dimensi

. What will be the output of the following R code?

> x <- 1> print(x)

a) 1
b) 2
c) 3
d) 5
View Answer

Answer: a
Explanation: When a complete expression is entered at the prompt, it is
evaluated and the result of the evaluated expression is returned.
2. Point out the wrong statement?
a) The grammar of the language determines whether an expression is
complete or not
b) The <- symbol is the assignment operator in R
c) The ## character indicates a comment
d) R does not support multi-line comments or comment blocks
View Answer

Answer: c
Explanation: Unlike some other languages, R does not support multi-line
comments or comment blocks.

Note: Join free Sanfoundry classes at Telegram or Youtube

advertisement
3. Which of the R following code is example of explicit printing?
a)

> x <- 5 > x

b)

Take R Programming Mock Tests - Chapterwise!


Start the Test Now: Chapter 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

> x <- 5 > print(x)

c)

> x <- "auto" > x

d)

> x <- "auto" > x <- "auto"

View Answer
Answer: b
Explanation: Print command is used for outputting the value.
4. Files containing R scripts ends with extension ____________
a) .S
b) .R
c) .Rp
d) .SP
View Answer

Answer: b
Explanation: Under many versions of UNIX and on Windows, R provides a
mechanism for recalling and re-executing previous commands.

5. Point out the wrong statement?


a) : operator is used to create integer sequences
b) The numbers in the square brackets are part of the vector itself
c) There is a difference between the actual R object and the manner in which
that R object is printed to the console
d) Files containing R scripts ends with extension .R
View Answer

Answer: b
Explanation: They are merely part of the printed output.

6. If commands are stored in an external file, say commands.R in the


working directory work, they may be executed at any time in an R session
with the command ____________
a) source(“commands.R”)
b) exec(“commands.R”)
c) execute(“commands.R”)
d) exect(“command.R”)
View Answer

Answer: a
Explanation: For Windows, Source is also available on the File menu.
7. _______ will divert all subsequent output from the console to an external file.
a) sink
b) div
c) exp
d) exc
View Answer

Answer: a
Explanation: sink() restores it to the console once again.

8. The entities that R creates and manipulates are known as ________


a) objects
b) task
c) container
d) packages
View Answer

Answer: a
Explanation: These may be variables, arrays of numbers, character strings,
functions, or more general structures built from such components.

9. Which of the following can be used to display the names of (most of) the
objects which are currently stored within R?
a) object()
b) objects()
c) list()
d) class()
View Answer

Answer: b
Explanation: During an R session, objects are created and stored by name.

10. Collection of objects currently stored in R is called as ________________


a) package
b) workspace
c) list
d) task
View Answer

Answer: b
Explanation: All objects created during an R session can be stored
permanently in a file for use in future R sessions.

1. Which of the following is used where the target variable is of


categorical nature?
A. Keras
B. Knime
C. Logistic Regression
D. MXNet

View Answer

Ans : C

Explanation: It’s a classification algorithm that is used where the target


variable is of categorical nature. The main objective behind Logistic
Regression is to determine the relationship between features and the
probability of a particular outcome.

2. How many different types of Logistic Regression?


A. 2
B. 3
C. 4
D. 5

View Answer

Ans : B
Explanation: Three different types of Logistic Regression are as follows:
Binary Logistic Regression, Multinomial Logistic Regression and Ordinal
Logistic Regression

3. _________ the target variable can have three or more possible


values without any order.
A. Multinomial Logistic Regression
B. Binary Logistic Regression
C. Ordinal Logistic Regression
D. All of the above

View Answer

Ans : A

Explanation: Multinomial Logistic Regression: In this, the target variable


can have three or more possible values without any order.

4. _______ are defined as the ratio of the probability of an event


occurring to the probability of the event not occurring.
A. Simple
B. Even
C. Regex
D. Odds

View Answer

Ans : D

Explanation: Odds are defined as the ratio of the probability of an event


occurring to the probability of the event not occurring.

5. SVM is insensitive to individual samples.


A. Yes
B. No
C. Can be yes or no
D. Can not say

View Answer

Ans : A

Explanation: Yes, SVM is insensitive to individual samples. So, to


accommodate an outlier there will not be a major shift in the linear
boundary. SVM comes with inbuilt complexity controls, which take care
of overfitting, which is not true in the case of Logistic Regression.

6. Which of the following are advantages of the logistic


regression?
A. Logistic Regression is very easy to understand
B. It requires less training
C. It performs well for simple datasets as well as when the data
set is linearly separable
D. All of the above

View Answer

Ans : D

Explanation: All of the above are are the advantages of Logistic


Regression

7. 0 and 1, or pass and fail or true and false is an example of?


A. Multinomial Logistic Regression
B. Binary Logistic Regression
C. Ordinal Logistic Regression
D. None of the above

View Answer

Ans : B
Explanation: Binary Logistic Regression: In this, the target variable has
only two 2 possible outcomes. For Example, 0 and 1, or pass and fail or
true and false.

8. Mean Square Error (MSE) is suitable for Logistic Regression.


A. TRUE
B. FALSE
C. Can be true or false
D. Can not say

View Answer

Ans : A

Explanation: MSE is not suitable for Logistic Regression

9. What are the disadvantages of Logistic Regression?


A. Sometimes a lot of Feature Engineering is required
B. It is quite sensitive to noise and overfitting
C. Both A and B
D. None of the above

View Answer

Ans : C

Explanation: Both A and B are the disadvantages of Logistic Regression.

10. Can we solve the multiclass classification problems using


Logistic Regression?
A. Yes
B. No
C. Can be yes or no
D. Can not say
View Answer

Ans : A

Explanati

1: Which of the following is not an assumption for simple linear regression?

a. Normally distributed variables


b. Multicollinearity
c. Linear relationship
d. Constant variance
e. Normally distributed residuals

Ans: B

2: Continuous predictors influence the ______ of the regression line, while


categorical predictors influence the _____________.

a. slope, intercept
b. intercept, slope
c. R2, p-value
d. p-value, R2

Ans: A

3: Which of the following is true about the adjusted R2?

a. It is usually larger than the R2


b. It is only used when there is just one predictor
c. It is usually smaller than the R2
d. It is used to determine whether residuals are normally distributed

Ans: C

4: Significance for the coefficients (b) is determined by

a. an F-test.
b. an R2 test.
c. a correlation coefficient.
d. a t-test.

Ans: D

5: The R2 is the squared correlation of which two values?

a. y and the predicted values of y


b. y and each continuous x
c. b and t
d. b and se

Ans: A

1. We should use Simple Linear Regression to predict the winner of a


football game
 ​ True
 ​ False
2. Which of the following formulas is not a simple linear regression model ?

 ​ Salary = a * Experience
 ​ Salary = a * Experience + b
 ​ Salary = a * Experience + b * Age
3. What is the class used in Python to create a simple linear regressor ?

 ​ SimpleLinear
 ​ LinearRegression
 ​ LinReg
 ​ SimpleLinearRegression
4. What is the function used in R to create a simple linear regressor ?

 ​ lr
 ​ slr
 ​ lm
 ​ slm
5. What is the correct way of writing a simple linear regression equation in
the formula parameter in R ?
 ​ Salary = YearsExperience
 ​ Salary ~ YearsExperience
 ​ Salary == YearsExperience
 ​ Salary = a * YearsExperience + b

Multiple Choice Questions on


Multiple Regression
6. Which of the following formula is not a multiple linear regression model ?

 ​ Salary = a * Experience + b * Age + c


 ​ Salary = a * Experience + b * Age + c * Level
 ​ Salary = a * Experience + b * Age + c * Level + d
 ​ Salary = a * Experience + b * Age^2
 ​ Salary = a * Experience + b * Age
7. Which library to import in Python for Multiple Linear Regression ?

 ​ MultipleLinearRegression
 ​ MultipleRegression
 ​ LinearRegression
 ​ LinearModel
8. In Python, the code to create a multiple linear regressor is exactly the
same as the one to create a simple linear regressor.

 ​ True
 ​ False
9. In R, which multiple linear regression equation can we input in the
formula parameter ?

 ​ Salary ~ *
 ​ Salary = *
 ​ Salary ~ .
 ​ Salary = Experience + Age
10. We should use Multiple Linear Regression to predict a dependent
variable that is growing exponentially with time.

 ​ Yes
 ​ No
Multiple Choice Questions on
Logistic Regression
11. Logistic Regression is a linear classifier

 ​ True
 ​ False
12. Logistic Regression returns probabilities

 ​ True
 ​ False
13. In Python, what is the class used to create a logistic regression
classifier ?

 ​ GLM
 ​ LogisticRegression
 ​ Logit
 ​ LogReg
14. In R, what is the function used to create a Logistic Regression classifier ?

 ​ lr
 ​ lm
 ​ glm
 ​ glr
15. In R, what value do we need to input for the family parameter ?

 ​ family = linear
 ​ family = logistic
 ​ family = binomial
 ​ family = response

1) Which of the following is an example of time series problem?

1. Estimating number of hotel rooms booking in next 6 months.


2. Estimating the total sales in next 3 years of an insurance
company.
3. Estimating the number of calls for the next one week.
A) Only 3
B) 1 and 2
C) 2 and 3
D) 1 and 3
E) 1,2 and 3

Solution: (E)

All the above options have a time component associated.

2) Which of the following is not an example of a time series


model?

A) Naive approach
B) Exponential smoothing
C) Moving Average
D)None of the above

Solution: (D)

Naïve approach: Estimating technique in which the last


period’s actuals are used as this period’s forecast, without
adjusting them or attempting to establish causal factors. It is
used only for comparison with the forecasts generated by the
better (sophisticated) techniques.

In exponential smoothing, older data is given


progressively-less relative importance whereas newer data is
given progressively-greater importance.
In time series analysis, the moving-average (MA) model is a
common approach for modeling univariate time series.
The moving-average model specifies that the output variable
depends linearly on the current and various past values of a
stochastic (imperfectly predictable) term.

3) Which of the following can’t be a component for a time


series plot?

A) Seasonality
B) Trend
C) Cyclical
D) Noise
E) None of the above

Solution: (E)

A seasonal pattern exists when a series is influenced


byseasonal factors (e.g., the quarter of the year, the month, or
day of the week). Seasonality is always of a fixed and known
period. Hence, seasonal time series are sometimes called
periodic time series

Seasonality is always of a fixed and known period. A cyclic


pattern exists when data exhibit rises and falls that are not of
fixed period.

Trend is defined as the ‘long term’ movement in a time series


without calendar related and irregular effects, and is a
reflection of the underlying level. It is the result of influences
such as population growth, price inflation and general
economic changes. The following graph depicts a series in
which there is an obvious upward trend over time.

Quarterly Gross Domestic Product

Noise: In discrete time, white noise is a discrete signal whose


samples are regarded as a sequence of serially uncorrelated
random variables with zero mean and finite variance.

Thus all of the above mentioned are components of a time


series.

4) Which of the following is relatively easier to estimate in time


series modeling?
A) Seasonality
B) Cyclical
C) No difference between Seasonality and Cyclical

Solution: (A)

As we seen in previous solution, as seasonality exhibits fixed


structure; it is easier to estimate.

5) The below time series plot contains both Cyclical and


Seasonality component.

A) TRUE
B) FALSE

Solution: (B)

There is a repeated trend in the plot above at regular intervals


of time and is thus only seasonal in nature.
6) Adjacent observations in time series data (excluding white
noise) are independent and identically distributed (IID).

A) TRUE

B) FALSE

Solution: (B)

Clusters of observations are frequently correlated with


increasing strength as the time intervals between them
become shorter. This needs to be true because in time series
forecasting is done based on previous observations and not the
currently observed data unlike classification or regression.

7) Smoothing parameter close to one gives more weight or


influence to recent observations over the forecast.

A) TRUE
B) FALSE

Solution: (A)

It may be sensible to attach larger weights to more recent


observations than to observations from the distant past. This
is exactly the concept behind simple exponential smoothing.
Forecasts are calculated using weighted averages where the
weights decrease exponentially as observations come from
further in the past — the smallest weights are associated with
the oldest observations:
y^T+1|T=αyT+α(1−α)yT−1+α(1−α)2yT−2+⋯ ,(7.1)

where 0≤α≤10≤α≤1 is the smoothing parameter. The


one-step-ahead forecast for time T+1T+1 is a weighted
average of all the observations in the series y1,…,yT. The rate
at which the weights decrease is controlled by the
parameter αα.

8) Sum of weights in exponential smoothing is _____.

A) <1
B) 1
C) >1
D) None of the above

Solution: (B)

Table 7.1 shows the weights attached to observations for four


different values of αα when forecasting using simple
exponential smoothing. Note that the sum of the weights even
for a small αα will be approximately one for any reasonable
sample size.

Observation α=0.2 α=0.4 α=0.6 α=0.8


yT 0.2 0.4 0.6 0.8
yT−1 0.16 0.24 0.24 0.16
yT−2 0.128 0.144 0.096 0.032
yT−3 0.102 0.0864 0.0384 0.0064
yT−4 (0.2)(0.8) (0.4)(0.6) (0.6)(0.4) (0.8)(0.2)
Observation α=0.2 α=0.4 α=0.6 α=0.8
yT−5 (0.2)(0.8) (0.4)(0.6) (0.6)(0.4) (0.8)(0.2)

9) The last period’s forecast was 70 and demand was 60.


What is the simple exponential smoothing forecast with alpha
of 0.4 for the next period.

A) 63.8
B) 65
C) 62
D) 66

Solution: (D)

Yt-1= 70

St-1= 60

Alpha = 0.4

Substituting the values we get

0.4*60 + 0.6*70= 24 + 42= 66

10) What does autocovariance measure?

A) Linear dependence between multiple points on the


different series observed at different times
B)Quadratic dependence between two points on the same
series observed at different times
C) Linear dependence between two points on different series
observed at same time
D) Linear dependence between two points on the same series
observed at different times

Solution: (D)

Option D is the definition of autocovariance.

11) Which of the following is not a necessary condition for


weakly stationary time series?

A) Mean is constant and does not depend on time


B) Autocovariance function depends on s and t only through
their difference |s-t| (where t and s are moments in time)
C) The time series under considerations is a finite variance
process
D) Time series is Gaussian

Solution: (D)

A Gaussian time series implies stationarity is strict


stationarity.

12) Which of the following is not a technique used in smoothing


time series?

A) Nearest Neighbour Regression


B) Locally weighted scatter plot smoothing
C) Tree based models like (CART)
D) Smoothing Splines

Solution: (C)

Time series smoothing and filtering can be expressed in terms


of local regression models. Polynomials and regression splines
also provide important techniques for smoothing. CART based
models do not provide an equation to superimpose on time
series and thus cannot be used for smoothing. All the other
techniques are well documented smoothing techniques.

13) If the demand is 100 during October 2016, 200 in


November 2016, 300 in December 2016, 400 in January
2017. What is the 3-month simple moving average for
February 2017?

A) 300
B) 350
C) 400
D) Need more information

Solution: (A)

X`= (xt-3 + xt-2 + xt-1 ) /3

(200+300+400)/ 3 = 900/3 =300


14) Looking at the below ACF plot, would you suggest to apply
AR or MA in ARIMA modeling technique?

A) AR
B) MA
C) Can’t Say

Solution: (A)

MA model is considered in the following situation, If


the autocorrelation function (ACF) of the differenced series
displays a sharp cutoff and/or the lag-1 autocorrelation
is negative–i.e., if the series appears slightly
“overdifferenced”–then consider adding an MA term to the
model. The lag beyond which the ACF cuts off is the indicated
number of MA terms.

But as there are no observable sharp cutoffs the AR model


must be preffered.
15) Suppose, you are a data scientist at Analytics Vidhya. And
you observed the views on the articles increases during the
month of Jan-Mar. Whereas the views during Nov-Dec
decreases.

Does the above statement represent seasonality?

A) TRUE
B) FALSE
C) Can’t Say

Solution: (A)

Yes this is a definite seasonal trend as there is a change in the


views at particular times.

Remember, Seasonality is a presence of variations at specific


periodic intervals.

16) Which of the following graph can be used to detect


seasonality in time series data?

1. Multiple box
2. Autocorrelation

A) Only 1
B) Only 2
C) 1 and 2
D) None of these
Solution: (C)

Seasonality is a presence of variations at specific periodic


intervals.

The variation of distribution can be observed in multiple box


plots. And thus seasonality can be easily
spotted. Autocorrelation plot should show spikes at lags equal
to the period.

17) Stationarity is a desirable property for a time series


process.

A) TRUE
B) FALSE

Solution: (A)

When the following conditions are satisfied then a time series


is stationary.

1. Mean is constant and does not depend on time


2. Autocovariance function depends on s and t only through
their difference |s-t| (where t and s are moments in
time)
3. The time series under considerations is a finite variance
process

These conditions are essential prerequisites for


mathematically representing a time series to be used for
analysis and forecasting. Thus stationarity is a desirable
property.

18) Suppose you are given a time series dataset which has only
4 columns (id, Time, X, Target).

What would be the rolling mean of feature X if you are given


the window size 2?

Note: X column represents rolling mean.


A)
B)
C)

D) None of the above

Solution: (B)

X`= xt-2 + xt-1 /2

Based on the above formula: (100 +200) /2 =150;


(200+300)/2 = 250 and so on.

19) Imagine, you are working on a time series dataset. Your


manager has asked you to build a highly accurate model. You
started to build two types of models which are given below.

Model 1: Decision Tree model


Model 2: Time series regression model

At the end of evaluation of these two models, you found that


model 2 is better than model 1. What could be the possible
reason for your inference?

A) Model 1 couldn’t map the linear relationship as good as


Model 2
B) Model 1 will always be better than Model 2
C) You can’t compare decision tree with time series regression
D) None of these

Solution: (A)

A time series model is similar to a regression model. So it is


good at finding simple linear relationships. While a tree based
model though efficient will not be as good at finding and
exploiting linear relationships.

20) What type of analysis could be most effective for


predicting temperature on the following type of data.
A) Time Series Analysis
B) Classification
C) Clustering
D) None of the above

Solution: (A)

The data is obtained on consecutive days and thus the most


effective type of analysis will be time series analysis.

21) What is the first difference of temperature / precipitation


variable?
A) 15,12.2,-43.2,-23.2,14.3,-7
B) 38.17,-46.11,-4.98,14.29,-22.61
C) 35,38.17,-46.11,-4.98,14.29,-22.61
D) 36.21,-43.23,-5.43,17.44,-22.61

Solution: (B)

73.17-35 = 38.17

27.05-73.17 = – 46.11 and so on..

13.75 – 36.36 = -22.61

22) Consider the following set of data:

{23.32 32.33 32.88 28.98 33.16 26.33 29.88 32.69


18.98 21.23 26.66 29.89}

What is the lag-one sample autocorrelation of the time series?

A) 0.26
B) 0.52
C) 0.13
D) 0.07
Solution: (C)

ρˆ1 = PT t=2(xt−1−x¯)(xt−x¯) PT t=1(xt−x¯) 2

= (23.32−x¯)(32.33−x¯)+(32.33−x¯)(32.88−x¯)+··· PT t=1(xt−x¯) 2

= 0.130394786

Where x¯ is the mean of the series which is 28.0275

23) Any stationary time series can be approximately the


random superposition of sines and cosines oscillating at various
frequencies.
A) TRUE
B) FALSE

Solution: (A)

A weakly stationary time series, xt, is a finite variance process


such that

 The mean value function, µt, is constant and does not


depend on time t, and (ii) the autocovariance function, γ(s,t),
defined in depends on s and t only through their diff erence
|s−t|.

random superposition of sines and cosines oscillating at various


frequencies is white noise. white noise is weakly stationary or
stationary. If the white noise variates are also normally
distributed or Gaussian, the series is also strictly stationary.
24) Autocovariance function for weakly stationary time series
does not depend on _______ ?

A) Separation of xs and xt
B) h = | s – t |
C) Location of point at a particular time

Solution: (C)

By definition of weak stationary time series described in


previous question.

25) Two time series are jointly stationary if _____ ?

A) They are each stationary


B) Cross variance function is a function only of lag h

A) Only A
B) Both A and B

Solution: (D)

Joint stationarity is defined based on the above two


mentioned conditions.

26) In autoregressive models _______ ?

A) Current value of dependent variable is influenced by


current values of independent variables
B) Current value of dependent variable is influenced by
current and past values of independent variables
C) Current value of dependent variable is influenced by past
values of both dependent and independent variables
D) None of the above

Solution: (C)

Autoregressive models are based on the idea that the current


value of the series, xt, can be explained as a function of p past
values, xt−1,xt−2,…,xt−p, where p determines the number of
steps into the past needed to forecast the current value. Ex. xt
= xt−1 −.90xt−2 + wt,

Where xt-1 and xt-2 are past values of dependent variable


and wt the white noise can represent values of independent
values.

The example can be extended to include multiple series


analogous to multivariate linear regression.

27) For MA (Moving Average) models the pair σ = 1 and θ =


5 yields the same autocovariance function as the pair σ = 25
and θ =
1/5.
A) TRUE
B) FALSE

Solution: (A)

True, because autocovariance is invertible for MA models

note that for an MA(1) model, ρ(h) is the same for θ and 1 /θ

try 5 and 1 5, for example. In addition, the pair σ2 w = 1 and


θ = 5 yield the same autocovariance function as the pair σ2 w
= 25 and θ = 1/5.

28) How many AR and MA terms should be included for the


time series by looking at the above ACF and PACF plots?

A) AR (1) MA(0)
B) AR(0)MA(1)
C) AR(2)MA(1)
D) AR(1)MA(2)
E) Can’t Say

Solution: (B)
Strong negative correlation at lag 1 suggest MA and there is
only 1 significant lag. Read this article for a better
understanding.

29) Which of the following is true for white noise?

A) Mean =0
B) Zero autocovariances
C) Zero autocovariances except at lag zero
D) Quadratic Variance

Solution: (C)

A white noise process must have a constant mean, a constant


variance and no autocovariance structure (except at lag zero,
which is the variance).

30) For the following MA (3)


process yt = μ + Εt + θ1Εt-1 + θ2Εt-2 + θ3Εt-3 , where σt is a
zero mean white noise process with variance σ2

A) ACF = 0 at lag 3
B) ACF =0 at lag 5
C) ACF =1 at lag 1
D) ACF =0 at lag 2
E) ACF = 0 at lag 3 and at lag 5

Solution: (B)
Recall that an MA(q) process only has memory of length q. This
means that all of the autocorrelation coefficients will have a
value of zero beyond lag q. This can be seen by examining the
MA equation, and seeing that only the past q disturbance
terms enter into the equation, so that if we iterate this
equation forward through time by more than q periods, the
current value of the disturbance term will no longer affect y.
Finally, since the autocorrelation function at lag zero is the
correlation of y at time t with y at time t (i.e. the correlation
of y_t with itself), it must be one by definition.

31) Consider the following AR(1) model with the disturbances


having zero mean and unit variance.

yt= 0.4 + 0.2yt-1+ut

The (unconditional) variance of y will be given by ?

A) 1.5
B) 1.04
C) 0.5
D) 2

Solution: (B)

Variance of the disturbances divided by (1 minus the square of


the autoregressive coefficient

Which in this case is : 1/(1-(0.2^2))= 1/0.96= 1.041


32) The pacf (partial autocorrelation function) is necessary for
distinguishing between ______ ?

A) An AR and MA model is_solution: False


B) An AR and an ARMA is_solution: True
C) An MA and an ARMA is_solution: False
D) Different models from within the ARMA family

Solution: (B)

33) Second differencing in time series can help to eliminate


which trend?
A) Quadratic Trend
B) Linear Trend
C) Both A & B
D) None of the above

Solution: (A)

The first diff erence is denoted as ∇xt = xt −xt−1. (1)

As we have seen, the first diff erence eliminates a linear trend.


A second diff erence, that is, the diff erence of (1), can
eliminate a quadratic trend, and so on.

34) Which of the following cross validation techniques is better


suited for time series data?

A) k-Fold Cross Validation


B) Leave-one-out Cross Validation
C) Stratified Shuffle Split Cross Validation
D) Forward Chaining Cross Validation

Solution: (D)
Time series is ordered data. So the validation data must be
ordered to. Forward chaining ensures this. It works as follows:

 fold 1 : training [1], test [2]


 fold 2 : training [1 2], test [3]
 fold 3 : training [1 2 3], test [4]
 fold 4 : training [1 2 3 4], test [5]
 fold 5 : training [1 2 3 4 5], test [6]

35) BIC penalizes complex models more strongly than the AIC.

A) TRUE
B) FALSE

Solution: (A)

AIC = -2*ln(likelihood) + 2*k,

BIC = -2*ln(likelihood) + ln(N)*k,

where:

k = model degrees of freedom

N = number of observations

At relatively low N (7 and less) BIC is more tolerant of free


parameters than AIC, but less tolerant at higher N (as the
natural log of N overcomes 2).

36) The figure below shows the estimated autocorrelation and


partial autocorrelations of a time series of n = 60 observations.
Based on these plots, we should.
A) Transform the data by taking logs
B) Difference the series to obtain stationary data
C) Fit an MA(1) model to the time series

Solution: (B)

The autocorr shows a definite trend and partial


autocorrelation shows a choppy trend, in such a scenario
taking a log would be of no use. Differencing the series to
obtain a stationary series is the only option.

Question Context (37-38)


37) Use the estimated exponential smoothening given above
and predict temperature for the next 3 years (1998-2000)

These results summarize the fit of a simple exponential smooth


to the time series.

A) 0.2,0.32,0.6
B) 0.33, 0.33,0.33
C) 0.27,0.27,0.27
D) 0.4,0.3,0.37

Solution: (B)
The predicted value from the exponential smooth is the same
for all 3 years, so all we need is the value for next year. The
expression for the smooth is

smootht = α yt + (1 – α) smooth t-1

Hence, for the next point, the next value of the smooth (the
prediction for the next observation) is

smoothn = α yn + (1 – α) smooth n-1

= 0.3968*0.43 + (1 – 0.3968)* 0.3968

= 0.3297

38) Find 95% prediction intervals for the predictions of


temperature in 1999.

These results summarize the fit of a simple exponential smooth


to the time series.

A) 0.3297 2 * 0.1125
B) 0.3297 2 * 0.121
C) 0.3297 2 * 0.129
D) 0.3297 2 * 0.22

Solution: (B)

The sd of the prediction errors is

1 period out 0.1125


2 periods out 0.1125 sqrt(1+α2) = 0.1125 * sqrt(1+
0.39682) ≈ 0.121

39) Which of the following statement is correct?

1. If autoregressive parameter (p) in an ARIMA model is 1, it


means that there is no auto-correlation in the series.
2. If moving average component (q) in an ARIMA model is 1,
it means that there is auto-correlation in the series with lag 1.
3. If integrated component (d) in an ARIMA model is 0, it
means that the series is not stationary.

A) Only 1
B) Both 1 and 2
C) Only 2
D) All of the statements

Solution: (C)

Autoregressive component: AR stands for


autoregressive. Autoregressive parameter is denoted by
p. When p =0, it means that there is no auto-correlation in
the series. When p=1, it means that the series
auto-correlation is till one lag.

Integrated: In ARIMA time series analysis, integrated is


denoted by d. Integration is the inverse of
differencing. When d=0, it means the series is stationary and
we do not need to take the difference of it. When d=1, it
means that the series is not stationary and to make it
stationary, we need to take the first difference. When d=2, it
means that the series has been differenced twice. Usually,
more than two time difference is not reliable.

Moving average component: MA stands for moving the


average, which is denoted by q. In ARIMA, moving average
q=1 means that it is an error term and there is
auto-correlation with one lag.

40) In a time-series forecasting problem, if the seasonal indices


for quarters 1, 2, and 3 are 0.80, 0.90, and 0.95 respectively.
What can you say about the seasonal index of quarter 4?

A) It will be less than 1


B) It will be greater than 1
C) It will be equal to 1
D) Seasonality does not exist
E) Data is insufficient

1. An orderly set of data arranged in accordance with their


time of occurrence is called:

(a) Arithmetic series


(b) Harmonic series
(c) Geometric series
(d) Time series
2. A time series consists of:

(a) Short-term variations


(b) Long-term variations
(c) Irregular variations
(d) All of the above

3. Secular trend can be measured by:

(a) Two methods


(b) Three methods
(c) Four methods
(d) Five methods

4. The secular trend is measured by the method of


semi-averages when:

(a) Time series based on yearly values


(b) Trend is linear
(c) Time series consists of even number of values
(d) None of them

5. Increase in the number of patients in the hospital due to


heat stroke is:

(a) Secular trend


(b) Irregular variation
(c) Seasonal variation
(d) Cyclical variation
6. In time series seasonal variations can occur within a period
of:

(a) Four years


(b) Three years
(c) less than One year
(d) Nine years

7. Wheat crops badly damaged on account of rains is:

(a) Cyclical movement


(b) Random movement
(c) Secular trend
(d) Seasonal movement

8. The method of moving average is used to find the:

(a) Secular trend


(b) Seasonal variation
(c) Cyclical variation
(d) Irregular variation

9. A complete cycle consists of a period of:

(a) Prosperity and depression


(b) Prosperity and recovery
(c) Prosperity and recession
(d) Recession and recovery

10. A complete cycle passes through:


(a) Two stages
(b) Three stages
(c) Four stages
(d) Difficult to tell

11. Most frequency used mathematical model of a time series


is:

(a) Additive model


(b) Mixed model
(c) Multiplicative model
(d) Regression model

12. In a straight line equation Y = a + bX; a is the:

(a) X-intercept
(b) Slope
(c) Y-intercept
(d) None of them

13. In a straight line equation Y = a + bX; b is the:

(a) Y-intercept
(b) Slope
(c) X-intercept
(d) Trend

14. Value of b in the trend line Y = a + bX is:

(a) Always negative


(b) Always positive
(c) Always zero
(d) Both negative or positive

15. In semi averages method, we divide the data into:

(a) Two parts


(b) Two equal parts
(c) Three parts
(d) depending on size of data

16. In fitting a straight line, the value of slope b remain


unchanged with the change of:

(a) Scale
(b) Origin
(c) Both (a) and (b)
(d) Neither (a) and (b)

17. Moving average method is used for measurement of trend


when:

(a) Trend is linear


(b) Trend is non linear
(c) Trend is curvilinear
(d) None of them

18. Indicate which of the following an example of seasonal


variations is:

(a) Death rate decreased due to advance in science


(b) The sale of air condition increases during summer
(c) Recovery in business
(d) Sudden causes by wars

19. The most commonly used mathematical method for


measuring the trend is:

(a) Moving average method


(b) Semi average method
(c) Method of least squares
(d) None of them

20. A trend is the better fitted trend for which the sum of
squares of residuals is:

(a) Maximum
(b) Minimum
(c) Positive
(d) Negative

21. Decomposition of time series is called:

(a) Histogram
(b) Analysis of time series
(c) Histogram
(d) Detrending

22. The fire in a factory is an example of:

(a) Secular trend


(b) Seasonal movements
(c) Cyclical variations
(d) Irregular variations

23. Increased demand of admission in the subject of computer


in Pakistan is:

(a) Secular trend


(b) Cyclical trend
(c) Seasonal trend
(d) Irregular trend

24. Damages due to floods, droughts, strikes fires and political


disturbances are:

(a) Trend
(b) Seasonal
(c) Cyclical
(d) Irregular

25. The general pattern of increase or decrease in economics


or social phenomena is shown by:

(a) Seasonal trend


(b) Cyclical trend
(c) Secular trend
(d) Irregular trend

26. In moving average method, we cannot find the trend


values of some:

(a) Middle periods


(b) End periods
(c) Starting periods
(d) extreme periods

27. The best fitting trend is one which the sum of squares of
residuals is:

(a) Negative
(b) Least
(c) Zero
(d) Maximum

28. In fitting of a straight line, the value of slope remains


unchanged by change of:

(a) Scale
(b) Origin
(c) Both origin and scale
(d) None of them

29. Depression in business is:

(a) Secular trend


(b) Cyclical
(c) Seasonal
(d) Irregular

30. In fitting of straight line = 0

(a) All the observed Y values lie on the line


(b) All the Y values are greater than corresponding values
(c) All the Y values are positive
(d) None of them

31. The rise and fall of a time series over periods longer than
one year is called:

(a) secular trend


(b) seasonal variation
(c) Cyclical variation
(d) irregular variation

32. A Time series has_____component s.

(a) Two
(b) Three
(c) Four
(d) Five

33. The multiplicative time series model is:

(a) Y = T + S + C + I
(b) Y = TSCI
(c) Y = a + bX
(d) Y = a + bX + cX²

34. The additive model of the time series is:

(a) Y = T + S + C + I
(b) Y = TSCI
(c) Y = a + bX
(d) Y = a + bX + cX²
35. The difference between the actual value of the time series
and the forecasted value is called:

(a) Residual
(b) Sum of variation
(c) Sum of squares of residual
(d) All of the above

36. A pattern that is repeated throughout a time series and


has a recurrence period of at most one year is called:

(a) Cyclical variation


(b) Irregular variation
(c) Seasonal variation
(d) Long term variation

37. When the production of a thing is maximum, this stage is


called:

(a) Boom
(b) Recovery
(c) Recession
(d) Depression

38. When the production of a thing is minimum, this stage is


called:

(a) Prosperity
(b) Recession
(c) Recovery
(d) Depression

39. When the production of thing is increasing towards


prosperity, this stage is called as:

(a) Recession
(b) Recovery
(c) Boom
(d) Depression

40. When the production of thing is decreasing, this stage is


called:

(a) Recession
(b) Recovery
(c) Prosperity
(d) Depression

41. For odd number of years, formula to code the values of X


by taking origin at center is:

(a) X = year – Middle of years


(b) X = year – first year
(c) X = year – last year
(d) X = year – ½ average of years

42. For even number of years when origin is in the center and
the unit of X being one year, then X can be coded as:

(a) X = (year – average of years)/2


(b) X = year – average of Middle two years
(c) X = year – 0.5 average of years
(d) X = average of years – year

43. For even number of years when origin is in the center and
the unit of X being half year, then X can be coded as:

(a) X = year – average of years


(b) X = 2(year – average of Middle two years)
(c) X = (year – average year)/2
(d) X = year – ½ average of years

44. In semi averages method, if the number of values is odd


then we drop:

(a) First value


(b) Last value
(c) Middle value
(d) Middle two values

45. The trend values in freehand curve method are obtained


by:

(a) Equation of straight line


(b) Graph
(c) Second degree parabola
(d) All of the above

46. The most important factors causing seasonal variations


are ______.

(a) growth in population


(b) technological improvements
(c) weather and social customs
(d) change in fashions

47. The most widely used method of measuring seasonal


variations is _______.

(a) ratio-to-moving average method


(b) ratio-to-trend method
(c) link relative method
(d) method of simple average

48. In the least square linear trend equation Y= a + bX. if b is


positive, it indicates ______.

(a) declining trend


(b) rising trend
(c) no trend at all
(d) all of these

49. Cyclical fluctuations are caused by _________.

(a) wars
(b) earthquakes
(c) floods
(d) none

50. Time-series analysis is based on the assumption that


_______.

(a) random error terms are normally distributed


(b) there are dependable correlations between the variable to
be forecast and other independent
(c) variables.
(d) past patterns in the variable to be forecast will continue
unchanged into the future.
(e) the data do not exhibit a tr-end.

51. Which of the following is not one of the four types of


variation that is estimated in time-series analysis?

(a) Predictable
(b) Trend
(c) cyclical
(d) Irregular

52. In time-series analysis, which source of variation can be


estimated by the ratio-to-trend method?

(a) seasonal
(b) Trend
(c) cyclical
(d) Irregular

53. Number of periods included in a group for moving averages


depend on _______ in a time series data

(a) Curvilinear trend


(b) Cyclic fluctuations
(c) Seasonal fluctuations
(d) Period of oscillation
54. A rise in price before Eid is an _________.

(a) seasonal trend


(b) secular Trend
(c) cyclical trend
(d) Irregular trend

55. The most commonly used mathematical method for


measuring the trend is _______.

(a) Moving average method


(b) semi average method
(c) least square
(d) ratio to trend

56. The best-fitted trend line is one for which sum of squares
of residuals or errors is

(a) positive
(b) negative
(c) zero
(d) minimum

57. The 3 yearly moving average for the year 2005 is given by
_______.

Year 2004 2005 2006 2007 2008


Y 3 6 9 3 4

(a) 3
(b) 9
(c) 6
(d) 0

58. For the given data semi averages for the first half is given
by _______.

Year 2010 2011 2012 2013 2014 2015 2016 2017


Y 20 16 9 11 40 23 21 12

(a) 13
(b) 15
(c) 16
(d) 14

59. For the straight-line equation Y= 56+12X, the trend value


for X=-3 is given by _____.

(a) 23
(b) 36
(c) 20
(d) 68

60. For the given data short term fluctuations for the year
2013 by using additive model is given by _______.

Year 2013 2014 2015 2016 2017


Y 80 90 92 83 94
Trend 84 86 88 90 92
value

(a) 4
(b) -4
(c) 0.95
(d) -0.95

61. For the given data short term fluctuations for the year
2014 by using Multiplicative model is given by _______.

Year 2013 2014 2015 2016 2017


Y 80 90 92 83 94
Trend 84 86 88 90 92
value

(a) 1.046
(b) -1.046
(c) -4
(d) 1

62. Shortage of certain consumer goods before annual budget


is due to

(a) Secular trend


(b) Irregular variation
(c) Seasonal variation
(d) Cyclical variation

63. Linear trend of a time series indicates towards .

(a) constant rate of growth


(b) constant rate of change
(c) change in geometric progression
(d) all the above
64. The method of moving average is used to find the

(a) Seasonal trend


(b) Irregular trend
(c) Secular trend
(d) Cyclical trend

65. If trend is absent in the data then ________method is used of


computing seasonal indices.

(a) ratio to trend


(b) Simple Average
(c) semi average
(d) ratio to moving average

66. The method of least squares indicates that we choose the


regression line where the sum of the square of deviations of the
points from the line is .

(a) positive
(b) Maximum
(c) Zero
(d) Minimum

67. Variations in a time series are caused by sale of air


condition increase during summer.

(a) Seasonal
(b) Irregular
(c) Secular
(d) Cyclic

68. Trend in a time series means .

(a) long-term regular movement


(b) short-term regular movement
(c) both (a) and (b)
(d) neither (a) nor (b)

69. A time series is a set of data recorded .

(a) at successive points of time


(b) periodically
(c) at time or space intervals
(d) all the above

70. Which of the following is not present in a time series?

(a) Seasonality
(b) Operational variations
(c) Trend
(d) Cycle

71. A lock-out in a factory for a month belongs to _______


component of a time series.

(a) Irregular variation


(b) Secular trend
(c) Cyclical variation
(d) None
72. The sales of departmental store on Dushera and Diwali are
associated with the component of a time series _________
variation.

(a) Trend
(b) Seasonal
(c) Irregular
(d) Cyclical

73. The consistent increase in production of cereals constitutes


the component of a time series ________.

(a) Seasonal Variation


(b) Cyclical Variation
(c) Secular Trend
(d) None

74. Secular trend is indicative of long-term variation towards


___________.

(a) Increase only


(b) Decrease only
(c) Either increase or decrease
(d) None

75. Irregular variations in a time series are caused by


___________.

(a) Lockouts and strikes


(b) Epidemics
(c) Floods
(d) All the above

76. Least Square method to estimate trend line in Time series


consists of estimating _______ constants.

(a) Zero
(b) Two
(c) Less than two
(d) More than two.

77. Simple average method is used to find ________.

(a) Seasonal Variation


(b) Cyclic Variation
(c) Secular trend
(d) None

78. Most frequently used mathematical model of time series is


__________ model.

(a) Additive
(b) Multiplicative
(c) Mixed
(d) Regression

79. The additive model assumes the components of time series


are _____ of each other.

(a) dependent
(b) inter connected
(c) independent
(d) None

80. The ________ model of Ztime Series assumes all the


component of time series are dependenet.

(a) Additive
(b) Multiplicative
(c) Mixed
(d) regression

81. The linear trend of sales of a company is Rs. 6,50,000 in


1995 and it rises by Rs.16,500 per year. Trend equation is
given by __________

(a) Y = 6,50,000+16,500 X with origin year 1995


(b) Y = 16,500+6,50,000 X with origin year 1995
(c) Y = 6,50,000 - 16,500 X with origin year 1995
(d) Y = 6,50,000 - 16,500 t

82. The linear trend equation of sales of a company is given by


Y = 6,50,000+16,500 X : with origin year 1995 and X is 1
unit =1 Year : Predicted sales for the year 2000 is __________

(a) 6,50,000
(b) 16,5000
(c) 33,650,000
(d) 7,32,500
83. y = 85·6 + 2·4x ; Origin 2000 ; x unit = 1 year , y = Annual
production of sugar (in ’000 quintals) ___________ slope of line.

(a) 2.4 quintals


(b) 85.6 quintals
(c) 2400 quintals
(d) 24000 quintals

84. Trend equation y = 85.6 – 2.4x ; Origin 2000 ; x unit =


1 year , shows ____________ trend

(a) increasing
(b) Decreasing
(c) constant
(d) non linear

85. Trend equation y = 85.6 + 2.4x ; Origin 2000 ; x unit =


1 year , y = Annual production of sugar (in ’000 quintals) has
___________ as the monthly increase in the production.

(a) 0.2 quintals


(b) 200 quintals
(c) 2400 quintals
(d) 24000 quintals

86. For the additive model in time series analysis, for annual
data the difference Y – T represents ___________ fluctuations.

(a) seasonal, cyclical and irregular


(b) seasonal and cyclical
(c) cyclical and irregular
(d) seasonal and irregular
Sample Multiple Choice Questions

1 The despatch department of a mail order book company have run out of small
padded envelopes that they use to send books to their customers. Unfortunately, there is a
shortage of these envelopes at their supplier and it will be 5 days before they will be
available. As a result of this, there is a backlog of 3 days of picked orders that are temporarily
stored in the despatch department which cannot be sent to customers until the envelopes
arrive.
Which of the following types of Lean waste are there in this scenario?

A Transport and Over-processing.


B Inventory and Waiting.
C Waiting and Defects.
D Over-production and Motion.

2 The Director of the Information Systems department in a large organisation that


traditionally uses a waterfall approach to development is looking to trial Agile development
on a small project. The project is for a single, small team to develop the first mobile
application for the organisation’s employees and whilst the development team are fully
competent with the development language, they are not familiar with mobile application
development.
The Director has asked the management team to consider which might be an appropriate
Agile method to use for this project.
Which of the following would be best suited to this project?

A DSDM.
B XP.
C Lean Software Development.
D SAFe.

Agile Business Analysis 10 Sample MCQs v0.2 1 © Assist Knowledge Development


3 Dominique is the Head of Recruitment within the Human Resources (HR) department.
Following a review of recruitment processes and systems in the organisation, a business
transformation project has been initiated. The director of HR, Sarah will be providing the
project budget and managing the project finances. Dominique will be the key HR
representative on the Agile project and will be responsible for the day to day decisions.
Which of the following roles is Dominique playing on this project?

A Product owner.
B End user.
C Project sponsor.
D Subject matter expert.

Agile Business Analysis 10 Sample MCQs v0.2 2 © Assist Knowledge Development


4 An initial IT system use case diagram has been developed for a school Student
Assessment System.

Student Assessment System

Submit Assessment Mark Assessment

Student Teacher

View Results Assessment Administration

Parent School Administrator

School Inspector School Principal

In reviewing the use case diagram, the business analyst has suggested that the following
elements in the diagram could be incorrect:
i. The ‘Assessment Administration’ use case
ii. The ‘View Results’ use case
iii. The ‘School Administrator’ actor
iv. The ‘Teacher’ actor
v. The System name (within the boundary)

Which of the following combinations identifies what is incorrect in this diagram?

A i and iv.
B ii and iii.
C i and iii.
D iii only.

Agile Business Analysis 10 Sample MCQs v0.2 3 © Assist Knowledge Development


5 A project team is working on the development of a mobile phone-based expenses
management system for a national mental health charity. Volunteers currently submit their
travel expenses by completing a paper form and sending it with receipts via the post. One of
the requirements is to use GPS tracking to automatically and accurately determine mileage
for visits which is considered useful but is not a mandatory requirement for the project. The
project is nearing completion and this is the final iteration for the project.
Which of the following MoSCoW priorities SHOULD be allocated to the automatic GPS
tracking requirement for the final iteration?

A M.
B S.
C C.
D W.

6 A small independent photography company has a business goal of providing a


scanning and enhancement service to produce premium quality prints from customers old
printed photographs, negatives and slides.
The business analyst has decomposed this goal into the following potential sub-goals.
i. Buy professional scanner.
ii. Produce prints from customers’ old photographs.
iii. Install scanner
iv. Provide photographic enhancement service.
v. Produce prints from customers’ photographic negatives.

Which of these are valid sub-goals?

A i and iii only.


B ii, iii and v only.
C ii and v only.
D i, iii and iv only.

Agile Business Analysis 10 Sample MCQs v0.2 4 © Assist Knowledge Development


7 The following user story is being discussed in the Agile team.

011: Customer fingerprint login

As a customer I want to be able to login using my fingerprint reader on my mobile phone so


that I can access my account without needing to type my password.

During the conversation it was determined that there is an associated non-functional


requirement that if there are 3 failed attempts to login using the fingerprint reader then the
customer will need to provide their username and password. This non-functional
requirement is specific to this user story.
Requirements could be documented in the following areas.
i. User story acceptance criteria for “Customer fingerprint login”.
ii. Requirements catalogue.
iii. Use case description for “Customer fingerprint login”.
iv. A separate user story.

Where should the non-functional requirement be documented?

A ii or iv.
B iv only.
C ii only.
D i or iii.

8 As part of a university assessment system project, the project team are reviewing the
following user story.

049: Show my exam results on social media

As a student I want to be able to share my exam certificate on social media so that my friends
and potential employers can see my achievements.
In the review, it was discussed that there are multiple social media channels that could be
used and that no channels had been specified. Which of the INVEST rules for writing a
quality user story are not being met?

A Small and Estimatable.


B Testable and Independent.
C Estimatable and Valuable.
D Valuable and Negotiable.

9 The following burn down chart has been produced showing the situation 7 days
through the latest iteration in the project.

Agile Business Analysis 10 Sample MCQs v0.2 5 © Assist Knowledge Development


Burn down chart
60
Number of story points

50

40

30

20

10

0
0 1 2 3 4 5 6 7 8 9 10
Ideal burndown 50 45 40 35 30 25 20 15 10 5 0
Story points remaining 50 50 49 45 39 39 30 30
Iteration timeline (days)

Ideal burndown Story points remaining

What does this chart tell us about the current iteration?


i. The iteration will not deliver the all the remaining story points by the end of the
iteration.
ii. No user stories were completed on days 1, 5 and 7.
iii. The team has completed 30 story points so far in the iteration.
iv. No work was done on days 1, 5 and 7.
v. There are 30 user story points left to complete in the iteration.

A i and iv only.
B ii and iii only.
C i, iii and iv only.
D ii and v only.

Agile Business Analysis 10 Sample MCQs v0.2 6 © Assist Knowledge Development


10 The project team are holding a meeting and during the meeting they discuss the
following:
• What went well?
• What didn’t go so well?
• What could we do differently next time?
• Actions

Which Agile ceremony does this describe?

A Show and Tell.


B Iteration Planning Meeting.
C Retrospective.
D Daily stand-up.

Agile Business Analysis 10 Sample MCQs v0.2 7 © Assist Knowledge Development


Multiple choice question answers

Question Answer Question Answer


1 B 6 C
2 B 7 D
3 A 8 A
4 A 9 D
5 C 10 C

Agile Business Analysis 10 Sample MCQs v0.2 8 © Assist Knowledge Development


Business Research Methods-104
Multiple Choice Questions

1. Research is
(A) Searching again and again
(B) Finding solution to any problem
(C) Working in a scientific way to search for truth of any problem
(D) None of the above

2. Which of the following is the first step in starting the research process?
(A) Searching sources of information to locate problem.
(B) Survey of related literature
(C) Identification of problem
(D) Searching for solutions to the problem

3. A common test in research demands much priority on


(A) Reliability
(B) Usability
(C) Objectivity
(D) All of the above

4. Action research means


(A) A longitudinal research
(B) An applied research
(C) A research initiated to solve an immediate problem
(D) A research with socioeconomic objective

5. A reasoning where we start with certain particular statements and conclude with a universal
statement is called
(A) Deductive Reasoning
(B) Inductive Reasoning
(C) Abnormal Reasoning
(D) Transcendental Reasoning

6. The essential qualities of a researcher are


(A) Spirit of free enquiry
(B) Reliance on observation and evidence
(C) Systematization or theorizing of knowledge
(D) All the above

7. In the process of conducting research ‘Formulation of Hypothesis” is followed by


(A) Statement of Objectives
(B) Analysis of Data
(C) Selection of Research Tools
(D) Collection of Data

8. A research paper is a brief report of research work based on


(A) Primary Data only
(B) Secondary Data only
(C) Both Primary and Secondary Data
(D) None of the above

9. An appropriate source to find out descriptive information is................ .


(A) Bibliography
(B) Directory
(C) Encyclopedia
(D) Dictionary

10. “Controlled Group” is a term used in.............. .


(A) Survey research
(B) Historical research
(C) Experimental research
(D) Descriptive research

11. Testing hypothesis is a


(A) Inferential statistics
(B) Descriptive statistics
(C) Data preparation
(D) Data analysis

12. The method that consists of collection of data through observation and experimentation,
formulation and testing of hypothesis is called
(A) Empirical method
(B) Scientific method
(C) Scientific information
(D) Practical knowledge
(E)
13. Information acquired by experience or experimentation is called as
(A) Empirical
(B) Scientific
(C) Facts
(D) Scientific evidences

14. “All living things are made up of cells. Blue whale is a living being. Therefore blue whale is
made up of cells”. The reasoning used here is
(A) Inductive
(B) Deductive
(C) Both A and B
(D) Hypothetic-Deductive

15. The reasoning that uses general principle to predict specific result is called
(A) Inductive
(B) Deductive
(C) Both A and B
(D) Hypothetic-Deductive

16. All research process starts with


(A) Hypothesis
(B) Experiments to test hypothesis
(C) Observation
(D) All of these
17. The quality of a research study is primarily assessed on:
(A) The place of publication.
(B) The ways in which the recommendations are implemented
(C) The rigor with which it was conducted
(D) The number of times it is replicated.

18. Which of the following is not an appropriate source for academic research?
(A) An online encyclopedia
(B) A government-based research organization database
(C) A peer reviewed journal article
(D) A text book

19. 'Research methodology' refers to:


(A) The sampling technique
(B) The tools that the researcher use
(C) The chain of association between the research question and the research design
(D) Qualitative methods

20. A researcher should:


(A) Be constrained by the research of others
(B) Use even anonymous sources if they appear relevant
(C) Use only sources that appear credible

21. Research is
(A) A purposeful, systematic activity
(B) Primarily conducted for purely academic purposes
(C) Primarily conducted to answer questions about practical issues
(D) A random, unplanned process of discovery

22. When conducting a review of literature on a particular subject, the researcher should
(A) Read all available material on the subject
(B) Read the whole journal article and then decide whether or not it is useful
(C) Read strategically and critically
(D) Read fully only those texts that appear to agree with his/her point of view

23. The two main styles of research are


(A) Data collection and data coding
(B) Surveys and questionnaires
(C) Sampling and recording
(D) Qualitative and quantitative
24. Qualitative research is:
(A) Not as rigorous as quantitative research
(B) Primarily concerned with the collection and analysis of numerical data
(C) Primarily concerned with in-depth exploration of phenomena
(D) Primarily concerned with the quality of the research

25. Quantitative research involves


(A) Interviewing people to find out their deeply held views about issues
(B) Collecting data in numerical form
(C) More rigor than qualitative research
(D) Interviewing every member of the target population

26. What is the basis of the scientific method


(A) To test hypothesis in conditions that are conducive to its success
(B) To formulate a research problem and disprove the hypothesis
(C) To formulate a research problem, test the hypothesis in carefully controlled conditions
that challenge the hypothesis
(D) To test hypothesis and if they are disproved, they should be abandoned completely

27. Of all the steps in the research process, the one that typically takes the most time is
(A) Data collection
(B) Formulating the problem
(C) Selecting a research method
(D) Developing a hypothesis

28. The concepts in a hypothesis are stated as


(A) Variables
(B) Theories
(C) Indices
(D) Ideas

29. In order for a variable to be measured, a researcher must provide a


(A) Operational definition
(B) Hypothesis
(C) Theory
(D) Scale

30. Which of the following was not identified as a major research design?
(A) secondary research
(B) Surveys
(C) Field Research
(D) ethnography

31. When a number of researchers use the same operational definition to measure a variable and
achieve the same results, the measure is said to be
(A) Instrumental
(B) Reliable
(C) Valid
(D) Factual

32. There are various types of research designed to obtain different types of information. What
type of research is used to define problems and suggest hypotheses?
(A) Descriptive Research
(B) Primary research
(C) Secondary research
(D) Causal research

33. What type of research would be appropriate in the following situation?


Nestlé wants to investigate the impact of children on their parents' decisions to buy
breakfast foods.
(A) Quantitative research.
(B) Qualitative research
(C) Secondary Research
(D) Mixed methodology

34. What type of research would be appropriate in the following situation?


A college or university bookshop wants to get some insights into how students feel about the
shop's merchandise, prices and service.
(A) Secondary data
(B) Qualitative research
(C) Focus groups
(D) Quantitative research

35. The Internet is a powerful mechanism for conducting research. However it does have its
drawbacks. Which of the following signify these drawbacks?
(A) The possible inclusion of individuals not being targeted, that could skew the results
(B) Lack of information about the population responding to the questionnaire.
(C) Eye contact and body language, (two useful research indicators) are excluded from the
analysis
(D) All of the above

36. _____________ research is the gathering of primary data by watching people.


(A) Experimental
(B) Causal
(C) Informative
(D) Observational

37. Which is the best type of research approach for gathering causal information?
(A) Observational
(B) Informative
(C) Experimental
(D) Survey

38. The outcome of what is being measured is termed:


(A) Independent Variable
(B) Dependent Variable
(C) Predictor variable
(D) Hypothetical Variable

39. Which of the following would occur in a longitudinal study:


(A) Measures are taken from different participants over an extended period of time
(B) Participation is expected to last for a minimum of 24 hours
(C) Measures are taken from same participants on different occasions usually over
extended period of time
(D) Measures are taken from participants in at least 6 different countries
w
40. Endeavors to explain, predict, and/or control phenomena are the goal of
(A) Scientific method
(B) Tradition
(C) Inductive logic
(D) Deductive logic

41. Ms. Casillas has been coordinating the Halloween Festival at her school for the last several
years. She wants to be sure the students and parents enjoy the festival again this year. On which
source is she LEAST likely to rely when making decisions about what to do?
(A) Tradition
(B) Research
(C) Personal experience
(D) Expert opinion

42. The scientific method is preferred over other ways of knowing because it is more
(A) Reliable
(B) Systematic
(C) Accurate
(D) All of these

43. Which of the following steps of the scientific method is exemplified by the researcher
reviewing the literature and focusing on a specific problem that has yet to be resolved?
(A) Describe the procedures to collect information
(B) Identify a topic.
(C) Analyze the collected information
(D) State the results of the data analysis

44. Which of the following is the LEAST legitimate research problem? The purpose of this study
is to
(A) understand what it means to be a part of a baseball team at a high school known for its
championship teams.
(B) study whether physical education should be taught in elementary parochial schools.
(C) examine the relationship between the number of hours spent studying and students test scores
(D) examine the effect of using advanced organizers on fifth grade students reading
comprehension

45. The research process is best described as a


(A) Method to select a frame of reference
(B) Set of rules that govern the selection of subjects
(C) Series of steps completed in a logical order
(D) Plan that directs the research design

46. A research proposal is best described as a


(A) Framework for data collection and analysis
(B) Argument for the merit of the study
(C) Description of how the researcher plans to maintain an ethical perspective during the study
(D) Description of the research process for a research project

47. The purpose of a literature review is to:


(A) Use the literature to identify present knowledge and what is unknown
(B) Assist in defining the problem and operational definition
(C) Identify strengths and weaknesses of previous studies
(D) All of the above

48. The statement 'To identify the relationship between the time the patient spends on the
operating table and the development of pressure ulcers' is best described as a research:
(A) Objective
(B) Aim
(C) Question
(D) Hypothesis

49. An operational definition specifies


(A) The data analysis techniques to be used in the study
(B) The levels of measurement to be used in the study
(C) How a variable or concept will be defined and measured in the study
(D) How the outcome of the research objectives for the study will be measured

50. A statement of the expected relationship between two or more variables is known as the:
(A) Concept definition
(B) Hypothesis
(C) Problem statement
(D) Research question
51. There is no difference in the incidence of phlebitis around intravenous cannulae changed
every 72 hours and those changed at 96 hours' is an example of a:
(A) Null hypothesis
(B) Directional hypothesis
(C) Non-directional hypothesis
(D) Simple hypothesis

52. Which of the following statements meets the criteria for a researchable question?
(A) Is the use of normal saline to cleanse wounds harmful to patients?
(B) Do generalist registered nurses meet the mental health needs of general patients?
(C) Do palliative care patients have spiritual needs?
(D) What are the patients perceptions of the effectiveness of pre-operative education for
total hip replacement?

53. The researcher needs to clearly identify the aim of the study; the question to be answered; the
population of interest; information to be collected, and feasibility in order to decide on the
research
(A) Design and method
(B) Design and assumptions
(C) Purpose and data analysis
(D) Purpose and assumptions

54. A variable that changes due to the action of another variable is known as the
(A) Independent variable
(B) Extraneous variable
(C) Complex variable
(D) Dependent Variable

55. When planning to do social research, it is better to


(A) Approach the topic with an open mind
(B) Do a pilot study before getting stuck into it
(C) Be familiar with the literature on the topic
(D) Forget about theory because this is a very practical undertaking

56. Which comes first, theory or research?


(A) Theory because otherwise you are working in the dark
(B) Research because that is only the way you can develop a theory
(C) It depends on your point of view
(D) The question is meaningless, because you cannot have one without the other

57. We review the relevant literature to know


(A) What is already known about the topic
(B) What concepts and theories have been applied to the topic
(C) Who are the key contributors to the topic
(D) All of the above

58. A deductive theory is one that:


(A) Allows theory to emerge out of the data
(B) Involves testing an explicitly defined hypothesis
(C) Allows for findings to feed back into the stock of knowledge
(D) Uses qualitative methods whenever possible

59. Which of the following is not a type of research question?


(A) Predicting an outcome
(B) Evaluating a phenomenon
(C) Developing good practice
(D) A hypothesis

60. Because of the number of things that can go wrong in research there is a need for:
(A) Flexibility and Perseverance
(B) Sympathetic supervisors
(C) An emergency source of finance
(D) Wisdom to know the right time to quit
61. __________ research seeks to investigate an area that has been under researched with
preliminary data that helps shape the direction for future research.
(A) Descriptive
(B) Exploratory
(C) Explanatory
(D) Positivist

62. Research questions in qualitative studies typically begin with which of the following words?
(A) Why
(B) How
(C) What
(D) All of the above

63. Qualitative researchers seek to analyze which of the following?


(A) Numerical data derived from the frequency of particular behaviors
(B) Statistical associations between variables
(C) The social meaning people attribute to their experiences and circumstances
(D) All of the above

64 . Which of the following is not a qualitative research methodology?


(A) Randomized control trial
(B) Ethnography
(C) Grounded Theory
(D) Phenomenology

65. Which of the following data collecting methods is not normally used in qualitative research?
(A) Participant observation
(B) Focus groups
(C) Questionnaire
(D) Semi-structured interview

66. The following journal article would be an example of --------- research.


“ The benefits of florescent lighting on production in a factory setting”
(A) Applied
(B) Basic
(C) Interview
(D) Stupid

67. The scientific method is preferred over other ways of knowing because it is more
(A) Reliable
(B) Systematic
(C) Accurate
(D) All of the above

68. Quantitative researcher’s preoccupation with generalization is an attempt to:


(A) Develop the law like findings of the natural sciences
(B) Boost their chances of publication
(C) Enhance the internal validity of the research
(D) Demonstrate the complex techniques of statistical analysis
69. What is the basis of the scientific method?
(A) To test hypotheses in conditions that are conducive to its success
(B) To formulate a research problem and disprove the hypothesis
(C) To formulate a research problem, test the hypothesis in carefully controlled conditions
that challenge the hypothesis
(D) To test hypotheses and if they are disproved, they should be abandoned completely

70. A literature review requires


(A) Planning
(B) Clear writing
(C) Good writing
(D) All of the above

71. The facts that should be collected to measure a variable, depend upon the
(A) Conceptual understanding
(B) Dictionary meaning
(C) Operational definition
(D) All of the above

72. Which of the following is the BEST hypothesis?


(A) Students taking formative quizzes will perform better on chapter exams than students not
taking these quizzes
(B) Taller students will have higher test scores than shorter students
(C) Students taught in a cooperative group setting should do better than students in a
traditional class
(D) Students using laptops will do well

73. Which of the following is the best hypothesis statement to address the research question?
“What impact will the new advertising campaign have on use of brand B”?
(A) The new advertising campaign will impact brand B image
(B) The new advertising campaign will impact brand B image trial
(C) The new advertising campaign will impact brand B usage at the expense of brand C
(D) The new advertising campaign will impact brand B’s market penetration

74. Qualitative and quantitative research are the classifications of research on the basis of
(A) Use of the research
(B) Time dimension
(C) Techniques used
(D) Purpose of the research

75. Rationalism is the application of


(A) Research Solution
(B) Logic and arguments
(C) Reasoning
(D) Previous findings

76. Why do you need to review the existing literature?


(A) To give your dissertation a proper academic appearance with lots of references
(B) Because without it, you could never reach the required word count
(C) To find out what is already known about your area of interest
(D) To help in your general studying

77. The application of the scientific method to the study of business problems is called
(A) Inductive reasoning
(B) Deductive reasoning
(C) Business research
(D) Grounded Theory

78. An operational definition


(A) One that bears no relation to the underlying concept
(B) An abstract, theoretical definition of a concept
(C) A definition of a concept in terms of specific, empirical measures
(D) One that refers to opera singers and their work

79. According to empiricism, which of the following is the ultimate source of all our concepts
and knowledge?
(A) Perceptions
(B) Theory
(C) Sensory experiences
(D) Logics and arguments

80. Which of the following is most beneficial to read in an article?


(A) Methods
(B) Introduction
(C) Figures
(D) References

81. Which of the following is not a function of clearly identified research questions?
(A) They guide your literature search
(B) They keep you focused throughout the data collection period
(C) They make the scope of your research as wide as possible
(D) They are linked together to help you construct a coherent argument

82. Hypothesis refers to


A. The outcome of an experiment
B. A conclusion drawn from an experiment
C. A form of bias in which the subject tries to outguess the experimenter
D. A tentative statement about the relationship

83. Statistics is used by researchers to


A. Analyze the empirical data collected in a study
B. Make their findings sound better
C. Operationally define their variables
D. Ensure the study comes out the way it was intended

84. A literature review is based on the assumption that


A. Copy from the work of others
B. Knowledge accumulates and learns from the work of others
C. Knowledge disaccumulates
D. None of the above option

85. A theoretical framework


A. Elaborates the researchers among the variables
B. Explains the logic underlying these researchers
C. Describes the nature and direction of the researchers
D. All of the above

86. Which of the following statement is not true?


A. A research proposal is a document that presents a plan for a project
B. A research proposal shows that the researcher is capable of successfully conducting
the proposed research project
C. A research proposal is an unorganized and unplanned project
D. A research proposal is just like a research report and written before the research
Project

87. Preliminary data collection is a part of the


A. Descriptive research
B. Exploratory research
C. Applied research
D. Explanatory research

88. Conducting surveys is the most common method of generating


A. Primary data
B. Secondary data
C. Qualitative data
D. None of the above

89. After identifying the important variables and establishing the logical reasoning in
Theoretical framework, the next step in the research process is
A. To conduct surveys
B. To generate the hypothesis
C. To focus group discussions
D. To use experiments in an investigation

90. The appropriate analytical technique is determined by


A. The research design
B. Nature of the data collected
C. Nature of the hypothesis
D. Both A & B

91. Discrete variable is also called……….


A. Categorical variable
B. Discontinuous variable
C. Both A & B
D. None of the above
92. “Officers in my organization have higher than average level of commitment” Such a
hypothesis is an example of……….
A. Descriptive Hypothesis
B. Directional Hypothesis
C. Relational Hypothesis
D. All of the above
93. ‘Science’ refers to……….
A. A system for producing knowledge
B. The knowledge produced by a system
C. Both A & B
D. None of the above
94. Which one of the following is not a characteristic of scientific method?
A. Deterministic
B. Rationalism
C. Empirical
D. Abstraction

95. The theoretical framework discusses the interrelationships among the……….


A. Variables
B. Hypothesis
C. Concept
D. Theory

96. ………research is based on naturalism.


A. Field research
B. Descriptive research
C. Basic research
D. Applied research

97. Rationalism is the application of which of the following?


A. Logic and arguments
B. Research solution
C. Reasoning
D. Previous findings

98- On which of the following, scientific knowledge mostly relies?


A. Logical understanding
B. Identification of events
C. Prior knowledge
D. All of the given options

99- Which of the following refers to research supported by measurable


evidence?
A. Opinion
B. Empiricism
C. Speculation
D. Rationalism
100. Research method is applicable in all of the following fields, EXCEPT;
A. Health care
B. Religion
C. Business
D. Government offices

101. All of the following are true statements about action research, EXCEPT;
A. Data are systematically analyzed
B. Data are collected systematically
C. Results are generalizable
D. Results are used to improve practice

102.Which of the following is characteristic of action research?


A. Variables are tightly controlled
B. Results are generalizable
C. Data are usually qualitative
D. Results demonstrate cause-and-effect relationships

103. “Income distribution of employees” in a specific organization is an example of which of


following type of variable?

A. Discontinuous variable
B. Continuous variable
C. Dependent variable
D. Independent variable

104.“There is no relationship between higher motivation level and higher efficiency” is an


example of which type of hypothesis?
A. Alternative
B. Null
C. Co relational
D. Research

105. Which of the following is not a role of hypothesis?


A. Guides the direction of the study
B. Determine feasibility of conducting the study
C. Identifies relevant and irrelevant facts
D. Provides framework for organizing the conclusions

106. Which of the following is not the source for getting information for exploratory
research?
A. Content analysis
B. Survey
C. Case study
D. Pilot study

107. Which of the following is the main quality of a good theory?


A. A theory that has survived attempts at falsification
B. A theory that is proven to be right
C. A theory that has been disproved
D. A theory that has been falsified

108. A variable that is presumed to cause a change in another variable is known as:
A. Discontinuous variable
B. Dependent variable
C. Independent variable
D. Intervening variable

109. Internal validity refers to.


a. Researcher’s degree of confidence.
b. Generalisability
c. Operationalization
d. All of the above

110. In ___________, the researcher attempts to control and/ or manipulate the variables in
the study.
a. Experiment
b. Hypothesis
c. Theoretical framework
d. Research design

111. In an experimental research study, the primary goal is to isolate and identify the effect
produced by the ____.
a. Dependent variable
b. Extraneous variable
c. Independent variable
d. Confounding variable

112. A measure is reliable if it provides consistent ___________.


a. Hypothesis
b. Results
c. Procedure
d. Sensitivity

113. ______ is the evidence that the instrument, techniques, or process used to measure
concept does indeed measure the intended concepts.
a. Reliability
b. Replicability
c. Scaling
d. Validity

114. Experimental design is the only appropriate design where_________ relationship can
be established.
a. Strong
b. Linear
c. Weak
d. Cause and Effect
115. In which one of the following stage researcher consult the literature?
a. Operation test
b. Response analysis survey
c. Document design analysis
d. Pretest interviews

116. Two variables may be said to be causally related if


a.they show a strong positive correlation.
b.all extraneous variables are controlled, and the independent variable creates consistent
differences in behavior of the experimental group.
c.they are observed to co-vary on many separate occasions.
d.they have been observed in a laboratory setting.

117. Theories explain results, predict future outcomes, and


a.rely only on naturalistic observations.
b.guide research for future studies.
c.rely only on surveys.
d.rely only on case studies.

118. Characteristics of the scientific method include


a. anecdotal definition.
b. controlled observation.
c. analysis formulation.
d. adherence to inductive thinking or common sense reasoning.

119. A scientific explanation that remains tentative until it has been adequately tested is called
a(n)
a.theory.
b.law.
c.hypothesis.
d.experiment.

120. The phrase "a theory must also be falsifiable" means


a.researchers misrepresent their data.
b.a theory must be defined so it can be disconfirmed.
c.theories are a rich array of observations regarding behavior but with few facts to support them.
d.nothing.

121. The products of naturalistic observation are best described in terms of


a.explanation.
b.theory.
c.prediction.
d.description.

122. A psychologist watches the rapid eye movements of sleeping subjects and wakes them to
find they report that they were dreaming. She concludes that dreams are linked to rapid eye
movements. This conclusion is based on
a.pure speculation.
b.direct observation.
c.deduction from direct observation.
d.prior prediction.

123. We wish to test the hypothesis that music improves learning. We compare test scores of
students who study to music with those who study in silence. Which of the following is an
extraneous variable in this experiment?
a.the presence or absence of music
b.the students' test scores
c.the amount of time allowed for the studying
d.silence

124. An experiment is performed to see if background music improves learning. Two groups
study the same material, one while listening to music and another without music. The
independent variable is
a.learning.
b.the size of the group.
c.the material studied.
d.music.

125. The most powerful research tool is a (an)


a.clinical study.
b.experiment.
c.survey.
d.correlational study.

126. A major disadvantage of the experimental method is that


a.private funding can never be obtained.
b.APA Ethical Review Committees often do not approve of the research techniques.
c.there is a certain amount of artificiality attached to it.
d.subjects are difficult to find for research projects.

127. In the traditional learning experiment, the effect of practice on performance is investigated.
Performance is the __________ variable.
a.independent
b.extraneous
c.dependent
d.control

128. Collection of observable evidence, precise definition, and replication of results all form the
basis for
a.scientific observation.
b.the scientific method.
c.defining a scientific problem.
d.hypothesis generation.

129. Which of the following is not characteristic of qualitative data?


a. Rich descriptions
b. Concise
c. Voluminous
d. Unorganized

130. An interview conducted by a trained moderator among a small group of respondents in an


unstructured and natural manner is a ----------
a. Depth Interview
b. Case Study
c. Focus Group
d. None of the above

131. Which of the following is not a longitudinal study?


a. Cohort Study
b. Trend Study
c. Panel Study
d. Census Study

132. A measure is reliable if it provides consistent --------


a. Hypothesis
b. Results
c. Procedure
d. Sensitivity

133. Following are the characteristics of the research except:


a. Systematic
b. Data Based
c. Subjective Approach
d. Scientific Inquiry

134. Which of the following similarity is found in qualitative research and survey research?
a. Examine topics primarily from the participant’s perspectives
b. They are guided by predetermined variables to study
c. They are descriptive research methods
d. Have large sample sizes

135. As a researcher you need not to:


a. Master the literature
b. Take numerous detailed notes
c. Create a bibliography list
d. Learn your findings

136. The final step in the research process is to:


a. Conduct a statistical analysis of data
b. Report the research results
c. Dismantle the apparatus
d. Clean the laboratory

137. The two functions of a research design are

(A) theory testing and model building.


(B) exploratory data collection and hypothesis testing
(C) hypothesis testing and theory testing
(D) exploratory data collection and model building
138. __________ involves evaluating potential explanations for observed behavior
(A) Exploratory data collection
(B) Data analysis
(C) Theory testing
(D) Hypothesis testing
139. In a __________ relationship, changes in one variable produce changes in another..
(A) causal
(B) correlational
(C) confounded
(D) unidirectional
140. The two defining characteristics of experimental research are:
(A) measuring predictor and criterion variables.
(B) random assignment of participants and measuring dependent variables
(C) manipulation of independent variables and control over extraneous variables
(D) random assignment of participants and control over extraneous variables
141. In an experiment on the effects of noise on problem solving, you have some participants
solve a problem while being exposed to noise, whereas other participants do the same problems
while not being exposed to noise. In this example, exposing or not exposing participants to the
noise constitutes a(n).
(A) independent variable
(B) dependent variable
(C) extraneous variable
(D) correlational variable.
142. In an experiment on visual perception, you make sure that your laboratory is the same
temperature and has the same level of lighting throughout the experiment. This is an example of:.
(A) holding extraneous variables constant.
(B) manipulating an independent variable
(C) randomly assigning participants to conditions
(D) ignoring extraneous variables
143 According to your text, extraneous variables can be dealt with in an experiment by.
(A) holding their values constant across conditions
(B) random assignment of subjects to condition
(C) increasing the power of your independent variables
(D) All of the above
(E) Both a and b
144. which of the following is the greatest strength of the experimental approach?
(A) the ability to study relationships under naturally occurring conditions
(B) the ability to identify and describe causal relationships
(C) the ability to generalize results beyond the original research situation
(D) All of the above
145. A disadvantage of the experimental approach is that
(A) you cannot adequately control extraneous variables.
(B) causal relationships among variables cannot be established
(C) your results may have limited generality
(D) All of the above
146. If your experimental design measures what it is intended to measure, we say that the design
has a high level of:
(A) reliability.
(B) internal validity.
(C) ecological validity.
(D) external validity
147. Alternative explanations for the findings of a study that may become viable because of
flaws in the design are termed:
(A) rival hypotheses
(B) experimental hypotheses
(C) theoretical possibilities
(D) goofs
148. Which of the following was listed in your text as a factor affecting external validity?
(A) history
(B) reactive testing
(C) statistical regression
(D) All of the above
149. you would be most concerned with external validity if you were conducting
(A) applied research
(B) basic research
(C) a demonstration
(D) None of the above
150. Power of the test of significance means probability of what?

(a) Incorrect rejection of the null hypothesis


(b) Correct rejection of the null hypothesis
(c) Incorrect acceptance of the null hypothesis
(d) Correct acceptance of the null hypothesis.
151. In evaluating the significance of the research problem, an important social consideration is

(a) The genuine interest of the researcher in the problem.


(b) Practical value of the findings to educationists, parents and social workers, etc.
(c) Necessary skills, abilities and background of knowledge of the researcher.
(d) Possibility of obtaining reliable and valid data by the researchers.
152. Thinking analogously about hypothesis, a researcher should

(a) First bet and then roll the dice.


(b) First roll the dice and then bet.
(c) Change his bet after the data are in.
(d) Have no bets, but dice only.
153. Why is research in education important for teachers?
(a) It adds to their academic qualifications.
(b) It makes them wiser
(c) It makes them better teachers
(d) It enables them to make best possible judgments about what should be taught and how.
154. Action research is ordinarily concerned with problems
(a) Of general nature.
(b) Constituting universal truths.
(c) Are of immediate concern and call for immediate solutions.
(d) Have long-range implications.
155. Which of the following is not a correct statement?

(a) A test can be reliable without being valid


(b) A test cannot be valid without being reliable
(c) A test can be reliable and valid both
(d) A test can be valid without being reliable.
156. Projective technique is used for measuring

(a) Individual’s need for self-actualization.


(b) Individual’s inventoried interests.
(c) Individual’s dominant feelings, emotions, conflicts, needs which are, generally,
repressed by the individual and are stored up in the unconscious mind.
(d) Individual’s value-system.
157.. Which of the following is not a projective technique?
(a) Rorschach
(b) T.A.T.
(c) Sentence-Completion Test
(d) Maudsley Personality Inventory (MPI).
158. Which of the following is not measured by the T.A. T. test?

(a) Personality needs


(b) Emotions
(c) Personality adjustment.
(d) Reasoning ability.
159. Which is a projective test?
(a) Edwards Personal Preference Schedule (EPPS)
(b) Allport Vemon-Lindzey Study of Values.
(c) Rorschach Test
(d) Minnesotta Multiphasic Personality Inventory (MMPI).
160. What is a research describing developmental changes in personality characteristics by
studying the same group at different age- levels?
(a) Developmental study
(b) Trend study
(c) Longitudinal growth study
(d) Cross-sectional growth study.
161. What is studying different groups of children of different ages simultaneously and
describing their developmental characteristics?
(a) Longitudinal growth study
(b) Trend study
(c) Time series study
(d) Cross-sectional growth study.
162. When is type-I error increased?
(a) When alpha-level decreases
(b) When alpha-level increases
(c) When the sample size increases
(d) When the sample size decreases.
163. What is the modern method of acquiring knowledge?
(a) Authority
(b) Personal experience
(c) Scientific method
(d) Expert opinion
164. What is NOT the goal of scientific method of acquiring knowledge?
(a) Explanation
(b) Fact-finding
(c) Control
(d) Prediction.
165. Theory, as an aspect of research, does not
(a) Serve as a tool for providing a guiding framework for observation and discovery.
(b) Describe the facts and relationships that exist.
(c) Serve as a goal providing explanation for specific phenomena with maximal probability and
exactitude.
(d) Discard facts, specific and concrete observations.
166. “Theory” helps the researcher in
(a) Understanding the research procedure.
(b) Identifying the facts needed to be considered in the context of the research problem.
(c) Understanding the technical terms used in research.
(d) Determining how to make or record observations.
167. Exploratory investigation of management question adapts the following approaches except
a. Films, photographs, and videotape
b. In-depth interviewing
c. Document analysis
d. Street ethnography
e. Survey method

168. ------------ are Questions the researcher, must answer to satisfactory arrive at a conclusion
about the research question.
a) Investigate questions
b) Research question
c) Measurement question
d) Fine-tuning the research question

169. research should be _________________


a) accessible
b) transparent
c) transferable
d) all of the above
170. A ___________________ is conducted to detect weaknesses in research instrument’s
design
a) Pilot study
b) Questionnaire
c) Interview
d) Sampling

171. What are the two types of arguments


a) deduction and induction
b) exploratory and deductive
c) dejection and injection
d) none of the above

172. What are the qualities of a good hypothesis


a) adequate for the purpose
b) testable
c) better than its rivals
d) all of the above
173. Data collection that focuses on providing an accurate description of the variables in a
situation forms the basis of which type of study
a) exploratory study
b) descriptive study
c) causal study
d) All of the above

174. A condition that exists when an instruments measures what it is supposed to measure is
called
a) validity
b) accuracy
c) reliability
d) none of the above
175. The major disadvantage with in depth interviews is that because of their time consuming
nature it is usually only possible to carry out a relatively small number of such interviews and
as such the results are likely to be highly ____________
a) subjective
b) objective
c) questionable
d) objectionable

176. A critical review of the information, pertaining to the research study, already available in
various sources is called

a) Research review
b) Research design
c) Data review
d) Literature review

177. ____________________ presents a problem, discusses related research efforts, outlines the
data needed for solving the data and shows the design used to gather and analyze the data.

a.) Research Question


b.) Research Proposal
c.) Research Design
d.) Research Methodology
178. The purpose of __________________ research is to help in the process of developing a
clear and precise statement of the research problem rather than in providing a definitive answer.

a.) Marketing
b.) Causal
c.) Exploratory
d.) Descriptive

179. A systematic, controlled, empirical, and critical investigation of natural phenomena guided
by theory and hypothesis is called _____________

a.) Applied Research


b.) Basic Research
c.) Scientific Research
d.) None Of The Above

180. __________________ is the determination of the plan for conducting the research and as
such it involves the specification of approaches and procedures..

a.) Strategy
b.) Research Design
c.) Hypothesis
d.) Deductive
181. A proposal is also known as a:

a) Work plan
b) Prospectus
c) Outline
d) Draft plan
e) All of the above
182. Every research proposal, regardless of length should include two basic sections. They are:

a) Research question and research methodology


b) Research proposal and bibliography
c) Research method and schedule
d) Research question and bibliography
183. The following are the synonyms for independent variable except

a) Stimulus
b) Manipulated
c) Consequence
d) Presumed Cause

184. The following are the synonyms for dependent variable except

a) Presumed effect
b) Measured Outcome
c) Response
d) Predicted from…
185. Which of the following is not a characteristic of research?

a. It requires the collection of new data


b. It is reiterative
c. It requires reasoned arguments to develop conclusions
d. It aims to increase understanding

186. How many stages are there to the research process?


a.5
b. 6
c. 7
d. 8

187. What would NOT be a consideration during the research design stage?
a. The availability of literature
b. The availability of participants
c. The type of methods that would be used
d. The type of analysis that would take place

188. Your conceptual framework is normally developed?


a. Before your literature review
b. During your literature review
c. After data collection
d. After data analysis

190. When assessing a research question, which is not an element of ‘CAFÉ’?


a. Control
b. Access
c.Facilities and resources
d. Expertise

191. What should not be included in a research proposal?


a. A summary of existing work in the area
b. The proposed methods to collect data
c. The results that will be obtained
d. An acknowledgement of any ethical issues

192. Reliability in quantitative research refers to


a. The consistency of any measure
b. The suitability of any measure
c. Both A and B
d. Neither A or B

193. Reliability in qualitative research refers to


a. The consistency of any measure
b. The consistency of the methods used to collect data
c. The suitability of the measure used
d. All of these
194. An experimental research design normally involves
a. Manipulating the independent variable
b. Manipulating the dependent variable
c. A number of repeated measures
d. Data collected over an extended time period

195. Which of the following are not normally a requirement for experimental research design?
a. Demonstrating co variation
b. Demonstrating time order
c. Demonstrating repeated measures
d. Demonstrating non spuriousness
DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

Specialization : Business Analytics

Course Code : 205 Course Name: Business Analytics using R Programming

MCQ

Sr Question Answer
No
1 Which of these measures are used to analyse the central tendency of data? B
a. Mean and Normal Distribution
b. Mean, Median and Mode
c. Mode, Alpha & Range
d. Standard Deviation, Range and Mean
e. Median, Range and Normal Distribution
2 Five numbers are given: (5, 10, 15, 5, 15). Now, what would be the sum of D
deviations of individual data points from their mean?

A) 10 B)25 C) 50 D) 0 E) None of the above


3 A test is administered annually. The test has a mean score of 150 and a A
standard deviation of 20. If Ravi’s z-score is 1.50, what was his score on the
test?

A) 180 B) 130 C) 30 D) 150


E) None of the above
4 Business intelligence (BI) is a broad category of application programs which A
includes _____________
a) Decision support b) Data mining c) OLAP
d) All of the mentioned

5 Point out the correct statement. A


a) OLAP is an umbrella term that refers to an assortment of software
applications for analyzing an organization’s raw data for intelligent decision
making
b) Business intelligence equips enterprises to gain business advantage from
data
c) BI makes an organization agile thereby giving it a lower edge in today’s
evolving market condition
d) None of the mentioned
6 BI can catalyze a business’s success in terms of _____________ d
a) Distinguish the products and services that drive revenues
b) Rank customers and locations based on profitability
c) Ranks customers and locations based on probability
d) All of the mentioned

7 Which of the following areas are affected by BI? D


a) Revenue b) CRM c) Sales d) All of the mentioned
8 1. Business intelligence (BI) is a broad category of application programs which D

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

includes _____________
a) Decision support b) Data mining c) OLAP
d) All of the mentioned

9 Which of the following measures of central tendency will always change if a A


single value in the data changes?

A) Mean B) Median C) Mode D) All of these


10 Strong assessment items are made up of five elements: A
1. a) Standard b) Stimulus
2. c) Stem d) Key
3. e) Distractors

11
4. A good question is --------------- It focuses on recall of only the material covered B
in your lesson and aligns well with the overall learning objectives
5. a) relevant. b) clear c) concise d) purpose

12
6. A good question is framed in a-----------, easily understandable language, A
without any vagueness. Students should understand what is wanted from the
question even when they don’t know the answer to it.
7. a) clear b) relevant c) concise d) purpose

13 A good question is usually crisp and----------. It omits any unnecessary A


information that requires students to spend time understanding it correctly.
The idea is not to trick learners but assess their knowledge.
a) concise b) clear c) relevant d) purpose
14 1. _____ programming language is a dialect of S. C
a) B b) C c) R d) K

15 Point out the WRONG statement? C


a) Early versions of the S language contain functions for statistical modeling
b) The book Programming with Data by John Chambers documents S version of
the language
c) In 1993 Bell Labs gave StatSci (later Insightful Corp.) an exclusive license to
develop and sell the S language
d) The book Programming with Data by IBM documents S version of the
language

16 In 1991, R was created by Ross Ihaka and Robert Gentleman in the Department D
of Statistics at the University of _________
a) John Hopkins
b) California
c) Harvard
d) Auckland

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

17 Point out the wrong statement? A


a) R is a language for data analysis and graphics
b) K is language for statistical modelling and graphics
c) One key limitation of the S language was that it was only available in a
commercial package, S-PLUS
d) C is a language for data and graphics

18 Business analytics results in which of these? D


a. Evidence Based Decisions
b. Data Driven Decisions
c. Better Decisions
d. All of these are correct
19 Which one of the following is not a type of Business Analytics? D
a. Descriptive Analytics
b. Diagnostic Analytics
c. Predictive Analytics
d. Performance Analytics
20 What will be the output of the following R code snippet? D

> paste("a", "b", se = ":")

a) “a+b”
b) “a=b”
c) “a b :”
d) none of the mentioned

21 Point out the correct statement? D


a) In R, a function is an object which has the mode function
b) R interpreter is able to pass control to the function, along with arguments that
may be necessary for the function to accomplish the actions that are desired
c) Functions are also often written when code must be shared with others or the
public
d) All of the mentione

22 The __________ function returns a list of all the formal arguments of a function. A
a) formals()
b) funct()
c) formal()
d) fun()

23 What will be the output of the following R code snippet? A

> f <- function(num = 1) {

+ hello <- "Hello, world!\n"

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

+ for(i in seq_len(num)) {

+ cat(hello)

+ }

+ chars <- nchar(hello) * num

+ chars

+}

> f()

a)

Hello, world!

[1] 14

b) Hello, world!

[1] 15

c) Hello, world!

[1] 16

d) Error

24 Point out the wrong statement? A


a) A formal argument can be a symbol, a statement of the form ‘symbol =
expression’, or the special formal argument
b) The first component of the function declaration is the keyword function
c) The value returned by the call to function is not a function
d) Functions are also often written when code must be shared with others or the
public

25 You can check to see whether an R object is NULL with the _________ function. A
a) is.null()
b) is.nullobj()
c) null()
d) as.nullobj()

Which of the following code will print NULL? A


a) > args(paste)
26 b) > arg(paste)

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

c) > args(pastebin)
d) > arg(bin)

What will be the output of the following R code snippet? A

> paste("a", "b", sep = ":")

a) “a+b”
b) “a=b”
c) “a:b”
d) a*b
27
What will be the output of the following R code snippet? A

> f <- function(a, b) {

+ print(a)

+ print(b)

+}

> f(45)

a) 32
b) 42
c) 52
28 d) 45
What will be the output of the following R code snippet? A

> f <- function(a, b) {

+ a^2

+}

> f(2)

a) 4
b) 3
c) 2
29 d) 5
Which of the following is a base package for R language? C

a) util
b) lang
c) tools
d) All of the above
30

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

R comes with a ________ to help you optimize your code and improve its performance. A

a) Debugger
b) Monitor
c) Profiler
d) None of the above

31
debug() flags a function for ______ mode in R mode. B

a) debug
b) run
c) compile
d) None of the above
32
______ suspends the execution of a function wherever it is called and puts the function C
in debug mode

a) recover()
b) browser()
c) Both of the above
33
A matrix is ___dimensionsinal rectangular data set? D

a) 5
b) 4
c) 3
d) 2
34
The _____ function takes a vector or other objects and splits it into groups determined B
by a factor or list of factors.

a) apply()
b) split()
c) isplit()
d) mapply()
35
lapply function takes___ arguments in R language C

a) 1
b) 3
c) 4
d) 5
36
____is used to apply a function over subsets of a vector d

37 a) apply()

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

b) lapply()
c) mapply()
d) tapply()
a)

_______applies a function over the margins of an array A

a) apply()
b) lapply()
c) tapply()
d) mapply()
38
____function is same as lapply() in R C

b) apply()
c) lapply()
d) sapply()
e) tapply()

39
_______ loop over a list and evaluate a function on each element A

a) apply()
b) lapply()
c) sapply()
d) tapply()
40
__________ is proprietary tool for predictive analytics. B

a) R
b) SAS
c) SSAS
d) SPSS

41
Data frames can be converted to a matrix by calling data._______ C

a) matr()
b) mat()
c) matrix()
d) None of the above

42
Which of the following method make a vector of repeated values? b

a) rep()
43 b) data()

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

c) view()
d) None of the above

R objects can have attributes, which are like ________ for the object A

a) metadata
b) features
c) expressions
44
Attributes of an object (if any) can be accessed using the ______ function. C

a) objects()
b) attrib()
c) attributes()
45
_________ involves predicting a response with meaningful magnitude, such as quantity A
sold, stock price, or return on investment.

a) Regression
b) Clustering
c) Summarization
46
________ provides needed string operators in R C

a) str
b) forcast
c) stringr
47
______ splits a data frame and results in an array (hence the da). Hopefully, you’re B
getting the idea here.

a) apply
b) daply
c) stats
48
System.time function returns an object of class _______ which contains two useful bits C
of information.

a) debug_time
b) procedure_time
c) proc_time
49
Which of the following will start the R program? a

a) $ R
50 b) & R

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

c) Rb

Unit 2
The third step in decision making process is C
a linear predictions
b dependent predictions
c making predictions
1 d independent predictions
The decision making step, which consists of organization goals, predicting C
alternatives and communicating goals is called
a organization
b alternation
c planning
2 d valuing
The fourth step in decision making process is B
a linear correlation
b making decisions
c implement decisions
d evaluate performance
3
The costs that behaves as irrelevant costs in process of decision making are A
classified as
a past costs
b future costs
c expected costs
4 d sunk costs
Which of these is not a topic covered in a typical Business Analyst Aptitude D
Test?
a. Analytical Thinking c. Data Interpretation
5 b. Listening Skills d. Risk Management
If the test should be 30 minutes, Analytical Thinking is taken in how many C
minutes?
a. 5 c. 10
b. 7 d. 15
6
Primary objective of a business analyst is to help businesses implement B
a. Business systems
b. Business solutions
c. Technology systems
d. Technology solutions
7
Which business professional performs cost-benefit analyses of existing and C
potential customers
a) Marketer
8 b) Financial Analyst

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

c) Business Analyst
d) Sales Representative
8. 1. A Use Case is a set of steps, typically defining interactions between a role, A
True of False
9. a. True
b. False
9
Any fact that the solution can assume to be true when the use case begins is C
what?

a. A win
b. A Failure
c. A success
d. A Precondition
10
A State Diagram is used for what? D
a. Which Events cause a transition between states
b. Which events cause a success between states
c. Allowable behaviour
11 d, All
A Solution Requirement is comprised of two types of requirements what are A
they?
a, Functional
b. Hard
c. Existing
12 d. Non-Functional
Which of the following is used for Statistical analysis in R language? B

a) Studio
b) RStudio
c) Heck
13
R functionality is divided into a number of ________ A

a) Packages
b) Functions
c) Domains
14
Which of the following is an example of vectorized operation as far as subtraction is b
concerned?

> x <- 1:4


> y <- 6:9

a) x+y
15 b) x-y

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

c) x/y
d) x*y

What would be the output of the following code? A

> x <- 1:4

> y <- 6:9

> z <- x + y

>z

a) 7 9 11 13
b) 7 9 11 13 14
c) 9 11 13
d) Null
16
What would be the output of the following code? A

> x <- 1:4

>x>2

a) FALSE FALSE TRUE TRUE


b) 1 2 3 4
c) 1 2 3 4 5

17
What would be the value of the following expression? A

log(-1)

a) Warning in log(-1): NaNs produced


b) 1
c) Null
d) 0
18
What will be the output of the following code? d

> g <- function(x) {

+ a <- 3

19 + x+a+y

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

c
+ ## 'y' is a free variable

+}

> g(2)

a) 8
b) 9
c) 42
d) Error

What will be the output of the following code? C

function(p) {

params[!fixed] <- p

mu <- params[1]

sigma <- params[2]

## Calculate the Normal density

a <- -0.5*length(data)*log(2*pi*sigma^2)

b <- -0.5*sum((data-mu)^2) / (sigma^2)

-(a + b)

> ls(environment(nLL))

a) “data” “fixed” “param”


b) “data” “variable” “params”
c) “data” “fixed” “params”
d) None of the above
20
Which of the following is a principle of analytic graphics? D

a) Don’t plot more than two variables at at time


b) Make judicious use of color in your scatterplots
21 c) Show box plots (univariate summaries)

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

d) Show causality, mechanism, explanation

R is an__________programming language? C

a) Closed source
b) GPL
c) Open source
d) Definite sourc
22
Who developed R? A

a) Dennis Ritchie
b) John Chambers
c) Bjarne Stroustrup
23
R was named partly after the first names of____R authors? B

a) One
b) Two
c) Three
d) Four
24
Packages are useful in collecting sets into a_____unit ? C

a) Single
b) Multiple
25
Many quantitative analysts use R as their____tool? D

a) Leading tool
b) Programming tool
c) Both the above
26
Predictive analysis is the branch of __________analysis? B

a) Advanced
b) Core
c) Both the above
27
___________ is used to make predictions about unknown future events? C

a) Descriptive analysis
b) Predicitive analysis
c) Both the above
28
How many steps does the predictive analysis process contained? d

29 a) 5

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

b) 6
c) 7
d) 8

Descriptive analysis tell about________? A

a) Past
b) Present
c) Future
30
How many types of R objects are present in R data type? C

a) 4
b) 5
c) 6
d) 7
31
How many types of data types are present in R? A

a) 4
b) 5
c) 6
d) 7
32
Which of the following is a primary tool for debugging? B

a) debug()
b) trace()
c) browser()
d) None of the above
33
Which function is used to create the vector with more than one element? C

a) Library()
b) plot()
c) c()
d) par()
34
In R every operation has a ______call? A

a) System
b) Function
c) None of the above
35
The ____________ in R is a vector. b

a) Basic data structure


36 b) Basic datatypes

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

c) Both

R is an interpreted language so it can access through_____________? C

a) Disk operating system


b) User interface operating system
c) Operating system
d) Command line interpreter
37
Vectors come in two parts_____ and _____. A

a) Atomic vectors and matrix


b) Atomic vectors and array
c) Atomic vectors and list
38
How many types of atomic vectors are present? C

a) 3
b) 4
c) 5
d) 6
39
How many types of vertices functions are peresent? B

a) 1
b) 2
c) 3
d) 4
40
_________and_________ are types of matrices functions? C

a) Apply and sapply


b) Apply and lapply
c) Both

41
How many control statements are present in R? A

a) 6
b) 7
c) 8
d) 9
42
Which of the following finds the maximum value in the vector x, exclude missing b
values

a) rm(x)
43 b) all(x)

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

c) max(x, na.rm=TRUE)
d) x%in%y

Which of the following sort dataframe by the order of the elements in B A

a) a.x[rev(order(x$B)),]
b) b.x[ordersort(x$B),
c) c.x[order(x$B),]
44
_________ initiates an infinite loop right from the start. B

a) Never
b) Repeat
c) Break
d) Set
45
_______ is used to skip an iteration of a loop. A

a) Next
b) Skip
c) Group
46
47 _____ programming language is a dialect of S. A

a) B
b) C
c) D
d) S

48 In 1991, R was created by Ross Ihaka and Robert Gentleman in the Department of A
Statistics at the University of _________.

a) Auckland
b) Harvard
c) California
d) John Hopkins

49 Finally, in _________ R version 1.0.0 was released to the public. D

a) 2000
b) 2005
c) 2010
d) 2012

50 R is technically much closer to the Scheme language than it is to the original _____ c

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

language.

a) B
b) S
c) C
d) C++

Unit-3

They primary R system is available from the ______ C


a) CRAN
b) CRWO
c) GNU
d) CRDO
1
Point out the wrong statement? D
a) Key feature of R was that its syntax is very similar to S
b) R runs only on Windows computing platform and operating system
c) R has been reported to be running on modern tablets, phones, PDAs, and
game consoles
2 d) R functionality is divided into a number of Packages
R functionality is divided into a number of ________ A
a) Packages
b) Functions
c) Domains
d) Classes

3
Which Package contains most fundamental functions to run R? A
a) root
b) child
c) base
4 d) parent
Which language is best for the statistical environment? B
a) C
b) R
c) Java
5 d) Python
In order to use the R-related functionality in Dundas BI, you must have D
access to an existing _________
a) Console
b) Terminal
c) Packages
6 d) R serve
The open source _________ software is available for Unix, Linux, and Windows A
7 platforms.

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

a) Rserve
b) BServe
c) CServe
d) Dserve
Modification in Dundas BI is done ______________ A
a) Directly
b) Indirectly
c) Need access to Server
8 d) Not known
Is It possible to inspect the source code of R? A
a) Yes
b) No
c) Can’t say
9 d) Some times
__________ function is used to watch for all available packages in library. D
a) lib()
b) fun.lib()
c) libr()
10 d) library()
The longer programs are called ____________ D
a) Files
b) Structures
c) Scripts
11 d) Data
Scripts will run on ___________________ A
a) Script Editors
b) Console
c) Terminal
12 d) GCC Compiler
What will be the output of the following R function? A

ab <- list(1, 2, 3, "X", "Y", "Z")


dim(ab) <- c(3,2)
print(ab)
a. 123
Xyz
b. Error
c. Xyz123
13 d. 123xyz
What is the meaning of the following R function? A
x <- c(4, 5, 1, 2, 3, 3, 4, 4, 5, 6)
x <- as.factor(x)
a) x becomes a factor
b) x is a factor
14 c) x does not exist

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

d) x is not a vector
What is the meaning of the following R function? B
print( sqrt(2) )
a) 1.414314
b) 1.414214
c) Error
15 d) 14.1414
What will be the output of the following R function? C
d <- date()
a) Prints todays date
b) Prints some date
c) Prints exact present time and date
16 d) Error
Which of the following commands will correctly read the above csv file with B
5 rows in a dataframe?
A) csv(‘Dataframe.csv’)
B) csv(‘Dataframe.csv’,header=TRUE)
C) dataframe(‘Dataframe.csv’)
D) csv2(‘Dataframe.csv’,header=FALSE,sep=’,’)
17
R functionality is divided into a number of ________ A

a) Packages
b) Functions
c) Domains
18
Consider the following function. A

f <- function(x) {

g <- function(y) {

y+z

z <- 4

x + g(x)

If we execute following commands (written below), what will be the output?

z <- 10

f(4)
19

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

A) 12

B) 7

C) 4

D) 16

The iris dataset has different species of flowers such as Setosa, Versicolor and B
Virginica with their sepal length. Now, we want to understand the
distribution of sepal length across all the species of flowers. One way to do
this is to visualise this relation through the graph shown below.

Which function can be used to produce the graph shown above?

A) xyplot()
B) stripplot()
C) barchart()
D) bwplot()
20
The plot above is of type strip whereas the options a, c and d will produce a scatter, D
bar and box whisker plot respectively. Therefore, option B is the correct solution.

Alpha 125.5 0

Beta 235.6 1

Beta 212.03 0

Beta 211.30 0

Alpha 265.46 1

File Name – Dataframe.csv


Which of the following commands will correctly read the above csv file with 5
rows in a dataframe?
A) csv(‘Dataframe.csv’)
B) csv(‘Dataframe.csv’,header=TRUE)
C) dataframe(‘Dataframe.csv’)
21 D) csv2(‘Dataframe.csv’,header=FALSE,sep=’,’)
Excel file format is one of the most common formats used to store datasets. It D
is important to know how to import an excel file into R. Below is an excel file
in which data has been entered in the third sheet.
Alpha 125.5 0
22

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

Beta 235.6 1

Beta 212.03 0

Beta 211.30 0

Alpha 265.46 1

File Name – Dataframe.xlsx


Which of the following codes will read the above data in the third sheet into a
dataframe in R?
A) Openxlsx::read.xlsx(“Dataframe.xlsx”,sheet=3,colNames=FALSE)
B) Xlsx::read.xlsx(“Dataframe.xlsx”,sheetIndex=3,header=FALSE)
C)XLConnect::readWorksheetFromFile(“Dataframe.xlsx”,sheet=3,header=FALSE)
D)All of the above

C
A 10 Sam

B 20 Peter

C 30 Harry

D ! ?

E 50 Mark

File Name – Dataframe.csv


Missing values in this csv file has been represented by an exclamation mark
(“!”) and a question mark (“?”). Which of the codes below will read the above
csv file correctly into R?
A) csv(‘Dataframe.csv’)
B) csv(‘Dataframe.csv’,header=FALSE, sep=’,’,na.strings=c(‘?’))
C) csv2(‘Dataframe.csv’,header=FALSE,sep=’,’,na.strings=c(‘?’,’!’))
23 D) dataframe(‘Dataframe.csv’)
Column 1 Column 2 Column 3 d

Row 1 15.5 14.12 69.5

Row 2 18.6 56.23 52.4

Row 3 21.4 47.02 63.21


24

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

Row 4 36.1 56.63 36.12

File Name – Dataframe.csv


6) The above csv file has row names as well as column names. Which of the
following code will read the above csv file properly into R?
A) delim(‘Train.csv’,header=T,sep=’,’,row.names=TRUE)
B) csv2(‘Train.csv’,header=TRUE, row.names=TRUE)
C) dataframe(‘Train.csv’,header=TRUE,sep=’,’)
D) csv(‘Train.csv’,,header=TRUE,sep=’,’)

A
Column 1 Column 2 Column 3

Row 1 15.5 14.12 69.5

Row 2 18.6 56.23 52.4

Row 3 21.4 47.02 63.21

Row 4 36.1 56.63 36.12

File Name – Dataframe.csv


Which of the following codes will read only the first two rows of the csv file?
A) csv(‘Dataframe.csv’,header=TRUE,row.names=1,sep=’,’,nrows=2
B) csv2(‘Dataframe.csv’,row.names=1,nrows=2)
C) delim2(‘Dataframe.csv’,header=T,row.names=1,sep=’,’,nrows=2)
D) dataframe(‘Dataframe.csv’,header=TRUE,row.names=1,sep=’,’,skip.last=2)
25
Dataframe1 Dataframe2 D

Feature1 Feature2 Feature3 Feature4 Feature1 Feature2 Feature3

A 1000 25.5 10 E 5000 65.5

B 2000 35.5 34 F 6000 75.5

C 3000 45.5 78 G 7000 85.5

D 4000 55.5 3 H 8000 95.5

There are two dataframes stored Dataframe1 and Dataframe2 shown above.
26 Which of the following codes will produce the output shown below?

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

Feature1 Feature2 Feature3

A 1000 25.5

B 2000 35.5

C 3000 45.5

D 4000 55.5

E 5000 65.5

F 6000 75.5

G 7000 85.5

H 8000 95.5

A) merge(dataframe[,1:3],dataframe2)
B) merge(dataframe1,dataframe2)[,1:3]
C) merge(dataframe1,dataframe2,all=TRUE)
D) Both 1 and 2
E) All of the above

e
V1 V2

1 121.5 461

2 516 1351

3 451 6918

4 613 112

5 112.36 230

6 25.23 1456

7 12 457
27

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

dataframe
A data set has been read in R and stored in a variable “dataframe”. Which of
the below codes will produce a summary (mean, mode, median) of the entire
dataset in a single line of code?
A) summary(dataframe)
B) stats(dataframe)
C) summarize(dataframe)
D) summarise(dataframe)
E) None of the above

D
A dataset has been read in R and stored in a variable “dataframe”. Missing
values have been read as NA.
A 10 Sam

B NA Peter

C 30 Harry

D 40 NA

E 50 Mark

dataframeWhich of the following codes will not give the number of missing
values in each column?
A) colSums(is.na(dataframe))
B) apply(is.na(dataframe),2,sum)
C) sapply(dataframe,function(x) sum(is.na(x))
D) table(is.na(dataframe))
28
One of the important phase in a Data Analytics pipeline is univariate analysis D
of the features which includes checking for the missing values and the
distribution, etc. Below is a dataset and we wish to plot histogram
for “Value” variable.
Parameter State Value Dependents

Alpha Active 50 2

Beta Active 45 5

Beta Passive 25 0

Alpha Passive 21 0
29

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

Alpha Passive 26 1

Beta Active 30 2

Beta Passive 18 0

dataframed
Which of the following commands will help us perform that task ?
A) hist(dataframed$Value)
B) ggplot2::qplot(dataframed$Value,geom=”Histogram”)
C)ggplot2::ggplot(data=dataframed,aes(dataframe$Value))+geom_histogram()
D) All of the above

D
Parameter State Value Usage

Alpha Active 50 0

Beta Active 45 1

Beta Passive 25 0

Alpha Passive 21 0

Alpha Passive 26 1

Beta Active 30 1

Beta Passive 18 0

Certain Algorithms like XGBOOST work only with numerical data. In that case,
categorical variables present in dataset are first converted to DUMMY
variables which represent the presence or absence of a level of a categorical
variable in the dataset. For example After creating the Dummy Variable for
the feature “Parameter”, the dataset looks like below.
Parameter_Alph
Parameter_Beta State Value Usage
a

1 0 Active 50 0

0 1 Active 45 1
30

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

0 1 Passive 25 0

1 0 Passive 21 0

1 0 Passive 26 1

0 1 Active 30 1d

0 1 Passive 18 0d

Which of the following commands will help us to achieve this?


A) dummies:: dummy.data.frame(dataframe,names=c(‘Parameter’))
B) dataframe$Parameter_Alpha=0
dataframe$Gende_Beta=0
dataframe$Parameter_Alpha[which(dataframe$Parameter==’Alpha’)]=1
dataframe$Parameter_Beta[which(dataframe$Parameter==’Alpha’)]=0
dataframe$Parameter_Alpha[which(dataframe$Parameter==’Beta’]=0
dataframe$Parameter_Beta[which(dataframe$Parameter==’Beta’]=1
C) contrasts(dataframe$Parameter)
D) Both 1 and 2

d
Column1 Column2 Column3 Column4 Column5 Column6

Name1 Alpha 12 24 54 0 Alpha

Name2 Beta 16 32 51 1 Beta

Name3 Alpha 52 104 32 0 Gamma

Name4 Beta 36 72 84 1 Delta

Name5 Beta 45 90 32 0 Phi

Name6 Alpha 12 24 12 0 Zeta

Name7 Beta 32 64 64 1 Sigma

Name8 Alpha 42 84 54 0 Mu

Name9 Alpha 56 112 31 1 Eta


31

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

Dataframe
We wish to calculate the correlation between “Column2” and “Column3” of a
“dataframe”. Which of the below codes will achieve the purpose?
A) corr(dataframe$column2,dataframe$column3)
B)
(cov(dataframe$column2,dataframe$column3))/(var(dataframe$column2)*sd(dat
aframe$column3))
C)
(sum(dataframe$Column2*dataframe$Column3)-
(sum(dataframe$Column2)*sum(dataframe$Column3)/nrow(dataframe))
)/(sqrt((sum(dataframe$Column2*dataframe$Column2)-
(sum(dataframe$Column2)^3)/nrow(dataframe))* (sum(dataframe$Column3*d
ataframe$Column3)-(sum(dataframe$Column3)^2)/nrow(dataframe))))
D) None of the Above
D
Parameter State Value Dependents

Alpha Active 50 2

Beta Active 45 5

Beta Passive 25 0

Alpha Passive 21 0

Alpha Passive 26 1

Beta Active 30 2

Beta Passive 18 0

Dataframe
The above dataset has been loaded for you in R in a variable
named “dataframe” with first row representing the column name. Which of
the following code will select only the rows for which parameter is Alpha?
A) subset(dataframe, Parameter=’Alpha’)
B) subset(dataframe, Parameter==’Alpha’)
C) filter(dataframe,Parameter==’Alpha’)
32 D) Both 2 and 3
E) All of the above
15) Which of the following function is used to view the dataset in spreadsheet B
like format?
A) disp()
B) View()
C) seq()
33 D) All of the Above

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

B
The below dataframe is stored in a variable named data.
A B

1 Right

2 Wrong

3 Wrong

4 Right

5 Right

6 Wrong

7 Wrong

8 Right

Data
Suppose B is a categorical variable and we wish to draw a boxplot for every
level of the categorical level. Which of the below commands will help us
achieve that?
A) boxplot(A,B,data=data)
B) boxplot(A~B,data=data)
C) boxplot(A|B,data=data)
D) None of the above
34
Which of the following commands will split the plotting window into 4 X 3 B
windows and where the plots enter the window column wise.
A) par(split=c(4,3))
B) par(mfcol=c(4,3))
C) par(mfrow=c(4,3))
D) par(col=c(4,3))
35
A Dataframe “df” has the following data: D
Dates
2017-02-28
2017-02-27
2017-02-26
2017-02-25
2017-02-24
36 2017-02-23

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

2017-02-22
2017-02-21
After reading above data, we want the following output:
Dates
28 Tuesday Feb 17
27 Monday Feb 17
26 Sunday Feb 17
25 Saturday Feb 17
24 Friday Feb 17
23 Thursday Feb 17
22 Wednesday Feb 17
21 Tuesday Feb 17

Which of the following commands will produce the desired output?


A) format(df,”%d %A %b %y”)
B) format(df,”%D %A %b %y”)
C) format(df,”%D %a %B %Y”)
D) None of above
Which of the following command will help us to rename the second column in D
a dataframe named “table” from alpha to beta?
A) colnames(table)[2]=’beta’
B) colnames(table)[which(colnames==’alpha’)]=’beta’
C) setnames(table,’alpha’,’beta’)
37 D) All of the above
C
A majority of work in R uses systems internal memory and with large
datasets, situations may arise when the R workspace cannot hold all the R
objects in memory. So removing the unused objects is one of the solution.
Which of the following command will remove an R object / variable named
“santa” from the workspace?
A) remove(santa)
B) rm(santa)
C) Both
38 D) None
“dplyr” is one of the most popular package used in R for manipulating data D
and it contains 5 core functions to handle data. Which of the following is not
one of the core functions of dplyr package?
A) select()
B) filter()
C) arrange()
39 D) summary()
During Feature Selection using the following dataframe (named table), D
“Column1” and “Column2” proved to be non-significant. Hence, we would not
like to take these two features into our predictive model.
Column1 Column2 Column3 Column4 Column5 Column6

Name1 Alpha 12 24 54 0 Alpha


40

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

Name2 Beta 16 32 51 1 Beta

Name3 Alpha 52 104 32 0 Gamma

Name4 Beta 36 72 84 1 Delta

Name5 Beta 45 90 32 0 Phi

Name6 Alpha 12 24 12 0 Zeta

Name7 Beta 32 64 64 1 Sigma

Name8 Alpha 42 84 54 0 Mu

Name9 Alpha 56 112 31 1 Eta

Table
Which of the following commands will select all the rows from column 3 to
column 6 for the below dataframe named table?
A) dplyr::select(table,Column3:Column6)
B) table[,3:6]
C) subset(table,select=c(‘Column3’,’Column4’,’Column5’,’Column6’))
D) All of the above
C
Column1 Column2 Column3 Column4 Column5 Column6

Name1 Alpha 12 24 54 0 Alpha

Name2 Beta 16 32 51 1 Beta

Name3 Alpha 52 104 32 0 Gamma

Name4 Beta 36 72 84 1 Delta

Name5 Beta 45 90 32 0 Phi

Name6 Alpha 12 24 12 0 Zeta

Name7 Beta 32 64 64 1 Sigma


41

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

Name8 Alpha 42 84 54 0 Mu

Name9 Alpha 56 112 31 1 Eta

table
Which of the following commands will select the rows having “Alpha” values
in “Column1” and value less than 50 in “Column4”? The dataframe is stored in
a variable named table.
A) dplyr::filter(table,Column1==’Alpha’, Column4<50)
B) dplyr::filter(table,Column1==’Alpha’ & Column4<50)
C) Both of the above
D) None of the above

C
Column1 Column2 Column3 Column4 Column5 Column6

Name1 Alpha 12 24 54 0 Alpha

Name2 Beta 16 32 51 1 Beta

Name3 Alpha 52 104 32 0 Gamma

Name4 Beta 36 72 84 1 Delta

Name5 Beta 45 90 32 0 Phi

Name6 Alpha 12 24 12 0 Zeta

Name7 Beta 32 64 64 1 Sigma

Name8 Alpha 42 84 54 0 Mu

Name9 Alpha 56 112 31 1 Eta

Table
Which of the following code will sort the dataframe based on “Column2” in
ascending order and “Column3” in descending order?
A) dplyr::arrange(table,desc(Column3),Column2)
B) table[order(-Column3,Column2),]
C) Both of the above
42 D) None of the above
What will be the output of the following command B
grepl(“neeraj”,c(“dheeraj”,”Neeraj”,”neeraj”,”is”,”NEERAJ”))
43 A) [FALSE TRUE TRUE FALSE TRUE]

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

B) [FALSE TRUE TRUE FALSE FALSE]


C) [FALSE FALSE TRUE FALSE FALSE]
D) None of the above
C
Sometimes as a Data Scientist working on textual data we come across
instances where we find multiple occurrences of a word which is unwanted.
Below is one such string.
A<-c("I can use because thrice in a sentence because because is a special word.")
A) gsub(“because”,”since”,A)
B) sub(“because”,”since”,A
C) regexec(“because”,”since”,A)
44 D) None of the above
Imagine a dataframe created through the following code. A
Which of the following command will help us remove the duplicate rows
based on both the columns?
A) df[!duplicated(df),]
B) unique(df)
C) dplyr::distinct(df)
45 D) All of the above
D
Grouping is an important activity in Data Analytics and it helps us discover
some interesting trends which may not be visible easily in the raw data.
Suppose you have a dataset created by the following lines of code.
table<-data.table(foo=c("A","B","A","A","B","A"),bar=1:6)
Which of the following command will help us to calculate the mean bar value
grouped by foo variable?
A) aggregate(bar~foo,table,mean)
B) table::df[,mean(bar),by=foo]
C) dplyr::table%>%group_by(foo)%>%summarize(mean=mean(bar))
46 D) All of the above
47 Dealing with strings is an important part of text analytics and splitting a D
string is often one of the common task performed while creating tokens, etc.
What will be the output of following commands?
A<-paste(“alpha”,”beta”,”gamma”,sep=” ”)
B←paste(“phi”,”theta”,”zeta”,sep=””)
parts←strsplit(c(A,B),split=” ”)
A) alpha
B) beta
C) gamma
D) phi
E) theta
F) zeta
48 If I have two vectors x<- c(1,3, 5) and y<-c(3, 2), what is produced by the D
expression cbind(x, y)?
A) a matrix with 2 columns and 3 rows
B) a matrix with 3 columns and 2 rows
C) a data frame with 2 columns and 3 rows
D) a data frame with 3 columns and 2 rows
49 Which of the following commands will convert the following dataframe A
named maverick into the one shown at the bottom?

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

Input Dataframe – “maverick”


Grade Male Female

A 10 15

B 20 15

A 30 35

Output dataframe
Grade Sex Count

A Male 10

A Female 15

B Male 30

B Female 15

A Male 30

A Female 35

A) tidyr::Gather(maverick, Sex,Count,-Grade)
B) tidyr::spread(maverick, Sex,Count,-Grade
C) tidyr::collect(maverick, Sex,Count,-Grade)
D) None of the above
50 Which of the following command will help us to replace every instance of C
Delhi with Delhi_NCR in the following character vector?
C<-c(“Delhi is”,”a great city.”,”Delhi is also”,”the capital of India.”)
A) gsub(“Delhi”,”Delhi_NCR”,C)
B) sub(“Delhi”,”Delhi_NCR”,C)
C) Both of the above
D) None of the above
Unit -4

C
1. R has how many atomic classes of objects?
a) 1
b) 2
c) 3
1 d) 5
2 Point out the correct statement? D

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

a) Empty vectors can be created with the vector() function


b) A sequence is represented as a vector but can contain objects of different classes
c) “raw” objects are commonly used directly in data analysis
d) The value NaN represents undefined value
Numbers in R are generally treated as _______ precision real numbers.
a) single
b) double
c) real
3 d) imaginary
If you explicitly want an integer, you need to specify the _____ suffix.
a) D
b) R
c) L
4 d) K
R is an__________programming language? C
a) Closed source
b) GPL
c) Open source
5 d) Definite source
.Solve A
varx<-23, 34->vary

print(varx+vary)

a. 57

b. 2334

c. 3423

d. 66
6
find the output B
varx<-23, 34->vary
print(varx == vary)
a. True
b. False
c. None of the above
d. Error

7
Below, we have represented six data points on a scale where vertical lines on scale C
represent unit. Which of the following line represents the mean of the given data
points, where the scale is divided into same units?
8 A) A B) B C) C D) D
If a positively skewed distribution has a median of 50, which of the following E
statement is true?
A) Mean is greater than 50
9 B) Mean is less than 50

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

C) Mode is less than 50


D) Mode is greater than 50
E) Both A and C
F) Both B and D
Which of the following is a possible value for the median of the below distribution? B
A) 32
B) 26
C) 17
10 D) 40
Which of the following statements are true about Bessels Correction while calculating C
a sample standard deviation?
Bessels correction is always done when we perform any operation on a sample data.
Bessels correction is used when we are trying to estimate population standard
deviation from the sample.
Bessels corrected standard deviation is less biased.
A) Only 2
B) Only 3
C) Both 2 and 3
D) Both 1 and 3
11
If the variance of a dataset is correctly computed with the formula using (n – 1) in the A
denominator, which of the following option is true?
A) Dataset is a sample
B) Dataset is a population
C) Dataset could be either a sample or a population
D) Dataset is from a census
12 E) None of the above
What would be the critical values of Z for 98% confidence interval for a two-tailed test A
?
A) +/- 2.33
B) +/- 1.96
C) +/- 1.64
13 D) +/- 2.55
Studies show that listening to music while studying can improve your memory. To D
demonstrate this, a researcher obtains a sample of 36 college students and gives them
a standard memory test while they listen to some background music. Under normal
circumstances (without music), the mean score obtained was 25 and standard
deviation is 6. The mean score for the sample after the experiment (i.e With music) is
28.
What is the null hypothesis in this case?
A) Listening to music while studying will not impact memory.
B) Listening to music while studying may worsen memory.
C) Listening to music while studying may improve memory.
D) Listening to music while studying will not improve memory but can make it worse.
14
Studies show that listening to music while studying can improve your memory. To B
demonstrate this, a researcher obtains a sample of 36 college students and gives them
a standard memory test while they listen to some background music. Under normal
circumstances (without music), the mean score obtained was 25 and standard
15 deviation is 6. The mean score for the sample after the experiment (i.e With music) is

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

28.

What would be the Type I error?


A) Concluding that listening to music while studying improves memory, and it’s right.
B) Concluding that listening to music while studying improves memory when it
actually doesn’t.
C) Concluding that listening to music while studying does not improve memory but it
does.

Studies show that listening to music while studying can improve your memory. To B
demonstrate this, a researcher obtains a sample of 36 college students and gives them
a standard memory test while they listen to some background music. Under normal
circumstances (without music), the mean score obtained was 25 and standard
deviation is 6. The mean score for the sample after the experiment (i.e With music) is
After performing the Z-test, what can we conclude ____ ?
A) Listening to music does not improve memory.
B)Listening to music significantly improves memory at p
C) The information is insufficient for any conclusion.
16 D) None of the above
A researcher concludes from his analysis that a placebo cures AIDS. What type of error D
is he making?
A) Type 1 error
B) Type 2 error
C) None of these. The researcher is not making an error.
D) Cannot be determined
17
What happens to the confidence interval when we introduce some outliers to the data? B
A) Confidence interval is robust to outliers
B) Confidence interval will increase with the introduction of outliers.
C) Confidence interval will decrease with the introduction of outliers.
18 D) We cannot determine the confidence interval in this case
B
A medical doctor wants to reduce blood sugar level of all his patients by altering their
diet. He finds that the mean sugar level of all patients is 180 with a standard deviation
of 18. Nine of his patients start dieting and the mean of the sample is observed to 175.
Now, he is considering to recommend all his patients to go on a diet.
Note: He calculates 99% confidence interval.
What is the standard error of the mean?
A) 9
B) 6
C) 7.5
19 D) 18
--------------is function in R to get number of observation in a data frame D

a) n( )
b) ncol( )
c) nobs( )
d) nrow( )
20

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

A key property of vectors in R language is that D


a. A vector cannot have attributes like dimensions
b. Elements of a vector can be of different classes
c. Elements of a vector can only be a character or numeric
21 d. Elements of a vector all must be of the same class
The definition of free software consists of four freedoms (freedoms 0 through 3).
Which of the following is NOT one of the freedoms that are part of the definition?

a. The freedom to study how the program works, and adapt it to your needs.
b. The freedom to improve the program, and release your improvements to
the public, so that the whole community benefits.
c. The freedom to run the program, for any purpose.
d. The freedom to sell the software for any price.
22
Point out the correct statement : C

a) Blocks are evaluated until a new line is entered after the closing brace
b) Single statements are evaluated when a new line is typed at the start of the
syntactically complete statement
c) The if/else statement conditionally evaluates two statements
d) All of the mentioned
23
Which will be the output of following code ? C
x-3
switch(6, 2+2, mean(1:10), rnorm(5))

a) 10
b) 1
c) NULL
d) All of the mentioned
24
_______ is used to continue an iteration of a loop. A

A. next

B. skip

C. group

D. All of the mentioned

25
Point out the correct statement : D

a) R has a number of ways to indicate to you that something’s not right


b) Executing any function in R may result in the condition
26 c) “condition” is a generic concept for indicating that something unexpected

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

has occurred
d) All of the mentioned

. Which of the following is primary tool for debugging ?

a) debug()
b) trace()
c) browser()
d) All of the mentioned
27
Point out the correct statement : A

a) Vectorizing the function can be accomplished easily with the Vectorize()


function
b) There are different levels of indication that can be used, ranging from mere
notification to fatal error
c) Vectorizing the function can be accomplished easily with the vector()
function
d) None of the mentioned
28
Functions are defined using the _________ directive and are stored as R objects A

a) function()
b) funct()
c) functions()
d) All of the mentioned
29
The __________ function returns a list of all the formal arguments of a function A
a) formals()
b) funct()
c) formal()
d) All of the mentioned
30
Which of the following is multivariate version of lapply ? D

a) apply()
b) lapply()
c) sapply()
d) mapply()
31
Point out the correct statement : C

a) split() takes elements of the list and passes them as the first argument of the
function you are applying
b) You can use tsplit() to evaluate a function single time each with a same
32 argument

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

c) Sequence of operations is sometimes referred to as “map-reduce”


d) None of the mentioned

A function, together with an environment, makes up what is called a ______ B


closure.

a) formal
b) function
c) reflective
d) All of the mentioned
33
The _________ function is used to plot negative likelihood. A

a) plot()
b) graph()
c) graph.plot()
d) None of the mentioned
34
Unit-5

_____ is a subset of _____ A

a) Information design, visual modality


b) Information design, data visualization
c) None of the answers are correct.
d) Data visualization, information design
1
Which of the answers is an example of the kinesthetic modality? B

a) A speech
b) A movie
c) A picture
d) The rain on our face
2
What area represents information in a graphical or pictorial form? C

a) Data design
b) None of the answers are correct.
c) Information design
d) Data visualization
3
Which of the following is an example of a temporal data visualization? d

a) A Gnatt chart that is use in project management


b) A histogram that represents proportions
4 c) A matrix representing interconnecting data among various entities

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

d) A 3D molecular rendering of a protein


a)
By definition, Tableau displays measures over time as a ____________ D
a) Bar
b) Line
c) Histogram
5 d) Scatter Plots
How do you identify a continuous field in Tableau? A
a) It is identified by a blue pill in the visualization
b) It is identified by a green pill in a visualization
c) It is preceded by a # symbol in the data window
6 d) When added to the visualization, it produces distinct values
For creating variable size bins we use _____________ B
a) Sets
b) Groups
c) Calculated fields
d) Table Calculations
7
Which of the following is not a Trend Line model C
a) Linear Trend Line
b) Exponential Trend Line
c) Binomial Trend Line
8 d) Logarithmic Trend Line
Data cleaning consists primarily in implementing ………………….strategies before they A
occur
a) error prevention
b) error detection
c) indicating error
9 d) none of the above
data errors will be detected incidentally during activities A
a) When collecting or entering data
b) When transforming/extracting/transferring data
c) When exploring or analysing data
10 d) When submitting the draft report for peer review
Data cleaning involves repeated cycles of E
a) screening,
b) diagnosing,
c) treatment and
d) documentation of this process.
e) All the above
11
After measurement, …………..are the object of a sequence of typical activities: C
a) Data
b) Information ,
c) Record
12 d) None of the above

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

1. Under the lattice graphics system, what do the primary plotting A


functions like xyplot() and bwplot() return?
a) nothing; only a plot is made
b) an object of class "lattice"
c) an object of class "trellis"
13 d) an object of class "plot”
What is produced by the following code?
library(nlme)
library(lattice)
xyplot(weight ~ Time | Diet, BodyWeight)
a) A set of 16 panels showing the relationship between weight and time for
each rat.
b) A set of 3 panels showing the relationship between weight and time for
each diet.
c) A set of 11 panels showing the relationship between weight and diet for
each time.
d) A set of 3 panels showing the relationship between weight and time for
14 each rat…
Which of the following functions can be used to annotate the panels in a multi- B
panel lattice plot?
a) axis()
b) text()
c) panel.abline()
d) points()
15 e) lines()
In the lattice system, which of the following functions can be used to finely B
control the appearance of all lattice plots?
a) par()
b) splom()
c) print.trellis()
d) trellis.par.set()
16
What is ggplot2 an implementation of? C
a) a 3D visualization system
b) the Grammar of Graphics developed by Leland Wilkinson
c) the base plotting system in R
d) the S language originally developed by Bell Labs
17
What is a geom in the ggplot2 system? A
 a method for mapping data to attributes like color and size
 a method for making conditioning plots
 a statistical transformation
18  a plotting object like point, line, or other shape
The following code creates a scatterplot of 'votes' and 'rating' from the movies A
dataset in the ggplot2 package. After loading the ggplot2 package with the
19 library() function, I can run

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

qplot(votes, rating, data = movies)


How can I modify the the code above to add a smoother to the scatterplot?
a) qplot(votes, rating, data = movies) + stats_smooth("loess")
b) qplot(votes, rating, data = movies, panel = panel.loess)
c) qplot(votes, rating, data = movies, smooth = "loess")
d) qplot(votes, rating, data = movies) + geom_smooth()

When I run the following code I get an error: B


library(ggplot2)
library(ggplot2movies)
g <- ggplot(movies, aes(votes, rating))
print(g)
I was expecting a scatterplot of 'votes' and 'rating' to appear. What's the
problem?
a) The dataset is too large and hence cannot be plotted to the screen.
b) There is a syntax error in the call to ggplot.
c) ggplot does not yet know what type of layer to add to the plot.
d) The object 'g' does not have a print method
20
Sometimes creating a feature which represents whether another variable has C
missing values or not can prove to be very useful for a predictive model.
Below is a dataframe which has missing values in one of its columns.
Feature1 Feature2

B NA

C 30

D 40

E 50

Which of the following commands will create a column named “missing” with
value 1 where variable “Feature2” has missing values?
Feature1 Feature2 Missing

B NA 1

C 30 0

D 40 0
21

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

E 50 0

A)
dataframe$missing<-0
dataframe$Missing[is.na(dataframe$Feature2)]<-1
B)
dataframe$missing<-0
dataframe$Missing[which(is.na(dataframe$Feature2))]<-1
C) Both of the above
D) None of the above

Suppose there are 2 dataframes “A” and “B”. A has 34 rows and B has 46 rows. C
What will be the number of rows in the resultant dataframe after running the
following command?
merge(A,B,all.x=TRUE)
A) 46
B) 12
C) 34
D) 80
22
The very first thing that a Data Scientist generally does after loading dataset is C
find out the number of rows and columns the dataset has. In technical terms, it is
called knowing the dimensions of the dataset. This is done to get an idea about
the scale of data that he is dealing with and subsequently choosing the right
techniques and tools.
Which of the following command will not help us to view the dimensions of our
dataset?
A) dim()
B) str()
C) View()
D) None of the above
23
C
Sometimes, we face a situation where we have two columns of a dataset and we
wish to know which elements of the column are not present in another column.
This is easily achieved in R using the setdiff command.
Column1 Column2 Column3 Column4 Column5 Column6

Name1 Alpha 12 24 54 0 Zion

Name2 Beta 16 32 51 1 Beta

Name3 Alpha 52 104 32 0 Gamma

Name4 Beta 36 72 84 1 Delta


24

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

Name5 Beta 45 90 32 0 Phi

Name6 Alpha 12 24 12 0 Zeta

Name7 Beta 32 64 64 1 Sigma

Name8 Alpha 42 84 54 0 Mu

Name9 Alpha 56 112 31 1 Eta

Dataframe
What will be the output of the following command?
setdiff(dataframe$Column1,dataframe$Column6)==setdiff(dataframe$Column6,datafr
ame$Column1)
A) TRUE
B)FALSE
C) Can’t Say

B
The below dataset is stored in a variable called “frame”.
A B

alpha 100

beta 120

gamma 80

delta 110

Which of the following commands will create a bar plot for the above dataset.
Use the values from Column B to represent the height of the bar plot.
A) ggplot(frame,aes(A,B))+geom_bar(stat=”identity”)
B) ggplot(frame,aes(A,B))+geom_bar(stat=”bin”)
C) ggplot(frame,aes(A,B))+geom_bar()
25 D) None of the above
A
mp dis dra qse gea car
A cyl hp wt vs am
g p t c r b

Mazda 21. 6 160 110 3.9 2.62 16.4 0 1 4 4


26

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

RX4 0 0 0 6

Mazda 21. 3.9 2.87 17.0


6 160 110 0 1 4 4
RX4 Wag 0 0 5 2

Datsun 22. 3.8 2.32 18.6


4 108 93 1 1 4 1
710 8 5 0 1

Hornet 21. 3.0 3.21 19.4


6 258 110 1 0 3 1
Drive 4 8 5 4

Hornet
18. 3.1 3.44 17.0
Sportabo 8 360 175 0 0 3 2
7 5 0 2
ut

18. 2.7 3.46 20.2


Valiant 6 225 105 1 0 3 1
1 6 0 2

We wish to create a stacked bar chart for cyl variable with stacking criteria
Being vs Variable. Which of the following commands will help us perform this
action?
A)qplot(factor(cyl),data=mtcars,geom=”bar”,fill=factor(vs)
B) ggplot(mtcars,aes(factor(cyl),fill=factor(vs)))+geom_bar()
C) All of the above
D) None of the above

What is the output of the command – paste(1:3,c(“x”,”y”,”z”),sep=””) ? C


A) [1 2 3x y z]
B) [1:3x y z]
C) [1x 2y 3z]
D) None of the above
27
C
R has a rich library reserve for drawing some of the very high end graphs and plots
and many a times you want to save the graphs for presenting your findings to someone
else. Saving your plots to a PDF file is one such option. If you want to save a plot to a
PDF file, which of the following is a correct way of doing that?
A) Construct the plot on the screen device and then copy it to a PDF file with
dev.copy2pdf().
B) Construct the plot on the PNG device with png(), then copy it to a PDF with
dev.copy2pdf().
C) Open the PostScript device with postscript(), construct the plot, then close the
device with dev.off().
D) Open the screen device with quartz(), construct the plot, and then close the device
28 with dev.off().

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

Given $X_1=12,X_2=19,X_3=10,X_4=7$, then $\sum_{i=1}^4 X_i$ equals? B

a) 36
b) 48
c) 29
d) 41
29
The number of accidents in a city during 2010 is A
a) Discrete variable
b) Continuous variable
c) Qualitative variable
30 d) Constant
The mean of a distribution is 23, the median is 24, and the mode is 25.5. It is most likely A
that this distribution is:
a) Positively Skewed
b) Symmetrical
c) Asymptotic
31 d) Negatively Skewed
Data collected by NADRA to issue computerized identity cards (CICs) are C
a) Unofficial data
b) Qualitative data
c) Secondary data
d) Primary data
32 e) None of these
Sum of dots when two dice are rolled is A
a) A discrete variable
b) A continuous variable
c) A constant
33 d) A qualitative variable
A chance variation in an observational process is C
a) Dispersion/ Variability
b) Measurement error
c) Random error
34 d) Instrument error
If a distribution is abnormally tall and peaked, then is can be said that the distribution is: A
a) Leptokurtic
b) Pyrokurtic
c) Platykurtic
35 d) Mesokurtic
The mean of a distribution is 14 and the standard deviation is 5. What is the value of C
the coefficient of variation?

a) 60.4%
b) 48.3%
c) 35.7%
36 d) 27.8%
The first hand and unorganized form of data is called C
a) Secondary data
37 b) Organized data

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

c) Primary data
d) None of these
Questionnaire survey method is used to collect
a) Secondary data
b) Qualitative variable
c) Primary data
38 d) None of these
The data which have already been collected by someone are called C
a) Raw data
b) Array data
c) Secondary data
39 d) Fictitious data
The grouped data is also called C
a) Raw data
b) Primary data
c) Secondary data
40 d) Qualitative data
A constant variable can take values B
a) Zero
b) Fixed
c) Not fixed
41 d) Nothing
A parameter is a measure which is computed from A
a) Population data
b) Sample data
c) Test statistics
42 d) None of these
According to the empirical rule, approximately what percent of the data should lie E
within $\mu \pm \sigma$?
a) 75%
b) 68%
c) 99.7%
d) 90%
43 e) 95%
Primary data and _____________ data are same C
a) Grouped
b) Secondary data
c) Ungrouped
44 d) None of these
Which one of the following measurement does not divide a set of observations into B
equal parts?
a) Quartiles
b) Standard Deviations
c) Percentiles
d) Deciles
45 e) Median
In descriptive statistics, we study A
a) The description of the decision-making process
46 b) The methods for organizing, displaying and describing data

Prof. Dhananjay Bhavsar www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

c) How to describe the probability distribution


d) None of the above
47 Which of the following is not based on all the observations? E
a) Arithmetic Mean
b) Geometric Mean
c) Harmonic Mean
d) Weighted Mean
e) Mode
48 Which one is the not measure of dispersion. B
a) The Range
b) 50th Percentile
c) Inter-Quartile Range
d) Variance
49 When data are collected in a statistical study for only a portion or subset of all A
elements of interest we are using:
a) A sample
b) A Parameter
c) A Population
d) Both b and c
50 In inferential statistics, we study A
a) The methods to make decisions about the population based on sample results
b) How to make decisions about mean, median, or mode
c) How a sample is obtained from a population
d) None of the above

Prof. Dhananjay Bhavsar www.dimr.edu.in


Making a forecast can be done with a regression by using what method?

(A) multiplying coefficients by expected values of Xs


(B) dividing coefficients by expected values of Ys
(C) dividing coefficients by expected values of Xs
(D) multiplying coefficients by expected values of Ys

Question 2 of 9
What type of variable can be used to capture fixed effects?

(A) Random variables


(B) continuous variables
(C) Dummy variables
(D) all of these answers

Special Offer for Today - Upgrade Your Skills FREE. Explore the thousands of
classes Free in entrepreneurship, Marketing, web development &
More (Limited Time Offer). Claim Your One Month 100% Free

Question 3 of 9
What type of data should the Y variable be in a binary regression?

(A) Discrete
(B) Random
(C) Continuous
(D) none of these answers

Question 4 of 9
Fixed effects regressions help to deal with what problem?

(A) Statistical Insignificance


(B) Survey Bias
(C) Omitted variables bias
(D) Randomness in data

Question 5 of 9
Which statistic offers a bounds on our estimate of the impact of an X
variable on the Y variable?

(A) T-statistic
(B) R-squared
(C) 95% confidence interval
(D) P-value
Question 6 of 9
What is one type of time series forecasting?

(A) Regressions
(B) The Delphi Method
(C) Exponential Smoothing
(D) Surveys

Want to Earn Money by Installing apps, Surveys, etc?

Click Here To Start Earning

Question 7 of 9
What is the term for the estimate of the impact an X variable have on
the Y variable?

(A) Coefficient
(B) R-squared
(C) Standard Error
(D) P-value

Question 8 of 9
What is one type of causal forecasting?

(A) Multiple Regression


(B) Exponential Smoothing
(C) Surveys
(D) The Delphi Method

CORONA BLOG – WANT TO EARN MONEY ONLINE BY MAKING CORONA STATS BLOG IN 15 MIN
CLICK HERE TO READ MORE ( Close if ad opens and double click on READ MORE ) CLICK HERE

Question 9 of 9
What is one good source of free data?

(A) Thomson Reuters


(B) Friends
(C) FRED
(D) WARP

BUSINESS ANALYTICS MCQ QUESTION 1 TO 7

Visualizing cash flow and excel


Question 1 of 7
Which is the internal rate of return?
(A) the interest rate at which all cash flows have a net present value of zero
(B) the interest rate at which all cash flows have a positive net present value
Correct answer
(C) the discount rate at which all cash flows have a net present value of
zero
(D) the discount rate at which all cash flows have a negative net present
value

Question 2 of 7
Which key will create an absolute instead of a relative cell reference?

(A) Ctrl+A
(B) Esc
(C) F4
(D) F1

Question 3 of 7
What is the future value for a fully amortized loan?

(A) zero
(B) the value of the interest only
(C) the principal plus interest
(D) the outstanding principal

Question 4 of 7
Which Excel formula can take up to five arguments and then calculate
the future value of an investment?

(A) RECEIVED
(B) PRICEMAT
(C) FV
(D) FVSCHEDULE

Question 5 of 7
What is the formula for calculating the value of a perpetuity?

(A) payment/(discount rate – growth rate)


(B) (discount rate – growth rate)/payment
(C) payment/interest rate
(D) payment/(growth rate – discount rate)

Question 6 of 7
Since the GEOMEAN formula does not accept negative numbers, how
can you use it despite having some negative growth rates?

(A) Subtract 1 from the growth rate, then add 1 to the result after using the
GEOMEAN formula.
(B) Divide the growth rate by -1, then multiply the result by -1 after using
the GEOMEAN formula.
(C) Add 1 to the growth rate, then subtract 1 to the result after using
the GEOMEAN formula.
(D) Multiply the growth rate by -1 to make it positive, then divide the result
by -1 after using the GEOMEAN formula.

Free download e-book Manifestation Magic Guide – The Ultimate Wealth Creation System
Click here to download ( Close if ad opens and double click on download )

Question 7 of 7
If you invested $10,000 with an annual compound interest rate of 5
percent, how much will it be worth after 10 years?

(A) 16289
(B) 11365
(C) 10761
(D) 15500

QUESTION 1 TO 13

Question 1 of 13
James has a small company and is looking to get a loan from the bank.
How will the bank deduce the company’s cash flow?

(A) By using three years of profit and loss reports to create a statement of
cash flows.
(B) By using the income statement alone to create a statement of cash flows.
Attempted correct option
(C) By using the balance sheet and income statement to create a
statement of cash flows.
(D) By using the balance sheet alone to create a statement of cash flows.
Question 2 of 13
You have worked on your balance sheet to figure out how to finance
an expansion plan. Which of the following would be the most realistic
plug figure to use?

(A) loans payable


(B) paid-in capital
(C) liabilities
(D) investments

Question 3 of 13
Which of the following tells you whether all of your forecasts,
assumptions, marketing plans, and operating plans are internally
consistent, practical, and achievable?

(A) assets = liability + equity


(B) costs = assets + equity
(C) equity = liability + inventory
(D) liability = assets + equity

Question 4 of 13
Padere’s preliminary assumptions in her income statement is that
there will be an overall increase in expenses next year. What is the next
logical step for Padere before presenting to her COO?

(A) Each of the forecasting assumptions must be researched,


supported, justified, and explained.
(B) No new steps are needed since the expected level of depreciation
expense is tied to the expected level of property, plant, and equipment.
(C) Verify the forecasted increase in gross profit percentage is not impacted
by changes in the competitive environment.
(D) No new steps are needed since the expected level of interest expense is
tied to the expected level of loans.

Question 5 of 13
Company A is significantly above the average value compared to other
companies in the industry. You predict that next year the company will
likely deliver to a number closer to the other companies. This is known
as _.

(A) deviate from the mean


(B) net income percentage
(C) gross profit percentage
(D) reversion to the mean

Question 6 of 13
What should you consider when forecasting the level of property,
plant, and equipment, and the associated amount of depreciation
expense?

(A) short- and long-term assets


(B) current utilization level and future expansion plans
(C) strategic plans and inventory
(D) sales and strategic plans

Question 7 of 13
In a forecasted income statement, the amount of a company’s _ is
determined by how much property, plant, and equipment the company
has.

(A) inventory
(B) depreciation expense
(C) assets
(D) accounts receivable

Question 8 of 13
Derrick is creating a constructed forecasted financial statement. Which
of the following would inhibit a more accurate statement?

(A) The amount of depreciation is driven by sales.


(B) An increase level of activity will create the need for more cash.
(C) An increase in accounts receivable in inventory will create a need for
more cash.

Question 9 of 13
As you put together a forecasting model, it is important to remember
that sales forecasting is _.

(A) an exact science


(B) not faulty or full of skepticism
(C) not an exact science
(D) a true means to the future

Question 10 of 13
Kiko is creating her sales forecast. Which of the following factors is the
starting point for Kiko to view a forecast of future sales?
(A) change in the competitive environment
(B) historical trend
(C) impact of current plans
(D) impact of a marketing plan

Question 11 of 13
The most important and recommended starting point of any financial
model exercise is the _

(A) sales forecast


(B) cost of sales
(C) levels of inventory
(D) depreciation expenses

Question 12 of 13
As an investor, you look to the DCF of a company before deciding to
invest. Which of the following questions would lead you to use a
higher interest rate in your analysis?

(A) How large will the future cash flows be?


(B) How risky are the cash flows?
(C) When will these cash flows occur?

Question 13 of 13
Analyzing the _ performance of a company allows financial statement
users to understand _ performance of a company.

(A) historical; past


(B) historical; future
(C) present; future
(D) present; past

01. In the Python statement x = a + 5 – b:

a and b are ________

a + 5 - b is ________

A. terms, a group
B. operators, a statement
C. operands, an expression
D. operands, an equation
View Answer

Answer : C
Explanation: The objects that operators act on are called
operands. An expression involving operators and
operands is called an expression So, option C is correct.

02. Which is the correct operator for power(xy)?

A. X^y
B. X**y
C. X^^y
D. None of the mentioned
View Answer

Answer : B
Explanation: In python, power operator is x**y i.e.
2**5=32.

03. What is the output of the following addition (+)


operator

a = [10, 20]

b=a

b += [30, 40]

print(a)
print(b)

A. [10, 20, 30, 40]


[10, 20, 30, 40]
B. [10, 20]
[10, 20, 30, 40]
C. [10, 20, 10, 20]
[10, 20, 30, 40]
D. [10, 20]
[30, 40]
View Answer

Answer : A
Explanation: Because since b and a reference to the
same object, when we use the addition assignment
operator += on b, it changes both a and b.

04. Which function overloads the >> operator?

A. more()
B. gt()
C. ge()
D. None of the above
View Answer

Answer : D
Explanation: rshift() function overloads the >> operator

05. What is the value of the expression 100 / 25?


A. 4
B. 4.0
C. 0
D. 25
View Answer

Answer : B
Explanation: The result of standard division is always
float. The value of 100 // 25 (integer division) is 4.

06. Which one of these is floor division?

A. //
B. /
C. %
D. None of the above
View Answer

Answer : A
Explanation: When both of the operands are integer
then python chops out the fraction part and gives you
the round-off value, to get the accurate answer use,
floor division. This is floor division. For ex, 5/2 = 2.5 but
both of the operands are integers so the answer of this
expression in Python is 2. To get the 2.5 as an answer,
use floor division.

07. What is the output of the following assignment


operator
a = 10

b = a -= 2

print(b)

A. 8
B. 10
C. Syntax Error
D. No error but no output too
View Answer

Answer : C
Explanation: b = a -= 2 expression is Invalid

08. Which operator is overloaded by the or() function?

A. ||
B. |
C. //
D. /
View Answer

Answer : B
Explanation: or() function overloads the bitwise OR
operator “|”.

09. Should you use the == operator to determine


whether objects of type float are equal?

A. Nope, not a good idea.


B. Sure! Go for it.
View Answer

Answer : A
Explanation: Internal representation of float objects is
not precise, so they can’t be relied on to equal exactly
what you think they will:
>>> 1.1 + 2.2 == 3.3
False

You should instead compute whether the numbers are


close enough to one another to satisfy a specified
tolerance:
>>> tolerance = 0.00001
>>> abs((1.1 + 2.2) – 3.3) < tolerance
True

10. What is the order of precedence in python?


i) Parentheses
ii) Exponential
iii) Multiplication
iv) Division
v) Addition
vi) Subtraction

A. ii,i,iii,iv,v,vi
B. ii,i,iv,iii,v,vi
C. i,ii,iii,iv,vi,v
D. i,ii,iii,iv,v,vi
View Answer
Answer : A
Explanation: For

1. What is the output of the following code

x =6

y=2

print(x ** y)

print(x // y)

A. 66
0
B. 36
0
C. 66
3
D. 36
3
View Answer

12. What is the output of the following program :

i =0

while i < 3:

print i

i++
print i+1

A. 021 324
B. 012 345
C. Error
D. 102 435
View Answer

Answer : C
Explanation: Python Programming language does not
support ‘++’ operator.

13. Suppose the following statements are executed:

a = 100

b = 200

What is the value of the expression a and b?

A. True
B. 0
C. False
D. 200
E. 100
View Answer

Answer : D
Explanation: None
14. Operators with the same precedence are evaluated
in which manner?

A. Left to Right
B. Right to Left
C. Can’t say
D. None of the mentioned
View Answer

Answer : A
Explanation: None

15. Which of the following operators has the highest


precedence?

A. not
B. &
C. *
D. +
View Answer

Answer : C
Explanation: None

16. Given a function that does not return any value,


what value is shown when executed at the shell?

A. int
B. bool
C. void
D. None
View Answer

Answer : D
Explanation: Python explicitly defines the None object
that is returned if no value is specified.

17. The function sqrt() from the math module computes


the square root of a number. Will the highlighted line of
code raise an exception?

x = -100

from math import sqrtx > 0 and sqrt(x)

A. Yes
B. No
C. void
D. None
View Answer

Answer : B
Explanation: In the highlighted line, x > 0 is False. The
expression is already known to be falsy at that point.
Due to short-circuit evaluation, sqrt(x) (which would
raise an exception) is not evaluated.

18. Which one of the following has the same precedence


level?

A. Addition and Subtraction


B. Multiplication, Division and Addition
C. Multiplication, Division, Addition and Subtraction
D. Addition and Multiplication
View Answer

Answer : A
Explanation: “Addition and Subtraction” are at the
same precedence level. Similarly, “Multiplication and
Division” are at the same precedence level. However,
Multiplication and Division operators are at a higher
precedence level than Addition and Subtraction
operators.

19. What is the output of the following code

print(bool(0), bool(3.14159), bool(-3), bool(1.0+1j))

A. True True False True


B. False True True True
C. True True False True
D. False True False True
View Answer

Answer : B
Explanation: If we pass A zero value to the bool()
constructor, it will treat it as false. Any non-zero value
is true.

20. What is the output of the expression print(-18 // 4)


A. -4
B. -5
C. 4
D. 5
View Answer

Answer : B

1.
Which of the following operator takes only integer operands?

A. +

B. *

C. /

D. %

E. None of these

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option D
Solution:
Two integers are taken to be input
2.
In an expression involving || operator, evaluation
I. Will be stopped if one of its components evaluates to false
II. Will be stopped if one of its components evaluates to true
III. Takes place from right to left
IV. Takes place from left to right

A. I and II

B. I and III

C. II and III

D. II and IV

E. III and IV

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option D
No explanation is given for this question Let's Discuss on Board
3.
Determine output:
void main()
{
int i=0, j=1, k=2, m;
m = i++ || j++ || k++;
printf("%d %d %d %d", m, i, j, k);}

A. 1123

B. 1122
C. 0122

D. 0123

E. None of these

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B
Solution:
In an expression involving || operator, evaluation takes place from left to right and will be stopped
if one of its components evaluates to true(a non zero value).

So in the given expression m = i++ || j++ || k++.


It will be stop at j and assign the current value of j in m.
therefore m = 1 , i = 1, j = 2 and k = 2 (since k++ will not encounter.
so its value remain 2)
4.
Determine output:
void main()
{
int c = - -2;
printf("c=%d", c); }

A. 1

B. -2

C. 2
D. Error

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option C
Solution:
Here unary minus (or negation) operator is used twice. Same maths rules applies, ie. minus *
minus = plus.
Note: However you cannot give like --2. Because -- operator can only be applied to variables as a
decrement operator (eg., i--). 2 is a constant and not a variable.

5.
Determine output:
void main()
{
int i=10;
i = !i>14;
printf("i=%d", i); }

A. 10

B. 14

C. 0

D. 1

E. None of these

Answer & Solution Discuss in Board Save for Later


Answer & Solution

Answer: Option C
Solution:
6.
In C programming language, which of the following type of
operators have the highest precedence

A. Relational operators

B. Equality operators

C. Logical operators

D. Arithmetic operators

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option D
No explanation is given for this question Let's Discuss on Board
7.
What will be the output of the following program?
void main()
{
int a, b, c, d;
a = 3;
b = 5;
c = a, b;
d = (a, b);
printf("c=%d d=%d", c, d);}

A. c=3 d=3

B. c=3 d=5

C. c=5 d=3

D. c=5 d=5

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B
Solution:
The comma operator evaluates both of its operands and produces the value of the second. It also
has lower precedence than assignment. Hence c = a, b is equivalent to c = a, while d = (a, b) is
equivalent to d = b.

8.
Which of the following comments about the ++ operator are
correct?

A. It is a unary operator

B. The operand can come before or after the operator

C. It cannot be applied to an expression


D. It associates from the right

E. All of the above

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option E
No explanation is given for this question Let's Discuss on Board
9.
What will be the output of this program on an implementation
where int occupies 2 bytes?
#include <stdio.h>void main()
{
int i = 3;
int j;
j = sizeof(++i + ++i);
printf("i=%d j=%d", i, j);}

A. i=4 j=2

B. i=3 j=2

C. i=5 j=2

D. the behavior is undefined

Answer & Solution Discuss in Board Save for Later

Answer & Solution


Answer: Option B
Solution:
Evaluating ++i + ++i would produce undefined behavior, but the operand of sizeof is not evaluated,
so i remains 3 throughout the program. The type of the expression (int) is reduced at compile time,
and the size of this type (2) is assigned to j.

10.
Which operator has the lowest priority?

A. ++

B. %

C. +

D. ||

E. &&

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option D
Solution:
11.
What will be the output?
void main(){
int a=10, b=20;
char x=1, y=0;
if(a,b,x,y){
printf("EXAM");
}
}

A. XAM is printed

B. exam is printed

C. Compiler Error

D. Nothing is printed

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option D
No explanation is given for this question Let's Discuss on Board
12.
What number will z in the sample code given below?
int z, x=5, y= -10, a=4, b=2;
z = x++ - --y*b/a;

A. 5

B. 6

C. 9

D. 10
E. 11

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option D
Solution:
C Operator Precedence Table
According to precedence table execution of the given operators are as follows:
1. x++(Postfix operator) i.e x will become 5
2. y--(Prefix operator) i.e y will become -11
3. * and / have same priority so they will be executed according to their associativity i.e left to
right. So, *(Multiplication) will execute first and then /(division).
4. -(Subtraction)

So the complete expression would be


5 - (-11)*2/4 = 5 - (-22)/4 = 5 - (-5) = 5 + 5 = 10.

13.
What is the output of the following statements?
int i = 0;printf("%d %d", i, i++);

A. 01

B. 10

C. 00

D. 11

E. None of these

Answer & Solution Discuss in Board Save for Later


Answer & Solution

Answer: Option B
Solution:
Since the evaluation is from right to left.
So when the print statement execute value of i = 0
Since its execute from right to left when i++ will be execute
first and print value 0 (since its post increment ) and after
printing 0 value of i become 1.
So it its prints for 1 for next i.
14.
What is the output of the following statements?
int b=15, c=5, d=8, e=8, a;
a = b>c ? c>d ? 12 : d>e ? 13 : 14 : 15;printf("%d", a);

A. 13

B. 14

C. 15

D. 12

E. Garbage Value

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B
Solution:
Expression
a = b>c ? c>d ? 12 : d>e ? 13 : 14 : 15;
can be rewritten as

if(b>c)

if(c>d)

a = 12;

else

if(d>e)

a = 13;

else

a = 14;

}else{

a = 15;}

15.
What will be the output of the following code fragment?
void main()
{
printf("%x",-1<<4);}

A. fff0

B. fff1

C. fff2
D. fff3

E. fff4

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option A
1. Which of the following is not a compound assignment operator?

a) /=
b) +=
c) %=
d) ==
View Answer / Hide Answer

ANSWER: d) ==

2. What will be the output of the following code snippet?

Y = 5;
if (! Y > 10)
X = Y + 3;
else
X = Y + 10;

printf(“ X = %d Y = %d”, X, Y);

a) The program will print X = 15 Y = 5


b) The program will print X = 15 Y = 0
c) The program will print X = 8 Y = 5
d) The program will print X = 3 Y = 0
View Answer / Hide Answer

ANSWER: a) The program will print X = 15 Y = 5

3. Which of the following statement is correct about the code snippet


given below?

num = 5;
printf( “%d”, ++num++ );

a) The code will print 5


b) The code will print 6
c) The code will result in L – value required
d) The code will result in R – value required
View Answer / Hide Answer

ANSWER: c) The code will result in L – value required

4. Which of the following statement is correct about the code snippet


given below?

#include < stdio.h>


int main()
{
float z = 12.35, c = 10;
if( ++z%10 -z)
c += z;
else
c - = z;
printf( “%f %f”, z, c);
return 0;
}

a) The program will result in compile time error


b) The program will print 12.35 22.35
c) The program will print 13.35 22.35
d) The program will print 1.35 11.35
View Answer / Hide Answer

ANSWER: a) The program will result in compile time error

5. Which of the following statement is correct about the code snippet


given below?

#include < stdio.h>


int main()
{
int n = 12, k;
printf(“%d”, (k = sizeof( n + 12.0))++);
return 0;
}

a) The code will print 17


b) The code will print 5
c) The code will result compile time error
d) The code will print 4
View Answer / Hide Answer

ANSWER: c) The code will result compile time error

6. Which is executed quickly?

a) ++p
b) P++
c) Both
d) P+1
View Answer / Hide Answer

ANSWER: c) Both

7. What is the value of X in the sample code given below?

double X; X = ( 2 + 3) * 2 + 3;

a) 10
b) 13
c) 25
d) 28
View Answer / Hide Answer

ANSWER: b) 13

8. What value will be stored in z if the following code is executed?

main()
{
int x = 5; y = -10, z;
int a = 4, b = 2;
z = x+++++y * b/a;
}

a) -2
b) 0
c) 1
d) 2
View Answer / Hide Answer

ANSWER: c) 1

9. What is the output of the following program?

#include < stdio.h>


int main()
{
int max =123, min = 10, *maxptr = &max, *minptr = &min;
int **nptr = &minptr, **mptr = &maxptr;
*maxptr = ++*mptr % **nptr;
max - = ( *minptr -**nptr && *maxptr || *minptr);
printf( “ %d %d”, ++**mptr, *minptr);
return 0;
}

a) 4 10
b) 3 11
c) 3 10
d) 4 11
View Answer / Hide Answer

ANSWER: a) 4 10

10. What will be the output of the following program?

#include < stdio.h>


int main()
{
int num = 0, z = 3;
if ( ! (num <= 0) || ++z )
printf( “%d %d ”, ++num + z++, ++z );
else
printf( “%d %d”, - -num + z- -, - - z);
return 0;
}

a) – 2 1
b) 6 5
c) 4 5
d) 5 5
View Answer / Hide Answer

ANSWER: b) 6 5

11. Which of the following statement is correct about the code snippet
given below?

#include < stdio.h>

int main()
{
int a = 10, b = 2, c;
a = !( c = c == c) && ++b;
c += ( a + b- -);
printf( “ %d %d %d”, b, c, a);
return 0;
}

a) The program will print the output 1 3 0


b) The program will print the output 0 1 3
c) The program will results in expression syntax error
d) The program will print the output 0 3 1
View Answer / Hide Answer

ANSWER: a) The program will print the output 1 3 0

12. Which of the following is the better approach to do the operation i = i *


16?

a) Multiply I by 16 and keep it


b) Shift left by 4 bit
c) Add I 16 times
d) Shift right by 4 bit
View Answer / Hide Answer

ANSWER: b) Shift left by 4 bit


13. For the following statement find the values generated for p and q?

int p = 0, q = 1;
p = q++;
p = ++q;
p = q--;
p = --q;

Value of p & q are

a) 1 1
b) 0 0
c) 3 2
d) 1 2
View Answer / Hide Answer

14. What is the value of the following expression?

i = 1;
i = ( I< <= 1 % 2)

a) 2
b) 1
c) 0
d) Syntax error
View Answer / Hide Answer

ANSWER: a) 2

15. What is the correct and fully portable way to obtain the most
significant byte of an unsigned integer x?

a) x & 0xFF00
b) x > > 24
c) x > > ( CHAR_BIT * (sizeof(int) - 3))
d) x > > ( CHAR_BIT * (sizeof(int) - 1))
View Answer / Hide Answer

ANSWER: d) x > > ( CHAR_BIT * (sizeof(int) - 1))


16. Expression x % y is equivalent to____?

a) (x – (x/y))
b) (x – (x/y) * y)
c) (y – (x/y))
d) (y – (x/y) * y)
View Answer / Hide Answer

ANSWER: b) (x – (x/y) * y)

17. What is the value of x after executing the following statement?

int x = 011 | 0x10;

a) 13
b) 19
c) 25
d) 27
View Answer / Hide Answer

ANSWER: c) 25

18. What is the value of the following expression?

i = 1;
i < < 1 % 2;

a) 2
b) -2
c) 1
d) 0
View Answer / Hide Answer

ANSWER: a) 2

19. p++ executes faster than p + 1 since

a) P uses registers
b) Single machine instruction required for p++
c) Option a and b
d) None
View Answer / Hide Answer

ANSWER: b) Single machine instruction required for p++

MCQ .1
A process by which we estimate the value of dependent variable on the basis of
one or more independent
variables is called:
(a) Correlation (b) Regression (c) Residual (d) Slope
MCQ .2
The method of least squares dictates that we choose a regression line where the
sum of the square of
deviations of the points from the lie is:
(a) Maximum (b) Minimum (c) Zero (d) Positive
MCQ .3
A relationship where the flow of the data points is best represented by a curve is
called:
(a) Linear relationship (b) Nonlinear relationship (c) Linear positive (d) Linear
negative
MCQ .4
All data points falling along a straight line is called:
(a) Linear relationship (b) Non linear relationship (c) Residual (d) Scatter
diagram
MCQ .5
The value we would predict for the dependent variable when the independent
variables are all equal to zero
is called:
(a) Slope (b) Sum of residual (c) Intercept (d) Difficult to tell
MCQ .6
The predicted rate of response of the dependent variable to changes in the
independent variable is called:
(a) Slope (b) Intercept (c) Error (d) Regression equation
MCQ .7
The slope of the regression line of Y on X is also called the:
(a) Correlation coefficient of X on Y (b) Correlation coefficient of Y on X
(c) Regression coefficient of X on Y (d) Regression coefficient of Y on X
MCQ .8
In simple linear regression, the numbers of unknown constants are:
(a) One (b) Two (c) Three (d) Four
MCQ .9
In simple regression equation, the numbers of variables involved are:
(a) 0 (b) 1 (c) 2 (d) 3
MCQ .10
If the value of any regression coefficient is zero, then two variables are:
(a) Qualitative (b) Correlation (c) Dependent (d) Independent
MCQ .11
The straight line graph of the linear equation Y = a+ bX, slope will be upward if:
(a) b = 0 (b) b < 0 (c) b > 0 (b) b ≠ 0
MCQ .12
The straight line graph of the linear equation Y = a + bX, slope will be downward If:
(a) b > 0 (b) b < 0 (c) b = 0 (d) b ≠ 0
MCQ .13
The straight line graph of the linear equation Y = a + bX, slope is horizontal if:
(a) b = 0 (b) b ≠ 0 (c) b = 1 (d) a = b
MCQ .14
If regression line of = 5, then value of regression coefficient of Y on X is:
(a) 0 (b) 0.5 (c) 1 (d) 5
MCQ .15
If Y = 2 - 0.2X, then the value of Y intercept is equal to:
(a) -0.2 (b) 2 (c) 0.2X (d) All of the above
MCQ .16
If one regression coefficient is greater than one, then other will he:
(a) More than one (b) Equal to one (c) Less than one (d) Equal to minus one
MCQ .17
To determine the height of a person when his weight is given is:
(a) Correlation problem (b) Association problem (c) Regression problem (d)
Qualitative problem
MCQ .18
The dependent variable is also called:
(a) Regression (b) Regressand (c) Continuous variable (d) Independent
MCQ .19
The dependent variable is also called:
(a) Regressand variable (b) Predictand variable (c) Explained variable (d) All of
these
MCQ .20
The independent variable is also called:
(a) Regressor (b) Regressand (c) Predictand (d) Estimated
MCQ .21
In the regression equation Y = a+bX, the Y is called:
(a) Independent variable (b) Dependent variable (c) Continuous variable (d)
None of the above
MCQ .22
In the regression equation X = a + bY, the X is called:
(a) Independent variable (b) Dependent variable (c) Qualitative variable (d) None
of the above
MCQ .23
In the regression equation Y = a +bX, a is called:
(a) X-intercept (b) Y-intercept (c) Dependent variable (d) None of the above
MCQ .24
The regression equation always passes through:
(a) (X, Y) (b) (a, b) (c) ( , ) (d) ( , Y)
MCQ .25
The independent variable in a regression line is:
(a) Non-random variable (b) Random variable (c) Qualitative variable (d) None of
the above
MCQ .26
The graph showing the paired points of (Xi, Yi) is called:
(a) Scatter diagram (b) Histogram (c) Historigram (d) Pie diagram
MCQ .27
The graph represents the relationship that is:
(a) Linear (b) Non linear (c) Curvilinear (d) No relation
MCQ .28
The graphrepresents the relationship that is.:
(a) Linear positive (b) Linear negative (c) Non-linear (d) Curvilinear
MCQ .29
When regression line passes through the origin, then:
(a) Intercept is zero (b) Regression coefficient is zero (c) Correlation is zero (d)
Association is zero
MCQ .30
When bXY is positive, then byx will be:
(a) Negative (b) Positive (c) Zero (d) One
MCQ .31
The correlation coefficient is the_______of two regression coefficients:
(a) Geometric mean (b) Arithmetic mean (c) Harmonic mean (d) Median
MCQ .32
When two regression coefficients bear same algebraic signs, then correlation
coefficient is:
(a) Positive (b) Negative (c) According to two signs (d) Zero
MCQ .33
It is possible that two regression coefficients have:
(a) Opposite signs (b) Same signs (c) No sign (d) Difficult to tell
MCQ .34
Regression coefficient is independent of:
(a) Units of measurement (b) Scale and origin (c) Both (a) and (b) (d) None of
them
MCQ .35
In the regression line Y = a+ bX:
(a) (b) (c) (d)
MCQ .36
In the regression line Y = a + bX, the following is always true:
(a) (b) (c) (d)
MCQ .37
The purpose of simple linear regression analysis is to:
(a) Predict one variable from another variable
(b) Replace points on a scatter diagram by a straight-line
(c) Measure the degree to which two variables are linearly associated
(d) Obtain the expected value of the independent random variable for a given
value of the dependent variable
MCQ .38
The sum of the difference between the actual values of Y and its values obtained
from the fitted
regression line is always:
(a) Zero (b) Positive (c) Negative (d) Minimum
MCQ .39
If all the actual and estimated values of Y are same on the regression line, the
sum of squares of
error will be:
(a) Zero (b) Minimum (c) Maximum (d) Unknown
MCQ .40
(a) Residual (b) Difference between independent and dependent variables
(c) Difference between slope and intercept (d) Sum of residual
MCQ .41
A measure of the strength of the linear relationship that exists between two
variables is called:
(a) Slope (b) Intercept (c) Correlation coefficient (d) Regression equation
MCQ .42
When the ratio of variations in the related variables is constant, it is called:
(a) Linear correlation (b) Nonlinear correlation (c) Positive correlation (d)
Negative correlation
MCQ .43
If both variables X and Y increase or decrease simultaneously, then the coefficient
of correlation will be:
(a) Positive (b) Negative (c) Zero (d) One
MCQ .44
If the points on the scatter diagram indicate that as one variable increases the
other variable tends to
decrease the value of r will be:
(a) Perfect positive (b) Perfect negative (c) Negative (d) Zero
MCQ .45
If the points on the scatter diagram show no tendency either to increase together
or decrease together
the value of r will be close to:
(a) -1 (b) +1 (c) 0.5 (d) 0
MCQ .46
If one item is fixed and unchangeable and the other item varies, the correlation
coefficient will be:
(a) Positive (b) Negative (c) Zero (d) Undecided
MCQ .47
In scatter diagram, if most of the points lie in the first and third quadrants, then
coefficient of
correlation is:
(a) Negative (b) Positive (c) Zero (d) All of the above
MCQ .48
If the two series move in reverse directions and the variations in their values are
always
proportionate, it is said to be:
(a) Negative correlation (b) Positive correlation
(c) Perfect negative correlation (d) Perfect positive correlation
MCQ .49
If both the series move in the same direction and the variations are in a fixed
proportion, correlation
between them is said to be:
(a) Perfect correlation (c) Linear correlation
(c) Nonlinear correlation (d) Perfect positive correlation
MCQ .50
The value of the coefficient of correlation r lies between:
(a) 0 and 1 (b) -1 and 0 (c) -1 and +1 (d) -0.5 and +0.5
MCQ .51
If X is measured in yours and Y is measured in minutes, then correlation
coefficient has the unit:
(a) Hours (b) Minutes (c) Both (a) and (b) (d) No unit
MCQ .52
The range of regressioin coefficient is:
(a) -1 to +1 (b) 0 to 1 (c) -∞ to +∞ (d) 0 to ∞
MCQ .53
The signs of regression coefficients and correlation coefficient are always:
(a) Different (b) Same (c) Positive (d) Negative
MCQ .54
The arithmetic mean of the two regression coefficients is greater than or equal to:
(a) -1 (b) +1 (c) 0 (d) r
MCQ .55
In simple linear regression model Y = α + βX + ε where α and β are called:
(a) Estimates (b) Parameters (c) Random errors (d) Variables
MCQ .56
Negative regression coefficient indicates that the movement of the variables are
in:
(a) Same direction (b) Opposite direction (c) Both (a) and (b) (d) Difficult to tell
MCQ .57
Positive regression coefficient indicates that the movement of the variables are in:
(a) Same direction (b) Opposite direction (c) Upward direction (d) Downward
direction
MCQ .58
If the value of regression coefficient is zero, then the two variable are called:
(a) Independent (b) Dependent (c) Both (a) and (b) (d) Difficult to tell
MCQ .59
The term regression was used by:
(a) Newton (b) Pearson (c) Spearman (d) Galton
MCQ .60
In the regression equation Y = a + bX, b is called:
(a) Slope (b) Regression coefficient (c) Intercept (d) Both (a) and (b)
MCQ .61
When the two regression lines are parallel to each other, then their slopes are:
(a) Zero (b) Different (c) Same (d) Positive
MCQ .62
The measure of change in dependent variable corresponding to an unit change in
independent
variable is called:
(a) Slope (b) Regression coefficient (c) Both (a) and (b) (d) Neither (a) and (b)
MCQ .63
In correlation problem both variables are:
(a) Equal (b) Unknown (c) Fixed (d) Random
MCQ .64
In the regression equation Y = a + bX, where a and b are called:
(a) Constants (b) Estimates (c) Parameters (d) Both (a) and (b)
MCQ .65
If byx = bxy = 1 and Sx = Sy, then r will be:
(a) 0 (b) -1 (c) 1 (d) Difficult to calculate
MCQ .66
The correlation coefficient between X and -X is:
(a) 0 (b) 0.5 (c) 1 (d) -1
MCQ .67
If byx = bxy = rxy, then:
(a) Sx ≠ Sy (b) Sx = Sy (c) Sx > Sy (d) Sx < Sy
MCQ .68
If rxy = 0.4, then r(2x, 2y) is equal to:
(a) 0.4 (b) 0.8 (c) 0 (d) 1
MCQ .69
rxy is equal to:
(a) 0 (b) -1 (c) 1 (d) 0.5
MCQ .70
If rxy = 0.75, then correlation coefficient between u = 1.5X and v = 2Y is:
(a) 0 (b) 0.75 (c) -0.75 (d) 1.5
MCQ .71
If byx = -2 and rxy= -1, then bxy is equal to:
(a) -1 (b) -2 (c) 0.5 (d) -0.5
MCQ .72
If byx = 1.6 and bxy = 0.4, then rxy will be:
(a) 0.4 (b) 0.64 (c) 0.8 (d) -0.8
MCQ .73
If byx = -0.8 and bxy = -0.2, then ryx is equal to:
(a) -0.2 (b) -0.4 (c) 0.4 (d) -0.8
MCQ .74
If = 6 – X, then r will be:
(a) 0 (b) 1 (c) -1 (d) Both (b) and (c)
MCQ .75
If = X + 10, then r equal to:
(a) 1 (b) -1 (c) 1/2 (d) Difficult to tell
MCQ .76
If Y = -10X and X = -0.1Y, then r is equal to:
(a) 0.1 (b) 1 (c) -1 (d) 10
MCQ .77
If the figure +1 signifies perfect positive correlation and the figure -1 signifies a
perfect negative
correlation, then the figure 0 signifies:
(a) A perfect correlation (b) Uncorrelated variables
(c) Not significant (d) Weak correlation
MCQ .78
A perfect positive correlation is signified by:
(a) 0 (b) -1 (c) +1 (d) -1 to +1
MCQ .79
If a statistics professor tells his class: "All those who got 100 on the statistics test
got 20 on the mathematics test, and all those that got 100 on the mathematics test
got 20 on the statistics test", he is saying that the correlation between the statistics
test and the mathematics test is:
(a) Negative (b) Positive (c) Zero (d) Difficult to tell
MCQ .80
If is zero, the correlation is:
(a) Weak negative (b) High positive (c) High negative (d) None of the preceding
MCQ .81
If rxy = 1, then:
(a) byx = bxy (b) byx > bxy (c) byx < bxy (d) byx . bxy = 1
MCQ .82
The relation between the regression coefficient byx and correlation coefficient r is:
MCQ .83
The relation between the regression coefficient bxy and correlation coefficient r is:
MCQ .84
If the sum of the product of the deviation of X and Y from their means is zero, the
correlation
coefficient between X and Y is:
(a) Zero (b) Maximum (c) Minimum (d) Undecided
MCQ .85
If the coefficient of correlation between the variables X and Y is r, the coefficient of
correlation
between X2 and Y2 is:
(a) -1 (b) 1 (c) r (d) r2
MCQ .86
If rxy = 0.75, then rxy will be:
(a) 0.25 (b) 0.50 (c) 0.75 (d) -0.75
MCQ .87
If , then byx is equal to:
(a) Positive (b) Negative (c) Zero (d) One
MCQ .88
If , then intercept a is equal to:
(a) 0 (b) 1 (c) -1 to +1 (d) 0 to 1
MCQ .89
:
(a) Less than zero (b) Greater than zero (c) Equal to zero (d) Not equal to zero
MCQ .90
When rxy < 0, then byx and bxy will be:
(a) Zero (b) Not equal to zero (c) Less than zero (d) Greater than zero
MCQ .91
When rxy > 0, then byx and bxy are both:
(a) 0 (b) < 0 (c) > 0 (d) < 1
MCQ .92
If rxy = 0, then:
(a) byx = 0 (b) bxy = 0 (c) Both (a) and (b) (d) byx ≠ bxy
MCQ .93
If bxy = 0.20 and rxy = 0.50, then byx is equal to:
(a) 0.20 (b) 0.25 (c) 0.50 (d) 1.25
MCQ .94
A regression model may be:
(a) Linear (b) Non-linear (c) Both (a) and (b) (d) Neither (a) and (

) The __________ function returns a list of all the


formal arguments of a function.

 A. formal()

 B. funct()

 C. formals()

 D. fun()

2) Which function help check to see whether an R


object is NULL with the _________ .
 A. is.null()

 B. null()

 C. as.nullobj()

 D. is.nullobj()

3) Which of the following code will print NULL?

 A. >arg(bin)

 B. >arg(paste)

 C. >args(pastebin)

 D. >args(paste)

4) What will be the output of the following R code


snippet? > paste("a", "b", sep = ":")

 A. a*b

 B. “a+b”

 C. “a:b”

 D. None of the above

5) _____ programming language is a dialect of S.

 A. L

 B. N

 C. R

 D. T
6) In 1991, R was created by Ross Ihaka and
Robert Gentleman in the Department of Statistics
at the University of _________

 A. Auckland

 B. Harvard

 C. California

 D. None of the above

7) R version 1.0.0 was released to the public _______.

 A. 2004

 B. 2000

 C. 2006

 D. 1998

8) R is technically much closer to the Scheme


language than it is to the original _____ language.

 A. C

 B. C++

 C. S

 D. K

9) In which year R-Core group was formed?

 A. 1991

 B. 1995
 C. 1997

 D. 1999

10) R runs on the ____________ operating system.

 A. Linux

 B. Ubuntu

 C. Windows

 D. All of the above

11) Which of the following packages does not


contain in “base” R system?

 A. splines, stats4

 B. mesh, compiler

 C. splines, stats4

 D. All of the above

12) Which of the following command is used to


print an object “x” in R?

 A. print(x)

 B. print{x}

 C. printx

 D. All of the above

13) R functionality is divided into a number of


________
 A. Domains

 B. Classes

 C. Packages

 D. Functions

14) _________ package contains most fundamental


functions to run R?

 A. parent

 B. child

 C. root

 D. base

15) S’s base graphics system allows for very fine


control over essentially every aspect of a plot or
graph.

 A. True

 B. False

16) ______ operator is used to create integer


sequences.

 A. ;

 B. :

 C. @

 D. -
17) In R , a vector is defined that it can only
contain objects of the ________

 A. Different class

 B. Same class

 C. Any class

 D. None of the above

18) ________ can define ‘undefined value’ in R


language.

 A. NaN

 B. Inf

 C. Sup

 D. None of the above

19) NaN stands for _____ .

 A. Not and Number

 B. Number and Number

 C. Not a Number

 D. Numeric a Number

20) _______ define ‘infinity’ in R.

 A. SuP

 B. Inf
 C. NaN

 D. All of the above

21) Matrices can be created by row-binding with


the help of the _________ function.

 A. rjoin()

 B. rbind()

 C. rowbind()

 D. None of the above

22) __________ function is used to test objects if they


are NaN.

 A. as.nan()

 B. s.nan()

 C. is.nan()

 D. None of the above

23) q() is used to quit the R program.

 A. True

 B. False

24) Which of the following will start the R


program?

 A. @ R

 B. - R
 C. / R

 D. $ R

25) The help.search() command allows searching


for help in various ways.

 A. True

 B. False

26) The help.search() command allows searching


for help in various ways.

 A. True

 B. False

27) What is the length of b? b <- 2:7

 A. 4

 B. 7

 C. 6

 D. 9

28) How many atomic vector types does R have?

 A. 4

 B. 6

 C. 7

 D. 12
29) ________ is the function to set row names for a
data frame.

 A. row.names()

 B. row.namespace()

 C. row.nam()

 D. None of the above

30) Is It possible to inspect the source code of R?

 A. Yes

 B. No

 C. May be

 D. Can't say
Data and Analysis in the Real World

Week 1 Quiz
Quiz, 12 questions

Question 1
1
point

1. Question 1
What statement below best describes why we do data analytics in business?

Refer to the following video for a refresher: video 1.

Analytics improve our understanding of how the business works

We must show a return on the investment we make in data & analytical resources

We need specific insights to make business decisions

We have to calculate & report financial results to owners / shareholders

Question 2
1
point

2. Question 2
What should you consider as you approach an analytical problem and in which order? Identify
correct order for the following ideas / steps.

For example, if you think they are already in the correct order, correct answer would be ABCDEF.

A. Sourcing Data

B. Analysis Outputs

C. Execute Analysis

D. Analysis Methods
E. Define Decision

F. Data Needs

Refer to the following video for a refresher: video 1.

ABCDEF:

A. Sourcing Data

B. Analysis Outputs

C. Execute Analysis

D. Analysis Methods

E. Define Decision

F. Data Needs

EBDAFC

E. Define Decision

B. Analysis Outputs

D. Analysis Methods

A. Sourcing Data

F. Data Needs

C. Execute Analysis

EBDFAC

E. Define Decision

B. Analysis Outputs
D. Analysis Methods

F. Data Needs

A. Sourcing Data

C. Execute Analysis

BDFACE

B. Analysis Outputs

D. Analysis Methods

F. Data Needs

A. Sourcing Data

C. Execute Analysis

E. Define Decision

Question 3
1
point

3. Question 3
What diagram below best describes the relationship between a mobile wireless carrier account
holder and devices at a point in time?
A - shows oval with account connected to oval with device by straight line

B - shows oval with account connected to oval with device by straight line – forked end on Account
side

C - shows oval with account connected to oval with device by straight line – forked end on Device
side

D - shows oval with account connected to oval with device by straight line – forked end on both sides

Refer to the following video for a refresher: video 2.

Question 4
1
point

4. Question 4
For the next 5 questions that describe types of metrics, select a source that best describes where
the following data might come from:

The average temperature of a turbine bearing over the last 8 hours

Refer to the following video for a refresher: video 1.

Billing System

Usage Tracking System

Customer Relationship Management System

Machine Data System

Enterprise Resource Planning System

Ticketing / Workflow System

Question 5
1
point

5. Question 5
Select a source that best describes where the following data might come from:

The number of developers allocated to a company software project

Billing System

Usage Tracking System

Customer Relationship Management System


Machine Data System

Enterprise Resource Planning System

Ticketing / Workflow System

Question 6
1
point

6. Question 6
Select a source that best describes where the following data might come from:

Household water consumption by month

Billing System

Usage Tracking System

Customer Relationship Management System

Machine Data System

Enterprise Resource Planning System

Ticketing / Workflow System

Question 7
1
point

7. Question 7
Select a source that best describes where the following data might come from:
The dollar amount of unpaid invoices at the end of a month

Billing System

Usage Tracking System

Customer Relationship Management System

Machine Data System

Enterprise Resource Planning System

Ticketing / Workflow System

Question 8
1
point

8. Question 8
Select a source that best describes where the following data might come from:

The average age of customers in Madison, Wisconsin

Billing System

Usage Tracking System

Customer Relationship Management System

Machine Data System


Enterprise Resource Planning System

Ticketing / Workflow System

1
point
9. Question 9
Why is it important for data analysts to understand the value-chain (process) associated with
information and the analytical process?

Refer to the following videos for a refresher: videos 3 and 4

the more you understand about the way the business works and how information flows through
business systems, the better prepared you will be to both execute and interpret your analysis. Also,
the more skill you have in finding and accessing data, the more productive and valuable you will be
as an analyst!
Question 10
1
point

10. Question 10
Identify correct order of steps in the Information-Action Value Chain.

Refer to the following videos for a refresher: videos 3 and 4.

ABCDEFGHI

A. Develop Strategy & Plan

B. Deliver the Pitch

C. Events & Characteristics in the Real World

D. Take Action

E. Data Capture by Source Systems

F. Data Extraction

G. Data Storage

H. Analytical Methods
I. Summarize & Interpret Results

CEFGHIABD

C. Events & Characteristics in the Real World

E. Data Capture by Source Systems

F. Data Extraction

G. Data Storage

H. Analytical Methods

I. Summarize & Interpret Results

A. Develop Strategy & Plan

B. Deliver the Pitch

D. Take Action

CEGFHIABD

C. Events & Characteristics in the Real World

E. Data Capture by Source Systems

G. Data Storage

F. Data Extraction

H. Analytical Methods

I. Summarize & Interpret Results

A. Develop Strategy & Plan

B. Deliver the Pitch


D. Take Action

CEGFHIBAD

C. Events & Characteristics in the Real World

E. Data Capture by Source Systems

G. Data Storage

F. Data Extraction

H. Analytical Methods

I. Summarize & Interpret Results

B. Deliver the Pitch

A. Develop Strategy & Plan

D. Take Action
MCQ TESTING OF HYPOTHESIS

MCQ 13.1
A statement about a population developed for the purpose of testing is called:
(a) Hypothesis (b) Hypothesis testing (c) Level of significance (d) Test-statistic

MCQ 13.2
Any hypothesis which is tested for the purpose of rejection under the assumption that it is true is
called:
(a) Null hypothesis (b) Alternative hypothesis (c) Statistical hypothesis (d) Composite hypothesis

MCQ 13.3
A statement about the value of a population parameter is called:
(a) Null hypothesis (b) Alternative hypothesis (c) Simple hypothesis (d) Composite hypothesis

MCQ 13.4
Any statement whose validity is tested on the basis of a sample is called:
(a) Null hypothesis (b) Alternative hypothesis (c) Statistical hypothesis (b) Simple hypothesis

MCQ 13.5
A quantitative statement about a population is called:
(a) Research hypothesis (b) Composite hypothesis (c) Simple hypothesis (d) Statistical hypothesis

MCQ 13.6
A statement that is accepted if the sample data provide sufficient evidence that the null hypothesis is false is
called:
(a) Simple hypothesis (b) Composite hypothesis (c) Statistical hypothesis (d) Alternative hypothesis

MCQ 13.7
The alternative hypothesis is also called:
(a) Null hypothesis (b) Statistical hypothesis (c) Research hypothesis (d) Simple hypothesis

MCQ 13.8
A hypothesis that specifies all the values of parameter is called:
(a) Simple hypothesis (b) Composite hypothesis (c) Statistical hypothesis (d) None of the above

MCQ 13.9
The hypothesis µ ≤ 10 is a:
(a) Simple hypothesis (b) Composite hypothesis (c) Alternative hypothesis (d) Difficult to tell.

MCQ 13.10
If a hypothesis specifies the population distribution is called:
(a) Simple hypothesis (b) Composite hypothesis (c) Alternative hypothesis (d) None of the above

MCQ 13.11
A hypothesis may be classified as:
(a) Simple (b) Composite (c) Null (d) All of the above

MCQ 13.12
The probability of rejecting the null hypothesis when it is true is called:
(a) Level of confidence (b) Level of significance (c) Power of the test (d) Difficult to tell
MCQ 13.13
The dividing point between the region where the null hypothesis is rejected and the region where it is not
rejected is said to be:
(a) Critical region (b) Critical value (c) Acceptance region (d) Significant region

MCQ 13.14
If the critical region is located equally in both sides of the sampling distribution of test-statistic, the test is
called:
(a) One tailed (b) Two tailed (c) Right tailed (d) Left tailed

MCQ 13.15
The choice of one-tailed test and two-tailed test depends upon:
(a) Null hypothesis (b) Alternative hypothesis (c) None of these (d) Composite hypotheses

MCQ 13.16
Test of hypothesis Ho: µ = 50 against H1: µ > 50 leads to:
(a) Left-tailed test (b) Right-tailed test (c) Two-tailed test (d) Difficult to tell

MCQ 13.17
Test of hypothesis Ho: µ = 20 against H1: µ < 20 leads to:
(a) Right one-sided test (b) Left one-sided test (c) Two-sided test (d) All of the above

MCQ 13.18
Testing Ho: µ = 25 against H1: µ ≠ 20 leads to:
(a) Two-tailed test (b) Left-tailed test (c) Right-tailed test (d) Neither (a), (b) and (c)

MCQ 13.19
A rule or formula that provides a basis for testing a null hypothesis is called:
(a) Test-statistic (b) Population statistic (c) Both of these (d) None of the above

MCQ 13.20
The range of test statistic-Z is:
(a) 0 to 1 (b) -1 to +1 (c) 0 to ∞ (d) -∞ to +∞

MCQ 13.21
The range of test statistic-t is:
(a) 0 to ∞ (b) 0 to 1 (c) -∞ to +∞ (d) -1 to +1

MCQ 13.22
If Ho is true and we reject it is called:
(a) Type-I error (b) Type-II error (c) Standard error (d) Sampling error

MCQ 13.23
The probability associated with committing type-I error is:
(a) β (b) α (c) 1 – β (d) 1 – α

MCQ 13.24
A failing student is passed by an examiner, it is an example of:
(a) Type-I error (b) Type-II error (c) Unbiased decision (d) Difficult to tell
MCQ 13.25
A passing student is failed by an examiner, it is an example of:
(a) Type-I error (b) Type-II error (c) Best decision (d) All of the above

MCQ 13.26
1 – α is also called:
(a) Confidence coefficient (b) Power of the test (c) Size of the test (d) Level of significance

MCQ 13.27
1 – α is the probability associated with:
(a) Type-I error (b) Type-II error (c) Level of confidence (d) Level of significance

MCQ 13.28
Area of the rejection region depends on:
(a) Size of α (b) Size of β (c) Test-statistic (d) Number of values

MCQ 13.29
Size of critical region is known as:
(a) β (b) 1 - β (c) Critical value (d) Size of the test

MCQ 13.30
A null hypothesis is rejected if the value of a test statistic lies in the:
(a) Rejection region (b) Acceptance region (c) Both (a) and (b) (d) Neither (a) nor (b)

MCQ 13.31
The test statistic is equal to:

MCQ 13.32
Level of significance is also called:
(a) Power of the test (b) Size of the test (c) Level of confidence (d) Confidence coefficient

MCQ 13.33
Level of significance α lies between:
(a) -1 and +1 (b) 0 and 1 (c) 0 and n (d) -∞ to +∞

MCQ 13.34
Critical region is also called:
(a) Acceptance region (b) Rejection region (c) Confidence region (d) Statistical region

MCQ 13.35
The probability of rejecting Ho when it is false is called:
(a) Power of the test (b) Size of the test (c) Level of confidence (d) Confidence coefficient

MCQ 13.36
Power of a test is related to:
(a) Type-I error (b) Type-II error (c) Both (a) and (b) (d) Neither (a) and (b)
MCQ 13.37
In testing hypothesis α + β is always equal to:
(a) One (b) Zero (c) Two (d) Difficult to tell

MCQ 13.38
The significance level is the risk of:
(a) Rejecting Ho when Ho is correct (b) Rejecting Ho when H1 is correct
(c) Rejecting H1 when H1 is correct (d) Accepting Ho when Ho is correct.

MCQ 13.39
An example in a two-sided alternative hypothesis is:
(a) H1: µ < 0 (b) H1: µ > 0 (c) H1: µ ≥ 0 (d) H1: µ ≠ 0

MCQ 13.40
If the magnitude of calculated value of t is less than the tabulated value of t and H1 is two-sided, we
should:
(a) Reject Ho (b) Accept H1 (c) Not reject Ho (d) Difficult to tell

MCQ 13.41
Accepting a null hypothesis Ho:
(a) Proves that Ho is true (b) Proves that Ho is false
(c) Implies that Ho is likely to be true (d) Proves that µ ≤ 0

MCQ 13.42
The chance of rejecting a true hypothesis decreases when sample size is:
(a) Decreased (b) Increased (c) Constant (d) Both (a) and (b)

MCQ 13.43
The equality condition always appears in:
(a) Null hypothesis (b) Simple hypothesis (c) Alternative hypothesis (d) Both (a) and (b)

MCQ 13.44
Which hypothesis is always in an inequality form?
(a) Null hypothesis (b) Alternative hypothesis (c) Simple hypothesis (d) Composite hypothesis

MCQ 13.45
Which of the following is composite hypothesis?
(a) µ ≥ µo (b) µ ≤ µo (c) µ = µo (d) µ ≠ µo

MCQ 13.46
P (Type I error) is equal to:
(a) 1 – α (b) 1 – β (c) α (d) β

MCQ 13.47
P (Type II error) is equal to:
(a) α (b) β (c) 1 – α (d) 1 – β

MCQ 13.48
The power of the test is equal to:
(a) α (b) β (c) 1 – α (d) 1 – β
MCQ 13.49
The degree of confidence is equal to:
(a) α (b) β (c) 1 – α (d) 1 – β

MCQ 13.50
α / 2 is called:
(a) One tailed significance level (b) Two tailed significance level
(c) Left tailed significance level (d) Right tailed significance level

MCQ 13.51
Student’s t-test is applicable only when:
(a) n≤30 and σ is known (b) n>30 and σ is unknown (c) n=30 and σ is known (d) All of the above

MCQ 13.52
Student’s t-statistic is applicable in case of:
(a) Equal number of samples (b) Unequal number of samples (c) Small samples (d) All of the above

MCQ 13.53
Paired t-test is applicable when the observations in the two samples are:
(a) Equal in number (b) Paired (c) Correlation (d) All of the above

MCQ 13.54
The degree of freedom for paired t-test based on n pairs of observations is:
(a) 2n - 1 (b) n - 2 (c) 2(n - 1) (d) n - 1

MCQ 13.55
The test-statistic has d.f = ________:
(a) n (b) n - 1 (c) n - 2 (d) n1 + n2 - 2

MCQ 13.56
In an unpaired samples t-test with sample sizes n1= 11 and n2= 11, the value of tabulated t should be
obtained for:
(a) 10 degrees of freedom (b) 21 degrees of freedom
(c) 22 degrees of freedom (d) 20 degrees of freedom

MCQ 13.57
In analyzing the results of an experiment involving seven paired samples, tabulated t should be
obtained for:
(a) 13 degrees of freedom (b) 6 degrees of freedom
(c) 12 degrees of freedom (d) 14 degrees of freedom

MCQ 13.58
The mean difference between 16 paired observations is 25 and the standard deviation of differences is
10. The value of statistic-t is:
(a) 4 (b) 10 (c) 16 (d) 25

MCQ 13.59
Statistic-t is defined as deviation of sample mean from population mean µ expressed in terms of:
(a) Standard deviation (b) Standard error
(c) Coefficient of standard deviation (d) Coefficient of variation
MCQ 13.60
Student’s t-distribution has (n-1) d.f. when all the n observations in the sample are:
(a) Dependent (b) Independent (c) Maximum (d) Minimum

MCQ 13.61
The number of independent values in a set of values is called:
(a) Test-statistic (b) Degree of freedom (c) Level of significance (d) Level of confidence
MCQ 13.62
The purpose of statistical inference is:
(a) To collect sample data and use them to formulate hypotheses about a population
(b) To draw conclusion about populations and then collect sample data to support the conclusions
(c) To draw conclusions about populations from sample data
(d) To draw conclusions about the known value of population parameter

MCQ 13.63
Suppose that the null hypothesis is true and it is rejected, is known as:
(a) A type-I error, and its probability is β
(b) A type-I error, and its probability is α
(c) A type-II error, and its probability is α
(d) A type-Il error, and its probability is β

MCQ 13.64
An advertising agency wants to test the hypothesis that the proportion of adults in Pakistan who read a Sunday
Magazine is 25 percent. The null hypothesis is that the proportion reading the Sunday Magazine is:
(a) Different from 25% (b) Equal to 25% (c) Less than 25 % (d) More than 25 %

MCQ 13.65
If the mean of a particular population is µo, is distributed:
(a) As a standard normal variable, if the population is non-normal
(b) As a standard normal variable, if the sample is large
(c) As a standard normal variable, if the population is normal
(d) As the t-distribution with v = n - 1 degrees of freedom

MCQ 13.66
If µ1 and µ2 are means of two populations, is distributed:

(a) As a standard normal variable, if both samples are independent and less than 30
(b) As a standard normal variable, if both populations are normal
(c) As both (a) and (b) state
(d) As the t-distribution with n1 + n2 - 2 degrees of freedom

MCQ 13.67
If the population proportion equals po, then is distributed:

(a) As a standard normal variable, if n > 30


(b) As a Poisson variable
(c) As the t-distribution with v= n 1 degrees of freedom
(d) As a distribution with v degrees of freedom
MCQ 13.68
When σ is known, the hypothesis about population mean is tested by:
(a) t-test (b) Z-test (c) χ2-test (d) F-test

MCQ 13.69
Given µo = 130, = 150, σ = 25 and n = 4; what test statistics is appropriate?
(a) t (b) Z (c) χ2 (d) F

MCQ 13.70
Given Ho: µ = µo, H1: µ ≠ µo, α = 0.05 and we reject Ho; the absolute value of the Z-statistic must have equalled
or been beyond what value?
(a) 1.96 (b) 1.65 (c) 2.58 (d) 2.33

MCQ 13.71
If p1 and p2 are not identical, then standard error of the difference of proportions (p1 – p2) is:

MCQ 13.72
Under the hypothesis Ho: p1 = p2, the formula for the standard error of the difference between
proportions (p1 – p2) is:
MULTIPLE CHOICE QUESTIONS ON QUANTITATIVE TECHNIQUES

1. The techniques which provide the decision maker a systematic and powerful means of
analysis to explore policies for achieving predetermined goals are called..........................
a. Correlation techniques
b. Mathematical techniques
c. Quantitative techniques
d. None of the above
2. Correlation analysis is a ..............................
a. Univariate analysis
b. Bivariate analysis
c. Multivariate analysis
d. Both b and c
3. If change in one variable results a corresponding change in the other variable, then
the variables are.........................
a. Correlated
b. Not correlated
c. Any of the above
d. None of the above
4. When the values of two variables move in the same direction, correlation is said to
be ............................
a. Linear
b. Non-linear
c. Positive
d. Negative
5. When the values of two variables move in the opposite directions, correlation is said
to be ............................
a. Linear
b. Non-linear
c. Positive
d. Negative
6. When the amount of change in one variable leads to a constant ratio of change in
the other variable, then correlation is said to be .........................
a. Linear
b. Non-linear
c. Positive
d. Negative
7. ...........................attempts to determine the degree of relationship between
variables.
a. Regression analysis
b. Correlation analysis
c. Inferential analysis
d. None of these
8. Non-linear correlation is also called.....................................
a. Non-curvy linear correlation
b. Curvy linear correlation
c. Zero correlation
d. None of these
9. Scatter diagram is also called ......................
a. Dot chart
b. Correlation graph
c. Both a and b
d. None of these
10. If all the points of a scatter diagram lie on a straight line falling from left upper
corner to the right bottom corner, the correlation is called...................
a. Zero correlation
b. High degree of positive correlation
c. Perfect negative correlation
d. Perfect positive correlation
11. If all the dots of a scatter diagram lie on a straight line falling from left bottom corner
to the right upper corner, the correlation is called..................
a. Zero correlation
b. High degree of positive correlation
c. Perfect negative correlation
d. Perfect positive correlation
12. Numerical measure of correlation is called .....................
a. Coefficient of correlation
b. Coefficient of determination
c. Coefficient of non-determination
d. Coefficient of regression
13. Coefficient of correlation explains:
a. Concentration
b. Relation
c. Dispersion
d. Asymmetry
14. Coefficient of correlation lies between:
a. 0 and +1
b. 0 and –1
c. –1 and +1
d. – 3 and +3
15. A high degree of +ve correlation between availability of rainfall and weight of weight
of people is:
a. A meaningless correlation
b. A spurious correlation
c. A nonsense correlation
d. All of the above
16. If the ratio of change in one variable is equal to the ratio of change in the other
variable, then the correlation is said to be .....................
a. Linear
b. Non-linear
c. Curvilinear
d. None of these
17. Pearsonian correlation coefficient if denoted by the symbol ...............
a. K
b. r
c. R
d. None of these
18. If r= +1, the correlation is said to be ...................
a. High degree of +ve correlation
b. High degree of –ve correlation
c. Perfect +ve correlation
d. Perfect –ve correlation
19. If the dots in a scatter diagram fall on a narrow band, it indicates a .......................
degree of correlation.
a. Zero
b. High
c. Low
d. None of these
20. If all the points of a dot chart lie on a straight line vertical to the X-axis, then
coefficient of correlation is ...................
a. 0
b. +1
c. –1
d. None of these
21. If all the points of a dot chart lie on a straight line parallel to the X-axis, it denotes
.................................of correlation.
a. High degree
b. Low degree
c. Moderate degree
d. Absence
22. If dots are lying on a scatter diagram in a haphazard manner, then r = ......................
a. 0
b. +1
c. –1
d. None of these
23. The unit of Coefficient of correlation is ........................
a. Percentage
b. Ratio
c. Same unit of the data
d. No unit
24. Product moment correlation method is also called ........................
a. Rank correlation
b. Pearsonian correlation
c. Concurrent deviation
d. None of these
25. The –ve sign of correlation coefficient between X and Y indicates.............................
a. X decreasing, Y increasing
b. X increasing, Y decreasing
c. Any of the above
d. There is no change in X and Y
26. Coefficient of correlation explains .........................of the relationship between two
variables.
a. Degree
b. Direction
c. Both of the above
d. None of the above
27. For perfect correlation, the coefficient of correlation should be ..........................
a. ± 1
b. + 1
c. – 1
d. 0
28. Rank correlation coefficient was discovered by....................................
a. Fisher
b. Spearman
c. Karl Pearson
d. Bowley
29. The rank correlation coefficient is always............................
a. + 1
b. – 1
c. 0
d. Between + 1 and – 1
30. Spearman’s Rank Correlation Coefficient is usually denoted by....................
a. k
b. r
c. S
d. R
31. Probable error is used to:
a. Test the reliability of correlation coefficient
b. Measure the error in correlation coefficient
c. Both a an b
d. None of these
32. If coefficient of correlation is more than ................of its P E, correlation is significant.
a. 2 times
b. 5 times
c. 6 times
d. 10 times
33. In correlation analysis, Probable Error = ........................ x 0.6745
a. Standard deviation
b. Standard error
c. Coefficient of correlation
d. None of these
34. Coefficient of concurrent deviation depends on .......................
a. The signs of the deviations
b. The magnitude of the deviations
c. Bothe a and b
d. None of these
35. Correlation analysis between two sets of data only is called....................
a. Partial correlation
b. Multiple correlation
c. Nonsense correlation
d. Simple correlation
36. Correlation analysis between one dependent variable with one independent variable
by keeping the other independent variables as constant is called......................
a. Partial correlation
b. Multiple correlation
c. Nonsense correlation
d. Simple correlation
37. Study of correlation among three or more variables simultaneously is called.............
a. Partial correlation
b. Multiple correlation
c. Nonsense correlation
d. Simple correlation
38. If r = 0.8, coefficient of determination is.....................................
a. 80%
b. 8%
c. 64%
d. 0.8%
39. If r is the simple correlation coefficient, the quantity r 2 is known as ...................
a. Coefficient of determination
b. Coefficient of non-determination
c. Coefficient of alienation
d. None of these
40. If r is the simple correlation coefficient, the quantity 1 -- r2 is known as ...................
a. Coefficient of determination
b. Coefficient of non-determination
c. Coefficient of alienation
d. None of these
41. The term regression was first used by..........................
a. Karl Pearson
b. Spearman
c. R A Fisher
d. Francis Galton
42. ....................refers to analysis of average relationship between two variables to
provide mechanism for prediction.
a. Correlation
b. Regression
c. Standard error
d. None of these
43. If there are two variables, there can be at most............................... number of
regression lines.
a. One
b. Two
c. Three
d. Infinite
44. If the regression line is Y on X, then the variable X is known as..........................
a. Independent variable
b. Explanatory variable
c. Regressor
d. All the above
45. Regression line is also called.................................
a. Estimating equation
b. Prediction equation
c. Line of average relationship
d. All the above
46. If the regression line is X on Y, then the variable X is known as..........................
a. Dependent variable
b. Explained variable
c. Both a and b
d. Regressor
47. If the regression line is X on Y, then the variable X is known as..........................
a. Dependent variable
b. Independent variable
c. Bothe a and b
d. None of the above
48. If the regression line is Y on X, then the variable X is known as..........................
a. Dependent variable
b. Independent variable
c. Both a and b
d. None of the above
49. The point of intersection of two regression lines is..........................
a. (0,0)
b. (1,1)
c. (x,y)
d. (x̄, ӯ)
50. If r = ± 1, the two regression lines are...............................
a. Coincident
b. Parallel
c. Perpendicular to each other
d. None of these
51. If r = 1, the angle between the two regression lines is.........................
a. Ninety degree
b. Thirty degree
c. Zero degree
d. Sixty degree
52. If r = 0, the two regression lines are:
a. Coincident
b. Parallel
c. Perpendicular to each other
d. None of these
53. If bxy and byx are two regression coefficients, they have:
a. Same signs
b. Opposite signs
c. Either a or b
d. None of the above.
54. If byx > 1, then bxy is:
a. Greater than one
b. Less than one
c. Equal to one
d. Equal to zero
55. If X and Y are independent, the value of byx is equal to ........................
a. Zero
b. One
c. Infinity
d. Any positive value
56. The property that both the regression coefficients and correlation coefficient have
same signs is called................................
a. Fundamental property
b. Magnitude property
c. Signature property
d. None of these
57. The property that byx > 1 implies that bxy < 1 is known as .....................
a. Fundamental property
b. Magnitude property
c. Signature property
d. None of these
58. If X and Y are independent, the property byx = bxy = 0 is called ...................
a. Fundamental property
b. Magnitude property
c. Mean property
d. Independence property
59. The Correlation coefficient between two variables is the ........................... of their
regression coefficients.
a. Arithmetic mean
b. Geometric mean
c. Harmonic mean
d. None of these
60. If the correlation coefficient between two variables, X and Y, is negative, then the
regression coefficient of Y on X is.............................
a. Positive
b. Negative
c. Not certain
d. None of these
61. The G M of two regression coefficients byx and bxy is equal to ..........................
a. r
b. r2
c. 1 – r2
d. None of these
62. If one regression coefficient is negative, the other is ...............................
a. 0
b. – ve
c. +ve
d. Either a or b
63. Arithmetic mean of the two regression coefficients is:
a. Equal to correlation coefficient
b. Greater than correlation coefficient
c. Less than correlation coefficient
d. Equal to or greater than correlation coefficient
64. byx is the regression coefficient of the regression equation.....................
a. Y on X
b. X on Y
c. Either a or b
d. None of these
65. bxy is the regression coefficient of the regression equation.....................
a. Y on X
b. X on Y
c. Either a or b
d. None of these
66. In ..................... regression analysis, only one independent variable is used to explain
the dependent variable.
a. Multiple
b. Non-linear
c. Linear
d. None of these
67. The regression coefficient and correlation coefficient of the two variables will be the
same if their .............................are same.
a. Arithmetic mean
b. Standard deviation
c. Geometric mean
d. Mean deviation
68. The idea of testing of hypothesis was first set forth by ..........................
a. R A Fisher
b. J Neyman
c. E L Lehman
d. A Wald
69. By testing of hypothesis, we mean:
a. A significant procedure in Statistics
b. A method of making a significant statement
c. A rule for accepting or rejecting hypothesis
d. A significant estimation of a problem.
70. Testing of hypothesis and ......................are the two branches of statistical inference.
a. Statistical analysis
b. Probability
c. Correlation analysis
d. Estimation
71. ......................... is the original hypothesis
a. Null hypothesis
b. Alternative hypothesis
c. Either a or b
d. None of these
72. A null hypothesis is denoted by...........................
a. H0
b. H1
c. NH
d. None of these
73. An alternative hypothesis is denoted by...........................
a. H0
b. H1
c. AH
d. None of these
74. Whether a test is one sided or two sided, depends on........................
a. Simple hypothesis
b. Composite hypothesis
c. Null hypothesis
d. Alternative hypothesis
75. A wrong decision about null hypothesis leads to:
a. One kind of error
b. Two kinds of errors
c. Three kinds of errors
d. Four kinds of errors
76. Power of a test is related to ........................
a. Type I error
b. Type II error
c. Both a and b
d. None of these
77. Level of significance is the probability of................................
a. Type I error
b. Type II error
c. Both a and b
d. None of these
78. Which type of error is more severe error:
a. Type I error
b. Type II error
c. Both a and b
d. None of these
79. Type II error means..............................
a. Accepting a true hypothesis
b. Rejecting a true hypothesis
c. Accepting a wrong hypothesis
d. Rejecting a wrong hypothesis
80. Type I error is denoted by...........................
a. Alpha
b. Beta
c. Gamma
d. None of these
81. Type II error is denoted by....................................
a. Alpha
b. Beta
c. Gamma
d. None of these
82. The level of probability of accepting a true null hypothesis is called........................
a. Degree of freedom
b. Level of significance
c. Level of confidence
d. D,
83. The probability of rejecting a true null hypothesis is called.......................
a. Degree of freedom
b. Level of significance
c. Level of confidence
d. None of these
84. 1 – Level of confidence =.............................
a. Level of significance
b. Degree of freedom
c. Either a or b
d. None of these
85. While testing a hypothesis, if level of significance is not mentioned, we take
................... level of significance.
a. 1%
b. 2%
c. 5%
d. 10%
86. A sample is treated as large sample, when its size is.............................
a. More than 100
b. More than 75
c. More than 50
d. More than 30
87. ...............refers to the number of independent observations which is obtained by
subtracting the number of constraints from the total number of observations.
a. Sample size
b. Degree of freedom
c. Level of significance
d. Level of confidence
88. Total number of observations – number of constraints =......................
a. Level of significance
b. Degree of freedom
c. Level of confidence
d. Sample size
89. Accepting a null hypothesis when it is false is called................................
a. Type I error
b. Type II error
c. Probable error
d. Standard error
90. Accepting a null hypothesis when it is true is called................................
a. Type I error
b. Type II error
c. Probable error
d. No error
91. When sample is small,....................... test is applied.
a. t-test
b. Z test
c. F test
d. None of these
92. To test a hypothesis about proportions of items in a class, the usual test is..............
a. t-test
b. Z- test
c. F test
d. Sign test
93. Student’s t-test is applicable when:
a. The values of the variate are independent
b. The variable is distributed normally
c. The sample is small
d. All the above
94. Testing of hypotheses Ho : μ = 45 vs. H1 : μ > 45 when the population standard
deviation is known, the appropriate test is:
a. t-test
b. Z test
c. Chi-square test
d. F test
95. Testing of hypotheses Ho : μ = 85 vs. H1 : μ > 85, is a ...................test.
a. One sided left tailed test
b. One sided right tailed test
c. Two tailed test
d. None of these
96. Testing of hypotheses Ho : μ = 65 vs. H1 : μ < 65, is a ...................test.
a. One sided left tailed test
b. One sided right tailed test
c. Two tailed test
d. None of these
97. Testing of hypotheses Ho : μ = 65 vs. H1 : μ ≠ 65, is a ...................test.
a. One sided left tailed test
b. One sided right tailed test
c. Two tailed test
d. None of these
98. Student’s t-test was designed by ............................
a. R A Fisher
b. Wilcoxon
c. Wald wolfowitz
d. W S Gosset
99. Z test was designed by ........................................
a. R A Fisher
b. Wilcoxon
c. Wald wolfowitz
d. W S Gosset
100. Z test was designed by .......................................
a. R A Fisher
b. Wilcoxon
c. Wald wolfowitz
d. W S Gosset
101.The range of F ratio is ........................................
a. – 1 to + 1
b. – ∞ to ∞
c. 0 to ∞
d. 0 to 1
102. While computing F ratio, customarily, the larger variance is taken as .....................
a. Denominator
b. Numerator
c. Either way
d. None of these
103. Chi-square test was first used by ...............................
a. R A Fisher
b. William Gosset
c. James Bernoulli
d. Karl Pearson
104. The Chi-squre quantity ranges from ........................ to ...........................
a. – 1 to + 1

b. – ∞ to ∞
c. 0 to ∞
d. 0 to 1
105.Degrees of freedom for Chi-squre test in case of contingency table of order (2x2) is:
a. 4
b. 3
c. 2
d. 1
106.Degrees of freedom for Chi-squre test in case of contingency table of order (4x3) is:
a. 4
b. 3
c. 6
d. 7
107.Degrees of freedom for Chi-squre test in case of contingency table of order (5x5) is:
a. 25
b. 16
c. 10
d. Infinity
108.The magnitude of the difference between observed frequencies and expected
frequencies is called .......................
a. F value
b. Z value
c. t value
d. Chi-square value
109.When the expected frequencies and observed frequencies completely coincide, the
chi-square value will be ..............................
a. + 1
b. – 1
c. 0
d. None of these
110.If the discrepancy between observed and expected frequencies are greater,
......................... will be the chi-square value.
a. Greater
b. Smaller
c. Zero
d. None of these
111.Calculated value of chi-square is always........................
a. Positive
b. Negative
c. Zero
d. None of these
112.While applying chi-square test, the frequency in any cell should not be ......................
a. More than 5
b. Less than 5
c. More than 10
d. Less than 10
113.Analysis of variance utilises..................
a. F test
b. Chi square test
c. Z test
d. t test
114.In one way ANOVA, the variances are:
a. Within samples
b. Between samples
c. Total
d. All
115.The technique of analysis of variance was developed by .............................
a. Frank Wilcoxon
b. Karl Pearson
c. R A Fisher
d. Kolmogrov
116.Non-parametric test is :
a. Distribution free test
b. Not concerned with parameter
c. Does not depend on the particular form of the distribution
d. None of these
117..........................tests follow assumptions about population parameters.
a. Parametric
b. Non-parametric
c. One-tailed
d. Two-tailed
118.........................is the simplest and most widely used non-parametric test
a. Sign test
b. K-S test
c. Chi-square tst
d. Wilcoxon matched paired test
119.Runs test was designed by .............................
a. Kruskal and Wallis
b. Kolmogrov and Smirnov
c. Wald wolfowitz
d. Karl Pearson
120.Which one of the following is a non-parametric test?
a. F test
b. Z test
c. t test
d. Wilcoxon test
121.Control charts are also termed as...............................
a. Shewart charts
b. Process behaviour chart
c. Both a and b
d. None of these
122.What type of chart will be used to plot the number of defective in the output of any
process?
a. x̄ chart
b. R chart
c. C chart
d. P chart
123.Process control is carried out:
a. Before production
b. During production
c. After production
d. All of the above
124.The dividing lines between random and non-random deviations from mean of the
distribution are known as ..........................
a. Upper Control Limit
b. Lower Control Limit
c. Control Limits
d. Two sigma limit
125.The control charts used to monitor variable is...........................
a. Range chart
b. P-chart
c. C-chart
d. All of the above
126.The control charts used to monitor attributes is............................
a. Range chart
b. P-chart
c. C-chart
d. All of the above
127.The control charts used for the fraction of defective items in a sample
is............................
a. Range chart
b. P-chart
c. C-chart
d. Mean chart
128.The control charts used for the number of defects per unit is:
a. Range chart
b. P-chart
c. C-chart
d. Mean chart
129.........................is user for testing goodness of fit.
a. Wilcoxon test
b. Sign test
c. K-S Test
d. Chi-square test
130.Which of the following is a non-parametric test?
a. F-test
b. Z-test
c. Wilcoxon test
d. All of the above
131.Regression coefficient is independent of...........................
a. Origin
b. Scale
c. Both a and b
d. Neither origin nor scale
132.The geometric mean of the two regression coefficient, bxy and byx is equal to:
a. r
b. r2
c. 1
d. None of the above
133.In a correlation analysis, if r= 0, then we may say that there is .................. between
variables.
a. No correlation
b. Linear correlation
c. Perfect correlation
d. none of these
134.If ‘r’ is the correlation coefficient between two variables, then:
a. 0 < r < 1
b. – 1 ≤ r ≤ 1
c. r ≥ 0
d. r ≤ 0

**********
ANSWERS
1:c 21 : d 41 : d 61 : a 81 : b 101 : c 121 : c
2:d 22 : a 42 : b 62 : b 82 : c 102 : b 122 : d
3:a 23 : d 43 : b 63 : b 83 : b 103 : d 123 : b
4:c 24 : b 44 : d 64 : a 84 : a 104 : c 124 : c
5:d 25 : c 45 : d 65 : b 85 : c 105 : d 125 : a
6:a 26 : c 46 : c 66 : c 86 : d 106 : c 126 : b
7:b 27 : a 47 : a 67 : b 87 : b 107 : b 127 : b
8:b 28 : b 48 : b 68 : b 88 : b 108 : d 128 : c
9:a 29 : d 49 : d 69 : c 89 : b 109 : c 129 : d
10 : c 30 : d 50 : a 70 : d 90 : d 110 : a 130 : c
11 : d 31 : a 51 : c 71 : a 91 : a 111 : a 131 : a
12 : a 32 : c 52 : c 72 : a 92 : b 112 : b 132 : a
13 : b 33 : b 53 : a 73 : b 93 : d 113 : a 133 : a
14 : c 34 : a 54 : b 74 : d 94 : b 114 : d 134 : b
15 : d 35 : d 55 : a 75 : b 95 : b 115 : c
16 : a 36 : a 56 : c 76 : b 96 : a 116 : d
17 : c 37 : b 57 : b 77 : a 97 : c 117 : a
18 : c 38 : c 58 : d 78 : b 98 : d 118 : c
19 : b 39 : a 59 : b 79 : c 99 : a 119 : c
20 : a 40 : b 60 : b 80 : a 100 : a 120 : a

Prepared by

VINEETHAN T

Assistant Professor

Govt. College Madappally


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

MBA – II SEM-III
304 : Advanced Statistical Methods using R
MULTIPLE CHOICE QUESTIONS

Q No Question Answer
Which of the following is apply function in R?
a) apply()
1 b) tapply() B
c) fapply()
d) rapply()
Point out the correct statement?
a) Writing functions is a core activity of an R programmer
b) Functions are often used to encapsulate a sequence of
expressions that need to be executed numerous times
2 D
c) Functions are also often written when code must be shared
with others or the public
d) All of the mentioned

Functions are defined using the _________ directive and are


stored as R objects.
a) function()
3 A
b) funct()
c) functions()
d) fun()
What will be the output of the following R code?
f <- function() {
## This is an empty function
}
f()
4 C
0
b) No result
c) NULL
d) 1

Point out the wrong statement?


a) Functions in R are “second class objects”
5 A
b) The writing of a function allows a developer to create an
interface to the code, that is explicitly specified with a set of

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

parameters
c) Functions provides an abstraction of the code to potential
users
d) Writing functions is a core activity of an R programmer

What will be the output of the following R code?

> f <- function() {


+ ## This is an empty function
+ }
> class(f)
6 A
a) “function”
b) “class”
c) “procedure”
d) “system”

Which of the following R code will print “Hello, world!”?


a)

> f <- function() {


+ cat("Hello, world!\n")
+ }
> f()

b)

> f <- function() {


+ cat("Hello, World!\n")
+ }
7 < f() A
c)

> f <- function() {


+ cat("Hello world!\n")
+ }
>>= f()

d)

> f <- function() {


- cat("Hello world!\n")
+ }

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

<= f()

What will be the output of the following R code?

> f <- function(num) {


+ for(i in seq_len(num)) {
+ cat("Hello, world!\n")
+ }
+ }
> f(3)

a)

Hello, world!

Hello, world!

8 b) B

Hello, world!
Hello, world!
Hello, world!

c)

Hello, world!

d)

Hello, world!
Hello, world!
Hello, world!
Hello, world!

What will be the output of the following R code?

> f <- function(num) {


+ hello <- "Hello, world!\n"
+ for(i in seq_len(num)) {
9 + cat(hello) A
+ }
+ chars <- nchar(hello) * num
+ chars
+ }
> meaningoflife <- f(3)

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

> print(meaningoflife)

a) 32
b) 42
c) 52
d) 46
R has how many atomic classes of objects?
a) 1
b) 2
10 c) 3 D
d) 5

Point out the correct statement?


a) Empty vectors can be created with the vector() function
b) A sequence is represented as a vector but can contain
11 objects of different classes A
c) “raw” objects are commonly used directly in data analysis
d) The value NaN represents undefined value

Numbers in R are generally treated as _______ precision real


numbers.
a) single
12 b) doublec B
c) real
d) imaginary

If you explicitly want an integer, you need to specify the _____


suffix.
a) D
13 b) R C
c) L
d) K

Point out the correct statement?


a) The value NaN represents undefined value
b) Number Inf represents infinity in R
14 c) NaN can also be thought of as a missing value B
d) “raw” objects are commonly used directly in data analysis

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

Attributes of an object (if any) can be accessed using the


______ function.
a) objects()
15 b) attrib() C
c) attributes()
d) obj()

. R objects can have attributes, which are like ________ for the
object.
a) metadata
16 b) features A
c) expression
d) dimensions

Which of the following can be considered as object attribute?


a) dimensions
b) class
17 c) length D
d) all of the mentioned

What will be the output of the following R code?

> x <- vector("numeric", length = 10)


> x
18 a) 10 B
b) 0 0 0 0 0 0 0 0 0 0
c) 01
d) 00120

The ________ function can be used to create vectors of


objects by concatenating things together.
a) cp()
19 b) c() B
c) concat()
d) con()

Which of the following statement is invalid?


20 a) x <- c(1+0i, 2+4i) D
b) x <- c(TRUE, FALSE)

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

c) x <- c(T, F)
d) None of the mentioned

Point out the correct statement?


a) Use explicit TRUE and FALSE values when indicating
logical values
21 b) rm command is used to remove objects in R D
c) R operates on named data structures
d) All of the mentioned

What will be the output of the following R code?

> x <- 6
> class(x)
22 a) “integer” B
b) “numeric”
c) “real”
d) “imaginary”

What will be the output of the following R code?

> x <- 0:6


> as.logical(x)
23 a) FALSE TRUE TRUE TRUE TRUE TRUE TRUE A
b) “0” “1” “2” “3” “4” “5” “6”
c) 0 1 2 3 4 5 6
d) 6 5 5 3 2 1

Point out the correct statement?


a) The usual operator, <-, can be thought of as a syntactic
shortcut to expression operation
b) Assignment can also be made using the function
24 assignment() C
c) Vectors can be used in arithmetic expressions, in which
case the operations are performed element by element
d) seq() is used to delete the numbers

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

. Which of the following is invalid assignment?


a)

> c(10.4, 5.6, 3.1, 6.4, 21.7) -> x

b)

> assign("x", c(10.4, 5.6, 3.1, 6.4, 21.7))


25 D
c)

> x <- c(10.4, 5.6, 3.1, 6.4, 21.7)

d) None of the mentioned

What will be the output of the following R code?

> sqrt(-17)

a) -4.02
26 b) 4.02 C
c) NaN
d) 3.67

Which of the following code constructs vector of length 11?


a)

> v <- 3*x + y + 1

b)

27 > v <- 3*x + y + 2 C


c)

> v <- 2*x + y + 1

d)

> v <- 2*x + y + 4

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

. _______ function returns a vector of the same size as x with


the elements arranged in increasing order.
a) sort()
28 b) orderasc() A
c) orderby()
d) sequence()

Which of the following is used for generating sequences?


a) seq()
b) sequence()
29 c) order() A
d) orderasc()

Which of the following is used for reading in saved


workspaces?
a) unserialize
30 b) load B
c) get
d) set

. Point out the wrong statement?


a) write.table is used for for writing tabular data to text files (i.e.
CSV) or connections
b) writeLines is used for for writing character data line-by-line
to a file or connection
31 c) dump is used for for dumping a textual representation of D
multiple R objects
d) all of the mentioned

________ is used for outputting a textual representation of an


R object.
a) dput
b) dump
32 c) dget A
d) dset

Which of the following argument denotes if the file has a


33 A
header line?

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

a) header
b) sep
c) file
d) footer

. Point out the correct statement?


a) unserialize is used for converting an R object into a binary
format for outputting to a connection
b) save is used for saving an arbitrary number of R objects in
34 binary format to a file B
c) The read.data() function is one of the most commonly used
functions for reading data
d) save is not used for saving an arbitrary

Which of the following statement would read file “foo.txt”?


a) data <- read.table(“foo.txt”)
b) read.data <- read.table(“foo.txt”)
35 c) data <- read.data(“foo.txt”) A
d) data <- data(“foo.txt”)

Which of the following function is identical to read .table?


a) read.csv
b) read.data
36 c) read.tab A
d) read.del

Which of the following code would read 100 rows?


a) initial <- read.table(“datatable.txt”, nrows = 100)
b) tabAll <- read.table(“datatable.txt”, colClasses = classes)
37 c) initial <- read.table(“datatable.txt”, nrows = 99) A
d) initial <- read.table(“datatable.txt”, nrows = 101)

. Individual R objects can be saved to a file using the _____


function.
38 A
a) save
b) put

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

c) save_image
d) get

Point out the correct statement?


a) The complement to the textual format is the binary format
b) If you have a lot of objects that you want to save to a file,
you can save all objects in your workspace using the
save.image() function
39 c) The serialize() function is used to convert individual R D
objects into a binary format that can be communicated across
an arbitrary connection
d) All of the mentioned

Which of the following R statement will save the output to the


file for following R code?

> a <- data.frame(x = rnorm(100), y = runif(100))


> b <- c(3, 4.4, 1 / 3)
40 a) save(a, b, file = “mydata.rda”) A
b) save_image(a, b, file = “mydata.rda”)
c) keep(a, b, file = “mydata.rda”)
d) keep_image(a, b, file = “mydata.rda”)

Which of the following statement will load the objects to the file
named “mydata.RData”?
a) save(“mydata.RData”)
b) load(“mydata.RData”)
41 c) loadAll(“mydata.RData”) B
d) put(“mydata.RData”)

Point out the wrong statement?


a) When you call unserialize() on an R object, the output will be
a raw vector coded in hexadecimal format
b) serialize() function is the only way to perfectly represent an
42 R object in an exportable format A
c) .rda extension is used when save() function is incorporated
d) The complement to the textual format is the binary format

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

. ________ opens a connection to a file compressed with gzip.


a) url
b) gzfile
43 c) bzfile B
d) file

Connections to text files can be created with the ________


function.
a) url
b) gzfile
44 c) bzfile D
d) file

. Which of the following R code creates a connection to


‘foo.txt’?
a) con <- file(“foo.txt”)
b) open(con, “r”)
45 c) opencon(con, “r”) A
d) ocon(con, “r”)

Which of the following code opens a connection to the file


foo.txt, reads from it, and closes the connection when its done?
a) data <- read.csv(“foo.txt”)
46 b) data <- read.csvo(“foo.txt”) A
c) data <- readonly.csv(“foo.txt”)
d) data <- getonly.csv(“foo.txt”)

Which of the following opens connection to gz-compressed text


file?
a) con <- gzfiles(“words.gz”)
b) con <- gzfile(“words.gz”)
47 c) con <- gzfile2(“words.gz”) B
d) con <- gzfiles2(“words.gz”)

Which of the following is example of vectorized operation as far


48 as subtraction is concerned? B

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

> x <- 1:4


> y <- 6:9
a) x+y
b) x-y
c) x/y
d) x–y

Point out the wrong statement?


a) Very less operations in R are vectorized
b) Vectorization allows you to write code that is efficient,
concise, and easier to read than in non-vectorized languages
49 c) vectorized means that operations occur in parallel in certain A
R objects
d) Matrix operations are also vectorized

What will be the output of the following R code?

> x <- 1:4


> y <- 6:9
> z <- x + y
> z
50 a) 7 9 11 13 A
b) 7 9 11 13 14
c) 9 7 11 13
d) NULL

What will be the output of the following R code?

> x <- 1:4


> x > 2
51 a) 1 2 3 4 B
b) FALSE FALSE TRUE TRUE
c) 1 2 3 4 5
d) 5 4 3 1 2 1

Point out the wrong statement?


a) Dates are represented by the Date class
52 C
b) Times are represented by the POSIXct or the POSIXlt class
c) Dates are represented by the DateTime class

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

d) Times can be coerced from a character string

What will be the output of the following R code?

> x <- 1:4


> y <- 6:9
> x/y
53 a) 0.1666667 0.2857143 0.4444444 B
b) 0.1666667 0.2857143 0.3750000 0.4444444
c) 0.2857143 0.3750000 0.4444444
d) Error

Which of the following code represents internal representation


of a Date object?
a) class(as.Date(“1970-01-02”))
b) unclass(as.Date(“1970-01-02”))
54 c) unclassint(as.Date(“1970-01-02”)) B
d) classint(as.Date(“1970-02-02”))

What will be the output of the following R code?

> x <- as.Date("1970-01-01")


>x
55 a) “1970-01-01” A
b) “1970-01-02”
c) “1970-02-01”
d) “1970-02-02”

R is an__________programming language?
a) Closed source
b) GPL
56 C
c) Open source
d) Definite source

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

Who developed R?
a) Dennis Ritchie

57 b) John Chambers B
c) Bjarne Stroustrup

R was named partly after the first names of____R authors?


a) One
b) Two
58 B
c) Three
d) Four

Packages are useful in collecting sets into a_____unit ?


a) Single
59 b) Multiple A

Many quantitative analysts use R as their____tool?


a) Leading tool
60 b) Programming tool C

c) Both the above

Predictive analysis is the branch of __________analysis?


a) Advanced
61 b) Core A

c) Both the above

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

___________ is used to make predictions about unknown


future events?
62 a) Descriptive analysis B
b) Predicitive analysis
c) Both the above
How many steps does the predictive analysis process
contained?
a) 5
63 b) 6 C
c) 7
d) 8

Descriptive analysis tell about________?


a) Past
64 b) Present A

c) Future

How many types of R objects are present in R data type?


a) 4
b) 5
65 C
c) 6
d) 7

How many types of data types are present in R?


a) 4
b) 5
66 C
c) 6
d) 7

67 Which of the following is a primary tool for debugging? A

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

a) debug()
b) trace()
c) browser()
d) None of the above

Which function is used to create the vector with more than one
element?
a) Library()
68 b) plot() C
c) c()
d) par()

In R every operation has a ______call?


a) System
69 b) Function B

c) None of the above

The ____________ in R is a vector.


a) Basic data structure
70 b) Basic datatypes A

c) Both

Vectors come in two parts_____ and _____.


a) Atomic vectors and matrix
71 b) Atomic vectors and array C

c) Atomic vectors and list

How many types of atomic vectors are present?


72 B
a) 3

Prof. Bhavsar Dhananjay www.dimr.edu.in


DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH

b) 4
c) 5
d) 6

How many types of vertices functions are peresent?


a) 1
b) 2
73 B
c) 3
d) 4

_________and_________ are types of matrices functions?


a) Apply and sapply
74 b) Apply and lapply C

c) Both

R is an interpreted language so it can access


through_____________?
a) Disk operating system
75 b) User interface operating system D
c) Operating system
d) Command line interpreter

Prof. Bhavsar Dhananjay www.dimr.edu.in


1. The branch of statistics which deals with development of particular statistical methods is classified
as
1. industry statistics
2. economic statistics
3. applied statistics
4. applied statistics
Answer: applied statistics

2. Which of the following is true about regression analysis?


1. answering yes/no questions about the data
2. estimating numerical characteristics of the data
3. modeling relationships within the data
4. describing associations within the data
Answer: modeling relationships within the data
3. Text Analytics, also referred to as Text Mining?
ANS-TRUE

4- What is a hypothesis?
1. A statement that the researcher wants to test through the data collected in a study.
2. A research question the results will answer.
3. A theory that underpins the study.
4. A statistical method for calculating the extent to which the results could have happened by
chance.
Answer: A statement that the researcher wants to test through the data collected in a stud
5. What is the cyclical process of collecting and analysing data during a single research study called

ANS-INTERIM ANALYSIS

6- The process of quantifying data is referred to as


ANS-ENUMERATION

7-
1 An advantage of using computer programs for qualitative data is that they _
1. Can reduce time required to analyse data (i.e., after the data are transcribed)
2. Help in storing and organising data
3. Make many procedures available that are rarely done by hand due to time constraints
4. All of the above
Answer: All of the Above

8. Boolean operators are words that are used to create logical combinations. 1. True 2. False
Answer: True
9. ______ are the basic building blocks of qualitative data.
ANS-CATEGORIES

10. This is the process of transforming qualitative research data from written interviews or field
notes into typed text.
1. Segmenting
2. Coding
3. Transcription
4. Mnemoning
Answer: Transcription
11. A challenge of qualitative data analysis is that it often includes data that are unwieldy and
complex; it is a major challenge to make sense of the large pool of data.
1. True
2. False Answer:
True
12. Hypothesis testing and estimation are both types of descriptive statistics.
1. True 2. False
ANS-FALSE

14. A graph that uses vertical bars to represent data is called a ___
1. Line graph
2. Bar graph
3. Scatterplot
4. Vertical graph
Answer: Bar graph
15. ____ are used when you want to visually examine the relationship between two quantitative
variables.
1. Bar graph
2. pie graph
3. line graph
4. Scatterplot
ANS-SCATTERPLOT

16. The denominator (bottom) of the z-score formula is


1. The standard deviation
2. The difference between a score and the mean
3. The range
4. The mean
Answer: The standard deviation
17. Which of these distributions is used for a testing hypothesis?
1. Normal Distribution
2. Chi-Squared Distribution
3. Gamma Distribution
4. Poisson Distribution
Answer: Chi-Squared Distribution
18. A statement made about a population for testing purpose is called?
1. Statistic
2. Hypothesis
3. Level of Significance
4. Test-Statistic Join:-
ans-hypothesis

19. If the assumed hypothesis is tested for rejection considering it to be true is called?
1. Null Hypothesis
2. Statistical Hypothesis
3. Simple Hypothesis
4. Composite Hypothesis
Answer: Null Hypothesis
20. If the null hypothesis is false then which of the following is accepted?
1. Null Hypothesis
2. Positive Hypothesis
3. Negative Hypothesis
4. Alternative Hypothesis.
Answer: Alternative Hypothesis.
21. Alternative Hypothesis is also called as?
1. Composite hypothesis 2. Research Hypothesis 3. Simple Hypothesis 4. Null Hypothesis
ans-research hypothesis

1. What is the minimum no. of variables/ features required to perform clustering? 1.0 2.1 3.2 4.3
Answer: 1
2. For two runs of K-Mean clustering is it expected to get same clustering results? 1. Yes 2. No
Answer: No
3. Which of the following algorithm is most sensitive to outliers? Join:-
ans-K means clustering
4. The discrete variables and continuous variables are two types of
1. Open end classification
2. Time series classification
3. Qualitative classification
4. Quantitative classification
Answer: Quantitative classification
5. Bayesian classifiers is
1. A class of learning algorithm that tries to find an optimum classification of a set of examples using
the probabilistic theory.
2. Any mechanism employed by a learning system to constrain the search space of a hypothesis
3. An approach to the design of learning algorithms that is inspired by the fact that when people
encounter new situations, they often explain them by reference to familiar experiences, adapting the
explanations to fit the new si

Answer: A class of learning algorithm that tries to find an optimum classification of a set of examples
using the probabilistic theory.
6. Classification accuracy is
1. A subdivision of a set of examples into a number of classes
2. Measure of the accuracy, of the classification of a concept that is given by a certain theory
3. The task of assigning a classification to a set of examples
4. None of these
Answer: Measure of the accuracy, of the classification of a concept that is given by a certain theory
7. Euclidean distance measure is
1. A stage of the KDD process in which new data is added to the existing selection.
2. The process of finding a solution for a problem simply by enumerating all possible solutions
according to some pre-defined order and then testing them
3. The distance between two points as calculated using the Pythagoras theorem
4. none of above
Answer: The distance between tw
18. Point out the wrong statement.
1. k-nearest neighbor is same as k-means
2. k-means clustering is a method of vector quantization
3. k-means clustering aims to partition n observations into k clusters
4. none of the mentioned
Answer: k-nearest neighbor is same as k-means
19. Consider the following example “How we can divide set of articles such that those articles have
the same theme (we do not know the theme of the articles
ans-clustering
20. Can we use K Mean Clustering to identify the objects in video?
1. Yes 2. No
Answer: Yes
21. Clustering techniques are in the sense that the data scientist does not determine, in advance, the
labels to apply to the clusters.
1. Unsupervised
2. supervised
3. Reinforcement
4, Neural network
Answer: Unsupervised
22. metric is examined to determine a reasonably optimal value of k. 1. Mean Square Error 2.
Within Sum of Squares (WSS) 3. Speed
ans-within sum of squares
23. If an itemset is considered frequent, then any subset of the frequent itemset must also be
frequent.
1. Apriori Property
2. Downward Closure Property
3. Either 1 or 2
4. Both 1 and 2
Answer: Both 1 and 2Z
24. if {bread,eggs,milk} has a support of 0.15 and {bread,eggs} also has a support of 0.15, the
confidence of rule {bread,eggs} = {milk} is 1.0 2.1 3.2 4.3 Answer: 1
25. Confidence is a measure of how X and Y are really related rather than coincidentally
happeningtogether.
ans-false
26. recommend items based on similarity measures between users and/or items.
1. Content Based Systems
2. Hybrid System
3. Collaborative Filtering Systems
4. None of these
Answer: Collaborative Filtering Systems
27. There are major Classification of Collaborative Filtering Mechanisms 1.1 2.2 3.3 4. none of above
Answer: 2 28. Movie Recommendation to people is an example of 1. User Based Recommendation
2. Item Based Recommendation 3. Knowledge Based
ans-2

. There are major Classification of Collaborative Filtering Mechanisms 1.1 2.2 3.3 4. none of above
Answer: 2
28. Movie Recommendation to people is an example of
1. User Based Recommendation
2. Item Based Recommendation
3. Knowledge Based Recommendation
ans-item based

9. recommenders rely on an explicitely defined set of recommendation rules 1. Constraint Based 2.


Case Based 3. Content Based 4. User Based Answer: Case Based
30. Parallelized hybrid recommender systems operate dependently of one another and produce
separate recommendation lists. 1. True 2. False
Answer: False

maximum value for entropy depends on the number of classes so if we have 8 Classes what will be
the max entropy.
1. Max Entropy is 1
2. Max Entropy is 2
3. Max Entropy is 3
4. Max Entropy is 4
Answer: Max Entropy is 3

5. High entropy means that the partitions in classification are 1. Pure 2. Not Pure 3. Usefull 4.
useless
Answer: Uses a single processor or computer
16. Which of the following statements about Naive Bayes is incorrect?
1. Attributes are equally important.
2. Attributes are statistically dependent of one another given the class value.
3. Attributes are statistically independent of one another given the class value.
ans-attributes are statistically dependent

Answer: The process of executing implicit previously unknown and potentially useful information
from data 11. Hidden knowledge referred to
1. A set of databases from different vendors, possibly using different database paradigms
2. An approach to a problem that is not guaranteed to work but performs well in most cases
3. Information that is hidden in a database and that cannot be recovered by a simple SQL query.
4. None of these
Answer: Information that is hidden in a database and that cannot be recovered by a simple SQL query.
12. Decision trees cannot handle categorical attributes with many distinct values, such as country
codes for telephone numbers.
1. True 2. False
Answer: False
15. CNMICHMENT IS 1. A stage of the KDD process in which new data is added to the existing
selection 2. The process of finding a solution for a problem simply by enumerating all possible
solutions according to some pre-defined order and then testing them 3. The distance between two
points as calculated using the Pythagoras theorem. 4. None of thes
ans-a stage of kdd

You might also like