Professional Documents
Culture Documents
1. Decision support
2. Data mining
3. OLAP
4. All of the mentioned
Show Answer
All of the mentioned
1. Revenue
2. CRM
3. Sales
4. All of the mentioned
Show Answer
CRM(Customer relationship management)
1. Balanced Scorecard
2. Data Cube
3. Dashboard
4. All of the mentioned
Show Answer
Balanced Scorecard
1. Data staging
2. Data integration
3. ETL
4. None of the mentioned
Show Answer
Data staging
1. Data warehouse
2. MIS
3. EIS
4. All of the mentioned
Show Answer
EIS(Enterprise Information System)
8. Which of the following does not form part of BI Stack in SQL Server?
1. SSRS
2. SSIS
3. SSAS
4. OBIEE
Show Answer
OBIEE
1. Web services
2. customer-facing
3. client/server
4. personalization
Show Answer
personalization
12. This is the processing of data about customers and their relationship with the
enterprise in order to improve the enterprise’s future sales and service and lower
cost.
1. clickstream analysis
2. database marketing
3. customer relationship management
4. CRM analytics
Show Answer
CRM analytics
1. best practice
2. data mart
3. business information warehouse
4. business intelligence
Show Answer
business intelligence
1. database marketing
2. marketing encyclopedia
3. application integration
4. service oriented integration
Show Answer
database marketing
1. spend management
2. supplier relationship management
3. hosted CRM
4. Customer Information Control System
Show Answer
hosted CRM
1. BizTalk
2. BPML
3. e-biz
4. ebXML b
Show Answer
BPML
17. This is a central point in an enterprise from which all customer contacts are
managed.
1. contact center
2. help system
3. multichannel marketing
4. call center
Show Answer
contact center
18. This is the practice of dividing a customer base into groups of individuals
that are similar in specific ways relevant to marketing, such as age, gender,
interests, spending habits, and so on.
1. customer service chat
2. customer managed relationship
3. customer life cycle
4. customer segmentation
Show Answer
customer segmentation
19. In data mining, this is a technique used to predict future behavior and
anticipate the consequences of change.
1. predictive technology
2. disaster recovery
3. phase change
4. predictive modeling
Show Answer
predictive modeling
1. Open source
2. Real-time
3. Java-based
4. Distributed computing approach
Show Answer
Real-time
1. Apple
2. Datamatics
3. Facebook
4. None of the mentioned
Show Answer
Facebook
1. Volume
2. velocity
3. Variety
4. All of the above
Show Answer
All of the above
24. ____ hides the limitations of Java behind a powerful and concise Clojure API
for Cascading.”
1. Scalding
2. Cascalog
3. Hcatalog
4. Hcalding
Show Answer
Cascalog
1. MapReduce
2. HDFS
3. YARN
4. All of these
Show Answer
All of these
1. Open-Source
2. Scalability
3. Data Recovery
4. All the above
Show Answer
All the above
27. Define the Port Numbers for NameNode, Task Tracker and Job Tracker
1. NameNode
2. Task Tracker
3. Job Tracker
4. All of the above
Show Answer
All of the above
1. Project Prism
2. Prism
3. ProjectData
4. ProjectBid
Show Answer
Project Prism
1. Record
2. Event
3. Row
4. Log
Show Answer
Event
30. A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of
students from a college. Which of the following statement is true in the
following case
1. PCA
2. K-Means
3. None of the above
4. all of the above
Show Answer
PCA
32. What is the entropy of the target variable?
1. Yes
2. No
3. Can’t say
4. None of these
Show Answer
Yes
6. In which of the following cases will K-Means clustering fail to give good
results?
7. Which of the following is/are valid iterative strategy for treating missing
values before clustering analysis?
1. Elbow method
2. Manhattan method
3. Euclidian mehthod
4. All of the above
Show Answer
Elbow method
13. Which of the following is/are not true about Centroid based K-Means
clustering algorithm and Distribution based expectation-maximization
clustering algorithm:
14. Which of the following is/are not true about DBSCAN clustering algorithm:
15. Which of the following are the high and low bounds for the existence of
F-Score?
1. [0,1]
2. (0,1)
3. [-1,1]
4. None of the above
Show Answer
[0,1]
16. All of the following increase the width of a confidence interval except:
1. The probability of failing to reject the null hypothesis, given the observed
results
2. The probability that the null hypothesis is true, given the observed results
3. The probability that the observed results are statistically significant, given that
the null hypothesis is true
4. The probability of observing results as extreme or more extreme than currently
observed, given that the null hypothesis is true
Show Answer
The probability of observing results as extreme or more extreme than currently
observed, given that the null hypothesis is true
18. Assume that the difference between the observed, paired sample values is
defined in the same manner and that the specified significance level is the same
for both hypothesis tests. Using the same data, the statement that “a
paired/dependent two sample t-test is equivalent to a one sample t-test on the
paired differences, resulting in the same test statistic, same p-value, and same
conclusion” is: Please select the best answer of those provided below.
1. Always True
2. Never True
3. Sometimes True
4. Not Enough Information
Show Answer
Always True
19. Green sea turtles have normally distributed weights, measured in kilograms,
with a mean of 134.5 and a variance of 49 0. A particular green sea turtle’s
weight has a z-score of -2.4. What is the weight of this green sea turtle? Round
to the nearest whole number.
1. 17 kg
2. 151 kg
3. 118 kg
4. 252 kg c
Show Answer
118 kg
1. 49%
2. 50%
3. 51%
4. Cannot Be Determined
Show Answer
Cannot Be Determined
21. The proportion of variation in 5k race times that can be explained by the
variation in the age of competitive male runners was approximately 0.663. What
is the value of the sample linear correlation coefficient? Round to 3 decimal
places.
1. 0.663
2. 0.814
3. -0.814
4. 0.440
Show Answer
-0.814
22. Using all of the results provided, is it reasonable to predict the 5k race time
(minutes) of a competitive male runner 73 years of age?”
1. Yes; linear correlation between age and 5k race times is statistically significant
2. Yes; both the sample linear regression equation and an age in years is
provided
3. No; linear correlation between age and 5k race times is not statistically
significant
4. No; the age provided is beyond the scope of our available sample data” d
Show Answer
No; linear correlation between age and 5k race times is not statistically significant
23. If an itemset is considered frequent, then any subset of the frequent itemset
must also be frequent.
1. Apriori Property
2. Downward Closure Property
3. Either 1 or 2
4. Both 1 and 2
Show Answer
Both 1 and 2
24. Algorithm is
25. Bias is
26. Classification is
1. This takes only two values. In general, these values will be 0 and 1 and .they
can be coded as one bit
2. The natural environment of a certain species
3. Systems that can be used without knowledge of internal operations
4. None of these
Show Answer
This takes only two values. In general, these values will be 0 and 1 and .they can be
coded as one bit
28. Cluster is
1. Complete
2. Consistent
3. Constant
4. None of these
Show Answer
Complete
1. Complete
2. Consistent
3. Constant
4. None of these
Show Answer
Consistent
Share as picture
Q10) What is valid for a parameter and a statistic associated with repeated
random samples of the same size f
1. * & Values of a parameter will vary from sample to sample but values of a
statistic will not.
2. * Values of both a parameter and a statistic may vary from sample to sample.
3. x Values of a parameter will vary according to the sampling distribution for
that parameter.
4. Values of a statistic will vary according to the sampling distribution for that
statistic.
Q11) What is to be used when faced with the decision of how to arrange
furniture in a room
1. x Mathematical model
2. x Mental model
3. x Physical model
4
Visual model
012) What are the frequencies of all specific values of x and y variables with
total calculated frequencies classified as
1.x variate frequencies
2 x unconditional frequencies
3.x conditional frequencies
4
marginal frequencies
Q13) What will be the mean of data for a random variable x having
probabilities of (1+r)/3, (1+2r)/3 and (0.2+3r)/3 for values of 1,2
1. x 0.8
2. x 1.3
3. x 1.5
4. 1.8
Q14) What doesthe method in which sample statistic is used to estimate value
of parameters of population classified as?
1
✓ estimation
2.X valuation
3.x probability calculation
4. x limited theorem estimation
015) Type of cumulative frequency distribution in which class intervals are
added in top to bottom order is classified as
1x
variation distribution
less than type distribution
Q19) Which type of analytics, uses statistical and machine learning techniques
1. * Decisive
2. * Descriptive
3. Predictive
4. * Prescriptive
Q20) Three dimensional diagrams are named as so because they considers
both
1. x length and breadth
2. x breadth and depth
3. depth, length and breadth
4. x depth and length
021) Which type of analytics, supports human decisions with visual analytics
the user models to
1
✓ Decisive
2. * Descriptive
3. X Predictive
4 x Prescriptive
Q22) What is the value of any sample statistic which is used to estimate
parameter
1.
point estimate
2. x population estimate
3. x sample estimate
4. x parameter estimate
Q23) A MS-Excel function inside another function is called a
function.
1. Nested
2. Round
3. * Sum
4. x Text
Q24) The essence of decision analysis is
1. breaking down complex situations into manageable elements.
2
✓ choosing the best course of action among alternatives.
3.x finding the root cause of why something has gone wrong.
Q25) Which type of analytics, gain insight from historical data with reporting,
sca
1. x Decisive
2. Descriptive
3. X Predictive
4. * Prescriptive
Q26) Discrete variables and continuous variables are two types of
1. x open end classification
2. x time series classification
3.x qualitative classification
4. ✓ quantitative classification
Q27) Number of observations are 30 and value of arithmetic mean is 15 then
what is the
1.X 15
2.450
3. X 200
Q28) What is P(A/B) if the probability of event A is 0.2 and the probability of
event B is 0.4
1. VP(A) = 0.2
2. * P(A)/P(B) = 0.2/0.4 = 42
3. * P(A) * P(B) = (0.2)(0.4) = 0.08
4. x None of the above.
Q29) 3-D reference in a formula of MS-Excel
1. Can not be modified
2. Only appears on summary worksheets
3. Limits the formatting options
4. Spans worksheets
030) Considering types of diagrams; what are squares, circles and rectangles
classified as
1. x cumulative diagram
2.x dispersion diagrams
3. x one dimension diagrams
4
two dimension diagram
Q34) What is the variance for the following discrete data ( 2 6 8 3 7 9 1 4]?
1. x 40.0
2. x 5.0
3.8 2.74
4. 7.5
Q35) Criteria of selecting point estimator must include information of
1.x consistency
2. x unbiasedness
3.x efficiency
4. ✓ all of above
Q36) What are Upper and lower boundaries of interval of confidence classified
as
1. x error biased limits
2. X marginal limits
3. * estimate limits
4
confidence limits
Q37) What will be the standard deviation for the process having an exponential
distributi
1. x 0.4
2. x 5.0
3. X 12.5
4. 25.0
Q38) Which type of analytics, recommend decisions using optimization,
simulation etc.
1. * Decisive
2. * Descriptive
3. X Predictive
4. Prescriptive
Q39) What is the median for the following data -- [ 24 3 6 1 8 9 257]
1.X 2.0
2.x 4.7
3. 4.5
4X100
Q40) If vertical lines are drawn at every point of straight line in frequency
1. x width diagram
2. x length diagram
3. histogram
4. X dimensional bar charts
Q41) Which statement is valid if a fair coin is flipped 10 times
1. * The number of heads will equal the number of tails.
2. The probability of all heads is greater than the probability of all tails.
3. The probability of HHHHHHHHHH = the probability of HTHTHTHTHT.
E
4.* The probability of HHHHHHHHHH < the probability of HTHTHTHTHT.
Q42) Types of histograms includes
1. X deviation bar charts
2.x paired bar charts
3.x grouped charts
all fiabave
View Answer
Ans : C
View Answer
Ans : D
View Answer
Ans : A
Ans : D
View Answer
Ans : C
View Answer
Ans : B
View Answer
Ans : D
View Answer
Ans : A
View Answer
Ans : A
View Answer
Ans : B
B. Data Warehousing
C. Web Mining
D. Text Mining
Discussion
B. Data Warehousing
2. The data Warehouse is __________.
A. Read only
B. Write only
D. None
Discussion
A. Read only
3. Expansion for DSS in DW is _________.
A. Decision Support system
Discussion
B. time-variant
C. integrate
B. 3-4years
C. 5-6 years
D. 5-10 years
Discussion
D. 5-10 years
6. The data is stored, retrieved & updated
in___________.
A. OLAP
B. OLTP
C. SMTP
D. FTP
Discussion
B. OLTP
7. __________describes the data contained in the
data warehouse.
A. Relational data
B. Operational data
C. Metadata
D. Informational data
Discussion
C. Metadata
8. __________predicts future trends & behaviors,
allowing business managers to make proactive,
knowledge-driven decisions.
A. Data warehouse
B. Data mining
C. Data marts
D. Metadata
Discussion
B. Data mining
Discussion
B. DBZ
C. Informix
D. Redbrick
Discussion
D. Redbrick
11. ___________defines the structure of the data
held in operational databases and used
byoperational applications.
A. User-level metadata
C. Operational metadata
Discussion
C. Operational metadata
12. _________is held in the catalog of the
warehouse database system.
A. Application level metadata
Discussion
Discussion
A. Application level metadata
14. ____________ consists of formal definitions,
such as a COBOL layout or a database schema.
A. Classical metadata
B. Transformation metadata
C. Historical metadata
D. Structural metadata
Discussion
A. Classical metadata
15. ________consists of information in the
enterprise that is not in classical form.
A. Mushy metadata
B. Differential metadata
C. Data warehouse
D. Data mining
Discussion
A. Mushy metadata
16. ________ Databases are owned by particular
departments or business groups.
A. Informational
B. Operational
D. Flat
Discussion
B. Operational
17. The star schema is composed of __________
fact table.
A. one
B. Two
C. Three
D. four
Discussion
A. one
18. The time horizon in operational environment
is ________.
A. 30-60 days
B. 60-90 days
C. 90-120 days
D. 120-150 days
Discussion
B. 60-90 days
19. The key used in operational environment may
not have an element of _________.
A. Time
B. Cost
C. Frequency
D. Quality
Discussion
A. Time
20. Data can be updated in ______ environment.
A. Data warehouse
B. Data mining
C. Operational
D. Informational
Discussion
C. Operational
B. Files
C. RDBMS
D. data warehouse
Discussion
D. data warehouse
22. The source of all data warehouse data is
the__________.
A. Operational environment
B. Informal environment
C. Formal environment
D. Technology environment
Discussion
A. Operational environment
B. Informational
C. Summary
D. Denormalized
Discussion
C. Summary
24. The modern CASE tools belong
to_________category.
A. Analysis
B. Development
C. Coding
D. Delivery
Discussion
A. Analysis
B. 20 percent
C. 30 percent
D. 40 percent
Discussion
D. 40 percent
C. Atomic data
D. Multiatomic data
Discussion
C. Atomic data
B. MICRO
C. MACRO
D. ACID
Discussion
D. ACID
28. ______is a good alternative to the star
schema.
A. Star schema
B. Snowflake schema
C. Fact constellation
D. Star-snowflake schema
Discussion
C. Fact constellation
29. The biggest drawback of the level indicator
in the classic star-schema is
thatitlimits_________.
A. Quantify
B. Qualify
C. Flexibility
D. Ability
Discussion
C. Flexibility
Discussion
B. Used to run the business in real time and is based on current data
Discussion
B. Used to run the business in real time and is based on current data
32. The generic two-level data warehouse
architecture includes________.
A. At least one data mart
Discussion
Discussion
D. Data that has been selected and formatted for end-user support
applications
Discussion
C. Data that are never altered or deleted once they have been adde
D. D Data that are never deleted once they have been added
Discussion
Discussion
B. A process to load the data in the data warehouse and to create the
necessary indexes
B. A process to load the data in the data warehouse and to create the
necessary indexes
Discussion
B. A process to load the data in the data warehouse and to create the
necessary indexes
39. Data transformation includes___________.
A. A process to change data from a detailed level to a summary level
Discussion
Discussion
B. One-to-one
C. One-to-many
D. Many-to-one
Discussion
C. One-to-many
B. Partially demoralized
C. Completely normalize
D. D Partially normalized
Discussion
C. Completely normalize
43. ____________is the goal of data mining.
A. To explain some observed event or condition
B. Data Mining
Discussion
B. Query optimization
C. Security management
Discussion
Discussion
A. Query able change data
47. ___________are responsible for running
queries and reports against data warehouse
tables.
A. Hardware
B. Software
C. End users
D. Middle ware
Discussion
C. End users
B. Information delivery
C. Information exchange
D. Communication
Discussion
A. Data acquisition
49. Classification rules are extracted
from____________.
A. Root node
B. Decision tree
C. Siblings
D. Branches
Discussion
B. Decision tree
50. Dimensionality reduction reduces the data
set size by removing_____________.
A. Relevant attributes
B. Irrelevant attributes
C. Derived attributes
D. Composite attributes
Discussion
B. Irrelevant attributes
B. OLAP
C. COBWEB
D. STING
Discussion
C. COBWEB
52. Effect of one attribute value on a given class
is independent of values of other attribute
iscalled____________.
A. Value independence
C. Conditional independence
D. Unconditional independence
Discussion
A. Value independence
53. The main organizational justification for
implementing a data warehouse is
toprovide__________.
A. Cheaper ways of handling transportation
B. Decision support
D. Access to data
Discussion
B. DBMS
C. EXTENDED RDBMS
D. EXTENDED DBMS
Discussion
B. DBMS
B. RDBMS
C. Sybase
D. SQL Server
Discussion
B. RDBMS
56. Source data from the warehouse comes
from__________.
A. ODS
B. TDS
C. MDDB
D. ORDBMS
Discussion
A. ODS
57. ___________is a data transformation process.
A. Comparison
B. Projection
C. Selection
D. Filtering
Discussion
D. Filtering
B. Generalization
C. Personalization
D. Summarization
Discussion
C. Personalization
59. SMP stands for________.
A. Symmetric Multiprocessor
B. Symmetric Multiprogramming
C. Symmetric Metaprogramming
D. Symmetric Microprogramming
Discussion
A. Symmetric Multiprocessor
B. Relational database
C. Multidimensional database
D. Data repository
Discussion
C. Multidimensional database
61. __________are designed to overcome any
limitations placed on the warehouse by the
natureof the relational data model.
A. Operational database
B. Relational database
C. Multidimensional database
D. Data repository
Discussion
C. Multidimensional database
62. MDDB stands for _______________.
A. Multiple data doubling
B. Multidimensional databases
D. Multi-dimension doubling
Discussion
B. Multidimensional databases
B. Microdata
C. Minidata
D. Multidata
Discussion
A. Metadata
64. _____________is an important functional
component of the metadata.
A. Digital directory
B. Repository
C. Information directory
D. Data dictionary
Discussion
C. Information directory
Discussion
B. ODS data
C. Statistical data
D. Historical data
Discussion
A. MRI scan
B. Oracle
C. Sybase
D. SQL Server
Discussion
A. Visual Basic
68. The term that is not associated with data
cleaning process is____________.
A. Domain consistency
B. Deduplication
C. Disambiguation
D. Segmentation
Discussion
D. Segmentation
C. HOLAP
D. MOLAP
Discussion
A. Metacube, Informix
70. Capability of data mining is to build
__________models.
A. Retrospective
B. Interrogative
C. Predictive
D. Imperative
Discussion
C. Predictive
71. ________is a process of determining the
preference of customer's majority.
A. Association
B. Preferencing
C. Segmentation
D. Classification
Discussion
B. Preferencing
72. Strategic value of data mining is_______.
A. Cost-sensitive
B. Work-sensitive
C. Time-sensitive
D. Technical-sensitive
Discussion
C. Time-sensitive
73. ___________proposed the approach for data
integration issues.
A. Ralph Campbell
B. Ralph Kimball
C. John Raphlin
D. James Gosling
Discussion
B. Ralph Kimball
74. The terms equality and roll up are associated
with__________.
A. OLAP
B. Visualization
C. Data mart
D. Decision tree
Discussion
C. Data mart
75. Exceptional reporting in data warehousing is
otherwise called as__________.
A. Exception
B. Alerts
C. Errors
D. Bugs
Discussion
B. Alerts
B. CORBA
C. STUNT
D. COBWEB
Discussion
B. Study
C. Design
D. Information collection
Discussion
D. Information collection
78. The full form of KDD is__________.
A. Knowledge database
Discussion
B. 1997
C. 1995
D. 1994
Discussion
C. 1995
B. Data cleaning
C. Data cleansing
D. Data pruning
Discussion
B. Data cleaning
81. __________contains information that gives
users an easy-to-understand perspective of
theinformation stored in the data warehouse.
A. Business metadata
B. Technical metadata
C. Operational metadata
D. Financial metadata
Discussion
A. Business metadata
82. _________helps to integrate, maintain and
view the contents of the data warehousing system.
A. Business directory
B. Information directory
C. Data dictionary
D. Database
Discussion
B. Information directory
B. Visualization
C. Correction
D. Association
Discussion
D. Association
84. Data marts that incorporate data mining tools
to extract sets of data are called_________.
A. Independent data mart
Discussion
C. Self-learning system
D. Productivity system
Discussion
D. Productivity system
B. Speed
C. Accuracy
D. Simplicity
Discussion
C. Accuracy
87. Building the informational database is done
with the help of ___________.
A. Transformation or propagation tools
D. Extraction tools
Discussion
B. Three
C. Four
D. Five
Discussion
D. Five
89. Which of the following is not a component of
a data warehouse?
A. Metadata
D. Component Key
Discussion
D. Component Key
90. ___________is data that is distilled from the
low level of detail found at the current
detailedleve.
A. Highly summarized data
C. Metadata
D. Compact
Discussion
C. Metadata
Discussion
C. Metadata
93. Metadata contains at least__________.
A. The structure of the data
Discussion
D. All of the above
94. Which of the following is not a old detail
storage medium?
A. Phot Optical Storage
B. RAID
C. Microfinche
D. Pen drive
Discussion
D. Pen drive
95. The data from the operational environment
enter_____________of data warehouse.
A. Current detail data
Discussion
B. Summarization
C. Archieve
Discussion
B. Facts
C. Keys
D. Units of measures
Discussion
B. Facts
98. The granularity of the fact is the
___________ of detail at which it is recorded.
A. Transformation
B. Summarization
C. Level
D. Tr
Discussion
A. Transformation
99. Which of the following is not a primary grain
in analytical modeling.
A. Transaction
B. Periodic snapshot
C. Accumulating snapshot
Discussion
B. Periodic snapshot
100. Granularity is determined by___________.
A. Number of parts to a key
C. Both A and B
Discussion
C. Both A and B
B. Granularity
C. Functional dependency
D. Dimensionality
Discussion
C. Functional dependency
102. A fact is said to be fully additive
if_________.
A. It is additive over every dimension of its dimensionality
Discussion
Discussion
Discussion
B. Non-additive measures
C. Partially additive
Discussion
A. Additive measures
106. A fact representing cumulative sales units
over a day at a store for a product is a_________.
A. Additive fact
D. Non-additive fact
Discussion
C. Deductive learning
Discussion
B. Regression
C. Summarization
D. Association rules
Discussion
B. Regression
109. Which of the following is a descriptive
model?
A. Classification
B. Regression
C. Sequence discovery
D. Association rules
Discussion
C. Sequence discovery
B. Predictive
C. Regression
Discussion
A. Descriptive
111. A predictive model makes use of______.
A. Current data.
B. Historical data.
D. Assumptions
Discussion
B. Historical data.
C. Prediction
D. Classification
Discussion
D. Classification
C. Prediction
D. Classification
Discussion
C. Sequence discovery
D. Prediction
Discussion
B. Summarization
C. Clustering
D. Prediction
Discussion
C. Clustering
116. Link Analysis is otherwise called as ____.
A. Affinity analysis
B. Association rules
C. Both A & B
D. Prediction
Discussion
C. Both A & B
117. ______ is a the input to KDD.
A. Data
B. Information
C. Query
D. Process
Discussion
A. Data
B. Information
C. Query
D. Useful information
Discussion
D. Useful information
B. Four
C. Five
D. Six
Discussion
C. Five
120. Treating incorrect or missing data is called
as________.
A. Selection
B. Preprocessing
C. Transformation
D. Interpretation
Discussion
B. Preprocessing
B. Preprocessing
C. Transformation
D. Interpretation
Discussion
C. Transformation
122. Various visualization techniques are used
in_________step of KDD.
A. Selection
B. Transformation
C. Data mining
D. Interpretation
Discussion
D. Interpretation
B. Rare values
C. Dimensionality reduction
Discussion
A. Outliers
124. Box plot and scatter diagram techniques
are_________.
A. Graphical
B. Geometri
C. C Icon-base
D. D Pixel-based
Discussion
B. Geometri
125. _____ is used to proceed from very specific
knowledge to more general information.
A. Induction
B. Compression
C. Approximation
D. Substitution
Discussion
A. Induction
B. Compression
C. Approximation
D. Summarization
Discussion
B. Compression
127. ______ helps to uncover hidden information
about the data.
A. Induction
B. Compression
C. Approximation
D. Summarization
Discussion
C. Approximation
128. ______ are needed to identify training data
and desired results.
A. Programmers
B. Designers
C. Users
D. Administrators
Discussion
C. Users
Discussion
Discussion
B. Noisy data
C. Outliers
D. Missing data
Discussion
B. Noisy data
B. Return on Information
C. Repetition of Information
D. Runtime of Instruction
Discussion
A. Return on Investment
133. The ______of data could result in the
disclosure of information that is deemed to
beconfidential.
A. Authorized use
B. Unauthorized use
C. Authenticated use
D. Unauthenticated use
Discussion
B. Unauthorized use
134. _________data are noisy and have many
missing attribute values.
A. Preprocessed
B. Cleaned
C. Real-worl
D. D Tr
Discussion
D. D Tr
135. The rise of DBMS occurred in early _______.
A. 1950's
B. 1960's
C. 1970's
D. 1980's
Discussion
C. 1970's
Discussion
B. Time complexity
C. ROI
Discussion
B. Dimensionality reduction
C. Cleaning
D. Over fitting
Discussion
B. Dimensionality reduction
139. Data that are not of interest to the data
mining task is called as _____.
A. Missing data
B. Changing data
C. Irrelevant data
D. Noisy data
Discussion
C. Irrelevant data
140. _________are effective tools to attack the
scalability problem.
A. Sampling
B. Parallelization
C. Both A & B
Discussion
C. Both A & B
141. Market-basket problem was formulated
by____________.
A. Agrawal et al
B. Steve et al
C. Toda et al
D. Simon et al
Discussion
A. Agrawal et al
142. Data mining helps in________.
A. Inventory managemen
C. Marketing strategies
Discussion
B. Support
C. Support count
Discussion
B. Support
144. The absolute number of transactions
supporting X in T is called _______.
A. Confidence
B. Support
C. Support count
Discussion
C. Support count
B. Support
C. Support count
Discussion
A. Confidence
146. If T consist of 500000 transactions, 20000
transaction contain bread, 30000 transaction
contain jam, 10000 transaction contain both
bread and jam. Then the support of bread and jam
is_________.
A. 2%
B. 20%
C. 3%
D. 30%
Discussion
A. 2%
147. 7 If T consist of 500000 transactions, 20000
transaction contain bread, 30000 transaction
contain jam, 10000 transaction contain both
bread and jam. Then the confidence of buying
bread with jam is____________.
A. 33.33%
B. 66.66%
C. 45%
D. 50%
Discussion
D. 50%
148. The left hand side of an association rule
is called________.
A. Consequent
B. Onset
C. Antecedent
D. Precedent
Discussion
C. Antecedent
149. The right hand side of an association rule
is called__________.
A. Consequent
B. Onset
C. Antecedent
D. Precedent
Discussion
A. Consequent
C. To be efficient in computing
Discussion
B. Frequent set
D. Lattice
Discussion
B. Frequent set
B. Border set
C. Lattice
D. Infrequent sets
Discussion
D. Border set
Discussion
B. Border set
Discussion
B. Border set
Discussion
B. Border set
156. A priori algorithm is otherwise called
as_________
A. Width-wise algorithm
B. Level-wise algorithm
C. Pincer-search algorithm
D. FP growth algorithm
Discussion
B. Level-wise algorithm
157. The A Priori algorithm is a____________
A. Top-down search
D. Bottom-up search
Discussion
D. Bottom-up search
B. Itemset generation
C. Pruning
D. Partitioning
Discussion
A. Candidate generation
159. The second phase of A Priori algorithm
is____________
A. Candidate generation
B. Itemset generation
C. Pruning
D. Partitioning
Discussion
C. Pruning
160. The step eliminates the extensions of
(k-1)-itemsets which are not found to be frequent,
from being considered for counting support.
A. Candidate generation
B. Pruning
C. Partitioning
D. Itemset eliminations
Discussion
B. Pruning
161. The a priori frequent itemset discovery
algorithm moves in the lattice
A. Upward
B. Downward
C. Breadthwise
Discussion
A. Upward
162. After the pruning of a priori
algorithm,__________will remain
A. Only candidate set
B. No candidate set
D. No border set
Discussion
B. No candidate set
163. The number of iterations in a priori
A. Increases with the size of the maximum frequent set
Discussion
Discussion
C. Toda et al
D. Simon et at
Discussion
A. Bin et al
B. Circle
C. Box
D. Solid
Discussion
A. Dashed
167. The itemsets in the_________category
structures are not subjected to any counting
A. Dashes
B. Box
C. Soli
D. D Circle
Discussion
C. Soli
168. Certain itemsets in the dashed circle whose
support count reach support valueduring an
iteration move into the______________
A. Dashed box
B. Solid circle
C. Solid box
Discussion
A. Dashed box
B. Solid circle
C. Solid box
D. Dashed circle
Discussion
D. Dashed circle
B. Solid circle
C. Solid box
Discussion
B. Solid circle
171. The FP-growth algorithm has phases
A. One
B. Two
C. Three
D. Four
Discussion
B. Two
B. A frequent-item-header table
C. A frequent-item-node
D. Both A & B
Discussion
D. Both A & B
173. The non-root node of item-prefix-tree
consists of fields
A. Two
B. Three
C. Four
D. Five
Discussion
B. Three
174. The frequent-item-header-table consists of
fields
A. Only one.
B. Two.
C. Three.
D. Four
Discussion
B. Two.
175. The paths from root node to the nodes
labelled 'a' are called_________
A. Transformed prefix path
B. Suffix subpath
D. Prefix subpath
Discussion
D. Prefix subpath
B. FP-tree
D. Prefix path
Discussion
B. Classification
C. Clustering
D. Genetic Algorithm
Discussion
C. Clustering
178. Which of the following is a clustering
algorithm?
A. A priori
B. CLARA
C. Pincer-Search
D. FP-growth
Discussion
B. CLARA
B. Divisive
C. Partition
D. Numeric
Discussion
A. Agglomerative
B. Divisive.
C. Partition.
D. Numeric
Discussion
B. Divisive.
181. Which of the following is a data set in the
popular UCI machine-learning repository?
A. CLARA.
B. CACTUS.
C. STIRR.
D. MUSHROOM
Discussion
D. MUSHROOM
B. K-means
C. Stirr
D. Rock
Discussion
B. K-means
183. In each cluster is represented by one of the
objects of the cluster located near the center
A. K-medoid
B. K-means
C. Stirr
D. Rock
Discussion
A. K-medoid
184. Pick out a k-medoid algorithm
A. DBSCAN
B. BIRCH
C. PAM
D. CURE
Discussion
C. PAM
185. Pick out a hierarchical clustering
algorithm
A. DBSCAN
B. BIRCH
C. PAM
D. CURE
Discussion
A. DBSCAN
Discussion
B. Hierarchical algorithm
C. Hierarchical-agglomerative algorithm
D. Divisive
Discussion
C. Hierarchical-agglomerative algorithm
188. The cluster features of different
subclusters are maintained in a tree
called_________
A. CF tree
B. FP tree
C. FP growth tree
D. B tree
Discussion
A. CF tree
189. The_______algorithm is based on the
observation that the frequent sets are normally
veryfew in number compared to the set of all
itemsets
A. A priori
B. Clustering
C. Association rule
D. Partition
Discussion
D. Partition
190. The partition algorithm uses scans of the
databases to discover all frequent sets
A. Two
B. Four
C. Six
D. Eight
Discussion
A. Two
191. The basic idea of the Apriori algorithm is
to generate_____item sets of a particular size
&scans the database
A. Candidate
B. Primary
C. Secondary
D. Superkey
Discussion
A. Candidate
B. Partition algorithm
C. Distributed algorithm
D. Pincer-search algorithm
Discussion
A. Apriori algorithm
193. An algorithm called________is used to
generate the candidate item sets for each pass
afterthe first
A. Apriori
B. Apriori-gen
C. Sampling
D. Partition
Discussion
B. Apriori-gen
B. Two
C. Three
D. Four
Discussion
B. Two
195. and prediction may be viewed as types of
classification
A. Decision.
B. Verification.
C. Estimation.
D. Illustration
Discussion
C. Estimation.
B. Prediction.
C. Identification.
D. Clarification
Discussion
B. Prediction.
197. Prediction can be viewed as forecasting a
value
A. Non-continuous.
B. Constant.
C. Continuous.
D. variable
Discussion
C. Continuous.
B. Measuring.
C. Non-training.
D. Training
Discussion
B. Measuring.
B. While
C. Do while
D. Switch
Discussion
A. If-then
200. are a different paradigm for computing which
draws its inspiration fromneuroscience
A. Computer networks
B. Neural networks
C. Mobile networks
D. Artificial networks
Discussion
B. Neural networks
Expense list
Q1) Which type of analytics, gain insight from historical data with reporting,
scorecards, clustering
1. Decisive
2. Descriptive
3. X Predictive
4. Prescriptive
Q2) What is to be used when faced with the decision of How to arrange
furniture in a room
1. x Mathematical model
2. X Mental model
3. * Physical model
4.
Visual model
Q3) Which type of analytics, supports human decisions with visual analytics
the user models to reflect re
1
► Decisive
2.x Descriptive
04) What is the characteristic of best models
1. accurately reflect relevant characteristics of the real-world object or decision.
2. X are mathematical models.
3. x replicate all aspects of the real-world object or decision,
4. x replicate the characteristics of a component in isolation from the rest of the
system.
Q5) Which type of analytics, uses statistical and machine learning techniques
1.x Decisive
2x Descriptive
3. Predictive
4. Prescriptive
06) The essence of decision analysis is
1. x breaking down complex situations into manageable elements.
2 choosing the best course of action among alternatives,
4. ___ and ___ are the key to emerging Business Intelligence technologies.
Ans: Data warehouse and data mining
5. Data mining is also called ___.
Ans: Knowledge discovery
13. ___ system is customer-oriented and is used for transaction and query
processing by clerks, clients, and information technology professionals.
Ans: OLTP
15. In ___ schema some dimension tables are normalized, thereby further
splitting the data into additional tables.
Ans: Snowflake
16. The ___ data model is commonly used in the design of relational
databases.
Ans: Entity-relationship
17. Data warehouses and OLAP tools are based on ___ data model.
Ans: Multidimensional
18. The ___ exposes the information being captured, stored, and managed by
operational systems.
Ans: Data source view
19. ___ are the intermediate servers that stand in between a relational back –
end server and client front – end tools.
Ans: Relational OLAP (ROLAP) servers
21. The ___ software gives the user the opportunity to look at the data from a
variety of different dimensions.
Ans: Multidimensional Analysis
23. Based on the overall requirements of business intelligence, the ___ layer
is required to extract, cleanse and transform data into load files for the
information warehouse.
Ans: Data integration
28. ___ is used to refer to systems and technologies that provide the business
with the means for decision-makers to extract personalized meaningful
information about their business and industry.
Ans: Business Intelligence
29. In ___ each value in a bin is replaced by the mean value of the bin.
Ans: Smoothing by bin means
30. ___ regression involves finding the “best” line to fit two variables so that
one variable can be used to predict the other.
Ans: Linear
31. ___ works to remove the noise from the data that includes techniques like
binning, clustering, and regression.
Ans: Smoothing
33. The ___ technique uses encoding mechanisms to reduce the data set size.
Ans: Data compression
35. ___ hierarchies can be used to reduce the data by collecting and replacing
low-level concepts by higher-level concepts.
Ans: Concept
36. The ___ rule can be used to segment numeric data into relatively uniform,
“natural” intervals.
Ans: 3-4-5
48. The ___ step eliminates the extensions of (k-1) – itemsets, which are not
found to be frequent, from being considered for counting support.
Ans: Pruning
49. In the first phase of the Partition algorithm, the algorithm logically divides
the database into a number of ___.
Ans: non – overlapping partitions.
51. ___ algorithm works like a train running over the data, with stops at
intervals M between transactions. When the train reaches the end of the
transaction file it completes one path.
Ans: DIC Algorithm
52. FP–Tree Growth Algorithm can be implemented in ___ Phases.
Ans: Two
54. Data mining systems should provide capabilities to mine association rules
at multiple levels of abstraction and traverse easily among different
abstraction spaces (True/False).
Ans: True
55. Which one of the following is alternative search strategies for mining
multiple-level associations with reduced support?
a) Level – by level independent
b) Level – cross-filtering by a single item
c) Level – cross-filtering by k – itemset:
d) All the above
Ans: d) All the above
57. Association rules that involve two or more dimension or predicates can be
referred to as ___.
Ans: Multidimensional association rules.
61. The process of grouping a set of physical or abstract objects into classes of
similar objects is called ___.
Ans: Cluster
66. Weight and height of an individual fall into ___ kind of variables.
Ans: Continuous
70. ___ software provides a set of partitioned clustering algorithms that treat
the clustering problem as an optimization process.
Ans: CLUTO
72. ___ can be viewed as the construction and use of a model to assess the
class of an unlabeled sample, or to assess the value or value ranges of an
attribute that a given sample is likely to have.
Ans: Prediction
73. ___ of data removes or reduces noise (by applying smoothing techniques)
and the treatment of missing values.
Ans: Pre-processing
74. ___ method refers to the ability to construct the model efficiently given a
large amount of data.
Ans: Scalability
77. The ___ measure is used to select the test attribute at each node in the
tree.
Ans: information gain
79. ___ is simple text files that are automatically generated every time
someone accesses one Website.
Ans: Log File
81. ___ is used to examine the structure of a particular website and collate
and analyze related data.
Ans: Structural mining
82. Which of the following techniques are concerned about user navigation
accessing?
a. Web structural mining
b. Web usage mining
c. Web content mining
d. Web data definition mining
Ans: b. Web usage mining
85. The ___ approaches to Web mining have generally focused on techniques
for integrating and organizing the heterogeneous and semi-structured data on
the Web into more structured and high-level collections of resources.
Ans: database
86. Association rules involving multimedia objects can be mined in ___ and
___ databases.
Ans: Image and video
87. In ___ approach, the signature of an image includes color histograms
based on the color composition of an image regardless of its scale or
orientation.
Ans: Color histogram-based signature
88. Which of the following are the measures of the text retrieval documents?
a. Precision
b. Recall
c. F-score
d. a,b,c
Ans: d. a,b,c
90. Which of the following is the first step in text retrieval systems?
a. Stemming
b. Term words finding
c. Tokenization
d. Replacing the null data with keywords
Ans: c. Tokenization
93. Insurance and direct mail are two industries that rely on ___ to make
profitable business decisions.
Ans: data analysis
95. A ___ profile is a model that predicts the future purchasing behaviour of
an individual customer, given historical transaction data for both the
individual and for the larger population of all of a particular company’s
customers.
Ans: predictive
96. Data mining can be used to help predict future patient behaviour and to
improve treatment programs (True/False).
Ans: True
98. Data mining in the telecommunication industry helps to understand the
business involved, identify telecommunication patterns (True/False).
Ans: True
101. IDS are based on ___ that are developed by the manual encoding of
expert knowledge.
Ans: Handcrafted signatures
103. To improve accuracy, data mining programs are used to analyze audit
data and extract features that can distinguish normal activities from
intrusions. (True/False)
Ans: True
105. ___ is a new class of intrusion detection algorithms that do not rely on
labelled data.
Ans: Unsupervised anomaly detection
106. ___ algorithm uses the frequency distribution of each feature’s values to
proportionally generate a sufficient amount of anomalies.
Ans: Distribution Based Artificial Anomaly
108. Patient Rule Induction Method (PRIM) and Weighted Item Sets (WIS), is
a type of ___ technique.
Ans: Association rule
109. ___ tools cannot discover high average regions or find new patterns in
data.
Ans: OLAP
1. In the research process, the management question has the following critical activity in
sequence.
2. The chapter that details the way in which the research was conducted is the _________
chapter
Introduction
Literature review
Research methodology
Data analysis
Conclusion and recommendations
3. Business research has an inherent value to the extent that it helps management make
better decisions. Interesting information about consumers, employees, or competitors
might be pleasant to have, but its value is limited if the information cannot be applied to a
critical decision.
True
False
4. The researcher should never report flaws in procedural design and estimate their effect
on the findings.
True
False
5. Adequate analysis of the data is the least difficult phase of research for the novice.
True
False
True
False
7. Researchers are tempted to rely too heavily on data collected in a prior study and use it
in the interpretation of a new study
True
False
True
False
10. A complete disclosure of methods and procedures used in the research study is
required. Such openness to scrutiny has a positive effect on the quality of research.
However, competitive advantage often mitigates against methodology disclosure in
business research.
True
False
11. Research is any organized inquiry carried out to provide information for solving
problems.
True
False
12. In deduction, the conclusion must necessarily follow from the reasons given. In
inductive argument there is no such strength of relationship between reasons and
conclusions.
True
False
13. Conclusions must necessarily follow from the premises. Identify the type of arguments
that follows the above condition.
Induction
Combination of Induction and Deduction
Deduction Variables
14. Eminent scientists who claim there is no such thing as the scientific method, or if exists,
it is not revealed by what they write, caution researchers about using template like
approaches
True
False
15. One of the terms given below is defined as a bundle of meanings or characteristics
associated with certain events, objects, conditions, situations, and the like
Construct
Definition
Concept
Variable
16. This is an idea or image specifically invented for a given research and/or theory
building purpose
Concept
Construct
Definition
Variables
17. The following are the synonyms for independent variable except
Stimulus
Manipulated
Consequence
Presumed Cause
18. The following are the synonyms for dependent variable except
Presumed effect
Measured Outcome
Response
Predicted from…
19. In the research process, a management dilemma triggers the need for a decision.
True
False
20. Every research proposal, regardless of length should include two basic sections. They
are:
Work plan
Prospectus
Outline
Draft plan
All of the above
23. Non response error occurs when you cannot locate the person or could not encourage
the respondent to participate in answering.
True
False
24. Secondary data can almost always be obtained more quickly and at a lower cost than
__________data.
Tertiary
Collective
Research
Primary
Marketing
Causal
Exploratory
Descriptive
Answers
1. a. 2. c. 3. a. 4. b. 5. b. 6. b. 7. a. 8. d. 9. a. 10. a. 11. a. 12. a. 13. C. 14. a. 15. c. 16. b.
17. c. 18. d. 19. a. 20. a. 21. a. 22. e. 23. a. 24. d. 25. c.
View Answer
Ans : C
View Answer
Ans : A
Explanation: R allows integration with the procedures written in the C,
C++, .Net, Python or FORTRAN languages for efficiency.
View Answer
Ans : B
View Answer
Ans : C
View Answer
Ans : D
View Answer
Ans : B
View Answer
Ans : A
View Answer
Ans : A
View Answer
Ans : D
Ans : C
Answer: a
Explanation: CRAN stands for Comprehensive R Archive Network.
Answer: b
Explanation: R runs on almost any standard computing platform and
operating system.
Answer: a
Explanation: CRAN also hosts many add-on packages that can be used to
extend the functionality of R.
Answer: c
Explanation: base package in R contains the most fundamental functions.
advertisement
5. Point out the wrong statement?
a) One nice feature that R shares with many popular open source projects is
frequent releases
b) R has sophisticated graphics capabilities
c) S’s base graphics system allows for very fine control over essentially every
aspect of a plot or graph
d) All of the mentioned
View Answer
Answer: c
Explanation: R has maintained the original S philosophy, which is that it
provides a language that is both useful for interactive work, but contains
a powerful programming language for developing new tools.
6. Which of the following is a base package for R language?
a) util
b) lang
c) tools
d) spatial
View Answer
Answer: c
Explanation: The other packages contained in the “base” system include
utils, stats, datasets, graphics, grDevices, grid, methods, parallel, compiler,
splines, tcltk, stats4.
Answer: d
Explanation: “Recommended” packages also include boot, class, cluster,
codetools, foreign, KernSmooth, lattice, mgcv, nlme, rpart, survival,
MASS, nnet, Matrix.
Answer: b
Explanation: There are base packages (which come with R automatically),
and contributed packages. The base packages are maintained by a select
group of volunteers called R Core. In addition to the base packages, there
are over ten thousand additional contributed packages written by
individuals all over the world.
Answer: a
Explanation: For computationally-intensive tasks, C, C++ and Fortran
code can be linked and called at run time.
Answer: a
Explanation: R
r<-0:10
r[2]
a) 0
b) 1
c) 2
d) 3
View Answer
Answer: b
Explanation: 1 is the output of the above code as indexing in R starts from
1. The output can be viewed in the R console. R studio has both R terminal
and the R console. Each output format is implemented as a function in R.
You can customize the output by passing arguments to the function as
sub-values of the output field.
advertisement
2. Which of the following operator is used to create integer sequences?
a) :
b) ;
c) –
d) ~
View Answer
Answer: a
Explanation: “:” operator is used to create an integer sequence. The other
operators are used for other purposes. Integer sequence is the basic
operator used in R. The [ operator can be used to extract multiple elements
of a vector by passing the operator an integer sequence.
y<-0:5
vector(y)
y[3]
Answer: a
Explanation: y is already vector; second line is an invalid argument. The
third line will give us the output. When an R vector is printed you will
notice that an index for the vector is printed in square brackets
[] on the side.
Answer: a
Explanation: A vector can only contain objects of the same class. A vector
cannot have contain objects of the different class. Same class objects are
used mostly. The most basic type of R object is a vector. Empty vectors can
be created with the vector() function.
Answer: b
Explanation: A list can contain objects of different class. But a vector can
only contain objects of the same class. A vector cannot have contain objects
of the different class. Same class objects are used mostly.
6. How can we define ‘undefined value’ in R language?
a) Inf
b) Sup
c) Und
d) NaN
View Answer
Answer: d
Explanation: NaN is used to define the “undefined” value in the R language.
Undefined values also have some value in R. Missing values are denoted by
NA or NaN for q undefined mathematical operations. A NaN value is also
NA but the converse is not true.
Answer: a
Explanation: NaN is called Not a Number. It is the full form of NaN. Full
forms can be viewed in R studio by typing help. A NaN value is also NA but
the converse is not true. The value NaN represents an undefined value.
Answer: a
Explanation: Inf is used to define “Infinity” in R. It is somewhat different
from other programming languages. There is also a special number of Inf
which represents infinity.
y <- c(TRUE, 2)
Answer: d
Explanation: Here TRUE is taken as 1. Then it will give output as 1 and 2.
FALSE can be taken as 0. T and F are short-hand ways to specify TRUE
and FALSE.
y<-c(2,”t”)
a) Character
b) Numeric
c) Logical
d) Integer
View Answer
Answer: a
Explanation: Here 2 is changed into character. Since the y belongs to list.
A list contains only characters. Combining a numer
y<-c(FALSE,2)
a) Character
b) Numeric
c) Logical
d) Integer
View Answer
Answer: b
Explanation: Numeric and FALSE is executed as 0. It is somewhat different
from other programming languages. Console will give a class as Numeric.
A vector can only contain objects of the same class. A list is represented as
a vector but can contain objects of different classes.
advertisement
2. Which one of the following is not a basic datatype?
a) Numeric
b) Character
c) Data frame
d) Integer
View Answer
Answer: c
Explanation: Data frame is not the basic data type of R. Numeric,
character, integer are the basic types of R. The basic data types are used
many times. Data frames are used to store tabular data in R. They are an
important type of object in R and are used in a variety of statistical
modelling applications.
as.numeric(x)
a) [1] 1 2
b) [1] TRUE TRUE
c) [1] NA NA (Warning message: NAs introduced by coercion)
d) [1] NAN
View Answer
Answer: c
Explanation: Characters cannot be expressed as numeric. Therefore NA’s
are printed as output. NA will specify the missing elements in the list.
When nonsensical coercion takes place, you will usually get a warning from
R.
Answer: b
Explanation: It is itself an integer vector of length 2. The dimension
attribute in R is an integer vector. Real values larger in modulus than the
largest integer are coerced to NA. Matrices are vectors with a dimension
attribute. The dimension attribute is itself an integer vector of length 2
(number of rows, number of columns).
a) row-wise
b) column-wise
c) any manner
d) data insufficient
View Answer
Answer: b
Explanation: If nothing is mentioned, matrix is created column-wise. If we
want in row-wise then we have to specify. We have to mention “by row”
to create a matrix in row wise. The filter( ) function is used to extract
subsets of rows from a data frame. This function is similar to the existing
subset( ) function.
Answer: b
Explanation: rbind() is used to create a matrix by row-binding. Row-
binding is the basic function of R. R – bind is used to bind the functions in
R. Matrices can be created by column-binding or row-binding with the
cbind() and rbind() functions.
8. What is the function used to test objects (returns a logical operator) if they
are NA?
a) is.na()
b) is.nan()
c) as.na()
d) as.nan()
View Answer
Answer: a
Explanation: is.na() is the function used to test if they are NA. We can
check NA ‘s at any stage of the code. Generally, We will remove the NA’s
for the operations in R like mean etc.., is.na() is used to test objects if they
are NA.
9. What is the function used to test objects (returns a logical operator) if they
are NaN?
a) as.nan()
b) is.na()
c) as.na()
d) is.nan()
View Answer
Answer: d
Explanation: is.nan() is used to test if they are NaN. We can check NAN‘s
at any stage of the code. We will remove the NA’s for the operations in R.
is.nan() is used to test for NaN.
Answer: b
Explanation: colnames() is the function to set column names for a matrix.
rownames() is the function to set row names for a mat
1. The most convenient way to use R is at a graphics workstation running
a ________ system.
a) windowing
b) running
c) interfacing
d) matrix
View Answer
Answer: a
Explanation: Most classical statistics and much of the latest methodology
is available for use with R.
Answer: b
Explanation: help command is used for knowing details of particular
command in R.
Answer: a
Explanation: When you use the R program it issues a prompt when it
expects input commands.
4. Which of the following will start the R program?
a) $ R
b) > R
c) * R
d) @ R
View Answer
Answer: a
Explanation: At this point R commands may be issued.
advertisement
5. Point out the wrong statement?
a) Windows versions of R have other optional help system also
b) The help.search command (alternatively ??) allows searching for help in
various ways
c) R is case insensitive as are most UNIX based packages, so A and a are
different symbols and would refer to different variables
d) $ R is used to start the R program
View Answer
Answer: c
Explanation: R is an expression language with a very simple syntax.
?solve
a) help(solve)
b) print(solve)
c) bind(solve)
d) matrix(solve)
View Answer
Answer: a
Explanation: help is used to get more information on any specific named
function.
Answer: c
Explanation: If an expression is given as a command, it is evaluated,
printed (unless specifically made invisible), and the value is lost.
Answer: c
Explanation: Comments can be put almost anywhere, starting with a
hashmark (‘#’), everything to the end of the line is a comment.
9. Command lines entered at the console are limited to about ________ bytes.
a) 3000
b) 4095
c) 5000
d) 6000
View Answer
Answer: b
Explanation: Elementary commands can be grouped together into one
compound expression by braces (‘{’ and ‘}’).
10._____ text editor provides more general support mechanisms via ESS for
working interactively with R.
a) EAC
b) Emacs
c) Shell
d) ECAP
View Answer
Answer: b
Explanation: The recall and editing capabili
Answer: b
Explanation: There are base packages (which come with R automatically),
and contributed packages. The base packages are maintained by a select
group of volunteers, called R Core. In addition to the base packages, there
are over ten thousand additional contributed packages written by
individuals all over the world.
x + y
a) Symbol
b) Missing Data
c) 5
d) 15.5
View Answer
Answer: b
Explanation: Missing data are a persistent and prevalent problem in many
statistical analyses, especially those associated with the social sciences. R
reserves the special symbol NA to represent missing data. Ordinary
arithmetic with NA value gives NA’s (addition, subtraction, etc.) and
applying a function to a vector that has a NA in it will usually give a NA.
advertisement
3. R language is a dialect of which of the following languages?
a) S
b) C
c) MATLAB
d) SAS
View Answer
Answer: a
Explanation: The R language is a dialect of S which was designed in the
1980s. Since the early 90’s the life of the S language has gone down a
rather winding path. The scoping rules for R are the main feature that
makes it different from the original S language.
Answer: a
Explanation: The language syntax has a superficial similarity with C, but
the semantics are of the FPL (functional programming language) variety
with stronger affinities with Lisp and APL. There are many syntaxes in C,
which are closely resembled with R.
a) Numeric
b) Character
c) Integer
d) Logical
View Answer
Answer: b
Explanation: All three elements can be expressed as a character. Both
paste() and cat() will printout text to the console by combining multiple
character vectors together. The original data are formatted as character
strings so we convert them to R’s Date format for easier manipulation.
b <- 2:7
a) 4
b) 5
c) 6
d) 0
View Answer
Answer: c
Explanation: Length of b [1] 2 3 4 5 6 7 is 6. We can also create an empty
list of a prespecified length with the vector() function. Data frames are
represented as a special type of list where every element of the list has to
have the same length.
a) Numeric
b) Character
c) Integer
d) Logical
View Answer
Answer: a
Explanation: All the elements in ‘b’ can be expressed in numeric. Both
paste() and cat() will printout text to the console by combining multiple
character vectors together. The original data are formatted as character
strings so we convert them to R’s Data format for easier manipulation.
x<-1:3
a) Numeric, Integer
b) Integer, Numeric
c) Integer, Integer
d) Numeric, Numeric
View Answer
Answer: b
Explanation: Here typeof() tells about the data type. They are an
important type of object in R and are used in a variety of statistical
modelling applications. You can determine an object’s type with the typeof
function.
9. How many atomic vector types does R have?
a) 5
b) 6
c) 8
d) 10
View Answer
Answer: b
Explanation: R language has 6 atomic data types. They are logical, integer,
real, complex, string (or character) and raw. There is also a class for “raw”
objects, but they are not commonly used directly in data analysis.
10. What is the function to set row names for a data frame?
a) row.names()
b) colnames()
c) col.names()
d) column name cannot be set for a data frame
View Answer
Answer: a
Explanation: row.names() is the function to set row names for a data
frame. Data frames have a special attribute called row.name
Answer: a
Explanation: Anybody is free to download and install these packages and
even inspect the source code. The instructions for obtaining R largely
depend on the user’s hardware and operating system.
2. How to install for a package and all of the other packages on which for
depends?
a) install.packages (for, depends = TRUE)
b) R.install.packages (“for”, depends = TRUE)
c) install.packages (“for”, depends = TRUE)
d) install (“for”, depends = FALSE)
View Answer
Answer: c
Explanation: To install a package named for, open up R and type
install.packages(“for”). To install foo and additionally install all of the
other packages on which for depends, instead type install.packages (“for”,
depends = TRUE).
Answer: d
Explanation: Type library() at the command prompt to see a list of all
available packages in the library. For total information about the
installation of R and add-on packages, see the R Installation and
Administration manual.
advertisement
4. The longer programs are called ____________
a) Files
b) Structures
c) Scripts
d) Data
View Answer
Answer: c
Explanation: The longer programs called scripts, there is too much code to
write all at once at the command prompt. Furthermore, for longer scripts,
it is convenient to be able to only modify a certain piece of the script and
run it again in R.
Answer: a
Explanation: script editors are designed to aid the communication and
code writing process. They have all sorts of features including R syntax
highlighting, automatic code completion, delimiter matching, and
dynamic help on the R functions.
Answer: d
Explanation: “Recommended” packages also include boot, class, cluster,
codetools, foreign, KernSmooth, lattice, mgcv, nlme, rpart, survival,
MASS, nnet, Matrix. There are about ten thousand packages in R now.
7. Full Form of GUI is ___________________
a) Guided User Interface
b) Graphical User Interface
c) Guided Used Interface
d) Graphical User Interval
View Answer
Answer: b
Explanation: GUI elements are usually accessed through a device. All
programs running a GUI use a consistent set of graphical elements so that
once the user learns a particular interface.
Answer: a
Explanation: R Commander provides a point-and-click interface to
statistical problems. It is called the “Commander” because every time one
makes a selection, the code corresponding to the task is listed in the output
window.
options(digits = 16)20/6
a) 3.33
b) 3.333
c) 3.3333333
d) 3.3333333333333333
View Answer
Answer: d
Explanation: We know that 20/6 is a repeating decimal, We can change
the number of digits displayed with options. This will make the number
after the decimal point to extend for the required amount.
Answer: a
Explanation: An IDE tailored to the needs of interactive data analysis and
statistical programming called R studio. In R studio we can directly
interact with R through the inbuilt functions and packages. We can also
download new packages.
Answer: a
Explanation: Compared to other programming languages, the R
community tends to be more focussed on results instead of processes.
Knowledge of software engineering best practice.
Answer: d
Explanation: You are confronted with over 20 years of evolution every
time you use R. Learning R can be hard because there are many special
cases in R to remember. R is the best user of memory.
Answer: d
Explanation: Statistics for relatively advanced users. R has thousands of
packages, designed, maintained, and widely used by statisticians. We can
code ourselves if a command is not present.
Answer: a
Explanation: RStudio is very popular with a nice interface and well
thought out, especially for more advanced usage. It can be a bit buggy, so
make sure you update it regularly. Available on all platforms.
Answer: d
Explanation: The expression a <- 16 creates a variable called a and gives
it the value 16 called assignment. The variable on the left is assigned to the
value on the right. The left side should have only a single one.
Answer: c
Explanation: In R, there is an extension of the numeric or character
vectors. They are not a separate type of object but simply an atomic vector
with dimensi
a) 1
b) 2
c) 3
d) 5
View Answer
Answer: a
Explanation: When a complete expression is entered at the prompt, it is
evaluated and the result of the evaluated expression is returned.
2. Point out the wrong statement?
a) The grammar of the language determines whether an expression is
complete or not
b) The <- symbol is the assignment operator in R
c) The ## character indicates a comment
d) R does not support multi-line comments or comment blocks
View Answer
Answer: c
Explanation: Unlike some other languages, R does not support multi-line
comments or comment blocks.
advertisement
3. Which of the R following code is example of explicit printing?
a)
b)
c)
d)
View Answer
Answer: b
Explanation: Print command is used for outputting the value.
4. Files containing R scripts ends with extension ____________
a) .S
b) .R
c) .Rp
d) .SP
View Answer
Answer: b
Explanation: Under many versions of UNIX and on Windows, R provides a
mechanism for recalling and re-executing previous commands.
Answer: b
Explanation: They are merely part of the printed output.
Answer: a
Explanation: For Windows, Source is also available on the File menu.
7. _______ will divert all subsequent output from the console to an external file.
a) sink
b) div
c) exp
d) exc
View Answer
Answer: a
Explanation: sink() restores it to the console once again.
Answer: a
Explanation: These may be variables, arrays of numbers, character strings,
functions, or more general structures built from such components.
9. Which of the following can be used to display the names of (most of) the
objects which are currently stored within R?
a) object()
b) objects()
c) list()
d) class()
View Answer
Answer: b
Explanation: During an R session, objects are created and stored by name.
Answer: b
Explanation: All objects created during an R session can be stored
permanently in a file for use in future R sessions.
View Answer
Ans : C
View Answer
Ans : B
Explanation: Three different types of Logistic Regression are as follows:
Binary Logistic Regression, Multinomial Logistic Regression and Ordinal
Logistic Regression
View Answer
Ans : A
View Answer
Ans : D
View Answer
Ans : A
View Answer
Ans : D
View Answer
Ans : B
Explanation: Binary Logistic Regression: In this, the target variable has
only two 2 possible outcomes. For Example, 0 and 1, or pass and fail or
true and false.
View Answer
Ans : A
View Answer
Ans : C
Ans : A
Explanati
Ans: B
a. slope, intercept
b. intercept, slope
c. R2, p-value
d. p-value, R2
Ans: A
Ans: C
a. an F-test.
b. an R2 test.
c. a correlation coefficient.
d. a t-test.
Ans: D
Ans: A
Salary = a * Experience
Salary = a * Experience + b
Salary = a * Experience + b * Age
3. What is the class used in Python to create a simple linear regressor ?
SimpleLinear
LinearRegression
LinReg
SimpleLinearRegression
4. What is the function used in R to create a simple linear regressor ?
lr
slr
lm
slm
5. What is the correct way of writing a simple linear regression equation in
the formula parameter in R ?
Salary = YearsExperience
Salary ~ YearsExperience
Salary == YearsExperience
Salary = a * YearsExperience + b
MultipleLinearRegression
MultipleRegression
LinearRegression
LinearModel
8. In Python, the code to create a multiple linear regressor is exactly the
same as the one to create a simple linear regressor.
True
False
9. In R, which multiple linear regression equation can we input in the
formula parameter ?
Salary ~ *
Salary = *
Salary ~ .
Salary = Experience + Age
10. We should use Multiple Linear Regression to predict a dependent
variable that is growing exponentially with time.
Yes
No
Multiple Choice Questions on
Logistic Regression
11. Logistic Regression is a linear classifier
True
False
12. Logistic Regression returns probabilities
True
False
13. In Python, what is the class used to create a logistic regression
classifier ?
GLM
LogisticRegression
Logit
LogReg
14. In R, what is the function used to create a Logistic Regression classifier ?
lr
lm
glm
glr
15. In R, what value do we need to input for the family parameter ?
family = linear
family = logistic
family = binomial
family = response
Solution: (E)
A) Naive approach
B) Exponential smoothing
C) Moving Average
D)None of the above
Solution: (D)
A) Seasonality
B) Trend
C) Cyclical
D) Noise
E) None of the above
Solution: (E)
Solution: (A)
A) TRUE
B) FALSE
Solution: (B)
A) TRUE
B) FALSE
Solution: (B)
A) TRUE
B) FALSE
Solution: (A)
A) <1
B) 1
C) >1
D) None of the above
Solution: (B)
A) 63.8
B) 65
C) 62
D) 66
Solution: (D)
Yt-1= 70
St-1= 60
Alpha = 0.4
Solution: (D)
Solution: (D)
Solution: (C)
A) 300
B) 350
C) 400
D) Need more information
Solution: (A)
A) AR
B) MA
C) Can’t Say
Solution: (A)
A) TRUE
B) FALSE
C) Can’t Say
Solution: (A)
1. Multiple box
2. Autocorrelation
A) Only 1
B) Only 2
C) 1 and 2
D) None of these
Solution: (C)
A) TRUE
B) FALSE
Solution: (A)
18) Suppose you are given a time series dataset which has only
4 columns (id, Time, X, Target).
Solution: (B)
Solution: (A)
Solution: (A)
Solution: (B)
73.17-35 = 38.17
A) 0.26
B) 0.52
C) 0.13
D) 0.07
Solution: (C)
= (23.32−x¯)(32.33−x¯)+(32.33−x¯)(32.88−x¯)+··· PT t=1(xt−x¯) 2
= 0.130394786
Solution: (A)
A) Separation of xs and xt
B) h = | s – t |
C) Location of point at a particular time
Solution: (C)
A) Only A
B) Both A and B
Solution: (D)
Solution: (C)
Solution: (A)
note that for an MA(1) model, ρ(h) is the same for θ and 1 /θ
A) AR (1) MA(0)
B) AR(0)MA(1)
C) AR(2)MA(1)
D) AR(1)MA(2)
E) Can’t Say
Solution: (B)
Strong negative correlation at lag 1 suggest MA and there is
only 1 significant lag. Read this article for a better
understanding.
A) Mean =0
B) Zero autocovariances
C) Zero autocovariances except at lag zero
D) Quadratic Variance
Solution: (C)
A) ACF = 0 at lag 3
B) ACF =0 at lag 5
C) ACF =1 at lag 1
D) ACF =0 at lag 2
E) ACF = 0 at lag 3 and at lag 5
Solution: (B)
Recall that an MA(q) process only has memory of length q. This
means that all of the autocorrelation coefficients will have a
value of zero beyond lag q. This can be seen by examining the
MA equation, and seeing that only the past q disturbance
terms enter into the equation, so that if we iterate this
equation forward through time by more than q periods, the
current value of the disturbance term will no longer affect y.
Finally, since the autocorrelation function at lag zero is the
correlation of y at time t with y at time t (i.e. the correlation
of y_t with itself), it must be one by definition.
A) 1.5
B) 1.04
C) 0.5
D) 2
Solution: (B)
Solution: (B)
Solution: (A)
Solution: (D)
Time series is ordered data. So the validation data must be
ordered to. Forward chaining ensures this. It works as follows:
35) BIC penalizes complex models more strongly than the AIC.
A) TRUE
B) FALSE
Solution: (A)
where:
N = number of observations
Solution: (B)
A) 0.2,0.32,0.6
B) 0.33, 0.33,0.33
C) 0.27,0.27,0.27
D) 0.4,0.3,0.37
Solution: (B)
The predicted value from the exponential smooth is the same
for all 3 years, so all we need is the value for next year. The
expression for the smooth is
Hence, for the next point, the next value of the smooth (the
prediction for the next observation) is
= 0.3297
A) 0.3297 2 * 0.1125
B) 0.3297 2 * 0.121
C) 0.3297 2 * 0.129
D) 0.3297 2 * 0.22
Solution: (B)
A) Only 1
B) Both 1 and 2
C) Only 2
D) All of the statements
Solution: (C)
(a) X-intercept
(b) Slope
(c) Y-intercept
(d) None of them
(a) Y-intercept
(b) Slope
(c) X-intercept
(d) Trend
(a) Scale
(b) Origin
(c) Both (a) and (b)
(d) Neither (a) and (b)
20. A trend is the better fitted trend for which the sum of
squares of residuals is:
(a) Maximum
(b) Minimum
(c) Positive
(d) Negative
(a) Histogram
(b) Analysis of time series
(c) Histogram
(d) Detrending
(a) Trend
(b) Seasonal
(c) Cyclical
(d) Irregular
27. The best fitting trend is one which the sum of squares of
residuals is:
(a) Negative
(b) Least
(c) Zero
(d) Maximum
(a) Scale
(b) Origin
(c) Both origin and scale
(d) None of them
31. The rise and fall of a time series over periods longer than
one year is called:
(a) Two
(b) Three
(c) Four
(d) Five
(a) Y = T + S + C + I
(b) Y = TSCI
(c) Y = a + bX
(d) Y = a + bX + cX²
(a) Y = T + S + C + I
(b) Y = TSCI
(c) Y = a + bX
(d) Y = a + bX + cX²
35. The difference between the actual value of the time series
and the forecasted value is called:
(a) Residual
(b) Sum of variation
(c) Sum of squares of residual
(d) All of the above
(a) Boom
(b) Recovery
(c) Recession
(d) Depression
(a) Prosperity
(b) Recession
(c) Recovery
(d) Depression
(a) Recession
(b) Recovery
(c) Boom
(d) Depression
(a) Recession
(b) Recovery
(c) Prosperity
(d) Depression
42. For even number of years when origin is in the center and
the unit of X being one year, then X can be coded as:
43. For even number of years when origin is in the center and
the unit of X being half year, then X can be coded as:
(a) wars
(b) earthquakes
(c) floods
(d) none
(a) Predictable
(b) Trend
(c) cyclical
(d) Irregular
(a) seasonal
(b) Trend
(c) cyclical
(d) Irregular
56. The best-fitted trend line is one for which sum of squares
of residuals or errors is
(a) positive
(b) negative
(c) zero
(d) minimum
57. The 3 yearly moving average for the year 2005 is given by
_______.
(a) 3
(b) 9
(c) 6
(d) 0
58. For the given data semi averages for the first half is given
by _______.
(a) 13
(b) 15
(c) 16
(d) 14
(a) 23
(b) 36
(c) 20
(d) 68
60. For the given data short term fluctuations for the year
2013 by using additive model is given by _______.
(a) 4
(b) -4
(c) 0.95
(d) -0.95
61. For the given data short term fluctuations for the year
2014 by using Multiplicative model is given by _______.
(a) 1.046
(b) -1.046
(c) -4
(d) 1
(a) positive
(b) Maximum
(c) Zero
(d) Minimum
(a) Seasonal
(b) Irregular
(c) Secular
(d) Cyclic
(a) Seasonality
(b) Operational variations
(c) Trend
(d) Cycle
(a) Trend
(b) Seasonal
(c) Irregular
(d) Cyclical
(a) Zero
(b) Two
(c) Less than two
(d) More than two.
(a) Additive
(b) Multiplicative
(c) Mixed
(d) Regression
(a) dependent
(b) inter connected
(c) independent
(d) None
(a) Additive
(b) Multiplicative
(c) Mixed
(d) regression
(a) 6,50,000
(b) 16,5000
(c) 33,650,000
(d) 7,32,500
83. y = 85·6 + 2·4x ; Origin 2000 ; x unit = 1 year , y = Annual
production of sugar (in ’000 quintals) ___________ slope of line.
(a) increasing
(b) Decreasing
(c) constant
(d) non linear
86. For the additive model in time series analysis, for annual
data the difference Y – T represents ___________ fluctuations.
1 The despatch department of a mail order book company have run out of small
padded envelopes that they use to send books to their customers. Unfortunately, there is a
shortage of these envelopes at their supplier and it will be 5 days before they will be
available. As a result of this, there is a backlog of 3 days of picked orders that are temporarily
stored in the despatch department which cannot be sent to customers until the envelopes
arrive.
Which of the following types of Lean waste are there in this scenario?
A DSDM.
B XP.
C Lean Software Development.
D SAFe.
A Product owner.
B End user.
C Project sponsor.
D Subject matter expert.
Student Teacher
In reviewing the use case diagram, the business analyst has suggested that the following
elements in the diagram could be incorrect:
i. The ‘Assessment Administration’ use case
ii. The ‘View Results’ use case
iii. The ‘School Administrator’ actor
iv. The ‘Teacher’ actor
v. The System name (within the boundary)
A i and iv.
B ii and iii.
C i and iii.
D iii only.
A M.
B S.
C C.
D W.
A ii or iv.
B iv only.
C ii only.
D i or iii.
8 As part of a university assessment system project, the project team are reviewing the
following user story.
As a student I want to be able to share my exam certificate on social media so that my friends
and potential employers can see my achievements.
In the review, it was discussed that there are multiple social media channels that could be
used and that no channels had been specified. Which of the INVEST rules for writing a
quality user story are not being met?
9 The following burn down chart has been produced showing the situation 7 days
through the latest iteration in the project.
50
40
30
20
10
0
0 1 2 3 4 5 6 7 8 9 10
Ideal burndown 50 45 40 35 30 25 20 15 10 5 0
Story points remaining 50 50 49 45 39 39 30 30
Iteration timeline (days)
A i and iv only.
B ii and iii only.
C i, iii and iv only.
D ii and v only.
1. Research is
(A) Searching again and again
(B) Finding solution to any problem
(C) Working in a scientific way to search for truth of any problem
(D) None of the above
2. Which of the following is the first step in starting the research process?
(A) Searching sources of information to locate problem.
(B) Survey of related literature
(C) Identification of problem
(D) Searching for solutions to the problem
5. A reasoning where we start with certain particular statements and conclude with a universal
statement is called
(A) Deductive Reasoning
(B) Inductive Reasoning
(C) Abnormal Reasoning
(D) Transcendental Reasoning
12. The method that consists of collection of data through observation and experimentation,
formulation and testing of hypothesis is called
(A) Empirical method
(B) Scientific method
(C) Scientific information
(D) Practical knowledge
(E)
13. Information acquired by experience or experimentation is called as
(A) Empirical
(B) Scientific
(C) Facts
(D) Scientific evidences
14. “All living things are made up of cells. Blue whale is a living being. Therefore blue whale is
made up of cells”. The reasoning used here is
(A) Inductive
(B) Deductive
(C) Both A and B
(D) Hypothetic-Deductive
15. The reasoning that uses general principle to predict specific result is called
(A) Inductive
(B) Deductive
(C) Both A and B
(D) Hypothetic-Deductive
18. Which of the following is not an appropriate source for academic research?
(A) An online encyclopedia
(B) A government-based research organization database
(C) A peer reviewed journal article
(D) A text book
21. Research is
(A) A purposeful, systematic activity
(B) Primarily conducted for purely academic purposes
(C) Primarily conducted to answer questions about practical issues
(D) A random, unplanned process of discovery
22. When conducting a review of literature on a particular subject, the researcher should
(A) Read all available material on the subject
(B) Read the whole journal article and then decide whether or not it is useful
(C) Read strategically and critically
(D) Read fully only those texts that appear to agree with his/her point of view
27. Of all the steps in the research process, the one that typically takes the most time is
(A) Data collection
(B) Formulating the problem
(C) Selecting a research method
(D) Developing a hypothesis
30. Which of the following was not identified as a major research design?
(A) secondary research
(B) Surveys
(C) Field Research
(D) ethnography
31. When a number of researchers use the same operational definition to measure a variable and
achieve the same results, the measure is said to be
(A) Instrumental
(B) Reliable
(C) Valid
(D) Factual
32. There are various types of research designed to obtain different types of information. What
type of research is used to define problems and suggest hypotheses?
(A) Descriptive Research
(B) Primary research
(C) Secondary research
(D) Causal research
35. The Internet is a powerful mechanism for conducting research. However it does have its
drawbacks. Which of the following signify these drawbacks?
(A) The possible inclusion of individuals not being targeted, that could skew the results
(B) Lack of information about the population responding to the questionnaire.
(C) Eye contact and body language, (two useful research indicators) are excluded from the
analysis
(D) All of the above
37. Which is the best type of research approach for gathering causal information?
(A) Observational
(B) Informative
(C) Experimental
(D) Survey
41. Ms. Casillas has been coordinating the Halloween Festival at her school for the last several
years. She wants to be sure the students and parents enjoy the festival again this year. On which
source is she LEAST likely to rely when making decisions about what to do?
(A) Tradition
(B) Research
(C) Personal experience
(D) Expert opinion
42. The scientific method is preferred over other ways of knowing because it is more
(A) Reliable
(B) Systematic
(C) Accurate
(D) All of these
43. Which of the following steps of the scientific method is exemplified by the researcher
reviewing the literature and focusing on a specific problem that has yet to be resolved?
(A) Describe the procedures to collect information
(B) Identify a topic.
(C) Analyze the collected information
(D) State the results of the data analysis
44. Which of the following is the LEAST legitimate research problem? The purpose of this study
is to
(A) understand what it means to be a part of a baseball team at a high school known for its
championship teams.
(B) study whether physical education should be taught in elementary parochial schools.
(C) examine the relationship between the number of hours spent studying and students test scores
(D) examine the effect of using advanced organizers on fifth grade students reading
comprehension
48. The statement 'To identify the relationship between the time the patient spends on the
operating table and the development of pressure ulcers' is best described as a research:
(A) Objective
(B) Aim
(C) Question
(D) Hypothesis
50. A statement of the expected relationship between two or more variables is known as the:
(A) Concept definition
(B) Hypothesis
(C) Problem statement
(D) Research question
51. There is no difference in the incidence of phlebitis around intravenous cannulae changed
every 72 hours and those changed at 96 hours' is an example of a:
(A) Null hypothesis
(B) Directional hypothesis
(C) Non-directional hypothesis
(D) Simple hypothesis
52. Which of the following statements meets the criteria for a researchable question?
(A) Is the use of normal saline to cleanse wounds harmful to patients?
(B) Do generalist registered nurses meet the mental health needs of general patients?
(C) Do palliative care patients have spiritual needs?
(D) What are the patients perceptions of the effectiveness of pre-operative education for
total hip replacement?
53. The researcher needs to clearly identify the aim of the study; the question to be answered; the
population of interest; information to be collected, and feasibility in order to decide on the
research
(A) Design and method
(B) Design and assumptions
(C) Purpose and data analysis
(D) Purpose and assumptions
54. A variable that changes due to the action of another variable is known as the
(A) Independent variable
(B) Extraneous variable
(C) Complex variable
(D) Dependent Variable
60. Because of the number of things that can go wrong in research there is a need for:
(A) Flexibility and Perseverance
(B) Sympathetic supervisors
(C) An emergency source of finance
(D) Wisdom to know the right time to quit
61. __________ research seeks to investigate an area that has been under researched with
preliminary data that helps shape the direction for future research.
(A) Descriptive
(B) Exploratory
(C) Explanatory
(D) Positivist
62. Research questions in qualitative studies typically begin with which of the following words?
(A) Why
(B) How
(C) What
(D) All of the above
65. Which of the following data collecting methods is not normally used in qualitative research?
(A) Participant observation
(B) Focus groups
(C) Questionnaire
(D) Semi-structured interview
67. The scientific method is preferred over other ways of knowing because it is more
(A) Reliable
(B) Systematic
(C) Accurate
(D) All of the above
71. The facts that should be collected to measure a variable, depend upon the
(A) Conceptual understanding
(B) Dictionary meaning
(C) Operational definition
(D) All of the above
73. Which of the following is the best hypothesis statement to address the research question?
“What impact will the new advertising campaign have on use of brand B”?
(A) The new advertising campaign will impact brand B image
(B) The new advertising campaign will impact brand B image trial
(C) The new advertising campaign will impact brand B usage at the expense of brand C
(D) The new advertising campaign will impact brand B’s market penetration
74. Qualitative and quantitative research are the classifications of research on the basis of
(A) Use of the research
(B) Time dimension
(C) Techniques used
(D) Purpose of the research
77. The application of the scientific method to the study of business problems is called
(A) Inductive reasoning
(B) Deductive reasoning
(C) Business research
(D) Grounded Theory
79. According to empiricism, which of the following is the ultimate source of all our concepts
and knowledge?
(A) Perceptions
(B) Theory
(C) Sensory experiences
(D) Logics and arguments
81. Which of the following is not a function of clearly identified research questions?
(A) They guide your literature search
(B) They keep you focused throughout the data collection period
(C) They make the scope of your research as wide as possible
(D) They are linked together to help you construct a coherent argument
89. After identifying the important variables and establishing the logical reasoning in
Theoretical framework, the next step in the research process is
A. To conduct surveys
B. To generate the hypothesis
C. To focus group discussions
D. To use experiments in an investigation
101. All of the following are true statements about action research, EXCEPT;
A. Data are systematically analyzed
B. Data are collected systematically
C. Results are generalizable
D. Results are used to improve practice
A. Discontinuous variable
B. Continuous variable
C. Dependent variable
D. Independent variable
106. Which of the following is not the source for getting information for exploratory
research?
A. Content analysis
B. Survey
C. Case study
D. Pilot study
108. A variable that is presumed to cause a change in another variable is known as:
A. Discontinuous variable
B. Dependent variable
C. Independent variable
D. Intervening variable
110. In ___________, the researcher attempts to control and/ or manipulate the variables in
the study.
a. Experiment
b. Hypothesis
c. Theoretical framework
d. Research design
111. In an experimental research study, the primary goal is to isolate and identify the effect
produced by the ____.
a. Dependent variable
b. Extraneous variable
c. Independent variable
d. Confounding variable
113. ______ is the evidence that the instrument, techniques, or process used to measure
concept does indeed measure the intended concepts.
a. Reliability
b. Replicability
c. Scaling
d. Validity
114. Experimental design is the only appropriate design where_________ relationship can
be established.
a. Strong
b. Linear
c. Weak
d. Cause and Effect
115. In which one of the following stage researcher consult the literature?
a. Operation test
b. Response analysis survey
c. Document design analysis
d. Pretest interviews
119. A scientific explanation that remains tentative until it has been adequately tested is called
a(n)
a.theory.
b.law.
c.hypothesis.
d.experiment.
122. A psychologist watches the rapid eye movements of sleeping subjects and wakes them to
find they report that they were dreaming. She concludes that dreams are linked to rapid eye
movements. This conclusion is based on
a.pure speculation.
b.direct observation.
c.deduction from direct observation.
d.prior prediction.
123. We wish to test the hypothesis that music improves learning. We compare test scores of
students who study to music with those who study in silence. Which of the following is an
extraneous variable in this experiment?
a.the presence or absence of music
b.the students' test scores
c.the amount of time allowed for the studying
d.silence
124. An experiment is performed to see if background music improves learning. Two groups
study the same material, one while listening to music and another without music. The
independent variable is
a.learning.
b.the size of the group.
c.the material studied.
d.music.
127. In the traditional learning experiment, the effect of practice on performance is investigated.
Performance is the __________ variable.
a.independent
b.extraneous
c.dependent
d.control
128. Collection of observable evidence, precise definition, and replication of results all form the
basis for
a.scientific observation.
b.the scientific method.
c.defining a scientific problem.
d.hypothesis generation.
134. Which of the following similarity is found in qualitative research and survey research?
a. Examine topics primarily from the participant’s perspectives
b. They are guided by predetermined variables to study
c. They are descriptive research methods
d. Have large sample sizes
168. ------------ are Questions the researcher, must answer to satisfactory arrive at a conclusion
about the research question.
a) Investigate questions
b) Research question
c) Measurement question
d) Fine-tuning the research question
174. A condition that exists when an instruments measures what it is supposed to measure is
called
a) validity
b) accuracy
c) reliability
d) none of the above
175. The major disadvantage with in depth interviews is that because of their time consuming
nature it is usually only possible to carry out a relatively small number of such interviews and
as such the results are likely to be highly ____________
a) subjective
b) objective
c) questionable
d) objectionable
176. A critical review of the information, pertaining to the research study, already available in
various sources is called
a) Research review
b) Research design
c) Data review
d) Literature review
177. ____________________ presents a problem, discusses related research efforts, outlines the
data needed for solving the data and shows the design used to gather and analyze the data.
a.) Marketing
b.) Causal
c.) Exploratory
d.) Descriptive
179. A systematic, controlled, empirical, and critical investigation of natural phenomena guided
by theory and hypothesis is called _____________
180. __________________ is the determination of the plan for conducting the research and as
such it involves the specification of approaches and procedures..
a.) Strategy
b.) Research Design
c.) Hypothesis
d.) Deductive
181. A proposal is also known as a:
a) Work plan
b) Prospectus
c) Outline
d) Draft plan
e) All of the above
182. Every research proposal, regardless of length should include two basic sections. They are:
a) Stimulus
b) Manipulated
c) Consequence
d) Presumed Cause
184. The following are the synonyms for dependent variable except
a) Presumed effect
b) Measured Outcome
c) Response
d) Predicted from…
185. Which of the following is not a characteristic of research?
187. What would NOT be a consideration during the research design stage?
a. The availability of literature
b. The availability of participants
c. The type of methods that would be used
d. The type of analysis that would take place
195. Which of the following are not normally a requirement for experimental research design?
a. Demonstrating co variation
b. Demonstrating time order
c. Demonstrating repeated measures
d. Demonstrating non spuriousness
DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH
MCQ
Sr Question Answer
No
1 Which of these measures are used to analyse the central tendency of data? B
a. Mean and Normal Distribution
b. Mean, Median and Mode
c. Mode, Alpha & Range
d. Standard Deviation, Range and Mean
e. Median, Range and Normal Distribution
2 Five numbers are given: (5, 10, 15, 5, 15). Now, what would be the sum of D
deviations of individual data points from their mean?
includes _____________
a) Decision support b) Data mining c) OLAP
d) All of the mentioned
11
4. A good question is --------------- It focuses on recall of only the material covered B
in your lesson and aligns well with the overall learning objectives
5. a) relevant. b) clear c) concise d) purpose
12
6. A good question is framed in a-----------, easily understandable language, A
without any vagueness. Students should understand what is wanted from the
question even when they don’t know the answer to it.
7. a) clear b) relevant c) concise d) purpose
16 In 1991, R was created by Ross Ihaka and Robert Gentleman in the Department D
of Statistics at the University of _________
a) John Hopkins
b) California
c) Harvard
d) Auckland
a) “a+b”
b) “a=b”
c) “a b :”
d) none of the mentioned
22 The __________ function returns a list of all the formal arguments of a function. A
a) formals()
b) funct()
c) formal()
d) fun()
+ for(i in seq_len(num)) {
+ cat(hello)
+ }
+ chars
+}
> f()
a)
Hello, world!
[1] 14
b) Hello, world!
[1] 15
c) Hello, world!
[1] 16
d) Error
25 You can check to see whether an R object is NULL with the _________ function. A
a) is.null()
b) is.nullobj()
c) null()
d) as.nullobj()
c) > args(pastebin)
d) > arg(bin)
a) “a+b”
b) “a=b”
c) “a:b”
d) a*b
27
What will be the output of the following R code snippet? A
+ print(a)
+ print(b)
+}
> f(45)
a) 32
b) 42
c) 52
28 d) 45
What will be the output of the following R code snippet? A
+ a^2
+}
> f(2)
a) 4
b) 3
c) 2
29 d) 5
Which of the following is a base package for R language? C
a) util
b) lang
c) tools
d) All of the above
30
R comes with a ________ to help you optimize your code and improve its performance. A
a) Debugger
b) Monitor
c) Profiler
d) None of the above
31
debug() flags a function for ______ mode in R mode. B
a) debug
b) run
c) compile
d) None of the above
32
______ suspends the execution of a function wherever it is called and puts the function C
in debug mode
a) recover()
b) browser()
c) Both of the above
33
A matrix is ___dimensionsinal rectangular data set? D
a) 5
b) 4
c) 3
d) 2
34
The _____ function takes a vector or other objects and splits it into groups determined B
by a factor or list of factors.
a) apply()
b) split()
c) isplit()
d) mapply()
35
lapply function takes___ arguments in R language C
a) 1
b) 3
c) 4
d) 5
36
____is used to apply a function over subsets of a vector d
37 a) apply()
b) lapply()
c) mapply()
d) tapply()
a)
a) apply()
b) lapply()
c) tapply()
d) mapply()
38
____function is same as lapply() in R C
b) apply()
c) lapply()
d) sapply()
e) tapply()
39
_______ loop over a list and evaluate a function on each element A
a) apply()
b) lapply()
c) sapply()
d) tapply()
40
__________ is proprietary tool for predictive analytics. B
a) R
b) SAS
c) SSAS
d) SPSS
41
Data frames can be converted to a matrix by calling data._______ C
a) matr()
b) mat()
c) matrix()
d) None of the above
42
Which of the following method make a vector of repeated values? b
a) rep()
43 b) data()
c) view()
d) None of the above
R objects can have attributes, which are like ________ for the object A
a) metadata
b) features
c) expressions
44
Attributes of an object (if any) can be accessed using the ______ function. C
a) objects()
b) attrib()
c) attributes()
45
_________ involves predicting a response with meaningful magnitude, such as quantity A
sold, stock price, or return on investment.
a) Regression
b) Clustering
c) Summarization
46
________ provides needed string operators in R C
a) str
b) forcast
c) stringr
47
______ splits a data frame and results in an array (hence the da). Hopefully, you’re B
getting the idea here.
a) apply
b) daply
c) stats
48
System.time function returns an object of class _______ which contains two useful bits C
of information.
a) debug_time
b) procedure_time
c) proc_time
49
Which of the following will start the R program? a
a) $ R
50 b) & R
c) Rb
Unit 2
The third step in decision making process is C
a linear predictions
b dependent predictions
c making predictions
1 d independent predictions
The decision making step, which consists of organization goals, predicting C
alternatives and communicating goals is called
a organization
b alternation
c planning
2 d valuing
The fourth step in decision making process is B
a linear correlation
b making decisions
c implement decisions
d evaluate performance
3
The costs that behaves as irrelevant costs in process of decision making are A
classified as
a past costs
b future costs
c expected costs
4 d sunk costs
Which of these is not a topic covered in a typical Business Analyst Aptitude D
Test?
a. Analytical Thinking c. Data Interpretation
5 b. Listening Skills d. Risk Management
If the test should be 30 minutes, Analytical Thinking is taken in how many C
minutes?
a. 5 c. 10
b. 7 d. 15
6
Primary objective of a business analyst is to help businesses implement B
a. Business systems
b. Business solutions
c. Technology systems
d. Technology solutions
7
Which business professional performs cost-benefit analyses of existing and C
potential customers
a) Marketer
8 b) Financial Analyst
c) Business Analyst
d) Sales Representative
8. 1. A Use Case is a set of steps, typically defining interactions between a role, A
True of False
9. a. True
b. False
9
Any fact that the solution can assume to be true when the use case begins is C
what?
a. A win
b. A Failure
c. A success
d. A Precondition
10
A State Diagram is used for what? D
a. Which Events cause a transition between states
b. Which events cause a success between states
c. Allowable behaviour
11 d, All
A Solution Requirement is comprised of two types of requirements what are A
they?
a, Functional
b. Hard
c. Existing
12 d. Non-Functional
Which of the following is used for Statistical analysis in R language? B
a) Studio
b) RStudio
c) Heck
13
R functionality is divided into a number of ________ A
a) Packages
b) Functions
c) Domains
14
Which of the following is an example of vectorized operation as far as subtraction is b
concerned?
a) x+y
15 b) x-y
c) x/y
d) x*y
> z <- x + y
>z
a) 7 9 11 13
b) 7 9 11 13 14
c) 9 11 13
d) Null
16
What would be the output of the following code? A
>x>2
17
What would be the value of the following expression? A
log(-1)
+ a <- 3
19 + x+a+y
c
+ ## 'y' is a free variable
+}
> g(2)
a) 8
b) 9
c) 42
d) Error
function(p) {
params[!fixed] <- p
mu <- params[1]
a <- -0.5*length(data)*log(2*pi*sigma^2)
-(a + b)
> ls(environment(nLL))
R is an__________programming language? C
a) Closed source
b) GPL
c) Open source
d) Definite sourc
22
Who developed R? A
a) Dennis Ritchie
b) John Chambers
c) Bjarne Stroustrup
23
R was named partly after the first names of____R authors? B
a) One
b) Two
c) Three
d) Four
24
Packages are useful in collecting sets into a_____unit ? C
a) Single
b) Multiple
25
Many quantitative analysts use R as their____tool? D
a) Leading tool
b) Programming tool
c) Both the above
26
Predictive analysis is the branch of __________analysis? B
a) Advanced
b) Core
c) Both the above
27
___________ is used to make predictions about unknown future events? C
a) Descriptive analysis
b) Predicitive analysis
c) Both the above
28
How many steps does the predictive analysis process contained? d
29 a) 5
b) 6
c) 7
d) 8
a) Past
b) Present
c) Future
30
How many types of R objects are present in R data type? C
a) 4
b) 5
c) 6
d) 7
31
How many types of data types are present in R? A
a) 4
b) 5
c) 6
d) 7
32
Which of the following is a primary tool for debugging? B
a) debug()
b) trace()
c) browser()
d) None of the above
33
Which function is used to create the vector with more than one element? C
a) Library()
b) plot()
c) c()
d) par()
34
In R every operation has a ______call? A
a) System
b) Function
c) None of the above
35
The ____________ in R is a vector. b
c) Both
a) 3
b) 4
c) 5
d) 6
39
How many types of vertices functions are peresent? B
a) 1
b) 2
c) 3
d) 4
40
_________and_________ are types of matrices functions? C
41
How many control statements are present in R? A
a) 6
b) 7
c) 8
d) 9
42
Which of the following finds the maximum value in the vector x, exclude missing b
values
a) rm(x)
43 b) all(x)
c) max(x, na.rm=TRUE)
d) x%in%y
a) a.x[rev(order(x$B)),]
b) b.x[ordersort(x$B),
c) c.x[order(x$B),]
44
_________ initiates an infinite loop right from the start. B
a) Never
b) Repeat
c) Break
d) Set
45
_______ is used to skip an iteration of a loop. A
a) Next
b) Skip
c) Group
46
47 _____ programming language is a dialect of S. A
a) B
b) C
c) D
d) S
48 In 1991, R was created by Ross Ihaka and Robert Gentleman in the Department of A
Statistics at the University of _________.
a) Auckland
b) Harvard
c) California
d) John Hopkins
a) 2000
b) 2005
c) 2010
d) 2012
50 R is technically much closer to the Scheme language than it is to the original _____ c
language.
a) B
b) S
c) C
d) C++
Unit-3
3
Which Package contains most fundamental functions to run R? A
a) root
b) child
c) base
4 d) parent
Which language is best for the statistical environment? B
a) C
b) R
c) Java
5 d) Python
In order to use the R-related functionality in Dundas BI, you must have D
access to an existing _________
a) Console
b) Terminal
c) Packages
6 d) R serve
The open source _________ software is available for Unix, Linux, and Windows A
7 platforms.
a) Rserve
b) BServe
c) CServe
d) Dserve
Modification in Dundas BI is done ______________ A
a) Directly
b) Indirectly
c) Need access to Server
8 d) Not known
Is It possible to inspect the source code of R? A
a) Yes
b) No
c) Can’t say
9 d) Some times
__________ function is used to watch for all available packages in library. D
a) lib()
b) fun.lib()
c) libr()
10 d) library()
The longer programs are called ____________ D
a) Files
b) Structures
c) Scripts
11 d) Data
Scripts will run on ___________________ A
a) Script Editors
b) Console
c) Terminal
12 d) GCC Compiler
What will be the output of the following R function? A
d) x is not a vector
What is the meaning of the following R function? B
print( sqrt(2) )
a) 1.414314
b) 1.414214
c) Error
15 d) 14.1414
What will be the output of the following R function? C
d <- date()
a) Prints todays date
b) Prints some date
c) Prints exact present time and date
16 d) Error
Which of the following commands will correctly read the above csv file with B
5 rows in a dataframe?
A) csv(‘Dataframe.csv’)
B) csv(‘Dataframe.csv’,header=TRUE)
C) dataframe(‘Dataframe.csv’)
D) csv2(‘Dataframe.csv’,header=FALSE,sep=’,’)
17
R functionality is divided into a number of ________ A
a) Packages
b) Functions
c) Domains
18
Consider the following function. A
f <- function(x) {
g <- function(y) {
y+z
z <- 4
x + g(x)
z <- 10
f(4)
19
A) 12
B) 7
C) 4
D) 16
The iris dataset has different species of flowers such as Setosa, Versicolor and B
Virginica with their sepal length. Now, we want to understand the
distribution of sepal length across all the species of flowers. One way to do
this is to visualise this relation through the graph shown below.
A) xyplot()
B) stripplot()
C) barchart()
D) bwplot()
20
The plot above is of type strip whereas the options a, c and d will produce a scatter, D
bar and box whisker plot respectively. Therefore, option B is the correct solution.
Alpha 125.5 0
Beta 235.6 1
Beta 212.03 0
Beta 211.30 0
Alpha 265.46 1
Beta 235.6 1
Beta 212.03 0
Beta 211.30 0
Alpha 265.46 1
C
A 10 Sam
B 20 Peter
C 30 Harry
D ! ?
E 50 Mark
A
Column 1 Column 2 Column 3
There are two dataframes stored Dataframe1 and Dataframe2 shown above.
26 Which of the following codes will produce the output shown below?
A 1000 25.5
B 2000 35.5
C 3000 45.5
D 4000 55.5
E 5000 65.5
F 6000 75.5
G 7000 85.5
H 8000 95.5
A) merge(dataframe[,1:3],dataframe2)
B) merge(dataframe1,dataframe2)[,1:3]
C) merge(dataframe1,dataframe2,all=TRUE)
D) Both 1 and 2
E) All of the above
e
V1 V2
1 121.5 461
2 516 1351
3 451 6918
4 613 112
5 112.36 230
6 25.23 1456
7 12 457
27
dataframe
A data set has been read in R and stored in a variable “dataframe”. Which of
the below codes will produce a summary (mean, mode, median) of the entire
dataset in a single line of code?
A) summary(dataframe)
B) stats(dataframe)
C) summarize(dataframe)
D) summarise(dataframe)
E) None of the above
D
A dataset has been read in R and stored in a variable “dataframe”. Missing
values have been read as NA.
A 10 Sam
B NA Peter
C 30 Harry
D 40 NA
E 50 Mark
dataframeWhich of the following codes will not give the number of missing
values in each column?
A) colSums(is.na(dataframe))
B) apply(is.na(dataframe),2,sum)
C) sapply(dataframe,function(x) sum(is.na(x))
D) table(is.na(dataframe))
28
One of the important phase in a Data Analytics pipeline is univariate analysis D
of the features which includes checking for the missing values and the
distribution, etc. Below is a dataset and we wish to plot histogram
for “Value” variable.
Parameter State Value Dependents
Alpha Active 50 2
Beta Active 45 5
Beta Passive 25 0
Alpha Passive 21 0
29
Alpha Passive 26 1
Beta Active 30 2
Beta Passive 18 0
dataframed
Which of the following commands will help us perform that task ?
A) hist(dataframed$Value)
B) ggplot2::qplot(dataframed$Value,geom=”Histogram”)
C)ggplot2::ggplot(data=dataframed,aes(dataframe$Value))+geom_histogram()
D) All of the above
D
Parameter State Value Usage
Alpha Active 50 0
Beta Active 45 1
Beta Passive 25 0
Alpha Passive 21 0
Alpha Passive 26 1
Beta Active 30 1
Beta Passive 18 0
Certain Algorithms like XGBOOST work only with numerical data. In that case,
categorical variables present in dataset are first converted to DUMMY
variables which represent the presence or absence of a level of a categorical
variable in the dataset. For example After creating the Dummy Variable for
the feature “Parameter”, the dataset looks like below.
Parameter_Alph
Parameter_Beta State Value Usage
a
1 0 Active 50 0
0 1 Active 45 1
30
0 1 Passive 25 0
1 0 Passive 21 0
1 0 Passive 26 1
0 1 Active 30 1d
0 1 Passive 18 0d
d
Column1 Column2 Column3 Column4 Column5 Column6
Name8 Alpha 42 84 54 0 Mu
Dataframe
We wish to calculate the correlation between “Column2” and “Column3” of a
“dataframe”. Which of the below codes will achieve the purpose?
A) corr(dataframe$column2,dataframe$column3)
B)
(cov(dataframe$column2,dataframe$column3))/(var(dataframe$column2)*sd(dat
aframe$column3))
C)
(sum(dataframe$Column2*dataframe$Column3)-
(sum(dataframe$Column2)*sum(dataframe$Column3)/nrow(dataframe))
)/(sqrt((sum(dataframe$Column2*dataframe$Column2)-
(sum(dataframe$Column2)^3)/nrow(dataframe))* (sum(dataframe$Column3*d
ataframe$Column3)-(sum(dataframe$Column3)^2)/nrow(dataframe))))
D) None of the Above
D
Parameter State Value Dependents
Alpha Active 50 2
Beta Active 45 5
Beta Passive 25 0
Alpha Passive 21 0
Alpha Passive 26 1
Beta Active 30 2
Beta Passive 18 0
Dataframe
The above dataset has been loaded for you in R in a variable
named “dataframe” with first row representing the column name. Which of
the following code will select only the rows for which parameter is Alpha?
A) subset(dataframe, Parameter=’Alpha’)
B) subset(dataframe, Parameter==’Alpha’)
C) filter(dataframe,Parameter==’Alpha’)
32 D) Both 2 and 3
E) All of the above
15) Which of the following function is used to view the dataset in spreadsheet B
like format?
A) disp()
B) View()
C) seq()
33 D) All of the Above
B
The below dataframe is stored in a variable named data.
A B
1 Right
2 Wrong
3 Wrong
4 Right
5 Right
6 Wrong
7 Wrong
8 Right
Data
Suppose B is a categorical variable and we wish to draw a boxplot for every
level of the categorical level. Which of the below commands will help us
achieve that?
A) boxplot(A,B,data=data)
B) boxplot(A~B,data=data)
C) boxplot(A|B,data=data)
D) None of the above
34
Which of the following commands will split the plotting window into 4 X 3 B
windows and where the plots enter the window column wise.
A) par(split=c(4,3))
B) par(mfcol=c(4,3))
C) par(mfrow=c(4,3))
D) par(col=c(4,3))
35
A Dataframe “df” has the following data: D
Dates
2017-02-28
2017-02-27
2017-02-26
2017-02-25
2017-02-24
36 2017-02-23
2017-02-22
2017-02-21
After reading above data, we want the following output:
Dates
28 Tuesday Feb 17
27 Monday Feb 17
26 Sunday Feb 17
25 Saturday Feb 17
24 Friday Feb 17
23 Thursday Feb 17
22 Wednesday Feb 17
21 Tuesday Feb 17
Name8 Alpha 42 84 54 0 Mu
Table
Which of the following commands will select all the rows from column 3 to
column 6 for the below dataframe named table?
A) dplyr::select(table,Column3:Column6)
B) table[,3:6]
C) subset(table,select=c(‘Column3’,’Column4’,’Column5’,’Column6’))
D) All of the above
C
Column1 Column2 Column3 Column4 Column5 Column6
Name8 Alpha 42 84 54 0 Mu
table
Which of the following commands will select the rows having “Alpha” values
in “Column1” and value less than 50 in “Column4”? The dataframe is stored in
a variable named table.
A) dplyr::filter(table,Column1==’Alpha’, Column4<50)
B) dplyr::filter(table,Column1==’Alpha’ & Column4<50)
C) Both of the above
D) None of the above
C
Column1 Column2 Column3 Column4 Column5 Column6
Name8 Alpha 42 84 54 0 Mu
Table
Which of the following code will sort the dataframe based on “Column2” in
ascending order and “Column3” in descending order?
A) dplyr::arrange(table,desc(Column3),Column2)
B) table[order(-Column3,Column2),]
C) Both of the above
42 D) None of the above
What will be the output of the following command B
grepl(“neeraj”,c(“dheeraj”,”Neeraj”,”neeraj”,”is”,”NEERAJ”))
43 A) [FALSE TRUE TRUE FALSE TRUE]
A 10 15
B 20 15
A 30 35
Output dataframe
Grade Sex Count
A Male 10
A Female 15
B Male 30
B Female 15
A Male 30
A Female 35
A) tidyr::Gather(maverick, Sex,Count,-Grade)
B) tidyr::spread(maverick, Sex,Count,-Grade
C) tidyr::collect(maverick, Sex,Count,-Grade)
D) None of the above
50 Which of the following command will help us to replace every instance of C
Delhi with Delhi_NCR in the following character vector?
C<-c(“Delhi is”,”a great city.”,”Delhi is also”,”the capital of India.”)
A) gsub(“Delhi”,”Delhi_NCR”,C)
B) sub(“Delhi”,”Delhi_NCR”,C)
C) Both of the above
D) None of the above
Unit -4
C
1. R has how many atomic classes of objects?
a) 1
b) 2
c) 3
1 d) 5
2 Point out the correct statement? D
print(varx+vary)
a. 57
b. 2334
c. 3423
d. 66
6
find the output B
varx<-23, 34->vary
print(varx == vary)
a. True
b. False
c. None of the above
d. Error
7
Below, we have represented six data points on a scale where vertical lines on scale C
represent unit. Which of the following line represents the mean of the given data
points, where the scale is divided into same units?
8 A) A B) B C) C D) D
If a positively skewed distribution has a median of 50, which of the following E
statement is true?
A) Mean is greater than 50
9 B) Mean is less than 50
28.
Studies show that listening to music while studying can improve your memory. To B
demonstrate this, a researcher obtains a sample of 36 college students and gives them
a standard memory test while they listen to some background music. Under normal
circumstances (without music), the mean score obtained was 25 and standard
deviation is 6. The mean score for the sample after the experiment (i.e With music) is
After performing the Z-test, what can we conclude ____ ?
A) Listening to music does not improve memory.
B)Listening to music significantly improves memory at p
C) The information is insufficient for any conclusion.
16 D) None of the above
A researcher concludes from his analysis that a placebo cures AIDS. What type of error D
is he making?
A) Type 1 error
B) Type 2 error
C) None of these. The researcher is not making an error.
D) Cannot be determined
17
What happens to the confidence interval when we introduce some outliers to the data? B
A) Confidence interval is robust to outliers
B) Confidence interval will increase with the introduction of outliers.
C) Confidence interval will decrease with the introduction of outliers.
18 D) We cannot determine the confidence interval in this case
B
A medical doctor wants to reduce blood sugar level of all his patients by altering their
diet. He finds that the mean sugar level of all patients is 180 with a standard deviation
of 18. Nine of his patients start dieting and the mean of the sample is observed to 175.
Now, he is considering to recommend all his patients to go on a diet.
Note: He calculates 99% confidence interval.
What is the standard error of the mean?
A) 9
B) 6
C) 7.5
19 D) 18
--------------is function in R to get number of observation in a data frame D
a) n( )
b) ncol( )
c) nobs( )
d) nrow( )
20
a. The freedom to study how the program works, and adapt it to your needs.
b. The freedom to improve the program, and release your improvements to
the public, so that the whole community benefits.
c. The freedom to run the program, for any purpose.
d. The freedom to sell the software for any price.
22
Point out the correct statement : C
a) Blocks are evaluated until a new line is entered after the closing brace
b) Single statements are evaluated when a new line is typed at the start of the
syntactically complete statement
c) The if/else statement conditionally evaluates two statements
d) All of the mentioned
23
Which will be the output of following code ? C
x-3
switch(6, 2+2, mean(1:10), rnorm(5))
a) 10
b) 1
c) NULL
d) All of the mentioned
24
_______ is used to continue an iteration of a loop. A
A. next
B. skip
C. group
25
Point out the correct statement : D
has occurred
d) All of the mentioned
a) debug()
b) trace()
c) browser()
d) All of the mentioned
27
Point out the correct statement : A
a) function()
b) funct()
c) functions()
d) All of the mentioned
29
The __________ function returns a list of all the formal arguments of a function A
a) formals()
b) funct()
c) formal()
d) All of the mentioned
30
Which of the following is multivariate version of lapply ? D
a) apply()
b) lapply()
c) sapply()
d) mapply()
31
Point out the correct statement : C
a) split() takes elements of the list and passes them as the first argument of the
function you are applying
b) You can use tsplit() to evaluate a function single time each with a same
32 argument
a) formal
b) function
c) reflective
d) All of the mentioned
33
The _________ function is used to plot negative likelihood. A
a) plot()
b) graph()
c) graph.plot()
d) None of the mentioned
34
Unit-5
a) A speech
b) A movie
c) A picture
d) The rain on our face
2
What area represents information in a graphical or pictorial form? C
a) Data design
b) None of the answers are correct.
c) Information design
d) Data visualization
3
Which of the following is an example of a temporal data visualization? d
B NA
C 30
D 40
E 50
Which of the following commands will create a column named “missing” with
value 1 where variable “Feature2” has missing values?
Feature1 Feature2 Missing
B NA 1
C 30 0
D 40 0
21
E 50 0
A)
dataframe$missing<-0
dataframe$Missing[is.na(dataframe$Feature2)]<-1
B)
dataframe$missing<-0
dataframe$Missing[which(is.na(dataframe$Feature2))]<-1
C) Both of the above
D) None of the above
Suppose there are 2 dataframes “A” and “B”. A has 34 rows and B has 46 rows. C
What will be the number of rows in the resultant dataframe after running the
following command?
merge(A,B,all.x=TRUE)
A) 46
B) 12
C) 34
D) 80
22
The very first thing that a Data Scientist generally does after loading dataset is C
find out the number of rows and columns the dataset has. In technical terms, it is
called knowing the dimensions of the dataset. This is done to get an idea about
the scale of data that he is dealing with and subsequently choosing the right
techniques and tools.
Which of the following command will not help us to view the dimensions of our
dataset?
A) dim()
B) str()
C) View()
D) None of the above
23
C
Sometimes, we face a situation where we have two columns of a dataset and we
wish to know which elements of the column are not present in another column.
This is easily achieved in R using the setdiff command.
Column1 Column2 Column3 Column4 Column5 Column6
Name8 Alpha 42 84 54 0 Mu
Dataframe
What will be the output of the following command?
setdiff(dataframe$Column1,dataframe$Column6)==setdiff(dataframe$Column6,datafr
ame$Column1)
A) TRUE
B)FALSE
C) Can’t Say
B
The below dataset is stored in a variable called “frame”.
A B
alpha 100
beta 120
gamma 80
delta 110
Which of the following commands will create a bar plot for the above dataset.
Use the values from Column B to represent the height of the bar plot.
A) ggplot(frame,aes(A,B))+geom_bar(stat=”identity”)
B) ggplot(frame,aes(A,B))+geom_bar(stat=”bin”)
C) ggplot(frame,aes(A,B))+geom_bar()
25 D) None of the above
A
mp dis dra qse gea car
A cyl hp wt vs am
g p t c r b
RX4 0 0 0 6
Hornet
18. 3.1 3.44 17.0
Sportabo 8 360 175 0 0 3 2
7 5 0 2
ut
We wish to create a stacked bar chart for cyl variable with stacking criteria
Being vs Variable. Which of the following commands will help us perform this
action?
A)qplot(factor(cyl),data=mtcars,geom=”bar”,fill=factor(vs)
B) ggplot(mtcars,aes(factor(cyl),fill=factor(vs)))+geom_bar()
C) All of the above
D) None of the above
a) 36
b) 48
c) 29
d) 41
29
The number of accidents in a city during 2010 is A
a) Discrete variable
b) Continuous variable
c) Qualitative variable
30 d) Constant
The mean of a distribution is 23, the median is 24, and the mode is 25.5. It is most likely A
that this distribution is:
a) Positively Skewed
b) Symmetrical
c) Asymptotic
31 d) Negatively Skewed
Data collected by NADRA to issue computerized identity cards (CICs) are C
a) Unofficial data
b) Qualitative data
c) Secondary data
d) Primary data
32 e) None of these
Sum of dots when two dice are rolled is A
a) A discrete variable
b) A continuous variable
c) A constant
33 d) A qualitative variable
A chance variation in an observational process is C
a) Dispersion/ Variability
b) Measurement error
c) Random error
34 d) Instrument error
If a distribution is abnormally tall and peaked, then is can be said that the distribution is: A
a) Leptokurtic
b) Pyrokurtic
c) Platykurtic
35 d) Mesokurtic
The mean of a distribution is 14 and the standard deviation is 5. What is the value of C
the coefficient of variation?
a) 60.4%
b) 48.3%
c) 35.7%
36 d) 27.8%
The first hand and unorganized form of data is called C
a) Secondary data
37 b) Organized data
c) Primary data
d) None of these
Questionnaire survey method is used to collect
a) Secondary data
b) Qualitative variable
c) Primary data
38 d) None of these
The data which have already been collected by someone are called C
a) Raw data
b) Array data
c) Secondary data
39 d) Fictitious data
The grouped data is also called C
a) Raw data
b) Primary data
c) Secondary data
40 d) Qualitative data
A constant variable can take values B
a) Zero
b) Fixed
c) Not fixed
41 d) Nothing
A parameter is a measure which is computed from A
a) Population data
b) Sample data
c) Test statistics
42 d) None of these
According to the empirical rule, approximately what percent of the data should lie E
within $\mu \pm \sigma$?
a) 75%
b) 68%
c) 99.7%
d) 90%
43 e) 95%
Primary data and _____________ data are same C
a) Grouped
b) Secondary data
c) Ungrouped
44 d) None of these
Which one of the following measurement does not divide a set of observations into B
equal parts?
a) Quartiles
b) Standard Deviations
c) Percentiles
d) Deciles
45 e) Median
In descriptive statistics, we study A
a) The description of the decision-making process
46 b) The methods for organizing, displaying and describing data
Question 2 of 9
What type of variable can be used to capture fixed effects?
Special Offer for Today - Upgrade Your Skills FREE. Explore the thousands of
classes Free in entrepreneurship, Marketing, web development &
More (Limited Time Offer). Claim Your One Month 100% Free
Question 3 of 9
What type of data should the Y variable be in a binary regression?
(A) Discrete
(B) Random
(C) Continuous
(D) none of these answers
Question 4 of 9
Fixed effects regressions help to deal with what problem?
Question 5 of 9
Which statistic offers a bounds on our estimate of the impact of an X
variable on the Y variable?
(A) T-statistic
(B) R-squared
(C) 95% confidence interval
(D) P-value
Question 6 of 9
What is one type of time series forecasting?
(A) Regressions
(B) The Delphi Method
(C) Exponential Smoothing
(D) Surveys
Question 7 of 9
What is the term for the estimate of the impact an X variable have on
the Y variable?
(A) Coefficient
(B) R-squared
(C) Standard Error
(D) P-value
Question 8 of 9
What is one type of causal forecasting?
CORONA BLOG – WANT TO EARN MONEY ONLINE BY MAKING CORONA STATS BLOG IN 15 MIN
CLICK HERE TO READ MORE ( Close if ad opens and double click on READ MORE ) CLICK HERE
Question 9 of 9
What is one good source of free data?
Question 2 of 7
Which key will create an absolute instead of a relative cell reference?
(A) Ctrl+A
(B) Esc
(C) F4
(D) F1
Question 3 of 7
What is the future value for a fully amortized loan?
(A) zero
(B) the value of the interest only
(C) the principal plus interest
(D) the outstanding principal
Question 4 of 7
Which Excel formula can take up to five arguments and then calculate
the future value of an investment?
(A) RECEIVED
(B) PRICEMAT
(C) FV
(D) FVSCHEDULE
Question 5 of 7
What is the formula for calculating the value of a perpetuity?
Question 6 of 7
Since the GEOMEAN formula does not accept negative numbers, how
can you use it despite having some negative growth rates?
(A) Subtract 1 from the growth rate, then add 1 to the result after using the
GEOMEAN formula.
(B) Divide the growth rate by -1, then multiply the result by -1 after using
the GEOMEAN formula.
(C) Add 1 to the growth rate, then subtract 1 to the result after using
the GEOMEAN formula.
(D) Multiply the growth rate by -1 to make it positive, then divide the result
by -1 after using the GEOMEAN formula.
Free download e-book Manifestation Magic Guide – The Ultimate Wealth Creation System
Click here to download ( Close if ad opens and double click on download )
Question 7 of 7
If you invested $10,000 with an annual compound interest rate of 5
percent, how much will it be worth after 10 years?
(A) 16289
(B) 11365
(C) 10761
(D) 15500
QUESTION 1 TO 13
Question 1 of 13
James has a small company and is looking to get a loan from the bank.
How will the bank deduce the company’s cash flow?
(A) By using three years of profit and loss reports to create a statement of
cash flows.
(B) By using the income statement alone to create a statement of cash flows.
Attempted correct option
(C) By using the balance sheet and income statement to create a
statement of cash flows.
(D) By using the balance sheet alone to create a statement of cash flows.
Question 2 of 13
You have worked on your balance sheet to figure out how to finance
an expansion plan. Which of the following would be the most realistic
plug figure to use?
Question 3 of 13
Which of the following tells you whether all of your forecasts,
assumptions, marketing plans, and operating plans are internally
consistent, practical, and achievable?
Question 4 of 13
Padere’s preliminary assumptions in her income statement is that
there will be an overall increase in expenses next year. What is the next
logical step for Padere before presenting to her COO?
Question 5 of 13
Company A is significantly above the average value compared to other
companies in the industry. You predict that next year the company will
likely deliver to a number closer to the other companies. This is known
as _.
Question 6 of 13
What should you consider when forecasting the level of property,
plant, and equipment, and the associated amount of depreciation
expense?
Question 7 of 13
In a forecasted income statement, the amount of a company’s _ is
determined by how much property, plant, and equipment the company
has.
(A) inventory
(B) depreciation expense
(C) assets
(D) accounts receivable
Question 8 of 13
Derrick is creating a constructed forecasted financial statement. Which
of the following would inhibit a more accurate statement?
Question 9 of 13
As you put together a forecasting model, it is important to remember
that sales forecasting is _.
Question 10 of 13
Kiko is creating her sales forecast. Which of the following factors is the
starting point for Kiko to view a forecast of future sales?
(A) change in the competitive environment
(B) historical trend
(C) impact of current plans
(D) impact of a marketing plan
Question 11 of 13
The most important and recommended starting point of any financial
model exercise is the _
Question 12 of 13
As an investor, you look to the DCF of a company before deciding to
invest. Which of the following questions would lead you to use a
higher interest rate in your analysis?
Question 13 of 13
Analyzing the _ performance of a company allows financial statement
users to understand _ performance of a company.
a + 5 - b is ________
A. terms, a group
B. operators, a statement
C. operands, an expression
D. operands, an equation
View Answer
Answer : C
Explanation: The objects that operators act on are called
operands. An expression involving operators and
operands is called an expression So, option C is correct.
A. X^y
B. X**y
C. X^^y
D. None of the mentioned
View Answer
Answer : B
Explanation: In python, power operator is x**y i.e.
2**5=32.
a = [10, 20]
b=a
b += [30, 40]
print(a)
print(b)
Answer : A
Explanation: Because since b and a reference to the
same object, when we use the addition assignment
operator += on b, it changes both a and b.
A. more()
B. gt()
C. ge()
D. None of the above
View Answer
Answer : D
Explanation: rshift() function overloads the >> operator
Answer : B
Explanation: The result of standard division is always
float. The value of 100 // 25 (integer division) is 4.
A. //
B. /
C. %
D. None of the above
View Answer
Answer : A
Explanation: When both of the operands are integer
then python chops out the fraction part and gives you
the round-off value, to get the accurate answer use,
floor division. This is floor division. For ex, 5/2 = 2.5 but
both of the operands are integers so the answer of this
expression in Python is 2. To get the 2.5 as an answer,
use floor division.
b = a -= 2
print(b)
A. 8
B. 10
C. Syntax Error
D. No error but no output too
View Answer
Answer : C
Explanation: b = a -= 2 expression is Invalid
A. ||
B. |
C. //
D. /
View Answer
Answer : B
Explanation: or() function overloads the bitwise OR
operator “|”.
Answer : A
Explanation: Internal representation of float objects is
not precise, so they can’t be relied on to equal exactly
what you think they will:
>>> 1.1 + 2.2 == 3.3
False
A. ii,i,iii,iv,v,vi
B. ii,i,iv,iii,v,vi
C. i,ii,iii,iv,vi,v
D. i,ii,iii,iv,v,vi
View Answer
Answer : A
Explanation: For
x =6
y=2
print(x ** y)
print(x // y)
A. 66
0
B. 36
0
C. 66
3
D. 36
3
View Answer
i =0
while i < 3:
print i
i++
print i+1
A. 021 324
B. 012 345
C. Error
D. 102 435
View Answer
Answer : C
Explanation: Python Programming language does not
support ‘++’ operator.
a = 100
b = 200
A. True
B. 0
C. False
D. 200
E. 100
View Answer
Answer : D
Explanation: None
14. Operators with the same precedence are evaluated
in which manner?
A. Left to Right
B. Right to Left
C. Can’t say
D. None of the mentioned
View Answer
Answer : A
Explanation: None
A. not
B. &
C. *
D. +
View Answer
Answer : C
Explanation: None
A. int
B. bool
C. void
D. None
View Answer
Answer : D
Explanation: Python explicitly defines the None object
that is returned if no value is specified.
x = -100
A. Yes
B. No
C. void
D. None
View Answer
Answer : B
Explanation: In the highlighted line, x > 0 is False. The
expression is already known to be falsy at that point.
Due to short-circuit evaluation, sqrt(x) (which would
raise an exception) is not evaluated.
Answer : A
Explanation: “Addition and Subtraction” are at the
same precedence level. Similarly, “Multiplication and
Division” are at the same precedence level. However,
Multiplication and Division operators are at a higher
precedence level than Addition and Subtraction
operators.
Answer : B
Explanation: If we pass A zero value to the bool()
constructor, it will treat it as false. Any non-zero value
is true.
Answer : B
1.
Which of the following operator takes only integer operands?
A. +
B. *
C. /
D. %
E. None of these
Answer: Option D
Solution:
Two integers are taken to be input
2.
In an expression involving || operator, evaluation
I. Will be stopped if one of its components evaluates to false
II. Will be stopped if one of its components evaluates to true
III. Takes place from right to left
IV. Takes place from left to right
A. I and II
B. I and III
C. II and III
D. II and IV
E. III and IV
Answer: Option D
No explanation is given for this question Let's Discuss on Board
3.
Determine output:
void main()
{
int i=0, j=1, k=2, m;
m = i++ || j++ || k++;
printf("%d %d %d %d", m, i, j, k);}
A. 1123
B. 1122
C. 0122
D. 0123
E. None of these
Answer: Option B
Solution:
In an expression involving || operator, evaluation takes place from left to right and will be stopped
if one of its components evaluates to true(a non zero value).
A. 1
B. -2
C. 2
D. Error
Answer: Option C
Solution:
Here unary minus (or negation) operator is used twice. Same maths rules applies, ie. minus *
minus = plus.
Note: However you cannot give like --2. Because -- operator can only be applied to variables as a
decrement operator (eg., i--). 2 is a constant and not a variable.
5.
Determine output:
void main()
{
int i=10;
i = !i>14;
printf("i=%d", i); }
A. 10
B. 14
C. 0
D. 1
E. None of these
Answer: Option C
Solution:
6.
In C programming language, which of the following type of
operators have the highest precedence
A. Relational operators
B. Equality operators
C. Logical operators
D. Arithmetic operators
Answer: Option D
No explanation is given for this question Let's Discuss on Board
7.
What will be the output of the following program?
void main()
{
int a, b, c, d;
a = 3;
b = 5;
c = a, b;
d = (a, b);
printf("c=%d d=%d", c, d);}
A. c=3 d=3
B. c=3 d=5
C. c=5 d=3
D. c=5 d=5
Answer: Option B
Solution:
The comma operator evaluates both of its operands and produces the value of the second. It also
has lower precedence than assignment. Hence c = a, b is equivalent to c = a, while d = (a, b) is
equivalent to d = b.
8.
Which of the following comments about the ++ operator are
correct?
A. It is a unary operator
Answer: Option E
No explanation is given for this question Let's Discuss on Board
9.
What will be the output of this program on an implementation
where int occupies 2 bytes?
#include <stdio.h>void main()
{
int i = 3;
int j;
j = sizeof(++i + ++i);
printf("i=%d j=%d", i, j);}
A. i=4 j=2
B. i=3 j=2
C. i=5 j=2
10.
Which operator has the lowest priority?
A. ++
B. %
C. +
D. ||
E. &&
Answer: Option D
Solution:
11.
What will be the output?
void main(){
int a=10, b=20;
char x=1, y=0;
if(a,b,x,y){
printf("EXAM");
}
}
A. XAM is printed
B. exam is printed
C. Compiler Error
D. Nothing is printed
Answer: Option D
No explanation is given for this question Let's Discuss on Board
12.
What number will z in the sample code given below?
int z, x=5, y= -10, a=4, b=2;
z = x++ - --y*b/a;
A. 5
B. 6
C. 9
D. 10
E. 11
Answer: Option D
Solution:
C Operator Precedence Table
According to precedence table execution of the given operators are as follows:
1. x++(Postfix operator) i.e x will become 5
2. y--(Prefix operator) i.e y will become -11
3. * and / have same priority so they will be executed according to their associativity i.e left to
right. So, *(Multiplication) will execute first and then /(division).
4. -(Subtraction)
13.
What is the output of the following statements?
int i = 0;printf("%d %d", i, i++);
A. 01
B. 10
C. 00
D. 11
E. None of these
Answer: Option B
Solution:
Since the evaluation is from right to left.
So when the print statement execute value of i = 0
Since its execute from right to left when i++ will be execute
first and print value 0 (since its post increment ) and after
printing 0 value of i become 1.
So it its prints for 1 for next i.
14.
What is the output of the following statements?
int b=15, c=5, d=8, e=8, a;
a = b>c ? c>d ? 12 : d>e ? 13 : 14 : 15;printf("%d", a);
A. 13
B. 14
C. 15
D. 12
E. Garbage Value
Answer: Option B
Solution:
Expression
a = b>c ? c>d ? 12 : d>e ? 13 : 14 : 15;
can be rewritten as
if(b>c)
if(c>d)
a = 12;
else
if(d>e)
a = 13;
else
a = 14;
}else{
a = 15;}
15.
What will be the output of the following code fragment?
void main()
{
printf("%x",-1<<4);}
A. fff0
B. fff1
C. fff2
D. fff3
E. fff4
Answer: Option A
1. Which of the following is not a compound assignment operator?
a) /=
b) +=
c) %=
d) ==
View Answer / Hide Answer
ANSWER: d) ==
Y = 5;
if (! Y > 10)
X = Y + 3;
else
X = Y + 10;
num = 5;
printf( “%d”, ++num++ );
a) ++p
b) P++
c) Both
d) P+1
View Answer / Hide Answer
ANSWER: c) Both
double X; X = ( 2 + 3) * 2 + 3;
a) 10
b) 13
c) 25
d) 28
View Answer / Hide Answer
ANSWER: b) 13
main()
{
int x = 5; y = -10, z;
int a = 4, b = 2;
z = x+++++y * b/a;
}
a) -2
b) 0
c) 1
d) 2
View Answer / Hide Answer
ANSWER: c) 1
a) 4 10
b) 3 11
c) 3 10
d) 4 11
View Answer / Hide Answer
ANSWER: a) 4 10
a) – 2 1
b) 6 5
c) 4 5
d) 5 5
View Answer / Hide Answer
ANSWER: b) 6 5
11. Which of the following statement is correct about the code snippet
given below?
int main()
{
int a = 10, b = 2, c;
a = !( c = c == c) && ++b;
c += ( a + b- -);
printf( “ %d %d %d”, b, c, a);
return 0;
}
int p = 0, q = 1;
p = q++;
p = ++q;
p = q--;
p = --q;
a) 1 1
b) 0 0
c) 3 2
d) 1 2
View Answer / Hide Answer
i = 1;
i = ( I< <= 1 % 2)
a) 2
b) 1
c) 0
d) Syntax error
View Answer / Hide Answer
ANSWER: a) 2
15. What is the correct and fully portable way to obtain the most
significant byte of an unsigned integer x?
a) x & 0xFF00
b) x > > 24
c) x > > ( CHAR_BIT * (sizeof(int) - 3))
d) x > > ( CHAR_BIT * (sizeof(int) - 1))
View Answer / Hide Answer
a) (x – (x/y))
b) (x – (x/y) * y)
c) (y – (x/y))
d) (y – (x/y) * y)
View Answer / Hide Answer
ANSWER: b) (x – (x/y) * y)
a) 13
b) 19
c) 25
d) 27
View Answer / Hide Answer
ANSWER: c) 25
i = 1;
i < < 1 % 2;
a) 2
b) -2
c) 1
d) 0
View Answer / Hide Answer
ANSWER: a) 2
a) P uses registers
b) Single machine instruction required for p++
c) Option a and b
d) None
View Answer / Hide Answer
MCQ .1
A process by which we estimate the value of dependent variable on the basis of
one or more independent
variables is called:
(a) Correlation (b) Regression (c) Residual (d) Slope
MCQ .2
The method of least squares dictates that we choose a regression line where the
sum of the square of
deviations of the points from the lie is:
(a) Maximum (b) Minimum (c) Zero (d) Positive
MCQ .3
A relationship where the flow of the data points is best represented by a curve is
called:
(a) Linear relationship (b) Nonlinear relationship (c) Linear positive (d) Linear
negative
MCQ .4
All data points falling along a straight line is called:
(a) Linear relationship (b) Non linear relationship (c) Residual (d) Scatter
diagram
MCQ .5
The value we would predict for the dependent variable when the independent
variables are all equal to zero
is called:
(a) Slope (b) Sum of residual (c) Intercept (d) Difficult to tell
MCQ .6
The predicted rate of response of the dependent variable to changes in the
independent variable is called:
(a) Slope (b) Intercept (c) Error (d) Regression equation
MCQ .7
The slope of the regression line of Y on X is also called the:
(a) Correlation coefficient of X on Y (b) Correlation coefficient of Y on X
(c) Regression coefficient of X on Y (d) Regression coefficient of Y on X
MCQ .8
In simple linear regression, the numbers of unknown constants are:
(a) One (b) Two (c) Three (d) Four
MCQ .9
In simple regression equation, the numbers of variables involved are:
(a) 0 (b) 1 (c) 2 (d) 3
MCQ .10
If the value of any regression coefficient is zero, then two variables are:
(a) Qualitative (b) Correlation (c) Dependent (d) Independent
MCQ .11
The straight line graph of the linear equation Y = a+ bX, slope will be upward if:
(a) b = 0 (b) b < 0 (c) b > 0 (b) b ≠ 0
MCQ .12
The straight line graph of the linear equation Y = a + bX, slope will be downward If:
(a) b > 0 (b) b < 0 (c) b = 0 (d) b ≠ 0
MCQ .13
The straight line graph of the linear equation Y = a + bX, slope is horizontal if:
(a) b = 0 (b) b ≠ 0 (c) b = 1 (d) a = b
MCQ .14
If regression line of = 5, then value of regression coefficient of Y on X is:
(a) 0 (b) 0.5 (c) 1 (d) 5
MCQ .15
If Y = 2 - 0.2X, then the value of Y intercept is equal to:
(a) -0.2 (b) 2 (c) 0.2X (d) All of the above
MCQ .16
If one regression coefficient is greater than one, then other will he:
(a) More than one (b) Equal to one (c) Less than one (d) Equal to minus one
MCQ .17
To determine the height of a person when his weight is given is:
(a) Correlation problem (b) Association problem (c) Regression problem (d)
Qualitative problem
MCQ .18
The dependent variable is also called:
(a) Regression (b) Regressand (c) Continuous variable (d) Independent
MCQ .19
The dependent variable is also called:
(a) Regressand variable (b) Predictand variable (c) Explained variable (d) All of
these
MCQ .20
The independent variable is also called:
(a) Regressor (b) Regressand (c) Predictand (d) Estimated
MCQ .21
In the regression equation Y = a+bX, the Y is called:
(a) Independent variable (b) Dependent variable (c) Continuous variable (d)
None of the above
MCQ .22
In the regression equation X = a + bY, the X is called:
(a) Independent variable (b) Dependent variable (c) Qualitative variable (d) None
of the above
MCQ .23
In the regression equation Y = a +bX, a is called:
(a) X-intercept (b) Y-intercept (c) Dependent variable (d) None of the above
MCQ .24
The regression equation always passes through:
(a) (X, Y) (b) (a, b) (c) ( , ) (d) ( , Y)
MCQ .25
The independent variable in a regression line is:
(a) Non-random variable (b) Random variable (c) Qualitative variable (d) None of
the above
MCQ .26
The graph showing the paired points of (Xi, Yi) is called:
(a) Scatter diagram (b) Histogram (c) Historigram (d) Pie diagram
MCQ .27
The graph represents the relationship that is:
(a) Linear (b) Non linear (c) Curvilinear (d) No relation
MCQ .28
The graphrepresents the relationship that is.:
(a) Linear positive (b) Linear negative (c) Non-linear (d) Curvilinear
MCQ .29
When regression line passes through the origin, then:
(a) Intercept is zero (b) Regression coefficient is zero (c) Correlation is zero (d)
Association is zero
MCQ .30
When bXY is positive, then byx will be:
(a) Negative (b) Positive (c) Zero (d) One
MCQ .31
The correlation coefficient is the_______of two regression coefficients:
(a) Geometric mean (b) Arithmetic mean (c) Harmonic mean (d) Median
MCQ .32
When two regression coefficients bear same algebraic signs, then correlation
coefficient is:
(a) Positive (b) Negative (c) According to two signs (d) Zero
MCQ .33
It is possible that two regression coefficients have:
(a) Opposite signs (b) Same signs (c) No sign (d) Difficult to tell
MCQ .34
Regression coefficient is independent of:
(a) Units of measurement (b) Scale and origin (c) Both (a) and (b) (d) None of
them
MCQ .35
In the regression line Y = a+ bX:
(a) (b) (c) (d)
MCQ .36
In the regression line Y = a + bX, the following is always true:
(a) (b) (c) (d)
MCQ .37
The purpose of simple linear regression analysis is to:
(a) Predict one variable from another variable
(b) Replace points on a scatter diagram by a straight-line
(c) Measure the degree to which two variables are linearly associated
(d) Obtain the expected value of the independent random variable for a given
value of the dependent variable
MCQ .38
The sum of the difference between the actual values of Y and its values obtained
from the fitted
regression line is always:
(a) Zero (b) Positive (c) Negative (d) Minimum
MCQ .39
If all the actual and estimated values of Y are same on the regression line, the
sum of squares of
error will be:
(a) Zero (b) Minimum (c) Maximum (d) Unknown
MCQ .40
(a) Residual (b) Difference between independent and dependent variables
(c) Difference between slope and intercept (d) Sum of residual
MCQ .41
A measure of the strength of the linear relationship that exists between two
variables is called:
(a) Slope (b) Intercept (c) Correlation coefficient (d) Regression equation
MCQ .42
When the ratio of variations in the related variables is constant, it is called:
(a) Linear correlation (b) Nonlinear correlation (c) Positive correlation (d)
Negative correlation
MCQ .43
If both variables X and Y increase or decrease simultaneously, then the coefficient
of correlation will be:
(a) Positive (b) Negative (c) Zero (d) One
MCQ .44
If the points on the scatter diagram indicate that as one variable increases the
other variable tends to
decrease the value of r will be:
(a) Perfect positive (b) Perfect negative (c) Negative (d) Zero
MCQ .45
If the points on the scatter diagram show no tendency either to increase together
or decrease together
the value of r will be close to:
(a) -1 (b) +1 (c) 0.5 (d) 0
MCQ .46
If one item is fixed and unchangeable and the other item varies, the correlation
coefficient will be:
(a) Positive (b) Negative (c) Zero (d) Undecided
MCQ .47
In scatter diagram, if most of the points lie in the first and third quadrants, then
coefficient of
correlation is:
(a) Negative (b) Positive (c) Zero (d) All of the above
MCQ .48
If the two series move in reverse directions and the variations in their values are
always
proportionate, it is said to be:
(a) Negative correlation (b) Positive correlation
(c) Perfect negative correlation (d) Perfect positive correlation
MCQ .49
If both the series move in the same direction and the variations are in a fixed
proportion, correlation
between them is said to be:
(a) Perfect correlation (c) Linear correlation
(c) Nonlinear correlation (d) Perfect positive correlation
MCQ .50
The value of the coefficient of correlation r lies between:
(a) 0 and 1 (b) -1 and 0 (c) -1 and +1 (d) -0.5 and +0.5
MCQ .51
If X is measured in yours and Y is measured in minutes, then correlation
coefficient has the unit:
(a) Hours (b) Minutes (c) Both (a) and (b) (d) No unit
MCQ .52
The range of regressioin coefficient is:
(a) -1 to +1 (b) 0 to 1 (c) -∞ to +∞ (d) 0 to ∞
MCQ .53
The signs of regression coefficients and correlation coefficient are always:
(a) Different (b) Same (c) Positive (d) Negative
MCQ .54
The arithmetic mean of the two regression coefficients is greater than or equal to:
(a) -1 (b) +1 (c) 0 (d) r
MCQ .55
In simple linear regression model Y = α + βX + ε where α and β are called:
(a) Estimates (b) Parameters (c) Random errors (d) Variables
MCQ .56
Negative regression coefficient indicates that the movement of the variables are
in:
(a) Same direction (b) Opposite direction (c) Both (a) and (b) (d) Difficult to tell
MCQ .57
Positive regression coefficient indicates that the movement of the variables are in:
(a) Same direction (b) Opposite direction (c) Upward direction (d) Downward
direction
MCQ .58
If the value of regression coefficient is zero, then the two variable are called:
(a) Independent (b) Dependent (c) Both (a) and (b) (d) Difficult to tell
MCQ .59
The term regression was used by:
(a) Newton (b) Pearson (c) Spearman (d) Galton
MCQ .60
In the regression equation Y = a + bX, b is called:
(a) Slope (b) Regression coefficient (c) Intercept (d) Both (a) and (b)
MCQ .61
When the two regression lines are parallel to each other, then their slopes are:
(a) Zero (b) Different (c) Same (d) Positive
MCQ .62
The measure of change in dependent variable corresponding to an unit change in
independent
variable is called:
(a) Slope (b) Regression coefficient (c) Both (a) and (b) (d) Neither (a) and (b)
MCQ .63
In correlation problem both variables are:
(a) Equal (b) Unknown (c) Fixed (d) Random
MCQ .64
In the regression equation Y = a + bX, where a and b are called:
(a) Constants (b) Estimates (c) Parameters (d) Both (a) and (b)
MCQ .65
If byx = bxy = 1 and Sx = Sy, then r will be:
(a) 0 (b) -1 (c) 1 (d) Difficult to calculate
MCQ .66
The correlation coefficient between X and -X is:
(a) 0 (b) 0.5 (c) 1 (d) -1
MCQ .67
If byx = bxy = rxy, then:
(a) Sx ≠ Sy (b) Sx = Sy (c) Sx > Sy (d) Sx < Sy
MCQ .68
If rxy = 0.4, then r(2x, 2y) is equal to:
(a) 0.4 (b) 0.8 (c) 0 (d) 1
MCQ .69
rxy is equal to:
(a) 0 (b) -1 (c) 1 (d) 0.5
MCQ .70
If rxy = 0.75, then correlation coefficient between u = 1.5X and v = 2Y is:
(a) 0 (b) 0.75 (c) -0.75 (d) 1.5
MCQ .71
If byx = -2 and rxy= -1, then bxy is equal to:
(a) -1 (b) -2 (c) 0.5 (d) -0.5
MCQ .72
If byx = 1.6 and bxy = 0.4, then rxy will be:
(a) 0.4 (b) 0.64 (c) 0.8 (d) -0.8
MCQ .73
If byx = -0.8 and bxy = -0.2, then ryx is equal to:
(a) -0.2 (b) -0.4 (c) 0.4 (d) -0.8
MCQ .74
If = 6 – X, then r will be:
(a) 0 (b) 1 (c) -1 (d) Both (b) and (c)
MCQ .75
If = X + 10, then r equal to:
(a) 1 (b) -1 (c) 1/2 (d) Difficult to tell
MCQ .76
If Y = -10X and X = -0.1Y, then r is equal to:
(a) 0.1 (b) 1 (c) -1 (d) 10
MCQ .77
If the figure +1 signifies perfect positive correlation and the figure -1 signifies a
perfect negative
correlation, then the figure 0 signifies:
(a) A perfect correlation (b) Uncorrelated variables
(c) Not significant (d) Weak correlation
MCQ .78
A perfect positive correlation is signified by:
(a) 0 (b) -1 (c) +1 (d) -1 to +1
MCQ .79
If a statistics professor tells his class: "All those who got 100 on the statistics test
got 20 on the mathematics test, and all those that got 100 on the mathematics test
got 20 on the statistics test", he is saying that the correlation between the statistics
test and the mathematics test is:
(a) Negative (b) Positive (c) Zero (d) Difficult to tell
MCQ .80
If is zero, the correlation is:
(a) Weak negative (b) High positive (c) High negative (d) None of the preceding
MCQ .81
If rxy = 1, then:
(a) byx = bxy (b) byx > bxy (c) byx < bxy (d) byx . bxy = 1
MCQ .82
The relation between the regression coefficient byx and correlation coefficient r is:
MCQ .83
The relation between the regression coefficient bxy and correlation coefficient r is:
MCQ .84
If the sum of the product of the deviation of X and Y from their means is zero, the
correlation
coefficient between X and Y is:
(a) Zero (b) Maximum (c) Minimum (d) Undecided
MCQ .85
If the coefficient of correlation between the variables X and Y is r, the coefficient of
correlation
between X2 and Y2 is:
(a) -1 (b) 1 (c) r (d) r2
MCQ .86
If rxy = 0.75, then rxy will be:
(a) 0.25 (b) 0.50 (c) 0.75 (d) -0.75
MCQ .87
If , then byx is equal to:
(a) Positive (b) Negative (c) Zero (d) One
MCQ .88
If , then intercept a is equal to:
(a) 0 (b) 1 (c) -1 to +1 (d) 0 to 1
MCQ .89
:
(a) Less than zero (b) Greater than zero (c) Equal to zero (d) Not equal to zero
MCQ .90
When rxy < 0, then byx and bxy will be:
(a) Zero (b) Not equal to zero (c) Less than zero (d) Greater than zero
MCQ .91
When rxy > 0, then byx and bxy are both:
(a) 0 (b) < 0 (c) > 0 (d) < 1
MCQ .92
If rxy = 0, then:
(a) byx = 0 (b) bxy = 0 (c) Both (a) and (b) (d) byx ≠ bxy
MCQ .93
If bxy = 0.20 and rxy = 0.50, then byx is equal to:
(a) 0.20 (b) 0.25 (c) 0.50 (d) 1.25
MCQ .94
A regression model may be:
(a) Linear (b) Non-linear (c) Both (a) and (b) (d) Neither (a) and (
A. formal()
B. funct()
C. formals()
D. fun()
B. null()
C. as.nullobj()
D. is.nullobj()
A. >arg(bin)
B. >arg(paste)
C. >args(pastebin)
D. >args(paste)
A. a*b
B. “a+b”
C. “a:b”
A. L
B. N
C. R
D. T
6) In 1991, R was created by Ross Ihaka and
Robert Gentleman in the Department of Statistics
at the University of _________
A. Auckland
B. Harvard
C. California
A. 2004
B. 2000
C. 2006
D. 1998
A. C
B. C++
C. S
D. K
A. 1991
B. 1995
C. 1997
D. 1999
A. Linux
B. Ubuntu
C. Windows
A. splines, stats4
B. mesh, compiler
C. splines, stats4
A. print(x)
B. print{x}
C. printx
B. Classes
C. Packages
D. Functions
A. parent
B. child
C. root
D. base
A. True
B. False
A. ;
B. :
C. @
D. -
17) In R , a vector is defined that it can only
contain objects of the ________
A. Different class
B. Same class
C. Any class
A. NaN
B. Inf
C. Sup
C. Not a Number
D. Numeric a Number
A. SuP
B. Inf
C. NaN
A. rjoin()
B. rbind()
C. rowbind()
A. as.nan()
B. s.nan()
C. is.nan()
A. True
B. False
A. @ R
B. - R
C. / R
D. $ R
A. True
B. False
A. True
B. False
A. 4
B. 7
C. 6
D. 9
A. 4
B. 6
C. 7
D. 12
29) ________ is the function to set row names for a
data frame.
A. row.names()
B. row.namespace()
C. row.nam()
A. Yes
B. No
C. May be
D. Can't say
Data and Analysis in the Real World
Week 1 Quiz
Quiz, 12 questions
Question 1
1
point
1. Question 1
What statement below best describes why we do data analytics in business?
We must show a return on the investment we make in data & analytical resources
Question 2
1
point
2. Question 2
What should you consider as you approach an analytical problem and in which order? Identify
correct order for the following ideas / steps.
For example, if you think they are already in the correct order, correct answer would be ABCDEF.
A. Sourcing Data
B. Analysis Outputs
C. Execute Analysis
D. Analysis Methods
E. Define Decision
F. Data Needs
ABCDEF:
A. Sourcing Data
B. Analysis Outputs
C. Execute Analysis
D. Analysis Methods
E. Define Decision
F. Data Needs
EBDAFC
E. Define Decision
B. Analysis Outputs
D. Analysis Methods
A. Sourcing Data
F. Data Needs
C. Execute Analysis
EBDFAC
E. Define Decision
B. Analysis Outputs
D. Analysis Methods
F. Data Needs
A. Sourcing Data
C. Execute Analysis
BDFACE
B. Analysis Outputs
D. Analysis Methods
F. Data Needs
A. Sourcing Data
C. Execute Analysis
E. Define Decision
Question 3
1
point
3. Question 3
What diagram below best describes the relationship between a mobile wireless carrier account
holder and devices at a point in time?
A - shows oval with account connected to oval with device by straight line
B - shows oval with account connected to oval with device by straight line – forked end on Account
side
C - shows oval with account connected to oval with device by straight line – forked end on Device
side
D - shows oval with account connected to oval with device by straight line – forked end on both sides
Question 4
1
point
4. Question 4
For the next 5 questions that describe types of metrics, select a source that best describes where
the following data might come from:
Billing System
Question 5
1
point
5. Question 5
Select a source that best describes where the following data might come from:
Billing System
Question 6
1
point
6. Question 6
Select a source that best describes where the following data might come from:
Billing System
Question 7
1
point
7. Question 7
Select a source that best describes where the following data might come from:
The dollar amount of unpaid invoices at the end of a month
Billing System
Question 8
1
point
8. Question 8
Select a source that best describes where the following data might come from:
Billing System
1
point
9. Question 9
Why is it important for data analysts to understand the value-chain (process) associated with
information and the analytical process?
the more you understand about the way the business works and how information flows through
business systems, the better prepared you will be to both execute and interpret your analysis. Also,
the more skill you have in finding and accessing data, the more productive and valuable you will be
as an analyst!
Question 10
1
point
10. Question 10
Identify correct order of steps in the Information-Action Value Chain.
ABCDEFGHI
D. Take Action
F. Data Extraction
G. Data Storage
H. Analytical Methods
I. Summarize & Interpret Results
CEFGHIABD
F. Data Extraction
G. Data Storage
H. Analytical Methods
D. Take Action
CEGFHIABD
G. Data Storage
F. Data Extraction
H. Analytical Methods
CEGFHIBAD
G. Data Storage
F. Data Extraction
H. Analytical Methods
D. Take Action
MCQ TESTING OF HYPOTHESIS
MCQ 13.1
A statement about a population developed for the purpose of testing is called:
(a) Hypothesis (b) Hypothesis testing (c) Level of significance (d) Test-statistic
MCQ 13.2
Any hypothesis which is tested for the purpose of rejection under the assumption that it is true is
called:
(a) Null hypothesis (b) Alternative hypothesis (c) Statistical hypothesis (d) Composite hypothesis
MCQ 13.3
A statement about the value of a population parameter is called:
(a) Null hypothesis (b) Alternative hypothesis (c) Simple hypothesis (d) Composite hypothesis
MCQ 13.4
Any statement whose validity is tested on the basis of a sample is called:
(a) Null hypothesis (b) Alternative hypothesis (c) Statistical hypothesis (b) Simple hypothesis
MCQ 13.5
A quantitative statement about a population is called:
(a) Research hypothesis (b) Composite hypothesis (c) Simple hypothesis (d) Statistical hypothesis
MCQ 13.6
A statement that is accepted if the sample data provide sufficient evidence that the null hypothesis is false is
called:
(a) Simple hypothesis (b) Composite hypothesis (c) Statistical hypothesis (d) Alternative hypothesis
MCQ 13.7
The alternative hypothesis is also called:
(a) Null hypothesis (b) Statistical hypothesis (c) Research hypothesis (d) Simple hypothesis
MCQ 13.8
A hypothesis that specifies all the values of parameter is called:
(a) Simple hypothesis (b) Composite hypothesis (c) Statistical hypothesis (d) None of the above
MCQ 13.9
The hypothesis µ ≤ 10 is a:
(a) Simple hypothesis (b) Composite hypothesis (c) Alternative hypothesis (d) Difficult to tell.
MCQ 13.10
If a hypothesis specifies the population distribution is called:
(a) Simple hypothesis (b) Composite hypothesis (c) Alternative hypothesis (d) None of the above
MCQ 13.11
A hypothesis may be classified as:
(a) Simple (b) Composite (c) Null (d) All of the above
MCQ 13.12
The probability of rejecting the null hypothesis when it is true is called:
(a) Level of confidence (b) Level of significance (c) Power of the test (d) Difficult to tell
MCQ 13.13
The dividing point between the region where the null hypothesis is rejected and the region where it is not
rejected is said to be:
(a) Critical region (b) Critical value (c) Acceptance region (d) Significant region
MCQ 13.14
If the critical region is located equally in both sides of the sampling distribution of test-statistic, the test is
called:
(a) One tailed (b) Two tailed (c) Right tailed (d) Left tailed
MCQ 13.15
The choice of one-tailed test and two-tailed test depends upon:
(a) Null hypothesis (b) Alternative hypothesis (c) None of these (d) Composite hypotheses
MCQ 13.16
Test of hypothesis Ho: µ = 50 against H1: µ > 50 leads to:
(a) Left-tailed test (b) Right-tailed test (c) Two-tailed test (d) Difficult to tell
MCQ 13.17
Test of hypothesis Ho: µ = 20 against H1: µ < 20 leads to:
(a) Right one-sided test (b) Left one-sided test (c) Two-sided test (d) All of the above
MCQ 13.18
Testing Ho: µ = 25 against H1: µ ≠ 20 leads to:
(a) Two-tailed test (b) Left-tailed test (c) Right-tailed test (d) Neither (a), (b) and (c)
MCQ 13.19
A rule or formula that provides a basis for testing a null hypothesis is called:
(a) Test-statistic (b) Population statistic (c) Both of these (d) None of the above
MCQ 13.20
The range of test statistic-Z is:
(a) 0 to 1 (b) -1 to +1 (c) 0 to ∞ (d) -∞ to +∞
MCQ 13.21
The range of test statistic-t is:
(a) 0 to ∞ (b) 0 to 1 (c) -∞ to +∞ (d) -1 to +1
MCQ 13.22
If Ho is true and we reject it is called:
(a) Type-I error (b) Type-II error (c) Standard error (d) Sampling error
MCQ 13.23
The probability associated with committing type-I error is:
(a) β (b) α (c) 1 – β (d) 1 – α
MCQ 13.24
A failing student is passed by an examiner, it is an example of:
(a) Type-I error (b) Type-II error (c) Unbiased decision (d) Difficult to tell
MCQ 13.25
A passing student is failed by an examiner, it is an example of:
(a) Type-I error (b) Type-II error (c) Best decision (d) All of the above
MCQ 13.26
1 – α is also called:
(a) Confidence coefficient (b) Power of the test (c) Size of the test (d) Level of significance
MCQ 13.27
1 – α is the probability associated with:
(a) Type-I error (b) Type-II error (c) Level of confidence (d) Level of significance
MCQ 13.28
Area of the rejection region depends on:
(a) Size of α (b) Size of β (c) Test-statistic (d) Number of values
MCQ 13.29
Size of critical region is known as:
(a) β (b) 1 - β (c) Critical value (d) Size of the test
MCQ 13.30
A null hypothesis is rejected if the value of a test statistic lies in the:
(a) Rejection region (b) Acceptance region (c) Both (a) and (b) (d) Neither (a) nor (b)
MCQ 13.31
The test statistic is equal to:
MCQ 13.32
Level of significance is also called:
(a) Power of the test (b) Size of the test (c) Level of confidence (d) Confidence coefficient
MCQ 13.33
Level of significance α lies between:
(a) -1 and +1 (b) 0 and 1 (c) 0 and n (d) -∞ to +∞
MCQ 13.34
Critical region is also called:
(a) Acceptance region (b) Rejection region (c) Confidence region (d) Statistical region
MCQ 13.35
The probability of rejecting Ho when it is false is called:
(a) Power of the test (b) Size of the test (c) Level of confidence (d) Confidence coefficient
MCQ 13.36
Power of a test is related to:
(a) Type-I error (b) Type-II error (c) Both (a) and (b) (d) Neither (a) and (b)
MCQ 13.37
In testing hypothesis α + β is always equal to:
(a) One (b) Zero (c) Two (d) Difficult to tell
MCQ 13.38
The significance level is the risk of:
(a) Rejecting Ho when Ho is correct (b) Rejecting Ho when H1 is correct
(c) Rejecting H1 when H1 is correct (d) Accepting Ho when Ho is correct.
MCQ 13.39
An example in a two-sided alternative hypothesis is:
(a) H1: µ < 0 (b) H1: µ > 0 (c) H1: µ ≥ 0 (d) H1: µ ≠ 0
MCQ 13.40
If the magnitude of calculated value of t is less than the tabulated value of t and H1 is two-sided, we
should:
(a) Reject Ho (b) Accept H1 (c) Not reject Ho (d) Difficult to tell
MCQ 13.41
Accepting a null hypothesis Ho:
(a) Proves that Ho is true (b) Proves that Ho is false
(c) Implies that Ho is likely to be true (d) Proves that µ ≤ 0
MCQ 13.42
The chance of rejecting a true hypothesis decreases when sample size is:
(a) Decreased (b) Increased (c) Constant (d) Both (a) and (b)
MCQ 13.43
The equality condition always appears in:
(a) Null hypothesis (b) Simple hypothesis (c) Alternative hypothesis (d) Both (a) and (b)
MCQ 13.44
Which hypothesis is always in an inequality form?
(a) Null hypothesis (b) Alternative hypothesis (c) Simple hypothesis (d) Composite hypothesis
MCQ 13.45
Which of the following is composite hypothesis?
(a) µ ≥ µo (b) µ ≤ µo (c) µ = µo (d) µ ≠ µo
MCQ 13.46
P (Type I error) is equal to:
(a) 1 – α (b) 1 – β (c) α (d) β
MCQ 13.47
P (Type II error) is equal to:
(a) α (b) β (c) 1 – α (d) 1 – β
MCQ 13.48
The power of the test is equal to:
(a) α (b) β (c) 1 – α (d) 1 – β
MCQ 13.49
The degree of confidence is equal to:
(a) α (b) β (c) 1 – α (d) 1 – β
MCQ 13.50
α / 2 is called:
(a) One tailed significance level (b) Two tailed significance level
(c) Left tailed significance level (d) Right tailed significance level
MCQ 13.51
Student’s t-test is applicable only when:
(a) n≤30 and σ is known (b) n>30 and σ is unknown (c) n=30 and σ is known (d) All of the above
MCQ 13.52
Student’s t-statistic is applicable in case of:
(a) Equal number of samples (b) Unequal number of samples (c) Small samples (d) All of the above
MCQ 13.53
Paired t-test is applicable when the observations in the two samples are:
(a) Equal in number (b) Paired (c) Correlation (d) All of the above
MCQ 13.54
The degree of freedom for paired t-test based on n pairs of observations is:
(a) 2n - 1 (b) n - 2 (c) 2(n - 1) (d) n - 1
MCQ 13.55
The test-statistic has d.f = ________:
(a) n (b) n - 1 (c) n - 2 (d) n1 + n2 - 2
MCQ 13.56
In an unpaired samples t-test with sample sizes n1= 11 and n2= 11, the value of tabulated t should be
obtained for:
(a) 10 degrees of freedom (b) 21 degrees of freedom
(c) 22 degrees of freedom (d) 20 degrees of freedom
MCQ 13.57
In analyzing the results of an experiment involving seven paired samples, tabulated t should be
obtained for:
(a) 13 degrees of freedom (b) 6 degrees of freedom
(c) 12 degrees of freedom (d) 14 degrees of freedom
MCQ 13.58
The mean difference between 16 paired observations is 25 and the standard deviation of differences is
10. The value of statistic-t is:
(a) 4 (b) 10 (c) 16 (d) 25
MCQ 13.59
Statistic-t is defined as deviation of sample mean from population mean µ expressed in terms of:
(a) Standard deviation (b) Standard error
(c) Coefficient of standard deviation (d) Coefficient of variation
MCQ 13.60
Student’s t-distribution has (n-1) d.f. when all the n observations in the sample are:
(a) Dependent (b) Independent (c) Maximum (d) Minimum
MCQ 13.61
The number of independent values in a set of values is called:
(a) Test-statistic (b) Degree of freedom (c) Level of significance (d) Level of confidence
MCQ 13.62
The purpose of statistical inference is:
(a) To collect sample data and use them to formulate hypotheses about a population
(b) To draw conclusion about populations and then collect sample data to support the conclusions
(c) To draw conclusions about populations from sample data
(d) To draw conclusions about the known value of population parameter
MCQ 13.63
Suppose that the null hypothesis is true and it is rejected, is known as:
(a) A type-I error, and its probability is β
(b) A type-I error, and its probability is α
(c) A type-II error, and its probability is α
(d) A type-Il error, and its probability is β
MCQ 13.64
An advertising agency wants to test the hypothesis that the proportion of adults in Pakistan who read a Sunday
Magazine is 25 percent. The null hypothesis is that the proportion reading the Sunday Magazine is:
(a) Different from 25% (b) Equal to 25% (c) Less than 25 % (d) More than 25 %
MCQ 13.65
If the mean of a particular population is µo, is distributed:
(a) As a standard normal variable, if the population is non-normal
(b) As a standard normal variable, if the sample is large
(c) As a standard normal variable, if the population is normal
(d) As the t-distribution with v = n - 1 degrees of freedom
MCQ 13.66
If µ1 and µ2 are means of two populations, is distributed:
(a) As a standard normal variable, if both samples are independent and less than 30
(b) As a standard normal variable, if both populations are normal
(c) As both (a) and (b) state
(d) As the t-distribution with n1 + n2 - 2 degrees of freedom
MCQ 13.67
If the population proportion equals po, then is distributed:
MCQ 13.69
Given µo = 130, = 150, σ = 25 and n = 4; what test statistics is appropriate?
(a) t (b) Z (c) χ2 (d) F
MCQ 13.70
Given Ho: µ = µo, H1: µ ≠ µo, α = 0.05 and we reject Ho; the absolute value of the Z-statistic must have equalled
or been beyond what value?
(a) 1.96 (b) 1.65 (c) 2.58 (d) 2.33
MCQ 13.71
If p1 and p2 are not identical, then standard error of the difference of proportions (p1 – p2) is:
MCQ 13.72
Under the hypothesis Ho: p1 = p2, the formula for the standard error of the difference between
proportions (p1 – p2) is:
MULTIPLE CHOICE QUESTIONS ON QUANTITATIVE TECHNIQUES
1. The techniques which provide the decision maker a systematic and powerful means of
analysis to explore policies for achieving predetermined goals are called..........................
a. Correlation techniques
b. Mathematical techniques
c. Quantitative techniques
d. None of the above
2. Correlation analysis is a ..............................
a. Univariate analysis
b. Bivariate analysis
c. Multivariate analysis
d. Both b and c
3. If change in one variable results a corresponding change in the other variable, then
the variables are.........................
a. Correlated
b. Not correlated
c. Any of the above
d. None of the above
4. When the values of two variables move in the same direction, correlation is said to
be ............................
a. Linear
b. Non-linear
c. Positive
d. Negative
5. When the values of two variables move in the opposite directions, correlation is said
to be ............................
a. Linear
b. Non-linear
c. Positive
d. Negative
6. When the amount of change in one variable leads to a constant ratio of change in
the other variable, then correlation is said to be .........................
a. Linear
b. Non-linear
c. Positive
d. Negative
7. ...........................attempts to determine the degree of relationship between
variables.
a. Regression analysis
b. Correlation analysis
c. Inferential analysis
d. None of these
8. Non-linear correlation is also called.....................................
a. Non-curvy linear correlation
b. Curvy linear correlation
c. Zero correlation
d. None of these
9. Scatter diagram is also called ......................
a. Dot chart
b. Correlation graph
c. Both a and b
d. None of these
10. If all the points of a scatter diagram lie on a straight line falling from left upper
corner to the right bottom corner, the correlation is called...................
a. Zero correlation
b. High degree of positive correlation
c. Perfect negative correlation
d. Perfect positive correlation
11. If all the dots of a scatter diagram lie on a straight line falling from left bottom corner
to the right upper corner, the correlation is called..................
a. Zero correlation
b. High degree of positive correlation
c. Perfect negative correlation
d. Perfect positive correlation
12. Numerical measure of correlation is called .....................
a. Coefficient of correlation
b. Coefficient of determination
c. Coefficient of non-determination
d. Coefficient of regression
13. Coefficient of correlation explains:
a. Concentration
b. Relation
c. Dispersion
d. Asymmetry
14. Coefficient of correlation lies between:
a. 0 and +1
b. 0 and –1
c. –1 and +1
d. – 3 and +3
15. A high degree of +ve correlation between availability of rainfall and weight of weight
of people is:
a. A meaningless correlation
b. A spurious correlation
c. A nonsense correlation
d. All of the above
16. If the ratio of change in one variable is equal to the ratio of change in the other
variable, then the correlation is said to be .....................
a. Linear
b. Non-linear
c. Curvilinear
d. None of these
17. Pearsonian correlation coefficient if denoted by the symbol ...............
a. K
b. r
c. R
d. None of these
18. If r= +1, the correlation is said to be ...................
a. High degree of +ve correlation
b. High degree of –ve correlation
c. Perfect +ve correlation
d. Perfect –ve correlation
19. If the dots in a scatter diagram fall on a narrow band, it indicates a .......................
degree of correlation.
a. Zero
b. High
c. Low
d. None of these
20. If all the points of a dot chart lie on a straight line vertical to the X-axis, then
coefficient of correlation is ...................
a. 0
b. +1
c. –1
d. None of these
21. If all the points of a dot chart lie on a straight line parallel to the X-axis, it denotes
.................................of correlation.
a. High degree
b. Low degree
c. Moderate degree
d. Absence
22. If dots are lying on a scatter diagram in a haphazard manner, then r = ......................
a. 0
b. +1
c. –1
d. None of these
23. The unit of Coefficient of correlation is ........................
a. Percentage
b. Ratio
c. Same unit of the data
d. No unit
24. Product moment correlation method is also called ........................
a. Rank correlation
b. Pearsonian correlation
c. Concurrent deviation
d. None of these
25. The –ve sign of correlation coefficient between X and Y indicates.............................
a. X decreasing, Y increasing
b. X increasing, Y decreasing
c. Any of the above
d. There is no change in X and Y
26. Coefficient of correlation explains .........................of the relationship between two
variables.
a. Degree
b. Direction
c. Both of the above
d. None of the above
27. For perfect correlation, the coefficient of correlation should be ..........................
a. ± 1
b. + 1
c. – 1
d. 0
28. Rank correlation coefficient was discovered by....................................
a. Fisher
b. Spearman
c. Karl Pearson
d. Bowley
29. The rank correlation coefficient is always............................
a. + 1
b. – 1
c. 0
d. Between + 1 and – 1
30. Spearman’s Rank Correlation Coefficient is usually denoted by....................
a. k
b. r
c. S
d. R
31. Probable error is used to:
a. Test the reliability of correlation coefficient
b. Measure the error in correlation coefficient
c. Both a an b
d. None of these
32. If coefficient of correlation is more than ................of its P E, correlation is significant.
a. 2 times
b. 5 times
c. 6 times
d. 10 times
33. In correlation analysis, Probable Error = ........................ x 0.6745
a. Standard deviation
b. Standard error
c. Coefficient of correlation
d. None of these
34. Coefficient of concurrent deviation depends on .......................
a. The signs of the deviations
b. The magnitude of the deviations
c. Bothe a and b
d. None of these
35. Correlation analysis between two sets of data only is called....................
a. Partial correlation
b. Multiple correlation
c. Nonsense correlation
d. Simple correlation
36. Correlation analysis between one dependent variable with one independent variable
by keeping the other independent variables as constant is called......................
a. Partial correlation
b. Multiple correlation
c. Nonsense correlation
d. Simple correlation
37. Study of correlation among three or more variables simultaneously is called.............
a. Partial correlation
b. Multiple correlation
c. Nonsense correlation
d. Simple correlation
38. If r = 0.8, coefficient of determination is.....................................
a. 80%
b. 8%
c. 64%
d. 0.8%
39. If r is the simple correlation coefficient, the quantity r 2 is known as ...................
a. Coefficient of determination
b. Coefficient of non-determination
c. Coefficient of alienation
d. None of these
40. If r is the simple correlation coefficient, the quantity 1 -- r2 is known as ...................
a. Coefficient of determination
b. Coefficient of non-determination
c. Coefficient of alienation
d. None of these
41. The term regression was first used by..........................
a. Karl Pearson
b. Spearman
c. R A Fisher
d. Francis Galton
42. ....................refers to analysis of average relationship between two variables to
provide mechanism for prediction.
a. Correlation
b. Regression
c. Standard error
d. None of these
43. If there are two variables, there can be at most............................... number of
regression lines.
a. One
b. Two
c. Three
d. Infinite
44. If the regression line is Y on X, then the variable X is known as..........................
a. Independent variable
b. Explanatory variable
c. Regressor
d. All the above
45. Regression line is also called.................................
a. Estimating equation
b. Prediction equation
c. Line of average relationship
d. All the above
46. If the regression line is X on Y, then the variable X is known as..........................
a. Dependent variable
b. Explained variable
c. Both a and b
d. Regressor
47. If the regression line is X on Y, then the variable X is known as..........................
a. Dependent variable
b. Independent variable
c. Bothe a and b
d. None of the above
48. If the regression line is Y on X, then the variable X is known as..........................
a. Dependent variable
b. Independent variable
c. Both a and b
d. None of the above
49. The point of intersection of two regression lines is..........................
a. (0,0)
b. (1,1)
c. (x,y)
d. (x̄, ӯ)
50. If r = ± 1, the two regression lines are...............................
a. Coincident
b. Parallel
c. Perpendicular to each other
d. None of these
51. If r = 1, the angle between the two regression lines is.........................
a. Ninety degree
b. Thirty degree
c. Zero degree
d. Sixty degree
52. If r = 0, the two regression lines are:
a. Coincident
b. Parallel
c. Perpendicular to each other
d. None of these
53. If bxy and byx are two regression coefficients, they have:
a. Same signs
b. Opposite signs
c. Either a or b
d. None of the above.
54. If byx > 1, then bxy is:
a. Greater than one
b. Less than one
c. Equal to one
d. Equal to zero
55. If X and Y are independent, the value of byx is equal to ........................
a. Zero
b. One
c. Infinity
d. Any positive value
56. The property that both the regression coefficients and correlation coefficient have
same signs is called................................
a. Fundamental property
b. Magnitude property
c. Signature property
d. None of these
57. The property that byx > 1 implies that bxy < 1 is known as .....................
a. Fundamental property
b. Magnitude property
c. Signature property
d. None of these
58. If X and Y are independent, the property byx = bxy = 0 is called ...................
a. Fundamental property
b. Magnitude property
c. Mean property
d. Independence property
59. The Correlation coefficient between two variables is the ........................... of their
regression coefficients.
a. Arithmetic mean
b. Geometric mean
c. Harmonic mean
d. None of these
60. If the correlation coefficient between two variables, X and Y, is negative, then the
regression coefficient of Y on X is.............................
a. Positive
b. Negative
c. Not certain
d. None of these
61. The G M of two regression coefficients byx and bxy is equal to ..........................
a. r
b. r2
c. 1 – r2
d. None of these
62. If one regression coefficient is negative, the other is ...............................
a. 0
b. – ve
c. +ve
d. Either a or b
63. Arithmetic mean of the two regression coefficients is:
a. Equal to correlation coefficient
b. Greater than correlation coefficient
c. Less than correlation coefficient
d. Equal to or greater than correlation coefficient
64. byx is the regression coefficient of the regression equation.....................
a. Y on X
b. X on Y
c. Either a or b
d. None of these
65. bxy is the regression coefficient of the regression equation.....................
a. Y on X
b. X on Y
c. Either a or b
d. None of these
66. In ..................... regression analysis, only one independent variable is used to explain
the dependent variable.
a. Multiple
b. Non-linear
c. Linear
d. None of these
67. The regression coefficient and correlation coefficient of the two variables will be the
same if their .............................are same.
a. Arithmetic mean
b. Standard deviation
c. Geometric mean
d. Mean deviation
68. The idea of testing of hypothesis was first set forth by ..........................
a. R A Fisher
b. J Neyman
c. E L Lehman
d. A Wald
69. By testing of hypothesis, we mean:
a. A significant procedure in Statistics
b. A method of making a significant statement
c. A rule for accepting or rejecting hypothesis
d. A significant estimation of a problem.
70. Testing of hypothesis and ......................are the two branches of statistical inference.
a. Statistical analysis
b. Probability
c. Correlation analysis
d. Estimation
71. ......................... is the original hypothesis
a. Null hypothesis
b. Alternative hypothesis
c. Either a or b
d. None of these
72. A null hypothesis is denoted by...........................
a. H0
b. H1
c. NH
d. None of these
73. An alternative hypothesis is denoted by...........................
a. H0
b. H1
c. AH
d. None of these
74. Whether a test is one sided or two sided, depends on........................
a. Simple hypothesis
b. Composite hypothesis
c. Null hypothesis
d. Alternative hypothesis
75. A wrong decision about null hypothesis leads to:
a. One kind of error
b. Two kinds of errors
c. Three kinds of errors
d. Four kinds of errors
76. Power of a test is related to ........................
a. Type I error
b. Type II error
c. Both a and b
d. None of these
77. Level of significance is the probability of................................
a. Type I error
b. Type II error
c. Both a and b
d. None of these
78. Which type of error is more severe error:
a. Type I error
b. Type II error
c. Both a and b
d. None of these
79. Type II error means..............................
a. Accepting a true hypothesis
b. Rejecting a true hypothesis
c. Accepting a wrong hypothesis
d. Rejecting a wrong hypothesis
80. Type I error is denoted by...........................
a. Alpha
b. Beta
c. Gamma
d. None of these
81. Type II error is denoted by....................................
a. Alpha
b. Beta
c. Gamma
d. None of these
82. The level of probability of accepting a true null hypothesis is called........................
a. Degree of freedom
b. Level of significance
c. Level of confidence
d. D,
83. The probability of rejecting a true null hypothesis is called.......................
a. Degree of freedom
b. Level of significance
c. Level of confidence
d. None of these
84. 1 – Level of confidence =.............................
a. Level of significance
b. Degree of freedom
c. Either a or b
d. None of these
85. While testing a hypothesis, if level of significance is not mentioned, we take
................... level of significance.
a. 1%
b. 2%
c. 5%
d. 10%
86. A sample is treated as large sample, when its size is.............................
a. More than 100
b. More than 75
c. More than 50
d. More than 30
87. ...............refers to the number of independent observations which is obtained by
subtracting the number of constraints from the total number of observations.
a. Sample size
b. Degree of freedom
c. Level of significance
d. Level of confidence
88. Total number of observations – number of constraints =......................
a. Level of significance
b. Degree of freedom
c. Level of confidence
d. Sample size
89. Accepting a null hypothesis when it is false is called................................
a. Type I error
b. Type II error
c. Probable error
d. Standard error
90. Accepting a null hypothesis when it is true is called................................
a. Type I error
b. Type II error
c. Probable error
d. No error
91. When sample is small,....................... test is applied.
a. t-test
b. Z test
c. F test
d. None of these
92. To test a hypothesis about proportions of items in a class, the usual test is..............
a. t-test
b. Z- test
c. F test
d. Sign test
93. Student’s t-test is applicable when:
a. The values of the variate are independent
b. The variable is distributed normally
c. The sample is small
d. All the above
94. Testing of hypotheses Ho : μ = 45 vs. H1 : μ > 45 when the population standard
deviation is known, the appropriate test is:
a. t-test
b. Z test
c. Chi-square test
d. F test
95. Testing of hypotheses Ho : μ = 85 vs. H1 : μ > 85, is a ...................test.
a. One sided left tailed test
b. One sided right tailed test
c. Two tailed test
d. None of these
96. Testing of hypotheses Ho : μ = 65 vs. H1 : μ < 65, is a ...................test.
a. One sided left tailed test
b. One sided right tailed test
c. Two tailed test
d. None of these
97. Testing of hypotheses Ho : μ = 65 vs. H1 : μ ≠ 65, is a ...................test.
a. One sided left tailed test
b. One sided right tailed test
c. Two tailed test
d. None of these
98. Student’s t-test was designed by ............................
a. R A Fisher
b. Wilcoxon
c. Wald wolfowitz
d. W S Gosset
99. Z test was designed by ........................................
a. R A Fisher
b. Wilcoxon
c. Wald wolfowitz
d. W S Gosset
100. Z test was designed by .......................................
a. R A Fisher
b. Wilcoxon
c. Wald wolfowitz
d. W S Gosset
101.The range of F ratio is ........................................
a. – 1 to + 1
b. – ∞ to ∞
c. 0 to ∞
d. 0 to 1
102. While computing F ratio, customarily, the larger variance is taken as .....................
a. Denominator
b. Numerator
c. Either way
d. None of these
103. Chi-square test was first used by ...............................
a. R A Fisher
b. William Gosset
c. James Bernoulli
d. Karl Pearson
104. The Chi-squre quantity ranges from ........................ to ...........................
a. – 1 to + 1
b. – ∞ to ∞
c. 0 to ∞
d. 0 to 1
105.Degrees of freedom for Chi-squre test in case of contingency table of order (2x2) is:
a. 4
b. 3
c. 2
d. 1
106.Degrees of freedom for Chi-squre test in case of contingency table of order (4x3) is:
a. 4
b. 3
c. 6
d. 7
107.Degrees of freedom for Chi-squre test in case of contingency table of order (5x5) is:
a. 25
b. 16
c. 10
d. Infinity
108.The magnitude of the difference between observed frequencies and expected
frequencies is called .......................
a. F value
b. Z value
c. t value
d. Chi-square value
109.When the expected frequencies and observed frequencies completely coincide, the
chi-square value will be ..............................
a. + 1
b. – 1
c. 0
d. None of these
110.If the discrepancy between observed and expected frequencies are greater,
......................... will be the chi-square value.
a. Greater
b. Smaller
c. Zero
d. None of these
111.Calculated value of chi-square is always........................
a. Positive
b. Negative
c. Zero
d. None of these
112.While applying chi-square test, the frequency in any cell should not be ......................
a. More than 5
b. Less than 5
c. More than 10
d. Less than 10
113.Analysis of variance utilises..................
a. F test
b. Chi square test
c. Z test
d. t test
114.In one way ANOVA, the variances are:
a. Within samples
b. Between samples
c. Total
d. All
115.The technique of analysis of variance was developed by .............................
a. Frank Wilcoxon
b. Karl Pearson
c. R A Fisher
d. Kolmogrov
116.Non-parametric test is :
a. Distribution free test
b. Not concerned with parameter
c. Does not depend on the particular form of the distribution
d. None of these
117..........................tests follow assumptions about population parameters.
a. Parametric
b. Non-parametric
c. One-tailed
d. Two-tailed
118.........................is the simplest and most widely used non-parametric test
a. Sign test
b. K-S test
c. Chi-square tst
d. Wilcoxon matched paired test
119.Runs test was designed by .............................
a. Kruskal and Wallis
b. Kolmogrov and Smirnov
c. Wald wolfowitz
d. Karl Pearson
120.Which one of the following is a non-parametric test?
a. F test
b. Z test
c. t test
d. Wilcoxon test
121.Control charts are also termed as...............................
a. Shewart charts
b. Process behaviour chart
c. Both a and b
d. None of these
122.What type of chart will be used to plot the number of defective in the output of any
process?
a. x̄ chart
b. R chart
c. C chart
d. P chart
123.Process control is carried out:
a. Before production
b. During production
c. After production
d. All of the above
124.The dividing lines between random and non-random deviations from mean of the
distribution are known as ..........................
a. Upper Control Limit
b. Lower Control Limit
c. Control Limits
d. Two sigma limit
125.The control charts used to monitor variable is...........................
a. Range chart
b. P-chart
c. C-chart
d. All of the above
126.The control charts used to monitor attributes is............................
a. Range chart
b. P-chart
c. C-chart
d. All of the above
127.The control charts used for the fraction of defective items in a sample
is............................
a. Range chart
b. P-chart
c. C-chart
d. Mean chart
128.The control charts used for the number of defects per unit is:
a. Range chart
b. P-chart
c. C-chart
d. Mean chart
129.........................is user for testing goodness of fit.
a. Wilcoxon test
b. Sign test
c. K-S Test
d. Chi-square test
130.Which of the following is a non-parametric test?
a. F-test
b. Z-test
c. Wilcoxon test
d. All of the above
131.Regression coefficient is independent of...........................
a. Origin
b. Scale
c. Both a and b
d. Neither origin nor scale
132.The geometric mean of the two regression coefficient, bxy and byx is equal to:
a. r
b. r2
c. 1
d. None of the above
133.In a correlation analysis, if r= 0, then we may say that there is .................. between
variables.
a. No correlation
b. Linear correlation
c. Perfect correlation
d. none of these
134.If ‘r’ is the correlation coefficient between two variables, then:
a. 0 < r < 1
b. – 1 ≤ r ≤ 1
c. r ≥ 0
d. r ≤ 0
**********
ANSWERS
1:c 21 : d 41 : d 61 : a 81 : b 101 : c 121 : c
2:d 22 : a 42 : b 62 : b 82 : c 102 : b 122 : d
3:a 23 : d 43 : b 63 : b 83 : b 103 : d 123 : b
4:c 24 : b 44 : d 64 : a 84 : a 104 : c 124 : c
5:d 25 : c 45 : d 65 : b 85 : c 105 : d 125 : a
6:a 26 : c 46 : c 66 : c 86 : d 106 : c 126 : b
7:b 27 : a 47 : a 67 : b 87 : b 107 : b 127 : b
8:b 28 : b 48 : b 68 : b 88 : b 108 : d 128 : c
9:a 29 : d 49 : d 69 : c 89 : b 109 : c 129 : d
10 : c 30 : d 50 : a 70 : d 90 : d 110 : a 130 : c
11 : d 31 : a 51 : c 71 : a 91 : a 111 : a 131 : a
12 : a 32 : c 52 : c 72 : a 92 : b 112 : b 132 : a
13 : b 33 : b 53 : a 73 : b 93 : d 113 : a 133 : a
14 : c 34 : a 54 : b 74 : d 94 : b 114 : d 134 : b
15 : d 35 : d 55 : a 75 : b 95 : b 115 : c
16 : a 36 : a 56 : c 76 : b 96 : a 116 : d
17 : c 37 : b 57 : b 77 : a 97 : c 117 : a
18 : c 38 : c 58 : d 78 : b 98 : d 118 : c
19 : b 39 : a 59 : b 79 : c 99 : a 119 : c
20 : a 40 : b 60 : b 80 : a 100 : a 120 : a
Prepared by
VINEETHAN T
Assistant Professor
MBA – II SEM-III
304 : Advanced Statistical Methods using R
MULTIPLE CHOICE QUESTIONS
Q No Question Answer
Which of the following is apply function in R?
a) apply()
1 b) tapply() B
c) fapply()
d) rapply()
Point out the correct statement?
a) Writing functions is a core activity of an R programmer
b) Functions are often used to encapsulate a sequence of
expressions that need to be executed numerous times
2 D
c) Functions are also often written when code must be shared
with others or the public
d) All of the mentioned
parameters
c) Functions provides an abstraction of the code to potential
users
d) Writing functions is a core activity of an R programmer
b)
d)
<= f()
a)
Hello, world!
Hello, world!
8 b) B
Hello, world!
Hello, world!
Hello, world!
c)
Hello, world!
d)
Hello, world!
Hello, world!
Hello, world!
Hello, world!
> print(meaningoflife)
a) 32
b) 42
c) 52
d) 46
R has how many atomic classes of objects?
a) 1
b) 2
10 c) 3 D
d) 5
. R objects can have attributes, which are like ________ for the
object.
a) metadata
16 b) features A
c) expression
d) dimensions
c) x <- c(T, F)
d) None of the mentioned
> x <- 6
> class(x)
22 a) “integer” B
b) “numeric”
c) “real”
d) “imaginary”
b)
> sqrt(-17)
a) -4.02
26 b) 4.02 C
c) NaN
d) 3.67
b)
d)
a) header
b) sep
c) file
d) footer
c) save_image
d) get
Which of the following statement will load the objects to the file
named “mydata.RData”?
a) save(“mydata.RData”)
b) load(“mydata.RData”)
41 c) loadAll(“mydata.RData”) B
d) put(“mydata.RData”)
R is an__________programming language?
a) Closed source
b) GPL
56 C
c) Open source
d) Definite source
Who developed R?
a) Dennis Ritchie
57 b) John Chambers B
c) Bjarne Stroustrup
c) Future
a) debug()
b) trace()
c) browser()
d) None of the above
Which function is used to create the vector with more than one
element?
a) Library()
68 b) plot() C
c) c()
d) par()
c) Both
b) 4
c) 5
d) 6
c) Both
4- What is a hypothesis?
1. A statement that the researcher wants to test through the data collected in a study.
2. A research question the results will answer.
3. A theory that underpins the study.
4. A statistical method for calculating the extent to which the results could have happened by
chance.
Answer: A statement that the researcher wants to test through the data collected in a stud
5. What is the cyclical process of collecting and analysing data during a single research study called
ANS-INTERIM ANALYSIS
7-
1 An advantage of using computer programs for qualitative data is that they _
1. Can reduce time required to analyse data (i.e., after the data are transcribed)
2. Help in storing and organising data
3. Make many procedures available that are rarely done by hand due to time constraints
4. All of the above
Answer: All of the Above
8. Boolean operators are words that are used to create logical combinations. 1. True 2. False
Answer: True
9. ______ are the basic building blocks of qualitative data.
ANS-CATEGORIES
10. This is the process of transforming qualitative research data from written interviews or field
notes into typed text.
1. Segmenting
2. Coding
3. Transcription
4. Mnemoning
Answer: Transcription
11. A challenge of qualitative data analysis is that it often includes data that are unwieldy and
complex; it is a major challenge to make sense of the large pool of data.
1. True
2. False Answer:
True
12. Hypothesis testing and estimation are both types of descriptive statistics.
1. True 2. False
ANS-FALSE
14. A graph that uses vertical bars to represent data is called a ___
1. Line graph
2. Bar graph
3. Scatterplot
4. Vertical graph
Answer: Bar graph
15. ____ are used when you want to visually examine the relationship between two quantitative
variables.
1. Bar graph
2. pie graph
3. line graph
4. Scatterplot
ANS-SCATTERPLOT
19. If the assumed hypothesis is tested for rejection considering it to be true is called?
1. Null Hypothesis
2. Statistical Hypothesis
3. Simple Hypothesis
4. Composite Hypothesis
Answer: Null Hypothesis
20. If the null hypothesis is false then which of the following is accepted?
1. Null Hypothesis
2. Positive Hypothesis
3. Negative Hypothesis
4. Alternative Hypothesis.
Answer: Alternative Hypothesis.
21. Alternative Hypothesis is also called as?
1. Composite hypothesis 2. Research Hypothesis 3. Simple Hypothesis 4. Null Hypothesis
ans-research hypothesis
1. What is the minimum no. of variables/ features required to perform clustering? 1.0 2.1 3.2 4.3
Answer: 1
2. For two runs of K-Mean clustering is it expected to get same clustering results? 1. Yes 2. No
Answer: No
3. Which of the following algorithm is most sensitive to outliers? Join:-
ans-K means clustering
4. The discrete variables and continuous variables are two types of
1. Open end classification
2. Time series classification
3. Qualitative classification
4. Quantitative classification
Answer: Quantitative classification
5. Bayesian classifiers is
1. A class of learning algorithm that tries to find an optimum classification of a set of examples using
the probabilistic theory.
2. Any mechanism employed by a learning system to constrain the search space of a hypothesis
3. An approach to the design of learning algorithms that is inspired by the fact that when people
encounter new situations, they often explain them by reference to familiar experiences, adapting the
explanations to fit the new si
Answer: A class of learning algorithm that tries to find an optimum classification of a set of examples
using the probabilistic theory.
6. Classification accuracy is
1. A subdivision of a set of examples into a number of classes
2. Measure of the accuracy, of the classification of a concept that is given by a certain theory
3. The task of assigning a classification to a set of examples
4. None of these
Answer: Measure of the accuracy, of the classification of a concept that is given by a certain theory
7. Euclidean distance measure is
1. A stage of the KDD process in which new data is added to the existing selection.
2. The process of finding a solution for a problem simply by enumerating all possible solutions
according to some pre-defined order and then testing them
3. The distance between two points as calculated using the Pythagoras theorem
4. none of above
Answer: The distance between tw
18. Point out the wrong statement.
1. k-nearest neighbor is same as k-means
2. k-means clustering is a method of vector quantization
3. k-means clustering aims to partition n observations into k clusters
4. none of the mentioned
Answer: k-nearest neighbor is same as k-means
19. Consider the following example “How we can divide set of articles such that those articles have
the same theme (we do not know the theme of the articles
ans-clustering
20. Can we use K Mean Clustering to identify the objects in video?
1. Yes 2. No
Answer: Yes
21. Clustering techniques are in the sense that the data scientist does not determine, in advance, the
labels to apply to the clusters.
1. Unsupervised
2. supervised
3. Reinforcement
4, Neural network
Answer: Unsupervised
22. metric is examined to determine a reasonably optimal value of k. 1. Mean Square Error 2.
Within Sum of Squares (WSS) 3. Speed
ans-within sum of squares
23. If an itemset is considered frequent, then any subset of the frequent itemset must also be
frequent.
1. Apriori Property
2. Downward Closure Property
3. Either 1 or 2
4. Both 1 and 2
Answer: Both 1 and 2Z
24. if {bread,eggs,milk} has a support of 0.15 and {bread,eggs} also has a support of 0.15, the
confidence of rule {bread,eggs} = {milk} is 1.0 2.1 3.2 4.3 Answer: 1
25. Confidence is a measure of how X and Y are really related rather than coincidentally
happeningtogether.
ans-false
26. recommend items based on similarity measures between users and/or items.
1. Content Based Systems
2. Hybrid System
3. Collaborative Filtering Systems
4. None of these
Answer: Collaborative Filtering Systems
27. There are major Classification of Collaborative Filtering Mechanisms 1.1 2.2 3.3 4. none of above
Answer: 2 28. Movie Recommendation to people is an example of 1. User Based Recommendation
2. Item Based Recommendation 3. Knowledge Based
ans-2
. There are major Classification of Collaborative Filtering Mechanisms 1.1 2.2 3.3 4. none of above
Answer: 2
28. Movie Recommendation to people is an example of
1. User Based Recommendation
2. Item Based Recommendation
3. Knowledge Based Recommendation
ans-item based
maximum value for entropy depends on the number of classes so if we have 8 Classes what will be
the max entropy.
1. Max Entropy is 1
2. Max Entropy is 2
3. Max Entropy is 3
4. Max Entropy is 4
Answer: Max Entropy is 3
5. High entropy means that the partitions in classification are 1. Pure 2. Not Pure 3. Usefull 4.
useless
Answer: Uses a single processor or computer
16. Which of the following statements about Naive Bayes is incorrect?
1. Attributes are equally important.
2. Attributes are statistically dependent of one another given the class value.
3. Attributes are statistically independent of one another given the class value.
ans-attributes are statistically dependent
Answer: The process of executing implicit previously unknown and potentially useful information
from data 11. Hidden knowledge referred to
1. A set of databases from different vendors, possibly using different database paradigms
2. An approach to a problem that is not guaranteed to work but performs well in most cases
3. Information that is hidden in a database and that cannot be recovered by a simple SQL query.
4. None of these
Answer: Information that is hidden in a database and that cannot be recovered by a simple SQL query.
12. Decision trees cannot handle categorical attributes with many distinct values, such as country
codes for telephone numbers.
1. True 2. False
Answer: False
15. CNMICHMENT IS 1. A stage of the KDD process in which new data is added to the existing
selection 2. The process of finding a solution for a problem simply by enumerating all possible
solutions according to some pre-defined order and then testing them 3. The distance between two
points as calculated using the Pythagoras theorem. 4. None of thes
ans-a stage of kdd