DWMMCQ

Question Bank : BECT406T: Data Warehousing & Mining (MCQ)
Unit 1:Introduction
1) __________ is a subjectoriented,integrated, timevariant, nonvolatile collection of data in
supportof management decisions.
A.Data Mining.
B.Data Warehousing.
C.Web Mining.
D.Text Mining
2) __________ is the heart of the warehouse.
A)Data mining database servers.
B)Data warehouse database servers.
C) Data mart database servers.
D) Relational data base servers
3) Data can be updated in _____environment.
A) data warehouse.
B) data mining.
C) operational.
D) informational
4) The star schema is composed of __________ fact table.
A) one
B)two.
C) three.
D) four.
5) The source of all data warehouse data is the____________.
A) Operational environment.
B) Informal environment.
C) Formal environment.
D) Technology environment
6) Data warehouse contains_____________data that is never found in the operational
environment.
A) normalized.
B) informational.
C) summary.
D)denormalized
7) An operational System
A) run the business in real time and is based on historical data
B) run the business in real time and is based on current data
C) used to support decision making and is based on current data.
D) support decision making and is based on historical data.
8) Data cleaning is a
A) collection of large data mostly stored in a computer system
B) Removal of noise errors and incorrect input from a database
C) systematic description of the syntactic structure of a specific database. It describes the structure of
the attributes the tables and foreign key relationship.
D) All the above
9) Data warehouses support _________________
A) OLTP
B) OLAP and OLTP
C) OLAP
D) Operational databases
10) __________describes the data contained in the data warehouse.
A). Relational data.
B). Operational data.
C). Metadata.
D). Informational data.
UNIT II Fundamentals of data mining
1. ...................... is an essential process where intelligent methods are applied to extract data patterns.
A) Data warehousing
B) Data mining
C) Text mining
D) Data selection
2. Data mining can also applied to other forms such as ................
i) Data streams
ii) Sequence data
iii) Networked data
iv) Text data
v) Spatial data
A) i, ii, iii and v only
B) ii, iii, iv and v only
C) i, iii, iv and v only
D) All i, ii, iii, iv and v
3. Which of the following is not a data mining functionality?
A) Characterization and Discrimination
B) Classification and regression
C) Selection and interpretation
D) Clustering and Analysis
4. ............................. is a summarization of the general characteristics or features of a target class of
data.
A) Data Characterization
B) Data Classification
C) Data discrimination
D) Data selection
5. ............................. is a comparison of the general features of the target class data objects against the
general features of objects from one or multiple contrasting classes.
D) Data selection
6. Strategic value of data mining is ......................
A) costsensitive
B) worksensitive
C) timesensitive
D) technicalsensitive
7. ............................. is the process of finding a model that describes and distinguishes data classes or
concepts.
D) Data selection
8. The various aspects of data mining methodologies is/are ...................
i) Mining various and new kinds of knowledge
ii) Mining knowledge in multidimensional space
iii) Pattern evaluation and pattern or constraintguided mining.
iv) Handling uncertainty, noise, or incompleteness of data
A) i, ii and iv only
B) ii, iii and iv only
C) i, ii and iii only
D) All i, ii, iii and iv
9. The full form of KDD is ..................
A) Knowledge Database
B) Knowledge Discovery Database
C) Knowledge Data House
D) Knowledge Data Definition
10. The out put of KDD is .............
A) Data
B) Information
C) Query
D) Useful information
Unit 3:Classification & Clustering
11) ____________ maps data into predefined groups
A).Regression
B) Time series analysis
C) Prediction
D) Classification
12) A frequent pattern tree is a tree structure consisting of ________
A) an itemprefixtree
B) a frequentitemheader table.
C) a frequentitemnode.
D) both A &B
13) The nonroot node of itemprefixtree consists of ________ fields.
A) two.
B) three.
C) four.
D) five.
14)The paths from root node to the nodes labelled 'a' are called __________.
A)transformed prefix path.
B)suffix subpath.
C)transformed suffix path.
D) prefix subpath
15) The transformed prefix paths of a node 'a' form a truncated database of pattern which
cooccurwith a is called _______.
A)suffix path.
B)FPtree.
C)conditional pattern base.
D) prefix path
16) BIRCH is a ________
A) agglomerative clustering algorithm.
B)hierarchical algorithm.
C)hierarchicalagglomerative algorithm.
D) divisive.
17) Which of the following is a clustering algorithm?
A) priori.
B) CLARA.
C) PincerSearch.
D) FPgrowth
18) In ________ algorithm each cluster is represented by the center of gravity of the cluster.
A) kmedoid.
B) kmeans.
C) STIRR
D) ROCK.
19) In ___________ each cluster is represented by one of the objects of the cluster located near
thecenter.
A) kmedoid.
B) kmeans.
C) STIRR.
D) ROCK.
20) Pick out a hierarchical clustering algorithm.
A) DBSCAN
B) BIRCH.
C.PAM.
D.CURE.
UNIT IV Mining frequent patterns and Association Rules:
1. The number of iterations in apriori ___________
a. increases with the size of the data
b. decreases with the increase in size of the data
c. increases with the size of the maximum frequent set
d. decreases with increase in size of the maximum frequent set
2. Which of the following are interestingness measures for association rules?
a. recall
b. lift
c. accuracy
d. compactness
3. Frequent item sets is
a. Superset of only closed frequent item sets
b. Superset of only maximal frequent item sets
c. Subset of maximal frequent item sets
d. Superset of both closed frequent item sets and maximal frequent item sets
4. In Apriori algorithm, if 1 itemsets are 100, then the number of candidate 2 itemsets are a. 100
b. 4950
c. 200
d. 5000
5. Significant Bottleneck in the Apriori algorithm is
a. Finding frequent itemsets
b. Pruning
c. Candidate generation
d. Number of iterations
6. Which Association Rule would you prefer
a. High support and medium confidence
b. High support and low confidence
c. Low support and high confidence
d. Low support and low confidence
7. The FPgrowth algorithm has ________ phases.
A. one.
B. two.
C. three.
D. four.
8. Which of the following is a predictive model?
A. Clustering.
B. Regression.
C. Summarization.
D. Association rules.
9. he basic idea of the apriori algorithm is to generate________ item sets of a particular size & scans the
database.
A. candidate.
B. primary.
C. secondary.
D. Superkey.
10. If an item set ‘XYZ’ is a frequent item set, then all subsets of that frequent item set are a. Undefined
b. Not frequent
c. Frequent
d. Can not say
Unit 5:Web Data Mining
21.Web content mining describes the discovery of useful information from the _______contents.
A)text
B) web.
C) page.
D) level
22) _______ mining is concerned with discovering the model underlying the link structures of
the web.
A) Data structure.
B) Web structure
C) Text structure
D) Image structure.
23) The ________ propose a measure of standing a node based on path counting.
A)open web.
B) close web.
C) link web.
D) hidden web.
24) In web mining, _______ is used to find natural groupings of users, pages, etc.
A) clustering.
B) associations.
C)sequential analysis.
D) classification.
25) In web mining, _________ is used to know the order in which URLs tend to be accessed.
A) clustering.
B) associations.
C) sequential analysis.
D.classification
26) In web mining, _________ is used to know which URLs tend to be requested together.
A.clustering.
B.associations.
C.sequential analysis.
D.classification.
27) __________ describes the discovery of useful information from the web contents.
A)Web content mining.
B) Web structure mining.
C) Web usage mining.
D) All of the above.
28)_______ is concerned with discovering the model underlying the link structures of the web
A) Web content mining.
B) Web structure mining
C) Web usage mining.
D) All of the above
29)A link is said to be _________ link if it is between pages with different domain names.
A) intrinsic.
B) transverse.
C) direct.
D) contrast.
30) A link is said to be _______ link if it is between pages with the same domain name.
A) intrinsic.
B) transverse.
C) direct.
D) contrast.
UNIT VI Big data Analytics
1. Hadoop is a framework that works with a variety of related tools. Common cohorts include
____________
a) MapReduce, Hive and HBase
b) MapReduce, MySQL and Google Apps
c) MapReduce, Hummer and Iguana
d) MapReduce, Heron and Trumpet
2. What was Hadoop named after?
a) Creator Doug Cutting’s favorite circus act
b) Cutting’s high school rock band
c) The toy elephant of Cutting’s son
d) A sound Cutting’s laptop made during Hadoop development
3. __________ can best be described as a programming model used to develop Hadoopbased applications
that can process massive amounts of data.
a) MapReduce
b) Mahout
c) Oozie
d) All of the mentioned
4. Facebook Tackles Big Data With _______ based on Hadoop.
a) ‘Project Prism’
b) ‘Prism’
c) ‘Project Big’
d) ‘Project Data’
5. What are the five V’s of Big Data?
a) Volume
b) Velocity
c) Variety
d) All the above
6.Above the file systems comes the ________ engine, which consists of one Job Tracker, to which client
applications submit MapReduce jobs.
A. MapReduce
B. Google
C. Functional Programming
D. Facebook
7. ________ is a platform for constructing data flows for extract, transform, and load (ETL) processing
and analysis of large datasets.
A. Pig Latin
B. Oozie
C. Pig
D. Hive
8. According to analysts, for what can traditional IT systems provide a foundation when they’re
integrated with big data technologies like Hadoop?
a) Big data management and data mining
b) Data warehousing and business intelligence
c) Management of Hadoop clusters
d) Collecting and storing unstructured data
9. What are the different features of Big Data Analytics?
A. Open Source
B. Data Recovery
C. Scalability
D. all of the above
10. What are the main components of Big Data?
A. MapReduce
B. HDFS
C. YARN
D. all the above

DWMMCQ

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DWMMCQ

Uploaded by

Copyright:

Available Formats

Question Bank : BECT406T: Data Warehousing & Mining (MCQ)

You might also like