Data Mining

UNIT-3
1. ___________contains a subset of corporate –wide data that is of value to a specific group of users.
2. ________ is the estimate of the strength of the implication of the rule.
3.. ________ is the estimate of the strength of the implication of the rule.
4.Improving the efficiency of Apriori can be done by ---- technique [ ]

a) hash based b) transaction reduction c) both d) none
5. Attribute oriented induction method is [ ]

a) Offline data analysis technique b) Online data analysis technique
c) Both offline and online d) None
6. 15. The DM systems typically have a default generalization relation threshold value ranging from
__________________ to __________________
7.Multi dimensional Association rules with no repeated predicates are called ______ associated rules.
8.Apriori algorithm employs level-wise search, where k-item sets uses ________ item sets.
a) k b) (k-1)
c) (k+1) d) (K+2)
9._____ is used to improve the efficient of the Apriori algorithm.
a) Berg queries b) Iceberg queries
c) Ice Burg queries d) Ice Cube queries
10..All nonempty subsets of a frequent itemset must also be frequent “. This is _____ property [ b]
b)candidate b) apriori c) FP d) Support
11..A_________ contains a subset of corporate –wide data that is of valve to a specific group of
users[ a ]
a)enterprise warehouse b)Datamart c)Virtual warehouse d) Date warehouse

12.Attribute removal method is to []
a) Generalize the data b) specialize the data c) Focus the data d) study the data
13.Multidimensional association rules with no repeated predicates sre called ______----
14.Apriori algorithm employs level-wise search, where k-item sets uses ________ item sets.
a) k b) (k-1)
c) (k+1) d) (K+2)
15.________ is used to improve the efficient of the Apriori algorithm.

a) Berg queries b) Iceberg queries
c) Ice Burg queries d) Ice Cube queries
16. Anti-monotone, monotone, succinct, convertible and inconvertible are five different
categories of constraints.
a) Knowledge type constraints b) Data constraints
c) Interestingness constraints d) Rule constraints
17. The rule "IF NOT A1 AND A2 THEN NOT C1 "is encoded as ________ .
a) 101 b) 010
c) 001 d) 110
18. If a rule concerns associations between the presence or absence of items, it is a ________
rule.
a) Boolean association b) Quantitative association
c) Frequent association d) Transaction association
19.A set of items is referred to as a ________ .
20. If in multi dimensional association rule with repeated predicates, which contains multiple
occurrences of some predicate certain rules are called as ________ .
21. The rule "IF NOT A1 AND A2 THEN NOT C1 "is encoded as ________
a) 101 b) 010
c) 001 d) 110
22. If a rule concerns associations between the presence or absence of items, it is a ________ rule.
23. If a rule concerns associations between the presence or absence of items, it is____ rule.[ ]
24. _ _ _ _ _ _ _ _ _ _ uses the concept to generalize the data by replacing lower-level data with
high-level concepts. [ ]
a) Analysis oriented induction b) Algorithm oriented induction
c) Attribute oriented induction d) Approach oriented induction
25.___________ specify the type of knowledge to be mined such as association. [ ]

a)knowledge type constraint b)Data constraint c)rule constraint d)none
26. _ _ _ _ _ _ _ _ is a optimization method for spatial association analysis. [ ]

a) Progressive regression b) progressive refinement
c) Progressive coverage d) refinement property
27.. Percent (A," 70, 71 _ _ _ 80") => placement (A, "Infosys") The above rule clearly refer to _ _
_ _ _ _ _ _ _ _ _ _ _ _ rule [ ]
c) Single dimensional association d) Multi dimensional association
28.. If in multi dimensional association rule with repeated predicates, which contains multiple
occurrences of some predicate certain rules are called as _ _ _ _ _ _ _ _ _
29. If a rule describes association between quantitative attributes, it is a _ _ _ _ _ _ _ _ _ _ rule.

30. Data matrix is often called [ ]
a)one mode matrix b)two-mode matrix c)three mode matrix d)none
UNIT-4
1. Rule constraints can be classified into ---- categories [ ]

a) 1 b) 2 c) 3 d) 5
2. In ------- algorithm , where each cluster is represented by the mean value of the objects in the
cluster. [ ]
a) k- medoids b) k-means c) CURE d) BIRCH
3.Support vector machines a method for the classification of_______ data []
a) linear b) non linear c) both a&b d) none
4.________ analysis is can be used to model the relationship between one or more independent or
predictor variable and a dependent or reponse variable
5.The _________ of a classifier on a given test set is the percentage of test set tuples that
are correctly classified by the classifier.
6.In Backpropagation ,the weights and biases are updated after all of the tuples in the training set have
been presented .this strategy is called [ d ]
a) Terminating updating b) Epoch updating c) Case updating d) sample
updating
7. The generalization is performed by the either ________________ or attribute generalization
8.Bayesian classification is based on___________________ theorem.
9. Bayes theorem provides a way of calculating ------------------probability

10. Bayes theorem provides a way of calculating which probability?
11. Decision trees can easily be converted to _ _ _ _ _ _ _ rules. [ ]

a) IF b) Nested IF c) If-THEN d) GROUP BY
12.In _______ duplicate subtrees exists with the tree [ ]

a)replication b)fragmentation c)repetition d)none
13._________ occurs when an attribute is repeatedly tested along a given branch of tree [ ]
a)repetition b)replication c)fragmentation d)none
UNIT-5
1.________ is task of discovering interesting patterns from large amounts of data.
2.________ hierarchy may formally express existing semantic relationships Between attributes.
a) schema b) set-grouping
c) operational derived d) none
3.Pattern evaluation is an issue related to [ ]

a) mining methodoly and user interaction issue b) performance issues
c)issues relating to measurement d)none
4. ________ hierarchy may formally express existing semantic relationships between attributes.
a) schema b) set-grouping
c) operational derived d) none
5. The usefulness of a pattern can be measured by [ ]

a) Simplicity b) Confidence c) Support d) Novelty
6.A ________________ model consists of radial lines emanating from a central point, where each line
represents a concept hierarchy for a dimension
7. Concept characterization follows ________ Approach.

a) Data cube OLAP b) AOI
c) both d) none
8. In the Data characterization the tuple typicality is measured by __________
9. _____ hierarchy may formally express existing semantic relationships between attributes. [ ]
a) schema b) set-grouping c) operational derived d) none
10. DBSCAN is a ------ based clustering algorithm. [ ]

a) partitioning b) hierarchical c)density d) grid
11. _ _ _ _ _ _ is a density-based method that computers an augmented clustering
12.The _ _ _ _ _ _ algorithm where each cluster is represented by one of the objects located
near the center of cluster.
13. Clustering large applications can be shortened as ________ .

a) CLA b) CLAPP
c) CLARA d) CLULA
14. Clustering large applications can be shortened as ________ .
a) CLA b) CLAPP
c) CLARA d) CLULA
15. The absolute closeness between 2 clusters, normalized w.r.t the internal closeness of two
clusters is [ ]
a) Relative distance b) Relative interconnectivity
c) Relative density d) Relative closeness
16. Which method overcame with the problem of favoring clusters with spherical shape and
similar sizes ______________
1. ______________describes the discovery of useful information from the web contents
a.web content mining
2. _______________is concerned with discovering the model underlying the link structure of the
web.
A.web structure mining
3. Web structure mining is the process of discovering ______structured_______
a. semistructured b.unstructured c. structured d.none
4.page rank is a measure for __________ documents based on their quality
a. ranking hypertext
5. web server includes___________
a. ip address, access time, page reference
6. Main purpose of structure mining is to extract previously unknown relationship

between______
a. web pages
7. k means is an example of _____________
a.clustering
8. Web usage mining is the application of identifying or discovering interesting usage patterns from large
data sets
9. Structure mining basically shows the structured summary of a particular website
10. Web Mining is the process of Data Mining techniques to automatically discover and extract
information from Web documents and services.
11.
SET NO 1
1. Compare and contrast classification methods.

2. Specify the key issues in hierarchical clustering.
3. Explain different methods used for text data bases.
4. Define and clustering and describe categorization of major clustering methods.
Set no 2
1. Discuss PAM algorithm and issues in K-mean.
2. Discuss about web mining
3. Explain different types of data types used in cluster analysis.
4. Discuss about K-nearest neighbor classifiers and case based reasoning
Set no 3
1. What is meant by outlier analysis? Differentiate between agglomerative and divisive hierarchical
clustering
2. Discuss web content mining, web structure mining and web usage mining.
3. Discuss Bayesian classification
4. Discuss about text mining.

Data Mining

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Mining

Uploaded by

Copyright:

Available Formats

UNIT-3

2. ________ is the estimate of the strength of the implication of the rule.

4.Improving the efficiency of Apriori can be done by ---- technique [ ]

5. Attribute oriented induction method is [ ]

9._____ is used to improve the efficient of the Apriori algorithm.

a) Berg queries b) Iceberg queries

c) Ice Burg queries d) Ice Cube queries

b)candidate b) apriori c) FP d) Support

a)enterprise warehouse b)Datamart c)Virtual warehouse d) Date warehouse

13.Multidimensional association rules with no repeated predicates sre called ______----

15.________ is used to improve the efficient of the Apriori algorithm.

a) Boolean association b) Quantitative association

c) Frequent association d) Transaction association

25.___________ specify the type of knowledge to be mined such as association. [ ]

26. _ _ _ _ _ _ _ _ is a optimization method for spatial association analysis. [ ]

29. If a rule describes association between quantitative attributes, it is a _ _ _ _ _ _ _ _ _ _ rule.

1. Rule constraints can be classified into ---- categories [ ]

3.Support vector machines a method for the classification of_______ data []

a) linear b) non linear c) both a&b d) none

7. The generalization is performed by the either ________________ or attribute generalization

8.Bayesian classification is based on___________________ theorem.

9. Bayes theorem provides a way of calculating ------------------probability

11. Decision trees can easily be converted to _ _ _ _ _ _ _ rules. [ ]

12.In _______ duplicate subtrees exists with the tree [ ]

1.________ is task of discovering interesting patterns from large amounts of data.

c) operational derived d) none

3.Pattern evaluation is an issue related to [ ]

c) operational derived d) none

5. The usefulness of a pattern can be measured by [ ]

7. Concept characterization follows ________ Approach.

8. In the Data characterization the tuple typicality is measured by __________

a) schema b) set-grouping c) operational derived d) none

10. DBSCAN is a ------ based clustering algorithm. [ ]

11. _ _ _ _ _ _ is a density-based method that computers an augmented clustering

13. Clustering large applications can be shortened as ________ .

14. Clustering large applications can be shortened as ________ .

a.web content mining

A.web structure mining

3. Web structure mining is the process of discovering ______structured_______

a. semistructured b.unstructured c. structured d.none

4.page rank is a measure for __________ documents based on their quality

5. web server includes___________

a. ip address, access time, page reference

6. Main purpose of structure mining is to extract previously unknown relationship

7. k means is an example of _____________

9. Structure mining basically shows the structured summary of a particular website

1. Compare and contrast classification methods.

You might also like

3. Web structure mining is the process of discovering structured_