Professional Documents
Culture Documents
Department Vision:
”To produce technically sound and ethically responsible Computer Engineers to the society
Department Mission:
1. To provide healthy Learning Environment based on current and future Industrial demands
3. To build technically and ethically strong mind having real life problem solving capabilities.
Explain why data warehouses are needed for developing business solutions from today’s
1 07
perspective. Discuss the role of data marts.(NOV 2016)
Define following terms & differenciate them: Data Mart , Enterprise Warehouse & Virtual
2 07/03
Warehouse (MAY 2017)(DEC-2018)
Explain role of Business intelligence in any one of following domain:Fraud
3 Detection,Market Segmentation, retail industry, telecommunications industry. Explain how 07
data mininig can be helpful in any of these cases.(MAY 2017)
Define the following terms: Business Intelligence, Data Mart, Closed frequent
4 04
itemset,Outlier Analysis (NOV 2017)
5 Do feature wise comparison between BI and DW.(MAY-2018) 04
6 Explain various features of Data Warehouse?(DEC-2018) 04
7 Differentiate between Operational Database System and Data Warehouse (DEC-2018) 07
8 Discuss the application of data warehousing and data mining (DEC-2018) 04
9 A data warehouse is a subject-oriented, integrated, time-variant, and 04
nonvolatile collection of data – Justify.(MAY-2019)
10 Can BI is used for DM? Or vice versa? Justify.(DEC-2019) 03
11 Explain why data warehouses are needed for developing business solutions from today’s 03
perspective. Discuss the role of data marts.(DEC-2019)
Sr 2. The Architecture of BI and DW [CO-1]
No
13 Discuss possible ways for integration of a Data Mining system with a Database or 04
DataWarehouse system.(MAY-2019)
14 Compare data mart and data warehouse.(MAY-2019) 03
15 Discuss star schema and fact constellation schema with diagram. (MAY-2019) 04
1 Define the term “data mining”. Discuss the major issues in data mining(NOV 2016) 07/03
What is Data Mining? Why is it called data mininig rather knowledge mininig?Explain KDD
2 07
process.(MAY 2017)(DEC-2018)
Explain mining in following Databases with example.
1. Temporal Databases
3 2. Sequence Databases 07
3. Spatial Databases
4. Spatiotemporal Databases.(MAY 2017)(DEC-2018)
What is the importance of visualization of discovered patterns? Explain the role of
4 presentation in pattern visualization. Discuss various visualization techniques in KDD.(MAY 07
2017)
5 What is Data Mining? Write down short note on KDD process(NOV 2017) 07
12 What is apex cuboid? Discuss drill down and roll up operation with diagram.(MAY-2019) 04
13 Define data mining and list its features.(DEC-2019) 03
14 Describe the steps involved in data mining when viewed as a process of knowledge 07
discovery. (DEC-2019)
15 Define outlier analysis? Why outlier mining is important? Briefly describe the different 07
approaches: statistical-based outlier detection, distance-based outlier detection and deviation-
based outlier detection.(DEC-2019)
In real-world data, tuples with missing values for some attributes are a common occurrence.
1 07
Describe various methods for handling this problem.(NOV 2016)
Explain the need for data smoothing during pre-processing and discuss data smoothing by
2 07
Binning.(NOV 2016)
What is Concept Hierarchy? List and explain types of Concept Hierarchy(MAY 2017)(DEC-
3 07/04
2018)
List and describe methods for handling missing values in data cleaning.(MAY 2017)(DEC-
4 03
2018)
What is noise? Explain data smoothing methods as noise removal technique to divide given
5 data into bins of size 3 by bin partition (equal frequency), by bin means, by bin medians and 07
by bin boundaries. Consider the data:10, 2, 19, 18, 20, 18, 25, 28, 22(MAY 2017)(DEC-2018)
Minimum salry is 20,000Rs and Maximum salary is 1,70,000Rs. Map thesalry 1,00,000Rs
6 in new Range of (60,000 , 2,60,000) Rs using min-max normalization method.(MAY 03
2017)
If Mean salary is 54,000Rs and standard deviation is 16,000 Rs then find zscore value of
7 07
73,600 Rs salry.(MAY 2017)
Explain Mean, Median, Mode,Variance, Standard Deviation & five number summay with
8 07
suitable database example.(MAY 2017)(DEC-2018)
Explain the following data normalization techniques: (i) min-max normalization and (ii)
9 07
decimal scaling.(NOV 2016)
Use min-max normalization method to normalize the following group of data by setting min =
10 04
0 and max = 1 ,200, 300, 400, 600, 1000(NOV 2017)
11 Describe various methods for handling missing data values(NOV 2017) 07
Suppose a group of sales price records has been sorted as follows:6, 9, 12, 13, 15, 25, 50, 70,
72, 92, 204, 232
12 03
Partition them into three bins by equal-frequency (equi-depth)
partitioning method. Perform data smoothing by bin mean.(NOV 2017)
Enlist the preprocessing steps with example. Explain procedure of any technique of
15 07
preprocessing.(MAY-2018)
Suppose that the data for analysis includes the attribute age.
The age values for the data tuples are (in increasing order):
16 13, 15, 16, 16, 19, 20, 23, 29, 35, 41, 44, 53, 62, 69, 72 04
Use min-max normalization to transform the value 45 for age onto the range [0:0, 1:0]
(DEC-2018)
17 What is noise? Explain binning methods for data smoothing. (MAY-2019) 04
21 Explain Mean, Median, Mode, Variance, Standard Deviation & five number summary with 07
suitable database example. (DEC-2019)
22 Is Graphical visualization is better than text data ?Justify your answer and explain different 07
data visualization technique.(DEC-2019)
23 In data pre-processing why we need data smoothing? Discuss data smoothing by 07
Binning.(DEC-2019)
24 Describe Concept Hierarchy? List and briefly explain types of Concept Hierarchy (DEC- 04
2019)
25 In real-world data, tuples with missing values for some attributes are a common occurrence. 07
Describe various methods for handling this problem.(DEC-2019)
What are the limitations of the Apriori approach for mining? Briefly describe the techniques
1 07
to improve the efficiency of Apriori algorithm (NOV 2016)
What is market basket analysis? Explain the two measures of rule interestingness:
2 07/04
support and confidence with suitable example.(NOV 2016)(MAY 2017)(DEC-2018)
State the Apriori Property. Generate candidate itemsets, frequent itemsets and association
rules using Apriori algorithm on the following data set with minimum support count is
2.(MAY 2017)
3 07
TID List of items_IDs
1 T100 I1,I2,I5
5 T2 {A, B, C, D, E} 07
T3 {A, B, C, E}
T4 {B, D}
T5 {A, C}
T6 {B, D}
Compare association and classification. Briefly explain associative classification with suitable
6 07
example.(NOV 2017)
What is an attribute selection measure? Explain different attribute selection measures with
7 07
example.(NOV 2017)
10 07
Consider following database of ten transactions. Let min_sup = 30% and min_confidence = 60%.
A) Find all frequent itemsets using Apriori algorithm.
B) Generate strong association rules.
13 07
(MAY-2019)
Write and discuss the algorithm which is used to generate frequent itemsets using an iterative level-
14 07
wise approach based on candidate generation.(MAY-2019)
15 Discuss Hash-based technique to improve efficiency of Apriori algorithm.(MAY-2019) 04
Consider the following dataset and find frequent item sets and generate association rules for
them using Apriori Algorithm. (DEC-2019)
16 07
What is market basket analysis? Explain the two measures of rule interestingness: support and
17
confidence. (DEC-2019)
With the help of a neat diagram explain the topology of a multilayer, feed-forward Neural
1 07
Network. Also explain the terms: “activation function” and“epoch”.(NOV 2016)
2 Briefly explain Linear and Non-linear regression(NOV 2016) 07
3 Explain the steps of the ID3 algorithm for generating Decision trees (NOV 2016) 07
What is Decision Tree? Explain how classification is done using decision tree
5 07
induction.(MAY 2017)
6 Explain Prepruning and Postpruning with an example(NOV 2017) 03
Define linear and nonlinear regression using figures. Calculate the value of Y for X=100
based on Linear regression prediction Method.(MAY-2018)
12 07
Calculate the weights using neural network single layer perceptron model. Three inputs are
x0, x1, x2, bias and weights are as follows:
w1(0) = 30 , w2(0) = 300
b(0)= 50 , η=0.01, xo = +1
13 04
Activation function is :
sgn(x) = +1, if x>=0
sgn(x) = -1, if x<0
(a)Calculate x2 for x1=100 and & 200.
1 Briefly explain the life-cycle of Data Analytics and discuss the role of data scientists.(NOV 07
How data Mining is useful for Business Intelligence applications viz.Balanced Scorecard,
4 Fraud Detection, Clickstream Mining, Market Segmentation, retail industry, 07
telecommunications industry, banking & finance and CRM (MAY-2018)
5 Discuss fraud detection and click-stream analysis using data mining. (MAY-2019) 07
What is Big Data? What is big data analytic ? Explain the big data- distributed file
4 07
system.(MAY 2017)(DEC-2018)
What is outlier analysis? Why outlier mining is important? Briefly describe the different
5 approaches : statistical-based outlier detection, distance-based outlier detection and deviation- 07
based outlier detection.(MAY 2017)
6 Define Big Data. Discuss various applications of Big Data(NOV 2017) 04
Explain big data and big data analytics. Explain key roles and their responsibilities for
11 04
successful analytic project.(MAY-2018)
Calculate 2 clusters using k-means cluster algorithm. For finding
12 07
the distance use euclidian distance.(MAY-2018)
16 What is web log? Explain web structure mining and web usage mining in detail(DEC-2018) 07
Briefly explain the life-cycle of Data Analytics and discuss the role of data scientists. (DEC-
20 07
2019)
21 Discuss the main features of Hadoop Distributed File System. (DEC-2019) 07