UNITVI Mlning Streams. Multimedia. Mining: I uNtTvlt Weff Applicatlons and Tr.nds in Data Mining: Data Mining Applications. Additional Themes { on Data Mining and Social lmpacts oi Oata Mining. Data Mining System Products and Research prototypes.

building aiecision tree. motivating challenges. Measures of Similarity and Oissimilarity: Basics. DATA WAREHOUSING AND DATA MINING Unit-l: lntroduciion to Data Mining: What is data mining. partiai materialization. due to lack of represeniation samples. meihods foi expressing attribute test conditions. Fp-Growth Algorithmj. efficient processing of OLAp queries. dissimilarities between data objects. candidate generation and pruning. Frequent ltem set generation in the Apriori algorithri. Gan) . proximity-measures: similarity . cioss_ validation. Cosine similarity. support counting (eluding support counting using a Hash tree) . Exploring Data : Data Set. Frequent ltem-set generation_ The Apriori principle . Decision Tree induction: working of decision tree. bootstrap. (l-an) Unit-V: Classification-Alternative techniques: Bayesian Classifier: Bayes theorem. Computer Science Engineering. data mining tasks. similarity and dissimilarity between simple attributes. ( H & : i) Uniuv: Classification: Basic Concepts. origins of data mining.2010-201 1 academic year JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA lV Year B.e. simiririties between data objects. Types of Data-attributes and miasuremints.of noise.t. Bayes erior rate. Extended Jaccard ctefficient."""rr"" for binary data. Nai've Bayes classifier. General approach to solving a classification problem. Data Warehousing Modeling: Data Cube and OLAp. measures for selecting the best split. eayesian'eellet Networks: Model representation. l-Sem. examples of. model building (Tan) Unit-Vl: Association Analysis: Problem Definition.w. types of data sets.Tech. using bayes theorm for classification. Rule generation. Model over fitting: Due to presence. Data Quality (Tan) Unit-ll: D€ta preprocessing. compact representation of frequent item sets. Jaccard coefficient. Summary Statistics (Tan) Unitlll: Data Warehouse: basic concepts:. evaluating the performance of classifier: holdout method. random sub sampling. Conelation. Data Warehouse implementation efficient Olta cuU6 computation. indexing OLAP data. Algorithm for decision tree induction.

Unit-Vll: Overview. types of clusledng. K -means -additional issues. Basic K-means. k-means and different types of clusters. k_meani as an optimization problem. basic agglomerative hierarchical clustering algorithm. Bisecting k-means. strengths and weaknesses. specific techniques. DBSCAN: Traditionat density: center_based approach] strengths and weaknesses (Tan)

Unit-Vlll: Agglomerative Hierarchical cl'lslelrlg.