You are on page 1of 4

114

COIVPU;ER SCIENCE & ENGINE-RINC Jdr) ]CIU

JAI'\IAHARLAL NEHRU TECH NOLOGICAL UNIVERSITY HYDERABAD lV Year B.Tech. CSE - I Sem T'P'D C

L 4

1t-t-

4

(57048) DATAWAREHOUSING AND DATA MINING
UNIT I

lntroduction: Fundarnentals of data mlning, Data Mining Functionalities, Classification of Data Mining systems, Data Mining Task primitives, lntegration ofa Oata Mining System with a Database ora Data Warehouse System, Major issues in Daia Mining. Data Preprocesslng: Need for Preprocessing the Oata, Data Cleaning,
Data lntegration and Transformation, Data Reduction, Oiscretization and Concept Hierarchy Generation.
UNIT II

)'r
I ,

tY
\)

Data Warehouse and OLAP Technology

for Data Mining: Data'l

Warehouse, Multidimensional Data Model, Oata Warehouse Architecture, Data Warehouse lmplementation, Further Development of Data Cube \ Technology, From Data Warehousing to Data Data Cube Computation and Data Generalization: Efficient Methods for^\ Oata Cube Computation, Further Development of Data Cube and OLep ) I Technology, Attribute-Odented

Mining

lnduction.

UNIT III

Mining Frsquent Paiterns, Associatlons and- Gorrslations,

Srsl:l-

Concepts, Effcient and Scalable Frequent ltemset Mining MetnoOs, Uining various kinds of Association Rules, FromAssociation Mining to Conelation

"\
,,i

q\

v\

Analysis, Constraint-Based Association Mining
UNIT

IV

Classification and Prediction: lssues Regarding Classification and\ Prediction, Classification by Oecision Tree tnduction, Bayesian/ Classification, Rule-Based Classification, Ctassification bt{_ Backpropagation, Support Vector Machines, Associative Classification,'\
Lazy Learners, Other Classification Methods, Prediction, Accuracy and Enor measures, Evaluating the accuracy of a Classifier or a Predictor, Ensemble
I

\

unitv

Methods

/

f

Cluster Analysis lntroduction :Types of Oata in Ciuster Analysis, A I Categorization of Maior Cluslering Methods, Partitioning Methods. t Hierarchical Methods, Density-Based Methods, Grid-Based lvlethods.

\

)

pang-Ning Tan.Diwakar. Tsxt and Web Oata: Muttidimensionat Analysis and Descriptive Mining of Complex Data Objects. Multimedia Data Mining. 4. Jiawei Han & Micheline Kamber.A. W. M Shawkat Ali and S.Arun K puiari. Etsevie(2. Pearson education. Spatlal. V.Concepts and Techniques . [rining Sequence patterns in Biologicat Data.?Oi(f Model-Based Clust€ring Methods.The Architecture forthe nextgeneraion of Data Warehousing. TEXT BOOKS: 1. Ou!ie. Text Mining. Data Warehousing in the RealWorld.A. Tims Ssriss and Saquonce Data: Minjng Data Streams.Neushloss. Graph Miningr. Oata Mining. Mintng the WorlO WiOe Mining: I uNtTvlt Weff Applicatlons and Tr. 2006. Distributed by SPD.. Ctustering High-Oimensionat Data.nds in Data Mining: Data Mining Applications. lntroduction to Oata Mining . S.B. .H.paulraj ponnaiah Wiley student Edition The Data Warehouse Life cycte Toot kit . Data Mining System Products and Research prototypes. Additional Themes { on Data Mining and Social lmpacts oi Oata Mining. Constraint-Based Cluster Anatysis. Michael Steinbach and Vipin Kumar. Data Mining lntroductory and advanced topics -Margaret H Ounham.Oxford University press. Cengage Leaming.\ Social Network Analysis and Muttirelational Data UNITVII Mlning Object.145 COVPUTER SCiENCE & ENGINEERING ](xI. 2005.Z.Wasimi. K.VPudi and P. Mining Time-Series Oata.Radha Krishna. John Witey & Sons lnc.Ratph Kimball Witey siudent editio n Building the Oata Warehouse By Willjam H Inmon. O. Elsevier. lnsight into Data Mining. Spatial Datr Mining.pSoman. 7. Pearson education 8.2OOB. G. Analysis.Strauss. 10. Mining Sequence patterns in Transactional Databases.lnmon. Data Warehousing Fundamenlals .Edition. 6 Press. Data Mining Techniques. 2.0.edition. pHl. REFERENCEB(EKS: 1. Universities . 2.I. Morgan Kaufmann publishers.Ajay. Data Mining .SamAanhory & Oennis Munay Pearson Edn Asia. Data Mining:Methods and Techniques. 9. Data Warehouse 2. UNITVI Mlning Streams. Muttimedia. 5.

building aiecision tree. motivating challenges. Measures of Similarity and Oissimilarity: Basics. DATA WAREHOUSING AND DATA MINING Unit-l: lntroduciion to Data Mining: What is data mining. partiai materialization. due to lack of represeniation samples. meihods foi expressing attribute test conditions. Fp-Growth Algorithmj. efficient processing of OLAp queries. dissimilarities between data objects. candidate generation and pruning. Frequent ltem set generation in the Apriori algorithri. Gan) . proximity-measures: similarity . cioss_ validation. Cosine similarity. support counting (eluding support counting using a Hash tree) . Exploring Data : Data Set. Frequent ltem-set generation_ The Apriori principle . Decision Tree induction: working of decision tree. bootstrap. (l-an) Unit-V: Classification-Alternative techniques: Bayesian Classifier: Bayes theorem. Computer Science Engineering. data mining tasks. similarity and dissimilarity between simple attributes. ( H & : i) Uniuv: Classification: Basic Concepts. origins of data mining.2010-201 1 academic year JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA lV Year B.e. simiririties between data objects. Types of Data-attributes and miasuremints.of noise.t. Bayes erior rate. Extended Jaccard ctefficient."""rr"" for binary data. Nai've Bayes classifier. General approach to solving a classification problem. Data Warehousing Modeling: Data Cube and OLAp. measures for selecting the best split. eayesian'eellet Networks: Model representation. l-Sem. examples of. model building (Tan) Unit-Vl: Association Analysis: Problem Definition.w. types of data sets.Tech. using bayes theorm for classification. Rule generation. Model over fitting: Due to presence. Data Quality (Tan) Unit-ll: D€ta preprocessing. compact representation of frequent item sets. Jaccard coefficient. Summary Statistics (Tan) Unitlll: Data Warehouse: basic concepts:. evaluating the performance of classifier: holdout method. random sub sampling. Conelation. Data Warehouse implementation efficient Olta cuU6 computation. indexing OLAP data. Algorithm for decision tree induction.

Pearson Data l\rining . Jiawei Han . k-means and different types of clusters. Elsevier REFERENCE BOOKS: 2. Data Warehousing. pHl. Soman. basic agglomerative hierarchical clustering algorithm. TMH Data Mining Theory and practice. pHl.w. l. 3/e. 3. Data Mining & OLAP. . Vipin Kumar. lntroduction to Data Mining : pang-Ning tan. Bisecting k-means. strengths and weaknesses. k_meani as an optimization problem. Basic K-means.f.20 1 0-201 1 academic year Unit-Vll: Overview. pearson. Micheline Kamber .Concepts and Techniques. Data Mining : lntroductory and Advanced Topics : Dunham. 2. Michael Steinbach. . Alex Berson.types of clusledng. specific techniques. Siidhar.2006. Diwakar. K -means -additional issues. DBSCAN: Traditionat density: center_based approach] strengths and weaknesses (Tan) TEXT BOOKS: l.e. lntroduction to Data Mining with Case Studies 2nd ed: GK Gupta. 4. Stephen J Smith. Unit-Vlll: Agglomerative Hierarchical cl'lslelrlg. Aiay.