You are on page 1of 4
91372020 OD Process in Data Mining - GeeksforGeeks OG GeeksforGeeks Q KDD Process in Data Mining Last Updated: 20-08-2019 Data Mining - Knowledge Discovery in Databases(KDD). Why we need Data Mining? Volume of information is increasing everyday that we can handle from business transactions, scientific data, sensor data, Pictures, videos, etc. So, we need a system that will be capable of extracting essence of information available and that can automatically generate report, views or summary of data for better decision-making. Why Data Mining is used in Business? Data mining is used in business to make better managerial decisions by: + Automatic summarization of data + Extracting essence of information stored. * Discovering patterns in raw data. Data ing also known as Knowledge Discovery in Databases, refers to the nontrivial extraction of implicit, previously unknown and potentially useful information from data stored in databases. Steps Involved in KDD Process: We use cookies to ensure you have the best browsing experience on our website, By using our ste, Got It! ‘acknowledge that you have read and understood our Cookie Policy & Privacy Policy hitpsswo geeksforgeeks.orghkde-process-indala-mining! 18 ‘91572020 OD Provess in Data Mining - GeekstorGecks Pattern Evaluation Data Minit x Pattern Data Selection and Transformation Task Relevant Data — Data Cleani Data Warehouse Data Integration Databases KDD process We use cookies fo ensure you have the best browsing experience on our website, By using our site, Got It! you acknowledge that you have read and understood our Cooke Policy & Privacy Policy hitpsswo geeksforgeeks.orghkde-process-indala-mining! 26 212020 OD Process in Data Mining -GeekstorGeeks 1, Data Cleaning. Data cleaning is defined as removal of noisy and irrelevant data from collection. * Cleaning in case of Missing values. * Cleaning noisy data, where noise is a random or variance error. + Cleaning with Data discrepancy detection and Data transformation tools. 2. Data Integration: Data integration is defined as heterogeneous data from multiple sources combined in a common source(DataWarehouse). * Data integration using Data Migration tools. * Data integration using Data Synchronization tools. + Data integration using ETL(Extract-Load-Transformation) process. 3, Data Selection: Data selection is defined as the process where data relevant to the analysis is decided and retrieved from the data collection. * Data selection using Neural network. * Data selection using Decision Trees. * Data selection using Naive bayes. * Data selection using Clustering, Regression, etc. 4, Data Transformation: Data Transformation is defined as the process of transforming data into appropriate form required by mining procedure, Data Transformation is a two step process: * Data Mapping. Assigning elements from source base to destination to capture transformations. + Code generation: Creation of the actual transformation program 5, Data Mining. Data mining is defined as clever techniques that are applied to extract patterns potentially useful * Transforms task relevant data into patterns, + Decides purpose of model using classification or characterization. 6. Pattern Evaluation. Pattern Evaluation is defined as as identifying strictly increasing patterns representing knowledge based on given measures. * Find interestingness score of each pattern. + Uses summarization and Visualization to make data understandable by user. 7. Knowledge representation. Knowledge representation is defined as technique which utilizes visualization tools to represent data mining results. + Generate reports. * Generate tables. + Generate discriminant rules, classification rules, characterization rules, etc. We use cookies to ensure you have the best browsing experience on our website, By using our sit Got It! ‘acknowledge that you have read and understood our Cookie Policy & Privacy Policy hitpsswwu-geckstorgocks.orghkde-proce data-mining! a6 ‘72020 00 Process in Data Mining -GoekstorGooks * KDD is an iterative process where evaluation measures can be enhanced, mining can be refined, new data can be integrated and transformed in order to get different and more appropriate results. + Preprocessing of databases consists of Data cleaning and Data Integration References. Data Mining: Concepts and Techniques Attention reader! Don't stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready. Recommended Posts: Data Mining Process Data Mining: Data Warehouse Process Difference Between Data Mining and Text Mining Difference Between Data Mining and Web Mining Types of Sources of Data in Data Mining Data Mining: Data Attributes and Quality Difference Between Data Science and Data Mining Difference Between Data Mining and Data Visualization Difference between Data Warehousing and Data Mining Data Integration in Data Mining Data Normalization in Data Mining Data Preprocessing in Data Mining Data Transformation in Data Mining Difference Between Big Data and Data Mining Data Reduction in Data Mining Data Mining | Set 2 Data Mining Challenges of Data Mining Numerosity Reduction in Data Mining Difference Between Data Mining and Statistics We use cookies to ensure you have the best browsing experience on our website, By using our she, Got It! you acknowledge that you have read and understood our Cookie Policy & Privacy Policy, hitpsswo geeksforgeeks.orghkde-process-indala-mining!

You might also like