Professional Documents
Culture Documents
5 Data Mining
1
What is Data Mining
2
What is Data Mining
3
Real-life Examples in Data Mining
▪ Market basket analysis
▪ Stock Market Analysis
▪ Weather forecasting
▪ Fraud Detection
▪ Intrusion Detection (Network Systems)
▪ Financial Banking
▪ Surveillance
▪ Online Shopping
▪ Criminal Investigation
▪ Bioinformatics
▪ Scientific research
▪ Health Care and Insurance
▪ Geological exploration and mineral exploration
4
Data Mining Techniques
▪ Classification
▪ Clustering
▪ Anomaly or Outlier Detection
▪ Association Rule Learning
▪ Regression Analysis
▪ Prediction/Induction Rule
▪ Summarization
▪ Sequential Patterns
▪ Decision Tree Learning
▪ Tracking Patterns
▪ Statistical Techniques
▪ Neural Networks
5
Steps of Data Mining
6
Steps of Data Mining
7
Steps of Data Mining
8
Steps of Data Mining
▪ Preparing and pre-processing
▪ The data preparation process consumes about 90% of the time of the
project.
▪ The data from different sources should be selected, cleaned, transformed,
formatted, anonymized, and Reconstructed (if required).
▪ Data cleaning is a process to "clean" the data by smoothing noisy data
and filling in missing values.
▪ For example, for a customer demographics profile, age data is missing.
The data is incomplete and should be filled. In some cases, there could
be data outliers. For instance, age has a value 300. Data could be
inconsistent. For instance, name of the customer is different in different
tables.
▪ Data transformation operations change the data to make it useful in data
mining.
9
Steps of Data Mining
▪ Modeling
▪ Based on the business objectives, suitable modeling
techniques should be selected for the prepared dataset.
▪ Create a scenario to test check the quality and validity of
the model.
▪ Run the model on the prepared dataset.
▪ Results should be assessed by all stakeholders to make
sure that model can meet data mining objectives.
10
Steps of Data Mining
11
Steps of Data Mining
12
Data Mining Challenges
13
Benefits of Data Mining
15
Popular Data Mining Tools
16
Popular Data Mining Tools
▪ RapidMiner and R are at the top of their games in terms of popularity and usage.
RapidMiner tends to be the preferred choice for startups and next-gen “smart
plant” manufacturers. Mobile apps and chatbots tend to depend on this software
platform for machine learning, rapid prototyping, app development, text
mining and predictive analytics for customer experience.
▪ Python is most often compared to R for ease of use. Unlike R, Python’s learning
curve tends to be so short it’s become legendary. Many users find that they can
start building data sets and doing extremely complex affinity analysis in minutes,
making this an extremely effective and efficient data mining tool. The most
common business-use case-data visualizations are straightforward as long as
you are comfortable with basic programming concepts like variables, data types,
functions, conditionals and loops.
17
Bibliography
▪ Patrizio, A. (2019). Top 15 Data Mining Techniques for Business Success. [online]
Datamation.com. Available at: https://www.datamation.com/big-data/data-mining-
techniques.html /> [Last Accessed 2 May 2020].
▪ Alton, L. (2019). The 7 Most Important Data Mining Techniques. [online]
Datasciencecentral.com. Available at:
https://www.datasciencecentral.com/profiles/blogs/the-7-most-important-data-
mining-techniques /> [Last Accessed 2 May 2020].
▪ LEARNTEK. (2019). Data Mining Examples and Data Mining Techniques |
Learntek. [online] Available at: https://www.learntek.org/blog/data-mining-
examples-and-techniques/ /> [Last Accessed 2 May 2020].
▪ Marr, B. (2015) Big Data: Using Smart Big Data, Analytics and Metrics to Make
Better Decisions and Improve Performance. 1st Ed. John Wiley & Sons, Ltd.
▪ Kaufmann,M. (2011)Data Mining Concepts and Techniques 3rd Ed. Elsevier
18
Next Week
19