You are on page 1of 5

4.2.

Data Mining Concepts and Applications


PredictiveAnalytics I: Data Mining Process, Methods, and
Algorithms
“Data mining will become much more important and companies will
throw away nothing about their customers because it will be so valuable.
If you are not doing this, you are out of business.”
Dr. Arno Penzias (1999)

• Companies such as Amazon use analytics to better understand their


customer so they can maximize their ROI.
• “Understanding the customer” can be done by analyzing the vast
amount of data that a company collects  Data Mining

Data Mining = discovering knowledge


Characteristics
• Data is often buried deep within very large databases.
• The data mining environment is usually a client/server architecture or a
Web-based information system architecture.
• The mines is often an end user, witch little or no programming skill.
• User should think creatively throughout the process of data mining,
including the interpretation of the findings.
• Data mining tools are readily combined with spreadsheets or other
tools, so the minded data can be analyzed and deployed quickly and
easily.
• Sometimes it is necessary to use parallel processing for data mining.
Types of Patterns
Predictions
They tell the nature of future Example: predicting the winner of
occurrences based on what has the Super Bowl or forecasting the
happened in the past. weather.
Algorithms: Classification and Regression Trees, ANN, SVM, Genetic Algorithms

Classification
Classification is a data mining function that
assigns items in a collection to target categories or Example: identify loan applicants
classes. The goal of classification is to accurately as low, medium or high.
predict the target class for each case in the data.

Algorithms: Decision Trees, ANN/MLP, SVM, Rough Sets, Genetic Algorithms


Clustering
Clusters identify natural groupings of Example: assigning customers in different
things based on their known segments based on their demographics and
characteristics. past purchase behaviours.

Algorithms: K-means, Expectation Maximization (EM)

Associations
It finds the commonly co- Example: beer and diapers going
occurring groupings of things. together in market-basket analysis.
Algorithms: Apriori, OneR, ZeroR, Eclat

Sequential Relationships
It discovers time-ordered Example: predicting that a customer
in a bank will open a savings and an
events. investment account within year.
Algorithms: Apriori, FP-Growth technique

You might also like