You are on page 1of 30

• A visual design environment for rapidly building complete

predictive analytics workflows


• Provide deep library and integrated environment for
machine learning algorithms, data preparation and
exploration functions, and validation tool to support all data
science projects and use cases.
• Widely used for business and industrial applications,
research, education, training, rapid prototyping and
application development
27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 1
Follow tutorials

Create a new Open an existing


process process

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 2


GUI (graphical user interface) Intro

Tips : When create new


Repository, always
make 2 files (data and
process)
Process panel :
Parameters : Modify
Repository : Storage Design any process
behavior for the process
RapidMiner Studio (ETL – Extract
for data and process transform load)

Operators : Known as
building block, used to Explanation for
create RapidMiner any help process
processes

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 3


(At middle top) Views Selector
 Design : design the process (Project Area)

 Result : The result from the Design of the process.

 Turbo Prep : designed to make data preparation easier.

 Auto Model/Hadoop Model/etc : extensions in RapidMiner

 Deployment : make our model available and can be accessed by others

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 4


GENERAL PROCESS
Compare
Import/Loading Data
Modeling Model
the data preparation
(Performance)
1. Generalized
Linear Model
2. Gradient
1. Filter Boosted Model
2. Map 3. Decision Tree
3. Replace Missing 4. Random Forest
Value Depends on the
5. K-Means
4. Data Type problem/task:
Depends on the dataset 6. Artificial
Conversions -Classification
Neural
5. Generate -Prediction
Networks
Attributes - Clustering
7. Support Vector
6. Set Roles Machine
7. Select Attributes 8. Logistic
Regression
27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 5
9. Naïve Bayes
Case Study: Customer Churn
Prediction
Customer churn tends to encourage customers to abandon a brand and
stop being a paying client of a particular business. It is a major problem and
one of the main concerns for large companies. Due to the direct effect on
the companies' revenues, especially in the telecom field, companies seek to
develop means to predict potential customer churn. Therefore, finding
factors that increase customer churn is important to take necessary actions
to reduce this churn.

Task: Develop a churn prediction model using Machine Learning Models


which assists the telco company to predict customers who are most likely
subject to churn.
27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 6
LET’S START!
1. Creating Workspace of the First Project

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 8


2. Create sub-folders for repository

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 9


3. Importing Data (using data folder)

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 10


Importing Data (using Process workspace)

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 11


27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 12
ETL PROCESS
E X T R A C T, T R A N S F O R M , L O A D
(ETL)
• process prepares the data for machine learning through various operators.
• The required operators completely depend on the kind and state of the
imported/loaded data
• Includes:
• Filter
• Map
• Replace Missing Value
• Data Type Conversion
• Generate Attributes
• Set Roles
• Select Attributes

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 14


1 . F I LT E R
2

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 15


27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 16
2. MAP

3
2
1

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 17


27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 18
3 . R E P L A C E M I S S I N G VA L U E

3
1 2

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 19


27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 20
4 . D ATA T Y P E C O N V E R S I O N

3
2

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 21


27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 22
5 . G E N E R AT E AT T R I B U T E S

2 3
1

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 23


27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 24
6. SET ROLE
1 2 3

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 25


27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 26
7 . S E L E C T AT T R I B U T E S

2
3

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 27


27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 28
27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 29
SAVE OUR ETL PROCESS

27/01/2023 By: Dr Azlin Ahmad (FSKM, UiTM) 30

You might also like