You are on page 1of 24

Prediction Model on Watson Studio Using SPSS Modeler

Overview:
The only thing in the current span that is unmeasurable right now is
“DATA” generation and consumption. Over 2.5 quintillion bytes of
data is created every single day. What if we can make this emitted
data useful? by extracting, analyzing, visualizing and managing data
to create insights. These insights help people and businesses to
make powerful
data-driven decision making. Distributed computing and IoT have
led to the generation of massive amounts of data, and social
networking has encouraged most of that data to be unstructured.
There is so much data that human experts cannot keep up with all
the changes and advancements in their fields of study but via With
Watson Service, we can put the information that experts need at
their fingertips, and back that information up with evidence and
prediction.

Learning Objectives:
After Completing this tutorial, you’ll understand how to:
 Clean and understand the data.
 Get familiar with SPSS Modeler.
 Create a predictive model on SPSS Modeler using different
parameters.
 Deploy the model and

Prerequisites:
IBM Cloud account : To work on this tutorial IBM cloud account is
must.

Watson Studio : Must Active the Waston Studio service


Estimated time to
This tutorial will take approximately 45min(Depending on your
network connection)

Steps
Step 1.Create IBM Account
If you don’t have IBM account ,create an account here
 You will receive free credits to use IBM cloud services

Step 2. Create a Watson Studio Account

 Kindly visit this link to create Watson Studio service.

 Kindly let your region as default and put your email as you use to create
IBM cloud account. Click Next.
 Check your email for verification link.
Step 3. IBM and Watson accounts are activated now

 Now must be able to see your name on the Watson Dashboard.


 Next visit your IBM Dashboard.
 Select the menu on left(three dots) , click on “Resources List”.
There you will see three services running.

Step 4. Lets’ create a project in Watson Studio

 On your Watson Studio Dashboard. Click on Get Started

 Select Create a project option , then it will redirected to the page


below.
 Select “Create an empty project” , as it contains all the tools we
require to develop our model. We use “Create a project from a
sample” when we have the previous file that we need continue to
develop.
 Once you have completed above tasks successfully , you will see the
page same as above , kindly follow the next few steps carefully:
1. Give your project a name “Clean, Build & Deploy a Prediction
Model on Watson Studio” and provide description.
2. Make sure a cloud storage instance exists, or add a new IBM
Cloud Object Storage instance by clicking on Add button.

3. It will take you the Object Storage service page , scroll down to
the bottom of page and click on Create the service and now
popup menu is open. Kindly let everything as default. After
everything is successful you will be redirected again to the
current page.
4. Click Refresh. And your new created dashboard will be there.

Step 4. Now we need to upload our dataset


 In this tutorial ,“Titanic dataset” is used which is available here (CHANGE). Save
the csv file to apply the following steps
 Select “Assets”
 Upload titanic_train.csv on the right tab. “Browse”
 After you upload the file it will be shown in Assets tab.

Step 5. Open data refinery

 In assets , you will find your uploaded titanic_train.csv file .


 Press the (three menu dots) to open data Refine.

Step 6. Convert column type


 By default all the datatypes are String. We need to convert age field to
integer.
 On the right of age field you can find (3 dots menu).
 Click Convert Column >Integer

Step 6. Fill missing values


The columns that have missing values in Titanic dataset are Age and Gender
The methods to fulfill the missing values are different for each attribute
depending on the purpose of the attribute.

1) Age
a) Calculate Mean
1. Before we can put the missing value it is important to know the mean of age
field.
2. Select the (3 dots ) right of Age attribute . Click Operation > Aggregate >
Select Column Age.
3. Choose Mean from the dropdown list.
4. Enter new column name “Mean”.
5. Click Apply
6. Kindly copy the mean value and then click on undo. As seen in below image.

a) Replace with Mean


 On Age field select Operation
 Replace missing values
 Click on Value and enter your copied mean value
 Click on Apply. Kindly see your missing values are updated.
2) Sex (Categorical Data) : “For Substring Replacement”
1. Select Profile > Check Frequency
2. From here you can find the missing values.
3. This symbol “?” shows a missing value.
4. Click Operation > Replace Substring > Value type “?”
5. Replace String with “Female” , then Apply.
6. Click Profile again and there will be no missing values anymore.

Step 7. Data Normalization


Normalization is generally required we are dealing with multiple attributes are there but
attributes have values on different scales, this may lead to poor data models while
performing data mining operations. So they are normalized to bring all the attributes on
the same scale.
 Scale From 0 to 1 :
1. We will apply normalization on “Age” attribute .
2. Go to Profile > Age > Note Maximum value (In our case it is 80)
3. Then , Operation > Calculate Column > Next > Operator “Division”
4. Write your maximum value , then create new column for results.
5. Insert new column name ,after that click Apply

 Data Privacy (Optional):


Sometimes you want to encrypt your data that may contain some private
information. Follow the steps:
Select Operation go in Substitute then choose the column you want to change
then Apply

Step 8. Monetarize Your Steps


The best thing about the monetarization is if you find out something went wrong, simply
delete that step from the “Steps” tab.
Step 9. Exploratory Data Analysis (EDA)
We use Exploratory Data Analysis that refers to the critical process of performing analytics
on data so as to discover patterns and identify the assumptions with the help of graphical
representations.

Click on visualization , then enter the column name (one or more).Watson will
automatically suggest the visualization types based on the selected columns. Go and try
amazing visualizations.

Step 10. Now it’s time to convert your Data Refinery flow to
Data assets:
 Note :This is a very important step.
 On the main dashboard , in data refinery section. Click on “Create a
job”

 After that you will be redirected to the below page, once your select
“Create and Run” please wait for a while.

 Now you will be able to see a new data-asset created.

Step 11. Let’s Create SPSS Modeler


With SPSS Modeler flows in Watson Studio, you can quickly develop predictive models
using business expertise and deploy them into business operations to improve decision
making.
SPSS modeler , uses the flows interface in Watson Studio supports the entire data mining
process, from data to better business results. Watson Studio offers a variety of modeling
methods taken from machine learning, artificial intelligence, and statistics.

 . Come to your main Watson Dashboard


 From there click on Add to project (On the top menu)
 Then a window will popup and select “Modeler flow”.

 Once you select “Modeler flow” the image below will be displayed

 Click on create , the below image will be shown.


All the nodes that are available on the left menu can be put on canvas.
First we need to load the data on canvas
1. Select Import then click on Data Asset (Double Click)
2. Then It will be visible on canvas.
3. Again double click , you will see “Change Data Asset” select it.

4. On left menu you will find “Data Assets”.


5. Upload your titanic_train.csv_shaped (the one your refined) file. Click
Save.(Please wait your data will be uploaded)
6. Go in “Field Operations”
 Double click on “Partition”
 Double click on “type”
7. The nodes will be auto connected on canvas , as shown in below image
8. After all the above steps are completed successfully, configure both nodes.
 First click on Partition (that help to choose how much partition you
need for test and train data).

 Then select “Types” (that helps to set data types and target value)
 Our main goal is to predict who will or can survive ,for that our target
variable will be “survived”. (Note :Change it into Categorical type)

Note : Change “survived” variable role to Target.


Step 12. Apply The Machine Learning Algorithm
1. Now we will integrate modeling node , from the left side menu “Modeling” .
There are 40+ algorithms available here , if you are confuse what algorithm to
select then select auto classify.

2. Double click “Auto Classifier” it will automatically connect to the type node:

9. We already set the input and target column in “Type” so we don’t have to configure
anything in “survived” node. Click on the play button on top and wait a minute.

Step 13. Verify Predicted Values

 After loading complete your canvas will look like the image below :
 Once a newly generated node is there , click 3 dots at top right corner of node and
select “View Model” .

Above image shows different estimators(algorithms that are auto-classified) with their
best accuracy. For more details click on C5.0
Go back , hover on node again and choose “Preview” again. Below image will be shown.

In above image you can identify now the predications made for each data in the “$XF-
Survived” column along with the confidence score of each prediction in the “$XFC-
Survived” column.
Step 14. Apply analytics to get graphical representation

Now navigate to the left side menu again and from “Outputs” choose “Analysis” node and
double click.

Run the SPSS flow again.

Open the “Analysis of [Survived]” and you will be redirected to the confusion matrix of the
model.

Go back and hover again to Analysis node and click on 3 dots , Save branch as a model.
 It will ask you to create a “Watson Machine Learning” Service.

 Click on it , and create a new ML service.


 Once the service is created , will be redirected to the below page.
Step 15. Now we need to deploy our model
 On your Watson dashboard , under models section.
 You can see your model created successfully , on right three dots
click on “Deploy”.

 Insert name and description(Optional). Then click “Create”.


 Now under Deployments section there is a new created
deployment.

Step 16. Lets test our model

 Select your deployment model.


 Select test section , enter the values as below or you can but random
values to.
 In implementation , you can use your model in your application
multiple languages are supported.
Conclusion
In this tutorial we have learned how to understand the data , their
relationships with each other , apply refine methodology for cleaning
data then used , SPSS modeler to utilize already created predictive
machine learning algorithms that is perfect for our model ,in this case
we have used auto-classifier and then simply deployed our model in
Watson studio so that we can implement the model in our
applications.

You might also like