You are on page 1of 3

Types of machine learning

Supervised learning
Uses data observation to predict the target variable. Given X, can we predict Y?

Classification
When Y is categorical. Ex: Yes / No, Fish/Dog/Cat.
We attempt to predict a categorical variable or a class.

Regression
Built to predict customer purchases in the next month.
When Y is continuous. Ex: Purchases, Clicks, Time spent on the website.

Unsupervised learning
Given X, can we detect patterns and clusters that are homogenous?
There is no target variable, and the models use different data points about observations to
group them into similar clusters.
A popular use case is customer segmentation based on their product purchase history.

Reinforcement learning
Given a current state and a number of potential actions, which path maximizes the reward?
The models in this space have agents that act on their own to maximize rewards defined by the
environment. They are used in robotics and other advanced fields.

Supervised learning data parts and


steps
Supervised learning models require two key data elements.
The first one is the target variable which is what we want to predict.
Define the target (dependent variable or Y) - what do we want to predict?
● Example 1 - which customers will churn? [CLASSIFICATION]
● Example 2 - which customers will buy again. [CLASSIFICATION]]
● Example 3 - how much will customers spend in the next 30 days? [REGRESSION]
The second data element is the features that will be used to predict the target variable.
Collect features (independent variables or X) which could have predictive power:
● Example 1 - Purchase patterns prior to churning.
● Example 2 - Number of missed loan payments prior to defaulting on a loan.

Unsupervised learning
Collect usage or purchase data and run a model to identify homogenous groups i.e. clusters or
segments of data:
● Example 1 - customer segmentation by their product purchases.
● Example 2 - product segmentation for bundling.

Supervised learning steps


1. Split data to training and testing
2. Initialize the model
3. Fit the model on the training data (Training the model)
4. Predict values on the testing data
5. Measure model performance on testing data

Code:
from sklearn import tree
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Split
train_X, test_X, train_Y, test_Y = train_test_split(X, Y, test_size=0.25)

# Initialize
mytree = tree.DecisionTreeClassifier()

# Fit
treemodel = mytree.fit(train_X, train_Y)

# Predict
pre_Y = treemodel.predict(test_X)

# Measure
accuracy_score(test_Y, pred_Y)
Unsupervised learning steps
1. Initialize the model
2. Fit the model
3. Assign cluster values
4. Explore results
Code:
from sklearn.cluster import KMeans
Import pandas as pd

# Initialize the model


kmeans = Kmeans(n_clusters=3)

# Fit the model


kmeans.fit(data)

# Assign cluster values


data.assign(Cluster=kmeans.labels_)

# Explore results
data.groupby('Cluster').mean()

You might also like