Professional Documents
Culture Documents
Page 1 of 11
Topics Covered
Page 2 of 11
Data Science Algorithms
1. Linear Regression
Linear Regression is a method of measuring the relationship between two continuous variables.
The two variables are –
y = mx + c
Where m is the slope and c is the intercept.
Based on the predicted output and the actual output, we perform the calculation
2. Logistic Regression
Page 3 of 11
We generate this with the help of logistic function –
1 / (1 + e^-x)
Here, e represents base of natural log and we obtain the S-shaped curve with values between 0
and 1. The equation for logistic regression is written as:
3. K-Means Clustering
Page 4 of 11
Where J is the objective function of the centroid of the cluster. K are the number of clusters and
n are the number of cases. C is the number of centroids and j is the number of clusters. X is the
given data-point from which we have to determine the Euclidean Distance to the centroid. Let
us have a look at the algorithm for K-means clustering –
First, we randomly initialize and select the k-points. These k-points are the means.
We use the Euclidean distance to find data-points that are closest to their centreW of the
cluster.
Then we calculate the mean of all the points in the cluster which is finding their
centroid.
We iteratively repeat step 1, 2 and 3 until all the points are assigned to their respective
clusters.
Page 5 of 11
4. Principal Component Analysis
One of the most important part of data science is dimension. There are several dimensions in
data. The dimensions are represented as n.
For example, suppose that as a data scientist working in a financial company, you have to deal
with customer data that involves their credit-score, personal details, salary and hundreds of
other parameters. In order to understand the significant labels that contribute towards our
model, we use dimensionality reduction. PCA is a type of reduction algorithm.
With the help of PCA, we can reduce the number of dimensions while keeping all the important
ones in our model. There are PCAs based on the number of dimension and each one is
perpendicular to the other (or orthogonal). The dot product of all the orthogonal PCs is 0.
Page 6 of 11
5. Support Vector Machines
Support Vector machines are powerful classifiers for classification of binary data. They are also
used in facial recognition and genetic classification. SVMs have pre-built regularization model
that allows data scientists to SVMs automatically minimize the classification error. It, therefore,
helps to increase the geometrical margin which is an essential part of an SVM classifier.
Support Vector Machines can map the input vectors to n-dimensional space. They do so by
building a maximum separation hyperplane. SVM’s are formed by structure risk minimization.
There also two other hyperplanes, on either side of the initially constructed hyperplane. We
measure the distance from the central hyperplane to the other two hyperplanes.
Neural Networks are modeled after the neurons in the human brain. It comprises many layers
of neurons that are structured to transmit information from the input layer to the output layer.
Between the input and the output layer, there are hidden layers present. These hidden layers
can be many or just one. A simple neural network comprising of a single hidden layer is known
as Perceptron.
Page 7 of 11
In the above diagram for a simple neural network, there is an input layer that takes the input in
the form of a vector. Then, this input is passed to the hidden layer which comprises of various
mathematical functions that perform computation on the given input. For example, given the
images of cats and dogs, our hidden layers perform various mathematical operations to find the
maximum probability of the class our input image falls in. This is an example of binary
classification where the class, that is, dog or cat, is assigned its appropriate place.
7. Decision Trees
With the help of decision trees, you can perform both prediction and classification. We use
Decision Trees to make decisions with a given set of input. Understand decision tree with the
help of the following example:
Suppose you go to the market to buy a product. First, you assess if you really need the product,
that is, you will go to the market only if you do not have the product. After assessing it, you will
determine if it is raining or not. Only if the sky is clear, you will go to the market, otherwise,
you will not go. We can observe this in the form of a decision tree-
Page 8 of 11
Using the same principle, we build a hierarchical tree to reach a result through a process of
decisions. There are two steps to building a tree: Induction & Pruning. Induction is the process
in which we build the tree, whereas, in pruning, we simplify the tree by removing complexities.
Recurrent Neural Networks are used for learning sequential information. These sequential
problems consist of cycles that make use of the underlying time-steps. In order to compute this
data, ANNs require a separate memory cell in order to store the data of the previous step. We
use data that is represented in a series of time-steps. This makes RNN an ideal algorithm for
solving problems related to text processing.
In the context of text-processing, RNNs are useful for predicting future sequences of words.
RNNs that are stacked altogether are referred to as Deep Recurrent Neural Networks. RNNs are
used in generating text, composing music and for time-series forecasting. Chatbots,
recommendation systems and speech recognition systems use varying architectures of
Recurrent Neural Networks.
Page 9 of 11
9. Apriori
In 1994, R. Agrawal and R. Srikant developed the Apriori Algorithm. This algorithm is used for
finding frequently occurring itemsets using the boolean association rule. This algorithm is
called Apriori as it makes use of the ‘prior’ knowledge of the properties in an itemset. In this
algorithm an iterative approach is applied. This is a level-wise search where we mine k-
frequently occurring itemset to find k+1 itemsets.
Apriori makes the following assumptions –
Support
Confidence
Lift
Support is a measure of the default popularity (which is a result of frequency) of an item ‘X’.
Support is calculated through the division of the number of transactions in which X appears
with the total number of transactions.
We can define the confidence of a rule as the division of the total number of transactions
involving X and Y with the total number of transactions involving X.
Lift is the increase in the ratio of the sale of X when you sell the item Y. It is used to measure the
likelihood of the Y being purchased when X is already purchased, taking into account the
popularity of the item Y.
Session III
Page 10 of 11
1) Explain the Linear Regression Algorithm?
2) Explain the Logistic Regression Algorithm?
3) Explain the K-means clustering Algorithm?
4) Explain the Principal Component Analysis Algorithm?
5) Explain the Support Vector machines Algorithm?
6) Explain the Artificial Neural Networks Algorithm?
7) Explain the Decision Trees Algorithm?
8) Explain the Recurrent Neural Networks Algorithm?
9) Explain the Apriori Algorithm?
Page 11 of 11