Algorithms For ML

Algorithm 1: Decision Tree
1. For the input, make the following assumptions: Training data T consisting of feature vectors and
corresponding class labels, max_depth denotes the maximum depth of the tree, min_samples_split
denotes the minimum number of samples required to split an internal node, and min_samples_leaf.
denotes minimum number of samples required to be at a leaf node.
2. Create a Decision Tree using the training data T with the specified hyperparameters (max_depth,
min_samples_split, and min_samples_leaf).
3. Generate a Decision Tree model D.
4. Train the Decision Tree model D:
a. Start by considering the entire training data T at the root node.
b. At each internal node of the tree, evaluate potential splits for each feature by computing a splitting
c criterion, e.g., Gini impurity or information gain.
c. Select the feature and split point that minimizes the splitting criterion.
d. Create child nodes for the selected feature and split point.
e. Recursively repeat steps b to d for each child node until one of the stopping conditions is met:
f. Assign the class label to each leaf node based on the majority class of the samples in that leaf node.
5. Return the trained Decision Tree model D.
Algorithm 2: Gradient Boosting

1. Initialize the ensemble model F(x) to a constant value.
2. From t = 1 to T_max:
a. Compute the negative gradient of the loss function with respect to the current ensemble
model:
∂ L ( yi , F(x ) )
i. r =−⌊ i
¿ for all data points(x_i, y_i) in T.
¿
∂F ¿¿
b. Train a base learner (e. decision tree) using the training data T and the negative gradients r i (t)
as target values. This learner models the residual errors.
c. Update the ensemble model:
i. F ( x )=F ( x ) +η h ( x)
ii. where, h(x) is the prediction of the trained base learner.
d. Repeat steps a to c for T_max iterations.
3. Return the trained gradient boosting ensemble model F(x).
Algorithm 3: k-Nearest Neighbour

1. Make the following assumptions: Training data T consisting of feature vectors and corresponding class
labels, a positive integer value k representing the number of neighbours to consider, and an input query
point Q for which we want to determine the class label.
2. Create a k-NN classifier using the training data T.
3. For each query point Q in the dataset or a new input point:
a. Calculate the distances between Q and all data points in T using a distance metric.
b. Select the k-nearest data points in T based on the calculated distances.
c. Determine the class labels of the k-nearest neighbours.
d. Assign the class label to Q by using a majority voting scheme, where the class label with the
most occurrences among the k-nearest neighbours is selected as the predicted class label. In
case of a tie, you can use various tie-breaking strategies (e.g., selecting the class of the nearest
neighbour or random selection).
4. Return the predicted class label for the input query point Q.
Algorithm 4: Naïve Bayes
labels, and a set of class labels C.
2. Calculate the prior probabilities P(C_i) for each class C_i in C.
3. Count the number of training examples in T belonging to each class C_i.
4. Divide the count by the total number of training examples to compute P(C_i).
5. For each feature f_j in the feature vector:
a. Calculate the class-conditional probability P(f_j | C_i) for each class C_i:
b. Count the number of training examples in class C_i where the feature f_j occurs.
c. Divide the count by the total number of training examples in class C_i to compute
P(f_j | C_i).
6. Store the calculated prior probabilities P(C_i) and class-conditional probabilities P(f_j | C_i).
7. For a given input feature vector x:
a. For each class C_i in C:
b. Calculate the posterior probability P(C_i | x) using Bayes' theorem:
c. P(C_i | x) = (P(C_i) * ∏[P(f_j | C_i)]) / P(x)
8. Store the posterior probability for each class.
9. Assign the class label C_i with the highest posterior probability as the predicted class for the input
feature vector x.
10. Return the trained Naive Bayes classifier, which includes the prior probabilities P(C_i) and class-
conditional probabilities P(f_j | C_i) for each class C_i in C.
Algorithm 5: Artificial Neural Networks
1. For the input make the following assumptions: Training data T (consisting of input features and
corresponding target labels), number of hidden layers L, number of neurons per hidden layer N,
learning rate α, and number of epochs E.
2. Initialize the weights and biases for each layer randomly
for epoch in 1 to E do:
for each data point (x, y) in T do:
3. Set the input layer values as the features of data point x
for l in 1 to L do
4. Compute the weighted sum and apply activation function for layer l. Pass the activation to the next l.
5. Compute the error between the predicted output and actual target y.
6. Update the weights and biases for each layer using gradient descent:
for l in L to 1 do.
11. Compute the gradient of the loss with respect to the weights and biases for layer l.
12. Update the weights and biases for layer l using the gradient and learning rate
13. end for
14. end for
15. Return the trained neural network model
Algorithm 6: Random Forest classifier

labels, the number of trees in the forest as N_trees, the number of features to consider for each split, as
max_features, the minimum number of samples required to split a node as min_samples_split, and the
maximum depth of each tree as max_depth.
2. Create an empty list, `forest`, to hold the ensemble of decision trees.
3. For i = 1 to N_trees:
a. Generate a random subset of the training data T by sampling with replacement. This subset is
referred to as T_i.
b. Create a decision tree D_i using T_i:
i. Initialize the root node of D_i.
ii. Recursively grow the tree as follows:
A. If the current node has fewer samples than min_samples_split or reaches max_depth, make it a
leaf node and assign the majority class label of the samples in the node.
B. Otherwise, randomly select max_features features from the dataset.
C. Find the best feature and split point among the selected features based on a criterion like Gini
impurity or entropy.
D. Create two child nodes for the current node, one for samples that satisfy the split condition and
another for samples that do not.
E. Recursively repeat steps A to D for the child nodes.
c. Add the decision tree D_i to the `forest` list.
3. Return the trained Random Forest classifier, which is an ensemble of the decision trees in the `forest` list.
When making predictions, each tree in the forest votes on the class label, and the majority class label is assigned
as the final prediction.
Algorithm 7: Support Vector Machine

1. Given a dataset with input features x and corresponding labels. Choose a kernel function and set
hyperparameters like regularization parameter
2. Compute the kernel matrix (Gram matrix) based on the chosen kernel function.
3. Define the optimization problem to maximize the margin between classes subject to constraints.
N
4. Maximize: W ( a )= ∑ α i− 12 ∑ α i α j y i y j K ( x i , x j )
i=1 i, j
Subject to: 0 ≤ α i ≤ C and ∑ α i y i=0 where α i are the Lagrange multipliers.

i
5. Compute the weight vector w and bias term b from the Lagrange multipliers α i and support vectors.
w=∑ α i xi y i
i
1
b= ∑ ¿ ¿ ¿
|S| iϵS
where S is the set of support
6. Given a new input sample x , predict the class label:
T
y pred =sign(w x +b)

Algorithms For ML

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Algorithms For ML

Uploaded by

Copyright:

Available Formats

Algorithm 1: Decision Tree

Algorithm 2: Gradient Boosting

Algorithm 3: k-Nearest Neighbour

Algorithm 5: Artificial Neural Networks

Algorithm 6: Random Forest classifier

c. Add the decision tree D_i to the `forest` list.

Algorithm 7: Support Vector Machine

Subject to: 0 ≤ α i ≤ C and ∑ α i y i=0 where α i are the Lagrange multipliers.

You might also like