You are on page 1of 9

KMEANS

K-means aims to minimize the variance (sum of squared distances) within each cluster. It's worth noting that the
algorithm is sensitive to the initial random centroid selection, and multiple runs with different initializations may
yield different results. Therefore, it's common practice to run k-means multiple times and select the best result
based on some criterion, such as minimizing the within-cluster variance.
K-means is a simple and efficient algorithm but has limitations, such as its sensitivity to initializations, and it assumes
that clusters are spherical, equally sized, and have similar variance. For datasets with more complex structures,
alternative clustering methods like hierarchical clustering, DBSCAN, or spectral clustering may be more appropriate.

Formula: The k-means algorithm involves two main formulas: one for assigning data points to clusters and another
for updating the cluster centroids. Here are the formulas for each step:

1. **Assign Data Points to Clusters (Assignment Step):**


In the assignment step, you calculate the distance between each data point and each cluster centroid and then
assign each data point to the nearest cluster. Typically, the Euclidean distance is used as the distance metric.

Let's denote:
- `x` as a data point
- `c_i` as the centroid of cluster `i`
- `d(x, c_i)` as the distance between data point `x` and centroid `c_i`.

The formula for assigning a data point `x` to the nearest cluster is:

```
Cluster(x) = argmin_i {d(x, c_i)}
```

In other words, you assign the data point `x` to the cluster whose centroid is closest to it based on the Euclidean
distance.

2. **Update Cluster Centroids (Update Step):**


In the update step, you calculate the new centroid for each cluster. The centroid of a cluster is simply the mean of
all the data points assigned to that cluster.

Let's denote:
- `N_i` as the number of data points assigned to cluster `i`
- `x_j` as the data point in cluster `i`.

The formula for updating the centroid `c_i` of cluster `i` is:

```
c_i = (1 / N_i) * Σ x_j
```

Here, you sum up all the data points `x_j` that belong to cluster `i` and divide by the number of data points in that
cluster (`N_i`) to compute the new centroid `c_i`.

These two formulas constitute the core of the k-means algorithm. The algorithm iteratively applies these steps until
convergence, and the final centroids represent the center of each cluster.

Implementation: We import the necessary libraries, generate some sample data, and specify the number of clusters
(k) to create.
We create a KMeans instance with the desired number of clusters, fit the model to the data, and obtain the cluster
assignments for each data point.

We also get the cluster centroids, which represent the center of each cluster.

Finally, we demonstrate how to predict the cluster for new data points using the trained k-means model.

Keep in mind that in practice, you would replace the sample data with your own dataset and adjust the parameters
as needed. Additionally, you may need to perform data preprocessing and choose an appropriate number of clusters
based on your specific problem.

Related study:

Pengfei Shan 2018 The image is an important way for people to understand the world. How to make the computer
have image recognition function is the goal of image recognition research. In image recognition, image segmentation
technology is one of the important research directions. This paper uses gray-gradient maximum entropy method to
extract features from the image, uses K-mean method to classify the images, and uses average precision (AP) and
intersection over union (IU) evaluation methods to evaluate the results. The results show that the method of K-mean
can achieve image segmentation very well.

KNN

The k-Nearest Neighbors (KNN) algorithm is a supervised machine learning algorithm used for classification and
regression tasks. It is a simple and intuitive algorithm that makes predictions based on the similarity of a data point
to its k-nearest neighbors in the training dataset. KNN can be used for both classification and regression tasks:

1. KNN Classification:

In KNN classification, the algorithm assigns a class label to a data point based on the majority class among its k-
nearest neighbors.
The "k" in KNN refers to the number of nearest neighbors to consider when making a prediction.
Common distance metrics to measure similarity between data points include Euclidean distance, Manhattan
distance, and cosine similarity.
The algorithm doesn't learn a model from the training data but rather stores the entire dataset to make predictions.
It is often referred to as an instance-based or lazy learning algorithm.
2. KNN Regression:

In KNN regression, the algorithm predicts a numerical value for a data point based on the average or weighted
average of the values of its k-nearest neighbors.
For regression tasks, the predicted value is often the mean or median of the target values of the k-nearest neighbors.
Key characteristics of the KNN algorithm include:

It's a non-parametric algorithm, meaning it doesn't make any assumptions about the underlying data distribution.
KNN can be sensitive to the choice of the value of "k." A smaller "k" may lead to a noisy model, while a larger "k"
may result in oversmoothed predictions.
It is suitable for both binary and multiclass classification tasks, as well as numerical prediction tasks.
KNN is computationally expensive, especially for large datasets, because it requires calculating distances between
the test point and all data points in the training set.
Scaling or normalizing the features is often necessary to avoid bias in the distance calculations.
It is important to choose an appropriate distance metric and value of "k" based on the specific problem and data
characteristics.
The KNN algorithm is straightforward to implement and is often used as a baseline model for classification and
regression tasks. However, its performance can vary depending on the choice of "k" and the quality of the features,
and it may not be suitable for high-dimensional data.

Formula: KNN Regression Formula

1. Given a new data point `x` for which you want to predict a numerical value, find the k training data points from
your data set that are closest to `x`, again using a distance metric.

2. For regression, the predicted value for `x` is often the mean or weighted mean of the target values of the k-
nearest neighbors. The formula for predicting the value `y` for the new data point `x` is:

y = (1/k) * Σ y_i

where `y_i` represents the target value of the `i`-th nearest neighbor among the k neighbors.

The choice of distance metric and the value of "k" are critical parameters in KNN and may significantly affect the
algorithm's performance. Common distance metrics include Euclidean distance, Manhattan distance, Minkowski
distance, and cosine similarity, among others.

It's important to note that KNN doesn't learn a model from the data but rather stores the entire training dataset. The
algorithm makes predictions by comparing the new data point to the training data points and finding the k-nearest
neighbors based on the chosen distance metric.

Implementation:

# Sample dataset
X = np.array([[1, 2], [2, 3], [2, 5], [3, 4], [3, 6], [4, 4], [4, 7], [5, 6]])
y = np.array([0, 0, 1, 1, 0, 1, 1, 0])

# New data point to classify


new_data = np.array([3, 5])

# Define the value of k


k=3

# Calculate Euclidean distances between the new data point and all data points
distances = np.sqrt(np.sum((X - new_data)**2, axis=1))

# Find the indices of the k-nearest neighbors


nearest_indices = np.argsort(distances)[:k]

# Get the class labels of the k-nearest neighbors


nearest_labels = y[nearest_indices]

# Make a prediction by selecting the majority class among the k-nearest neighbors
prediction = np.bincount(nearest_labels).argmax()

print(f"The predicted class for the new data point is: {prediction}")
```

In this code:

1. We create a sample dataset `X` and corresponding class labels `y`. This is a simplified dataset with two features
and binary class labels.

2. We define a new data point `new_data` that we want to classify.


3. We specify the value of `k` (the number of neighbors to consider in the classification). In this example, `k = 3`.

4. We calculate the Euclidean distances between the new data point and all data points in the dataset.

5. We find the indices of the `k` nearest neighbors by sorting the distances and selecting the first `k` indices.

6. We extract the class labels of the `k` nearest neighbors.

7. We make a prediction by selecting the majority class among the `k` nearest neighbors using `np.bincount` and
`argmax`.

Related study:

Shichao Zhang 2017 The kNN (k Nearest Neighbors) algorithm is a non-parametric, or an instance-based, or a lazy
method, and has been regarded as one of the simplest method in data mining and machine learning [27], [37], [38].
The principle of kNN algorithm is that the most similar samples belonging to the same class have high probability.
Generally, the kNN algorithm first finds k nearest neighbors of a query in training dataset, and then predicts the
query with the major class in the k nearest neighbors. Therefore, it has recently been selected as one of top 10
algorithms in data mining [32].

ARTIFICIAL NEURAL NETWORKS

Artificial Neural Networks (ANNs) are computational models inspired by the structure and function of biological
neural networks in the human brain. ANNs are a fundamental component of deep learning and are widely used for
solving complex tasks such as pattern recognition, classification, regression, and optimization problems. They consist
of interconnected nodes, also known as neurons or units, organized in layers that process and transform input data
to produce the desired output. Here is an overview of the key components of artificial neural networks:

1. Neurons or Nodes: Neurons are the basic computational units that process and transmit information in an artificial
neural network. They receive input signals, apply a transformation using an activation function, and pass the output
to the next layer.

2. Layers: ANNs are composed of multiple layers, including an input layer, one or more hidden layers, and an output
layer. The input layer receives the initial data, the hidden layers perform intermediate computations, and the output
layer produces the final result.

3. Connections and Weights: Neurons are connected by weighted connections that transmit signals between layers.
These connections have associated weights that determine the strength of the signal. Learning in neural networks
involves adjusting these weights during the training process.

4. Activation Function: Each neuron applies an activation function to the weighted sum of its inputs to introduce
non-linearities and allow the network to model complex relationships in the data. Common activation functions
include the sigmoid, tanh, ReLU (Rectified Linear Unit), and softmax functions.

5. Forward Propagation: During forward propagation, the input data flows through the network from the input layer
to the output layer, with each layer performing computations and passing the results to the next layer.

6. Backpropagation: Backpropagation is an algorithm used to train neural networks. It involves calculating the
gradient of the loss function with respect to the network's weights, which is then used to update the weights in the
opposite direction of the gradient, aiming to minimize the loss.

7. Loss Function**: The loss function measures the error between the predicted output of the network and the
actual target values. The goal during training is to minimize this error by adjusting the network's weights.
8. Optimization Algorithm: Optimization algorithms like gradient descent or its variants are used to update the
weights of the network during training, aiming to find the optimal set of weights that minimize the loss function.

Formula:
The artificial neural network (ANN) consists of interconnected nodes organized in layers, with each node performing
specific computations. While the operations in a neural network involve complex mathematics, the basic formulas
can be simplified to understand the underlying principles. Here are the fundamental components and simplified
formulas of an artificial neural network:

1. **Input Layer**:
- There is no specific formula for the input layer, as it only serves to pass the input data to the first hidden layer.

2. **Hidden Layer and Output Layer**:


- Each node in the hidden layer and the output layer applies the following formula to produce an output:

```
z = b + w1 * x1 + w2 * x2 + ... + wn * xn
```

where:
- `z` is the weighted sum of the inputs and biases.
- `b` is the bias term.
- `w1, w2, ..., wn` are the weights associated with each input `x1, x2, ..., xn`.

3. **Activation Function**:
- After calculating the weighted sum, the result is passed through an activation function. Common activation
functions include the sigmoid function, ReLU (Rectified Linear Unit), tanh function, and softmax function.
- For instance, the sigmoid activation function is given by:

```
activation(z) = 1 / (1 + e^(-z))
```

- The ReLU activation function is:

```
activation(z) = max(0, z)
```

- The tanh activation function is:

```
activation(z) = (e^z - e^(-z)) / (e^z + e^(-z))
```

- The softmax activation function (typically used in the output layer for multiclass classification) is:

```
softmax(z_i) = e^(z_i) / (Σ e^(z_j)), for all j
```

4. **Loss Function**:
- The loss function measures the error between the predicted output and the actual target values. It quantifies how
well the model is performing.
- Common loss functions include mean squared error (MSE) for regression tasks and categorical cross-entropy for
classification tasks.
Implementation:
Implementing an artificial neural network (ANN) from scratch can be a complex task. However, you can utilize deep
learning libraries such as TensorFlow or PyTorch, which provide high-level APIs for building and training neural
networks. Here's a simple example of implementing a feedforward neural network for binary classification using the
TensorFlow library in Python:

```python
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Generate some synthetic data


import numpy as np
np.random.seed(0)
X = np.random.rand(100, 2) # Sample 100 data points
y = np.random.randint(0, 2, 100) # Binary labels

# Build the neural network model


model = keras.Sequential([
layers.Dense(5, activation='relu', input_shape=(2,)),
layers.Dense(1, activation='sigmoid')
])

# Compile the model


model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the model


model.fit(X, y, epochs=50, batch_size=8)

# Make predictions
predictions = model.predict(X)

# Print some predictions


print("Some predictions:")
for i in range(5):
print(f"Data point: {X[i]}, Prediction: {predictions[i]}")
```

In this code:

- We use TensorFlow and Keras to build a simple feedforward neural network with one hidden layer.
- The model is compiled with the Adam optimizer and binary cross-entropy loss for binary classification.
- We train the model on synthetic data (`X`) and corresponding binary labels (`y`).
- After training, we use the model to make predictions on the same data.

This is a basic example, and in practical applications, you will work with more complex data and architectures.
TensorFlow and PyTorch provide a wide range of tools and functionalities for building and training soprehisticated
neural network models, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and
more. Using these libraries can greatly simplify the process of implementing and working with neural networks.

Related study:
According to Nebjosa Bacanin 2021 Artificial neural networks are one of the most commonly used methods in
machine learning. Performance of network highly depends on the learning method. Traditional learning algorithms
are prone to be trapped in local optima and have slow convergence. At the other hand, nature-inspired optimization
algorithms are proven to be very efficient in complex optimization problems solving due to derivative-free solutions.
Addressing issues of traditional learning algorithms, in this study, an enhanced version of artificial bee colony nature-
inspired metaheuristics is proposed to optimize connection weights and hidden units of artificial neural networks.
Proposed improved method incorporates quasi-reflection-based learning and guided best solution bounded
mechanisms in the original approach and manages to conquer its deficiencies. First, the method is tested on a recent
challenging CEC 2017 benchmark function set, then applied for training artificial neural network on five well-known
medical benchmark datasets. Further, devised algorithm is compared to other metaheuristics-based methods. The
efficiency is measured by five metrics - accuracy, specificity, sensitivity, geometric mean, and area under the curve.
Simulation results prove that the proposed algorithm outperforms other metaheuristics in terms of accuracy and
convergence speed. The improvement of the accuracy over the other methods on different datasets are between
0.03% and 12.94%. The quasi-refection-based learning mechanism significantly improves the convergence speed of
the original artificial bee colony algorithm and together with the guided best solution bounded, the exploitation
capability is enhanced, which results in significantly better accuracy.

DEEP LEARNING

Deep learning is a subfield of machine learning that focuses on artificial neural networks with multiple layers, also
known as deep neural networks. Deep learning algorithms aim to model and solve complex tasks by learning
patterns and representations directly from data. The defining characteristic of deep learning is the use of deep
neural networks, which are composed of multiple layers of interconnected nodes (neurons) that process and
transform the data.

Formula: Input Data: Deep learning starts with input data, which could be anything from images and text to
numerical values. This data is the information you want the deep learning model to learn from.

Neural Network: A neural network is like a virtual brain composed of interconnected layers. It takes the input data
and processes it through multiple layers, each containing many artificial neurons. These neurons perform
computations and transform the data.

Weights and Connections: In the neural network, each connection between neurons has a weight. These weights are
like adjustable knobs that the model tunes during training. The model learns how to set these weights to make
predictions or perform specific tasks.

Training: To train a deep learning model, you provide it with a large dataset that includes both input data and the
correct answers or labels. The model makes predictions based on its current weights and compares them to the
actual answers.

Loss Function: A loss function quantifies how wrong the model's predictions are compared to the correct answers.
The goal during training is to minimize this loss.

Backpropagation: Backpropagation is an algorithm that adjusts the weights in the neural network to reduce the loss.
It works by calculating how each weight change would affect the loss and then updating the weights accordingly.

Iterations: The training process involves many iterations. The model repeatedly goes through the dataset, making
predictions, calculating the loss, and updating the weights to get better and better at the task.

Testing and Inference: Once the model is trained, you can use it to make predictions on new, unseen data. This is
called inference, and the model can perform tasks it was trained for, such as recognizing images, translating
languages, or generating text.

Implementation: Data Preparation:

Collect and preprocess your data. This may include tasks such as data cleaning, normalization, feature scaling, and
data splitting into training, validation, and test sets.
Encode labels or target variables if you're working on a classification task.
Choose a Framework:
Select a deep learning framework like TensorFlow, PyTorch, Keras, or others. These frameworks provide high-level
APIs for building and training deep neural networks, making the implementation process more manageable.
Model Architecture:

Decide on the architecture of your deep neural network. This includes the type and number of layers, the number of
neurons in each layer, and the activation functions.
You can use pre-built models (e.g., Convolutional Neural Networks for image tasks) or create custom architectures.
Loss Function:

Choose an appropriate loss function for your specific task. For example, use mean squared error for regression or
cross-entropy for classification.
The loss function measures the error between the model's predictions and the actual target values.
Optimization Algorithm:

Select an optimization algorithm (optimizer) like stochastic gradient descent (SGD), Adam, RMSprop, or others.
The optimizer adjusts the model's weights during training to minimize the loss.
Training:

Feed the training data through the model, compute the loss, and use backpropagation to update the model's
weights.
Monitor the model's performance on a validation set to prevent overfitting and select the best model based on
validation performance.
Hyperparameter Tuning:

Experiment with different hyperparameters, such as learning rate, batch size, number of layers, and neurons, to
optimize the model's performance.
Regularization:

Apply regularization techniques like dropout, L1/L2 regularization, or batch normalization to prevent overfitting.
Testing:

Once the model is trained and hyperparameters are optimized, evaluate its performance on a test dataset to ensure
it generalizes well to new, unseen data.
Deployment:

If you want to use your deep learning model in real-world applications, you can deploy it in production
environments. This may involve integrating the model into a software application or web service.
Continual Learning and Fine-Tuning:

Keep in mind that deep learning models may require periodic updates, especially when dealing with evolving data.

Related study: Samira Pouyanfar The field of machine learning is witnessing its golden era as deep learning slowly
becomes the leader in this domain. Deep learning uses multiple layers to represent the abstractions of data to build
computational models. Some key enabler deep learning algorithms such as generative adversarial networks,
convolutional neural networks, and model transfers have completely changed our perception of information
processing. However, there exists an aperture of understanding behind this tremendously fast-paced domain,
because it was never previously represented from a multiscope perspective. The lack of core understanding renders
these powerful methods as black-box machines that inhibit development at a fundamental level. Moreover, deep
learning has repeatedly been perceived as a silver bullet to all stumbling blocks in machine learning, which is far
from the truth. This article presents a comprehensive review of historical and recent state-of-the-art approaches in
visual, audio, and text processing; social network analysis; and natural language processing, followed by the in-depth
analysis on pivoting and groundbreaking advances in deep learning applications. It was also undertaken to review
the issues faced in deep learning such as unsupervised learning, black-box models, and online learning and to
illustrate how these challenges can be transformed into prolific future research avenues.

You might also like