You are on page 1of 149

Topics in Module-3-ML & Cloud Computing for IoT

• Supervised and
Unsupervised ML
Algorithms.

• IoT Data Analytics,

• Cloud Computing for IoT.

• Cloud Based platforms.

• ML for Cloud IoT


Learn + Predict + Improve
Analytics. =
Machine learning
• Challenges.
1
Evolution of ML

Engineering of Machines that


Mimic Cognitive Functions
Ability to perform tasks
without explicit instructions
and relying on pattern
Machine Learning based on
Artificial Neural Networks
2
What is Machine Learning?

A canonical definition by Tom Mitchell in 1997: “An agent is said to learn from experience (E) with
respect to some class of tasks (T), and the performance measure (P), if the learner's performance at T,
as measured by P, improves with E". One has to be very careful about defining the set of tasks T, and
the performance measure P. With experience E, the performance P has to improve.”
3
What is Machine Learning?

Machine learning can be defined as a subset of Artificial Intelligence (AI) that


allows a system/computer to learn from some available data. The data can either be
labelled (with a number, tag or type) or unlabelled.
F
4
Advantages and Disadvantages of Machine Learning
Advantages Disadvantages
1. Easy to identify the 1. Chances of error
patterns 2. Data acquiring and
2. No Human Intervention preprocessing
3. Wide range of applications 3. Time and resource
4. Scope for Continuous dependent
improvement 4. Human Expertise for
5. Handling Multi-variety result interpretation
F
Data
F

5
Applications of Machine Learning

6
Real world Applications of Machine Learning

7
Types of Machine Learning based on Learning

8
What is Supervised Machine Learning?

• Learning an
input and
output map.
• It deals with
Labels
labelled data.

• If the output happens to be a categorical one, then the supervised


learning paradigm is called `classification'.
• If the output is a continuous value, then the learning paradigm is
called `regression'.
9
What is Supervised Machine Learning?
• In supervised learning, the machine is trained on a set of labeled data, which means that the input
data is paired with the desired output. The machine then learns to predict the output for new
input data. Supervised learning is often used for tasks such as classification, regression, and
object detection.
• The machine is provided with a new set of examples (data) so that the supervised learning
algorithm analyses the training data and produces a correct outcome from labeled data.
• For example, a labeled dataset of images of Elephant, Camel and Cow would have each image
tagged with either “Elephant” , “Camel” or “Cow”.

10
Supervised Learning: Supervised learning
is a category of machine learning that
uses labeled datasets to train algorithms
to predict outcomes and recognize
patterns.
Training set/Validation Set/Test Set
• The Training Set
• It is the set of data that is used to train and make the model learn
the hidden features/patterns in the data.
• The training set should have a diversified set of inputs so that the
model is trained in all scenarios and can predict any unseen data
sample that may appear in the future.
• The Validation Set
• The validation set is a set of data, separate from the training set,
that is used to validate our model performance during training.
• This validation process gives information that helps us tune the
model’s hyperparameters and configurations accordingly.
• The Test Set
• The test set is a separate set of data used to test the model after
completing the training.
• It provides an unbiased final model performance metric in terms
of accuracy, precision, etc.
What is Supervised Machine Learning?

• Learning an
input and
output map.
• It deals with
labelled data.
Labels

• If the output happens to be a categorical one, then the supervised


learning paradigm is called `classification'.
• If the output is a continuous value, then the learning paradigm is called
`regression'.
13
How does Supervised Machine Learning Works?

14
K Nearest Neighbour
• A supervised learning technique in which kn ∈ [1 n] is the number of
nearest samples with respect to a test sample. Note that n is the total
number of samples.
• Example: Perform kNN classification on the following raw dataset of
a smart home as shown in Table 1. Determine the class for
Temperature = 4 and Humidity = 8 with kn = 3.

15
16
Naive Bayes classification
• We know from the probability theory that the probability of A when B
is true

17
• Example: Consider the given raw dataset as shown in Table Perform
Naive Bayes classification algorithm and determine the posterior
probability if the weather is windy.

18
What is Unsupervised Machine Learning?

• Discovering
patterns in the
data.
• It deals with
unlabelled data.

• The process of finding cohesive groups in the input data is called


`clustering'.
• The process of finding the frequent co-occurance of items in the
data is called `association rule mining'.
19
Difference between Supervised and Unsupervised

20
Supervised and Unsupervised Algorithms

SVD (Singular Value Decomposition), PCA (Principal Component Analysis), FP-(Frequent


Pattern Growth), K-Nearest Neighbors (KNN), Support Vector Machine (SVM).
21
What is Unsupervised Machine Learning?

• Discovering
patterns in
the data.
• It deals with
unlabelled
data.

• The process of finding cohesive groups in the input data is called


`clustering'.
• The process of finding the frequent co-occurance of items in the
data is called `association rule mining'.
22
How does Unsupervised Machine Learning Works?

23
Supervised and Unsupervised Algorithms

24
Classification Vs. Regression

• Regression algorithms predicts the discrete or a


continues value. In some cases, the predicted
value can be used to identify the linear
relationship between the attributes.
• Classification algorithms predicts the target class
(Yes/ No). If the trained model is for predicting
any of two target classes. It is known as binary
classification.
25
An Example of Bayes Theorem
• Given:
• A doctor knows that meningitis causes stiff neck 50% of the time
• Prior probability of any patient having meningitis is 1/50,000
• Prior probability of any patient having stiff neck is 1/20

• If a patient has stiff neck, what’s the probability he/she has meningitis?

P( S | M ) P( M ) 0.5 1 / 50000
P( M | S ) = = = 0.0002
P( S ) 1 / 20

26
Naïve Bayes Classification model
•Naïve Bayes Classifier is one of the simple and most effective Classification algorithms which helps
in building the fast machine learning models that can make quick predictions.
•It is a probabilistic classifier, which means it predicts on the basis of the probability of an object.
•Naïve: It is called Naïve because it assumes that the occurrence of a certain feature is independent
of the occurrence of other features. Such as if the fruit is identified on the bases of color, shape,
and taste, then red, spherical, and sweet fruit is recognized as an apple. Hence each feature
individually contributes to identify that it is an apple without depending on each other.
•Bayes: It is called Bayes because it depends on the principle of Bayes' Theorem
•Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to determine the probability
of a hypothesis with prior knowledge. It depends on the conditional probability.
•The formula for Bayes' theorem is given as:
•Where,
•P(A) is Prior Probability: Probability of hypothesis before observing the evidence.
•P(B) is Marginal Probability: Probability of Evidence.
•P(A|B) is Posterior probability: Probability of hypothesis A on the observed event B.
•P(B|A) is Likelihood probability: Probability of the evidence given that the probability of a
hypothesis is true.
27
Naïve Bayes Classification model Solved Example#1
If the weather is sunny, then the Player should play or not?

28
Naïve Bayes Classification model Solved Example#1
Step-1 Frequency table for the Weather Conditions:

Step-2 Likelihood table weather condition:

29
Naïve Bayes Classification model Solved Example#1
Step-3 Applying Bayes Theorem

30
K-Nearest Neighbor(KNN) Algorithm
• Choosing the value of k:
• If k is too small, sensitive to noise
points
• If k is too large, neighborhood may
include points from other classes
• Higher values of k provide smoothing
that reduces the risk of overfitting
due to noise in the training data
• Value of k can be chosen based on
error rate measures
• We should also avoid over-smoothing
by choosing k=n, where n is the total
number of tuples in the training data
set
31
K-Nearest Neighbor(KNN) Algorithm-Solved Example

• Let us consider the data given in table above consisting


of 10 entries.

32
K-Nearest Neighbor(KNN) Algorithm-Solved Example
The distance between the new point and each training point is
calculated.

33
K-Nearest Neighbor(KNN) Algorithm-Solved Example
The distance between the new point and each training point is calculated using either of
the forms

34
K-Nearest Neighbor(KNN) Algorithm-Solved Example
The closest k data points are selected (based on the distance). In this
example, points 1, 5, 6 will be selected if the value of k is 3.

35
K-Nearest Neighbor(KNN) Algorithm-Solved Example
• Select the k value. This determines the
number of neighbors we look at when
we assign a value to any new
observation.
• In our example, for a value k = 3, the
closest points are ID1, ID5 and ID6.

36
K-Nearest Neighbor(KNN) Algorithm-Solved Example
• In our example, for a value k = 5, the
closest points are ID1, ID4, ID5, ID6
and ID10.

37
K-Nearest Neighbor(KNN) Algorithm
Advantages
• It is simple to implement.
• It is robust to the noisy training
data
• It can be more effective if the
training data is large.
Disadvantages:
• Always needs to determine the
value of K which may be complex
some time.
• The computation cost is high
because of calculating the distance
between the data points for all the
training samples.

38
What is Regression?

Regression analysis is a way of


mathematically sorting out
which of those variables does
indeed have an impact.

• Regression analysis is a statistical method to model the relationship/correlation between a


dependent (target) and independent (predictor) variables with one or more independent
variables.
• Regression analysis helps us to understand how the value of the dependent variable is changing
corresponding to an independent variable when other independent variables are held fixed.
39
Solved Example-Finding the best fit line
• The data consist of samples
Works Assigned(X) 1 2 3 4
Total hours(Y) 3 4 5 7
X Y XY 𝑿𝟐 Y
8
1 3 3 1
7
2 4 8 4 6
3 5 15 9 5
4
4 7 28 16
3 Line of best fit
Sum=10 19 54 30 2 y = 1.5X+1.3
( σ 𝑦 ∗ (σ 𝑥 2 )) − σ 𝑥 σ 𝑥𝑦 (𝑛 σ 𝑥𝑦 − σ 𝑥 σ 𝑦 1
𝑎= 𝑏= 2
𝑛 σ𝑥 − σ𝑥
2 2 𝑛 σ𝑥 − σ𝑥 2 X
1 2 3 4
19 ∗ 30 −( 10 ∗ 54 ) 4 54 −( 10 ∗ 19 )
= =1.5 = =1.3 y = ax + b
4 30 − 10 2 4 30 − 10 2
y = 1.5X+1.3
40
Logistic Regression
• Logistic regression estimates the probability of a certain event occurring using
the odds ratio by calculating the logarithm of the odds.
• Uses Maximum likelihood estimation (MLE) to transform the probability of
an event occurring into its odds, a nonlinear model.

• Odds ratio is the probability of occurrence of a particular event over the


probability of non occurrence and providing an estimate of the
magnitude of the relationship between binary variables. i.e. probability of
success divided by the probability of failure 41
Example #1

Example #2

42
What Logistic Regression predicts?
• Probability of Y occurring given known values for X(s).
• In Logistic Regression, the Dependent Variable is transformed into the natural log of the odds.
This is called logit (short for logistic probability unit).

• The probabilities which ranged between 0.0 and 1.0 are transformed into
odds ratios that range between 0 and infinity and approximated as a sigmoid
function applied to a linear combination of input features in the range 0 to 1.
• If the probability for group membership in the modeled category is above
some cut point (the default is 0.50), the subject is predicted to be a member
of the modeled group. Example: Default their payment.
• If the probability is below the cut point, the subject is predicted to be a
member of the other group. Example: No Default their payment.
• For any given case, logistic regression computes the probability that a case
with a particular set of values for the independent variable is a member of
the modeled category.
43
Logistic Regression-Solved Example#1
A dataset consist of women and men Instagram users with a sample size of
1069. Let the probability of men and women using Instagram
be 𝑃𝑚𝑒𝑛 𝑎𝑛𝑑 𝑃𝑤𝑜𝑚𝑒𝑛 𝑟𝑒𝑠𝑝𝑒𝑐𝑡𝑖𝑣𝑒𝑙𝑦. The sample proportion of women who
are Instagram users is given as 61.08%, and the sample proportion for men
is 43.98%. The difference is 0.170951, and the 95% confidence interval is
(0.111429, 0.2292).Establish a logistic regression model specifies the
relationship between p and x. 𝑃0 𝑠𝑢𝑐𝑐𝑒𝑠𝑠
Odds= =
1− 𝑃0 𝑓𝑎𝑖𝑙𝑢𝑟𝑒
Solution
𝑃𝑤𝑜𝑚𝑒𝑛
Logistic regression equation for women log ( ) = 𝛽0 + 𝛽1
1− 𝑃𝑤𝑜𝑚𝑒𝑛
𝑃𝑚𝑒𝑛
Logistic regression equation for men log ( ) = 𝛽0
1− 𝑃𝑚𝑒𝑛

44
Logistic Regression-Solved Example#1 (Contd.)
𝑃 0.6108
Odds for women=1− 0𝑃 = 1−0.6108=1.5694
0
𝑃0 0.4398
Odds for men=1− 𝑃 =
1−0.4398
=0.7851
0
𝑃𝑤𝑜𝑚𝑒𝑛
Log of Odds for women=log ( )=log(1.5694)=0.4507=𝛽0 + 𝛽1
1− 𝑃𝑤𝑜𝑚𝑒𝑛
𝑃𝑚𝑒𝑛
Log of Odds for men=log ( )=log(0.7851)= -0.2419 =𝛽0
1− 𝑃𝑚𝑒𝑛
𝛽0 = −0.2419
Slope 𝛽1 = Log (odds for women)-Log(odds for men)=0.4507-(- 0.2419)=0.6926

Best fit regression equation y= 𝜷𝟎 +𝜷𝟏 𝒙=−0.2419+0.6926𝒙

45
Model Estimation and Evaluation

46
Principal Component Analysis (PCA)#Solved Example-1
Consider the two dimensional patterns (2, 1), (3, 5), (4, 3), (5, 6), (6, 7), (7, 8).
Compute the principal component using PCA Algorithm.
The Principal Component Analysis is a popular unsupervised learning technique
for reducing the dimensionality of large data sets.
Get data.

47
Principal Component Analysis (PCA)#Solved Example-1

48
Principal Component Analysis (PCA)#Solved Example-1

49
Principal Component Analysis (PCA)#Solved Example-1

50
Principal Component Analysis (PCA)#Solved Example-1

51
Principal Component Analysis (PCA)#Solved Example-1

52
Principal Component Analysis (PCA)#Solved Example-1

53
Principal Component Analysis (PCA)#Solved Example-1

54
IoT Data Analytics
• IoT data analytics refers to the
procedure of gathering, examining,
and decoding data produced by these
devices to gain knowledge and make
wise decisions.
• Data analytics uses bunches of
hardware, software, and data science
techniques to collect accurate
information from massive data created
by IoT devices.

55
IoT Data Analytics-Components
•Data Collection − IoT devices are embedded with various sensors that collect data on
different parameters such as temperature, humidity, pressure, and motion. This data is
transmitted to a central server or cloud platform for further processing.
•Data Storage − The data generated by IoT devices is massive and needs to be stored
efficiently.
•Data Processing − IoT data analytics involves processing data to extract valuable insights.
To make sure the data is correct, consistent, and prepared for analysis, data processing
procedures including data cleansing, data transformation & data normalization are utilized.
•Data analysis − To find patterns and trends in the data, statistical & machine learning
algorithms are employed.
•Data Visualization − IoT data analytics involves the use of data visualization tools to
present insights and findings in a user-friendly and understandable format. Visualization
tools like dashboards, charts & graphs help to understand the data quickly and then make
decisions in a very logical and practical way. So, they can give an informed decision based
on the insights derived from IoT data analysis.
56
IoT Data Analytics-Challenges
•Data Security − IoT devices generate sensitive data that can be vulnerable to cyber-
attacks. Every organization must make sure that IoT data is stored securely. Also, only
authorized people can access it.
•Data Privacy − IoT devices collect personal data such as location, health, and behaviour.
Organizations should check that all these data must be collected and used in compliance
with privacy regulations.
•Data Quality − IoT data can be noisy and inconsistent. Organizations need to ensure that
IoT data is accurate, consistent, and reliable for analysis.
•Scalability − IoT data is generated at a massive scale. Organizations need to ensure that
their IoT data analytics infrastructure can scale to handle large volumes of data.
•Interoperability − IoT devices come from different manufacturers and have different
protocols & standards. All these make it difficult to integrate & analyze data from different
sources. Interoperability challenges can lead to data storage, reduced efficiency, and
increased costs. Organizations need to ensure that their IoT data analytics infrastructure can
integrate data from different sources and platforms seamlessly.
57
IoT Data Analytics-Applications
• Predictive Maintenance − IoT data analytics is used to predict when equipment is likely to fail. By analyzing
the data generated by sensors embedded in machines, organizations can identify patterns that indicate potential
equipment failure. It enables organizations to schedule maintenance before a failure occurs, reducing downtime
and increasing efficiency.
• Energy Management − IoT data analytics is used to monitor and optimize energy consumption in buildings.
By analyzing data on energy usage, temperature, and occupancy, organizations can identify areas where energy
usage can be reduced. It helps organizations save money on energy costs and reduce their carbon footprint.
• Supply Chain Optimization − IoT data analytics is used to optimize supply chain operations. By analyzing
data on inventory levels, transportation routes & delivery times, organizations can identify areas where supply
chain processes can be improved. It helps organizations reduce costs and improve customer satisfaction.
• Smart Cities − IoT data analytics is used to make cities more efficient and sustainable. You can easily analyze
traffic patterns, air quality, and energy usage. With this cities can identify the areas they need improvements.
• Healthcare − IoT data analytics is used to monitor patients remotely, collect vital signs data & provide
personalized healthcare. By analyzing patient data, healthcare providers can identify patterns that indicate
potential health issues, enabling them to intervene early and provide more effective treatment. IoT data analytics
can also help healthcare providers improve operational efficiency by optimizing resource allocation and
reducing wait times.
58
Cloud Computing for IoT
• Cloud Internet of Things (IoT) uses cloud
computing services to collect and process
data from IoT devices, and to manage the
devices remotely.
• The scalability of cloud IoT platforms
enables the processing of large amounts of
data, as well as artificial intelligence (AI) and
analytics capabilities.
• Cloud IoT is a technology architecture that
connects IoT devices to servers housed in
cloud data centers. This enables real-time
data analytics, allowing better, information-
driven decision making, optimization, and
risk mitigation. Cloud IoT also simplifies
management of connected devices at-scale.
59
Cloud Computing for IoT
Cloud IoT is different from traditional, or non-cloud-based IoT in a
few key ways:
• Data Storage: the cloud collects IoT data generated by thousands or millions of
IoT sensors, with the data being stored and processed in a central location.
While in other types of IoT architectures, data may be stored and processed on-
premises
• Scalability: cloud IoT is highly scalable, as cloud infrastructure (compute,
storage, and networking resources) can easily handle thousands of devices and
process their data across large systems.
• Flexibility: cloud IoT provides a high level of flexibility, as it allows devices to
be added or removed as-needed, without having to reconfigure the entire system.
• Maintenance: in cloud IoT, the maintenance of servers and networking
equipment is handled by the cloud service provider (CSP). While in other types
of IoT architectures, maintenance may be the responsibility of the end user.
• Cost: cloud IoT can be more cost-effective over the long-term, as users only pay
for the resources they actually consume, and users do not have to invest upfront
in their own expensive compute, storage, and networking infrastructure. 60
Cloud Computing for IoT

• Cloud IoT connects IoT devices – which collect and transmit data – to cloud-based
servers via communication protocols such as MQTT and HTTP and over wired and
wireless networks. These IoT devices can be managed and controlled remotely and
integrated with other cloud services.
• A cloud IoT system typically includes the following elements:
• IoT Devices: physical devices, such as sensors and actuators, that generate and
transmit data to the cloud
• Connectivity: communication protocols and standards used to connect the IoT
devices to the cloud. Examples of protocols include MQTT and HTTP, while
examples of standards are Wi-Fi, 4G/LTE, 5G, Zigbee, and LoRa (long range). 61
Cloud Computing for IoT

• Cloud Platforms: cloud service providers (CSPs) that offer infrastructure and services
to connect to the IoT devices. Examples include AWS IoT and Azure IoT
• Data Storage: cloud-based storage for data generated by the IoT devices, which can be
housed in repositories such as a database, data warehouse, or data lake
• Application Layer or API: cloud IoT platforms typically provide a native application –
for analytics, machine learning (ML), and visualization – or application programming
interface (API) – for data processing. Usually, applications offer the ability to manage
and monitor the IoT devices for provisioning, software updates, and troubleshooting
• Security: measures put in place to secure the data and IoT devices, such as encryption,
authentication, and access control 62
Cloud Based platforms
• A Cloud platform hosts the server hardware and
operating system in a web-based data center.
• The platform enables the coexistence of hardware and
software, and it offers remote connectivity to compute
services at scale.
• Businesses leveraging a cloud platform can remotely
access a variety of pay-per-use computing services,
including databases, servers, analytics, storage,
intelligence, networking, and software.
• Organizations do not have to build and own computing
infrastructure or data centres. They only pay for the
services they use.
• The cloud platforms permit enterprises to create and test
applications, as well as store, retrieve, analyze, and back
up data. Companies can also embed business intelligence
into operations, stream videos or audios, and deliver on-
demand software on a worldwide scale.
63
Cloud Based platforms
• Public Cloud Platforms: A public cloud platform is a third-
party cloud service provider that delivers scalable computing Top Cloud Platforms
resources via the Internet. Typical examples of public cloud 1.Amazon Web Services
platforms include IBM Bluemix, Microsoft Azure, Google (AWS) IoT Platform.
Cloud Platform, and AWS (Amazon Web Services). 2.Microsoft Azure IoT.
• Private Cloud: Private cloud platforms are managed by an 3.Google IoT.
organization's internal IT department. They use existing
4.IBM Watson IoT.
infrastructure and resources that already exist at a company's
on-premises data center. Private cloud platforms offer the 5.Cisco IoT Cloud Connect.
highest level of cybersecurity. 6.ThingsBoard Open-Source
• Hybrid Cloud Platforms: Hybrid clouds, which combine IoT Platform.
private and public cloud platforms, provide both scalability 7.Oracle IoT Intelligent
and security. It allows enterprises to seamlessly move Applications.
applications and data between private and public cloud
platforms, offering increased flexibility and superior
optimization of infrastructure, compliance, and security

64
Cloud Based platforms
• Platform-as-a-Service (PaaS): Platform-as-a-Service (PaaS) emerges as a dynamic cloud
computing solution, providing users with a comprehensive suite of hardware, software, and
resources to seamlessly develop, deploy, and manage applications without additional hardware
or software investments.
• PaaS proves invaluable for developers and individuals tasked with creating custom
applications or seamlessly integrating existing ones into the cloud environment.
• With PaaS, innovation knows no bounds as users unlock the power to shape their digital
landscape without the complexities of infrastructure provisioning.
Use Cases for PaaS
1.Web Application Hosting: PaaS can host applications requiring frequent updates without managing the
underlying infrastructure. This makes it easier to deploy and scale applications.
2.Mobile App Development: PaaS can be used to develop and deploy mobile applications more quickly, as it
provides access to ready-made components and services.
3.Big Data Analytics: PaaS can process and analyze large amounts of data quickly and cost-effectively, as it
provides access to powerful computing resources.
4.IoT Solutions: PaaS can be used to develop and manage connected devices and applications, as it provides
access to scalability and secure communication infrastructure.
5.DevOps Automation: PaaS can be used to automate development and operations processes, such as
deployment, testing, and monitoring, which helps to ensure faster and more reliable software releases.
65
Cloud Based platforms
• Infrastructure-as-a-Service (IaaS)
• Infrastructure-as-a-Service (IaaS) is a cloud computing solution that furnishes users with
virtualized computing components such as servers, storage, networks, and operating systems.
• It is optimal for those seeking more control over their infrastructure while avoiding physical
hardware costs.
Use Cases for IaaS
1.Web Hosting: IaaS can host web-based applications and websites, providing users access to the underlying
infrastructure and computing resources.
2.Application Development and Testing: IaaS can be used to develop and test software applications, as it
provides users with access to the underlying infrastructure and computing resources.
3.Database Hosting: IaaS can host databases as it provides users access to the underlying infrastructure and
computing resources.
4.Disaster Recovery: IaaS can be used for disaster recovery, as it allows users to quickly provision additional
resources from the cloud to restore their data and systems.
5.Big Data Analytics: IaaS can store, process and analyze large amounts of data, providing users with access to
the underlying infrastructure and computing resources.
6.IoT Deployment: IaaS can deploy and manage large-scale Internet of Things (IoT) solutions, as it provides
users with access to the underlying infrastructure and computing resources.
66
Cloud Based platforms
• Software-as-a-Service (SaaS): As a cloud computing solution, it provides users
seamless access to software applications via the internet. These web-based programs
can be utilized from any device with an internet connection, eliminating the need for
local installations. SaaS caters to individuals and organizations seeking efficient
access to specific software programs, enabling enhanced collaboration, scalability, and
flexibility without the burden of software management.
Use Cases for SaaS
1.Email and Collaboration: Email and collaboration tools such as Google Apps and Office 365 are popular SaaS
applications for communication and productivity.
2.CRM: Customer relationship management (CRM) tools such as Salesforce and Zendesk provide businesses with
a platform to manage customer data, automate sales and marketing operations, and track customer engagement.
3.E-commerce: E-commerce platforms such as Shopify, BigCommerce, and Magento provide businesses with a
complete solution to create and manage their online stores.
4.Project Management: Project management and task management tools such as Asana, Trello, and Basecamp
are popular SaaS applications used to manage projects, tasks, and timelines.
5.Accounting: Accounting and bookkeeping tools such as QuickBooks Online and Xero provide businesses with
an easy way to track financials and keep their books in order.
6.Human Resources: Human resource management (HRM) tools such as BambooHR and Zenefits provide
businesses with a platform to manage employee data and automate HR processes.
67
ML for Cloud IoT Analytics
• IoT and ML deliver insights
otherwise hidden in data for
rapid, automated responses
and improved decision
making. ML for IoT can be
used to project future
trends, detect anomalies,
and augment intelligence by
ingesting image, video and
audio.
• ML can help demystify the
hidden patterns in IoT data
by analyzing massive
volumes of data using
sophisticated algorithms. Cumulocity is a platform that enables the
management, monitoring, and analysis of
Internet of Things (IoT) devices and data.
68
ML for Cloud IoT Analytics
• ML inference can
supplement or replace
manual processes with
automated systems using
statistically derived actions
in critical processes.
• With ML for IoT,
• Ingest and transform data
into a consistent format
• Build a machine learning
model
• Deploy this machine
learning model on cloud,
edge and device

69
Benefits of ML for Cloud IoT Analytics
• Simplify ML model training: Cumulocity IoT ML is designed to help you
quickly build new ML models in an easy manner. Auto ML support allows the
right ML model to be chosen for you based on your data, whether that be
operational device data captured on the Cumulocity IoT platform or historical
data stored in big data archives.
• Flexibility to use your data science library of choice: There are a wide variety
of data science libraries available (e.g., Tensorflow®, Keras, Scikit-learn) for
developing ML models. Cumulocity IoT ML allows models to be developed in
data science frameworks of your choice. These models can be transformed into
industry-standard formats using open source tools and made available for scoring
within Cumulocity IoT.
• Rapid model deployment to operationalize ML quickly: Cumulocity IoT ML
allows easy deployment of models, whether created within the platform or
imported from other frameworks. With just one click, models can be deployed in
cloud or edge environments. Operationalized models are easily monitored and
70
updated as needed.
Benefits of ML for Cloud IoT Analytics
• Prebuilt connectors for operational & historical datastores: Cumulocity IoT ML
offers seamless access to operational and historical datastores for model training. It
retrieves data periodically, processes it through automated pipelines, and trains ML
models. Data can be stored on Amazon® S3, Microsoft® Azure® Data Lake
Storage, or locally, and accessed using Cumulocity IoT DataHub connectors.
• Integration with Cumulocity IoT Streaming Analytics:
Cumulocity IoT ML helps to quickly analyze real-time IoT data in Cumulocity IoT
Streaming Analytics. With a user-friendly interface, it lets you use ML models to
analyze data without needing to write any code.
• Notebook integration: Jupyter Notebook, a standard in data science, provides an
interactive environment across programming languages. They can be used to
prepare and process data, train, deploy and validate machine learning models. This
open-source web application is integrated with Cumulocity IoT ML.

71
ML for Cloud IoT Analytics-Challenges
ML based Cloud computing also poses some
challenges for IoT applications, such as latency,
bandwidth, reliability, and interoperability.
• Latency means the delay between sending
and receiving data, which can affect the
performance and responsiveness of IoT
devices and applications.
• Bandwidth means the amount of data that
can be transferred over a network, which
can limit the data volume and quality of IoT
devices and applications.
• Reliability means the availability and
consistency of cloud services, which can be • Interoperability means the ability of different cloud
affected by network failures, outages, or services and IoT devices to communicate and work
together, which can be hindered by incompatible
disruptions.
standards, protocols, or formats.

72
Topics in Module-4-IoT-Cloud Convergence
• Opportunities and
Challenges
• Architectures for
convergence
• Data offloading and
computation
• Dynamic Resource
Provisioning
• Security Aspects

1
IoT Cloud Convergence
Represents the cloud
• IoT-Cloud convergence infrastructure providing
refers to the integration and storage, data processing,
collaboration between and various services.
Internet of Things (IoT) •Manages communication
devices and cloud computing between IoT devices and
infrastructure. the cloud.
•Integrates data from IoT
• It involves leveraging the devices into the cloud
capabilities of cloud platform.
platforms to enhance the Edge Computing: Localized
efficiency, scalability, and processing and analytics
functionality of IoT near the IoT devices.
applications. Reduces latency by
handling data processing at
• The convergence of IoT and the edge.
cloud computing brings Fog Computing: IoT Devices:
several benefits and enables Intermediate layer between 1. Sensors and actuators
the development of more edge and cloud. Performs generating and receiving data.
powerful and sophisticated additional processing and 2. Connected to the cloud through
IoT solutions. analysis before data is sent the IoT-Cloud gateway.
to the cloud. 2
IoT Cloud Convergence architecture

3
IoT Cloud Convergence
1. Data Storage and Management:
• Cloud platforms provide scalable and reliable storage solutions for the vast amount of data
generated by IoT devices.
• Historical data can be stored in the cloud for long-term analysis, compliance, and auditing
purposes.
2. Data Processing and Analytics:
• Cloud computing enables powerful data processing and analytics, allowing for real-time and
batch processing of IoT data.
• Advanced analytics, machine learning, and artificial intelligence can be applied to gain
actionable insights from IoT data.
3. Scalability:
• Cloud infrastructure provides on-demand scalability, allowing IoT solutions to handle varying
workloads and accommodate a growing number of devices.
• Auto-scaling features ensure that computational resources can be dynamically adjusted based
on demand.
4
IoT Cloud Convergence
• 4. Device Management:
• Cloud platforms offer centralized device management capabilities, allowing for remote
monitoring, configuration, and firmware updates of IoT devices.
• Device health and status information can be easily tracked and managed through cloud-
based services.
• 5. Security and Authentication:
• Cloud services provide robust security features, including encryption, access control, and
identity management.
• Centralized security measures help protect IoT devices and data from potential threats.
• 6. Real-time Communication:
• Cloud-based messaging and communication services facilitate real-time interaction between
IoT devices and applications.
• Push notifications, alerts, and commands can be delivered efficiently through cloud
infrastructure.
5
IoT Cloud Convergence
7. Cost Optimization:
• Cloud services allow organizations to pay for the resources they consume, promoting cost
efficiency.
• It eliminates the need for significant upfront investments in infrastructure and provides
flexibility in resource utilization.
• 8. Edge and Fog Computing Integration:
• IoT-Cloud convergence often involves integrating edge and fog computing models to
distribute processing closer to the data source.
• Edge and fog computing reduce latency and bandwidth usage by performing initial
processing near IoT devices.
• 9. Standardization and Interoperability:
• Cloud platforms often adhere to industry standards, promoting interoperability among diverse
IoT devices.
• Standardized protocols and APIs facilitate seamless communication and integration within
the IoT ecosystem. 6
Role of cloud computing in IoT
• IoT works on a diverse
selection of devices that
fit into the requirements
of different industries.
• Cloud computing
involves storing and
accessing data,
applications, or services
over the internet — that
is, in the cloud —
instead of in physical
servers or mainframes

7
Advantages of IoT-cloud convergence
• Remote operation and compatibility: IoT devices lack compatibility resources, but
cloud integration enables remote tasks like asset data gathering and maintenance on
deployed devices.
• Unlimited data storage: The IoT and cloud technologies efficiently handle
unstructured data from multiple sensors, providing more space for data aggregation
and analysis when combined.
• Unlimited processing capabilities: IoT devices and apps have limited processing
capabilities, but cloud integration allows for unlimited virtual processing using AI
and ML for data-driven decision-making and improvements.
• Added security measures: Cloud technology enhances IoT security by improving
authentication mechanisms and device identity verification.

8
IoT-cloud convergence Challenges and solutions

The IT team focuses


on network
infrastructure and
cloud services,
while the OT team is
responsible for
industrial
automation and
control systems. 9
IoT-cloud convergence Challenges in secure IoT-cloud environments
• Risk arising from centralizing the entry into critical infrastructure. The integration of
IoT and cloud reduces attack surface by restricting traffic through API gateways. However,
a high-end firewall is needed to protect data flow, as a single entry could lead to potential
attackers entering the infrastructure.
• Unsecure communication and data flow between the edge and the cloud: If the
endpoints or the cloud has inadequate security features, such as authentication and
encryption, this could put access controls and the integrity of the data sent between these
two points at risk.
• Privacy and authorization issues: Enterprises must carefully consider IoT devices'
handling of sensitive data, particularly in cloud-based ecosystems, and ensure data location
is discussed with cloud service providers, particularly in capitalized areas.
• Poor implementation of the IoT: Inadequate security measures in an IoT ecosystem,
including changing passwords, network segmentation, and physical device security, can lead
to vulnerabilities even with cloud integration, such as limited access without a timeout.
• Cloud misconfiguration and other vulnerabilities: Misconfiguration in cloud computing
allows malicious actors to conduct attacks, potentially causing severe consequences for the
IoT ecosystem it's part of. 10
Architectures for IoT convergence
• The convergence of IoT and cloud
computing involves the integration of
IoT devices with cloud services to
enable efficient data processing, storage,
and analysis.
• Some common architectures for IoT
cloud convergence:
• Fog Computing Architecture
• Edge Computing Architecture
• Hierarchical Architecture
• Client-Server Model
• Microservices Architecture
• Serverless Architecture
• Containerization
11
Architectures for IoT convergence-Fog Computing
Architecture
• Fog computing brings computational
resources closer to the edge of the
network, reducing latency and
bandwidth usage.
• IoT devices communicate with
nearby fog nodes that can process
data locally before sending relevant
information to the cloud.
• Fog nodes act as intermediaries,
providing real-time processing and
decision-making capabilities.
12
Architectures for IoT convergence –Edge Architecture

• Similar to fog computing, edge


computing places computational
resources closer to the IoT
devices at the network's edge.
• Edge devices perform initial data
processing and filtering, sending
only relevant information to the
cloud for further analysis.
• Reduces latency and bandwidth
usage by processing data at the
source.
13
Architectures for IoT convergence- hierarchical
architecture
• In a hierarchical architecture, IoT devices
are organized into layers, each responsible
for specific tasks.
• The lower layers handle data acquisition
(collecting raw data from sensors) and
preliminary processing (basic filtering,
formatting), while higher layers manage
aggregation, analysis, and communication
with the cloud.
• The lower layers are primarily concerned
with capturing data at the source and
preparing it for further processing. The higher layers handle more sophisticated
• Enables a scalable and organized approach tasks such as data aggregation, analysis, and
to handling large-scale IoT deployments. communication with external systems.
14
Architectures for IoT convergence -Client-
Server Model
• IoT devices act as clients that
collect and transmit data to
cloud servers.
• Cloud servers host
applications, databases, and
analytics engines for
processing the received data.
• Simple and straightforward,
suitable for scenarios where
latency is not a critical
concern.

15
Architectures for IoT convergence-Microservices
Architecture
• Decomposes the overall system
into small, independent, and
modular services.
• Each service, or microservice,
performs a specific function,
allowing for scalability and
flexibility.
• Enables the development and
deployment of services for
handling various aspects of IoT,
such as data storage, analytics, and
device management.
16
Architectures for IoT convergence- Serverless
Architecture
• In a serverless
architecture, developers
focus on writing code
without managing the
underlying infrastructure.
• Cloud providers
automatically handle the
scaling and execution of
functions in response to
IoT events.
• Offers cost-effectiveness and scalability, as
resources are allocated on-demand.
17
Architectures for IoT convergence: Containerization
• Containerization technologies,
such as Docker, can be used to
package IoT applications and
their dependencies into portable
containers.
• Containers provide consistency
across development, testing, and
deployment environments.
• Kubernetes can be employed for
container orchestration (used to
manage clusters of containers),
managing the deployment and
scaling of containerized
applications. In this architecture, applications are packaged
along with their dependencies and libraries into
lightweight, portable containers. 18
Data offloading and computation
• Data offloading refers to the
transfer of data from a mobile
device to a more powerful
computing resource, such as a
cloud server or edge computing
node, for processing and analysis.
This is done to alleviate the
computational burden on the
mobile device, which may have
limited processing capabilities,
memory, or battery life.
• IoT raises challenges for devising
efficient strategies that offload
applications to the fog or the cloud
layer while ensuring the optimal
response time for a service.

19
Data offloading and computation
• Data offloading and computation in IoT convergence involve the strategic distribution of
data processing tasks across different layers of the architecture, including edge devices,
fog nodes, and cloud servers.
• The goal is to optimize the use of resources, minimize latency, and enhance overall
system efficiency.
• Computation offloading policies assume the response time is only dominated by the
execution time.
• For the computation offloading problem, the majority of existing literature presents
efficient solutions considering a limited number of parameters (e.g., computation capacity
and network bandwidth) neglecting the effect of the application characteristics and
dataflow configuration.
• Offloading computation is based on the assumption that
(i) the response time is mostly determined by computation time
(ii) shifting the computation toward upper layers can reduce the total response time.
• However, this assumption may not always hold, as the migration is often a
communication–computation co-optimization problem. 20
Data offloading
• Edge Level:
• Description: Initial data processing occurs at the edge, close to the IoT devices where data is generated.
• Benefits: Reduces the need to send raw data to the cloud, minimizing latency and conserving bandwidth.
• Use Cases: Real-time processing of sensor data, immediate response to local events.
• Fog Level:
• Description: Intermediate fog nodes between edge and cloud that perform additional processing and filtering
of data.
• Benefits: Enables distributed computing, allowing for more complex analysis than at the edge alone.
• Use Cases: Aggregation of data from multiple edge devices, localized analytics.
• Cloud Level:
• Description: Centralized cloud servers handle resource-intensive tasks, large-scale analytics, and long-term
storage.
• Benefits: Provides scalability, global accessibility, and the ability to run advanced algorithms.
• Use Cases: Historical data analysis, machine learning model training, global insights.
• Hierarchical Offloading:
• Description: Divides the processing tasks into hierarchical levels, with initial processing at lower levels and
more extensive analysis at higher levels.
• Benefits: Optimizes resource usage by performing appropriate processing tasks at each level.
• Use Cases: Multi-tiered analysis, where different layers handle specific aspects of data processing.
21
Data Computation
Centralized cloud servers for
• Edge Computing: resource-intensive tasks, large-
• Description: Localized processing at the scale analytics, and long-term
edge of the network, near the IoT storage
devices.
• Benefits: Reduces latency, enhances real-
time decision-making, and conserves Intermediate fog nodes
bandwidth. performing additional
• Use Cases: Quick response to local processing and filtering before
events, filtering of irrelevant data at the data is sent to the cloud
source.
• Fog Computing:
• Description: Fog nodes perform Localized
distributed computation, providing processing near
intermediate processing between edge IoT devices for
and cloud. initial data
• Benefits: Improves response time, processing and
supports more complex analytics filtering
compared to edge-only processing.
• Use Cases: Aggregated data analysis,
real-time decision-making at an
Sensors and devices
intermediate level. generating data
22
Data Computation
• Cloud Computing:
• Description: Centralized processing in the cloud for resource-intensive tasks, extensive analytics, and
storage.
• Benefits: Offers scalability, accessibility, and the ability to run complex algorithms.
• Use Cases: Machine learning model training, global analytics, storage of large datasets.
• Dynamic Computation Allocation:
• Description: Dynamically allocates processing tasks based on changing conditions, such as network
bandwidth and device capabilities.
• Benefits: Adapts to varying demands, optimizing resource utilization in real-time.
• Use Cases: Fluctuating workloads, adaptive resource allocation based on network conditions.
• Mobile Edge Computing (MEC):
• Description: Extends edge computing to mobile networks, allowing offloading of data processing tasks to
edge servers.
• Benefits: Lowers latency in mobile environments, supports real-time applications.
• Use Cases: Mobile IoT devices offloading computation to edge servers in a mobile network.
• Hybrid Approaches:
• Description: Combines elements of edge, fog, and cloud computing for a flexible and adaptable system.
• Benefits: Balances local and centralized processing based on specific application requirements.
• Use Cases: Hybrid architectures for diverse IoT applications.
23
Data offloading and computation systems-Response Time model
• Dataflow 1: The actuators
are located at the sensor layer,
so the action response is
issued at the processing unit
and is performed at the sensor
layer.
• Based on this scenario, the
data analysis can be
performed at the three layers:
the sensor node processing
unit, the gateway device
processing unit, and the
cloud processing unit.

24
Data offloading and computation systems-Response Time model

• Dataflow 2: Actuators
can be physically distant
from sensor nodes,
collecting data for
analysis and notification
on the application layer.
• Tele-monitoring IoT
systems typically follow
this dataflow, assuming
end-user connectivity.

25
Data offloading and computation systems-Response Time model
• The response time is defined as the time
difference between the moment when data is
collected for decision making and the moment
when the result is delivered to the consumer.
• Different data flows from sensors to consumers
may cause changes in the response time.
• The response time is a function of many factors
including contextual parameters and
application characteristics that can change over
time.
• Finding optimal computation offloading policy
for an unknown and dynamic system is critical
since dynamicity of environment e.g., network
condition, workload arrival at computing
nodes, user traffic, and application
characteristics changes over time.

26
Data offloading and computation systems-Response Time model

• (i) Where to offload: the


scheduler should determine where
the computation is offloaded,
depending on the variety of
parameters such as objectives,
availability of resources, and
required computation capacity for
performing computations.
• Optimum Solution: optimally
offloading workloads to more
capable computing resources.

27
Data offloading and computation systems-Response Time model
• (ii) When to offload: the
scheduler should determine
when the computation is
offloaded to upper layers to
achieve the required QoS due
to many uncertainties in the
system and environment such
as network congestion,
overloaded workload, and
device battery.
• Optimum Solution: Optimal
time scheduling of offloading
computations.

28
Data offloading and computation systems-Response Time model
• (iii) What to offload: the
computation scheduler should
determine what portion of
workloads is offloaded to upper
layers.
• According to this fact, offloading
solutions can be classified into
two classes:
• (i) full offloading where the whole
workloads are offloaded to
external resources such as fog or
cloud layers and
• (ii) partial offloading where
workloads are partitioned into
parts to be executed locally or
externally. 29
Data offloading and computation systems-Response Time model
• Resource allocation is classified into three main categories as resource placement,
resource scheduling, and computation offloading. Resource placement is about where
and how resources are placed in IoT systems.
• It aims to find optimal set resources in IoT systems to execute tasks or applications
while satisfying QoS (Quality of Service) requirements by optimizing specific
objective function (minimizing latency, minimizing energy consumption, etc.).
• Resource scheduling or scheduling in resource allocation is to determine when and
how many resources to allocate in IoT systems.
• The resource scheduling determines optimal scheduling of tasks, services, or
applications to be executed on resources in order to meet QoS requirements.
• Computation offloading is to determine where and how many resources can be moved to
execute tasks or applications. The technique in IoT context is the transfer of resource-
intensive computational tasks to a separate external device in the network.
• The technique of offloading computation over a network can provide computing power and
overcome the limitation of an IoTbased device such as computational power, storage, and
energy.
30
Dynamic Resource Provisioning Centralized system
responsible for managing
• Dynamic Resource and orchestrating resources
across the entire IoT cloud
Provisioning in IoT cloud convergence architecture
convergence involves the
Core component handling
automatic allocation and auto-scaling, load balancing,
de-allocation of edge/fog node provisioning,
computational resources and other dynamic resource
allocation strategies.
based on the changing
demands of the system.
• This adaptive approach •Edge nodes for localized
processing near IoT devices
ensures efficient utilization •Fog nodes for intermediate
of resources, scalability, and processing between edge
responsiveness to varying and cloud.
•Cloud nodes for
workloads. centralized processing and
storage.

31
Dynamic Resource Provisioning
1. Monitoring and Analysis:
• Continuous monitoring of IoT device status, network conditions, and application performance.
• Analysis of incoming data and workload patterns to identify resource requirements.
2. Resource Scaling:
• Vertical Scaling:
• Increase or decrease the computing power (CPU, memory) of existing virtual machines or
containers based on workload.
• Horizontal Scaling:
• Add or remove instances of virtual machines to balance the load and handle increased
demand.
3. Auto-scaling Policies:
• Define policies that trigger resource scaling based on predefined thresholds or performance
metrics.
• Parameters may include CPU utilization, memory usage, network traffic, or specific
application-level metrics.
32
Dynamic Resource Provisioning
4. Load Balancing:
• Distribute incoming requests or data streams evenly across multiple computing resources to
prevent overloading specific nodes.
• Adjust load balancing strategies dynamically based on current conditions.
5. Edge and Fog Node Provisioning:
• Dynamically allocate computing resources at the edge and fog layers to handle localized
processing.
• Adjust the number of edge and fog nodes based on the proximity to IoT devices and the intensity
of processing required.
6. Cloud Bursting:
• Offload excess processing tasks to the cloud during peak workloads.
• Automatically provision additional cloud resources as needed and release them when demand
decreases.
7. Predictive Analytics:
• Use historical data and machine learning models to predict future resource demands.
• Proactively allocate resources before a surge in demand occurs. 33
Dynamic Resource Provisioning
8. Policy-based Allocation:
• Define policies that govern how resources should be allocated based on specific application
requirements, quality of service (QoS), and cost considerations.
9. Real-time Adaptation:
• Adapt resource provisioning in real-time based on the evolving nature of IoT data and
application requirements.
• Respond quickly to sudden changes in workload or network conditions.
10. Automated Configuration Management:
• Use configuration management tools to automate the setup and configuration of new
resources.
• Ensure consistency and reliability across dynamically provisioned resources.
11. Feedback Loops:
• Implement feedback loops to continuously assess the effectiveness of resource provisioning.
• Adjust provisioning strategies based on the feedback received from monitoring and analytics.
34
Security aspects of IoT convergence Cloud Security

• Security is a critical aspect in IoT-Cloud


Network Security
convergence, as the integration of Internet
of Things (IoT) devices with cloud
computing introduces new challenges and IOT Device Security
vulnerabilities.
• Ensuring the security of data, devices, and Access Control and
Identity Management
communications is essential for the
successful and safe deployment of IoT
Security Monitoring
solutions.
Update and Patch
Management

Privacy protection and


Physical Security
35
Security aspects of IoT convergence
• Device Security:
• Authentication: Implement strong authentication mechanisms to ensure that only
authorized devices can access the network and cloud services.
• Secure Boot: Ensure that devices boot securely, verifying the integrity of their
firmware and software components.
• Device Identity: Assign unique identities to each device and manage these identities
securely.
• Data Encryption:
• End-to-End Encryption: Encrypt data both in transit and at rest to protect it from
unauthorized access.
• Data Integrity: Implement measures to ensure the integrity of data during
transmission and storage.
• Network Security:
• Secure Protocols: Use secure communication protocols (e.g., TLS/SSL) for data
transmission between devices, edge, and cloud.
• Firewalls and Intrusion Detection Systems (IDS): Deploy firewalls and IDS to
36
monitor and protect the network from malicious activities.
Security aspects of IoT convergence
• Access Control:
• Role-Based Access Control (RBAC): Implement RBAC to restrict access to
resources based on the roles and responsibilities of users and devices.
• Least Privilege Principle: Grant the minimum level of access necessary for devices
and users to perform their functions.
• Cloud Security:
• Multi-Factor Authentication (MFA): Enforce MFA for accessing cloud services to
add an extra layer of security.
• Data Center Security: Ensure that the cloud provider maintains robust physical and
environmental security measures.
• Security Monitoring and Logging:
• Security Information and Event Management (SIEM): Implement SIEM solutions
to monitor and analyze security events across the IoT-Cloud infrastructure.
• Logging: Generate and analyze logs to detect and respond to security incidents.

37
Security aspects of IoT convergence
• Update and Patch Management:
• Firmware Updates: Regularly update device firmware to patch vulnerabilities and
improve security.
• Patch Management: Apply security patches to cloud infrastructure and services to
address known vulnerabilities.
• Privacy Protection:
• Data Minimization: Collect and store only the data necessary for the intended
purpose to minimize privacy risks.
• Privacy Policies: Clearly communicate privacy policies to users and ensure
compliance with applicable data protection regulations.
• Physical Security:
• Physical Access Controls: Secure physical access to IoT devices, edge infrastructure,
and cloud data centers.
• Tamper Detection: Implement mechanisms to detect tampering or unauthorized
physical access to devices.
38
Security aspects of IoT convergence
• Incident Response and Recovery:
• Incident Response Plan: Develop and regularly test an incident response plan to
efficiently address and mitigate security incidents.
• Backup and Recovery: Implement backup and recovery procedures to minimize data
loss in the event of a security incident.
• Regulatory Compliance:
• Compliance Assessments: Ensure compliance with relevant regulations and
standards, such as GDPR (General Data Protection Regulation), or industry-specific
requirements.
• Secure Development Practices:
• Secure Coding Standards: Adhere to secure coding practices during the
development of IoT devices and cloud applications.
• Security Testing: Conduct regular security assessments, including penetration testing
and code reviews.

39
Top 10 web application security risks according to the Open
Web Application Security Project (OWASP)

40
Securing IoT-cloud convergence
• Monitor and secure the flow of data early
into the process. Enterprises must secure IoT
data flow by implementing edge monitoring
and filtering tools to detect suspicious activity,
anomalies, and identify connected devices early
in processing.
• Use cloud-based solutions to bring security
closer to the edge. Enterprises must protect
edge devices from physical and cyber threats,
using cloud-based solutions like fog computing
to enhance security and processing capabilities.
• Perform vulnerability checks
regularly. Regular vulnerability testing can
detect errors in an ecosystem, allowing
enterprises to conduct testing on specific
components or the entire ecosystem as long as
it's conducted regularly.
41
Securing IoT-cloud convergence
• Ensure continuous updates and patches. Enterprises can effectively and securely
distribute software updates using the cloud, which is crucial in preventing IoT
vulnerabilities from being exploited.
• Use secure passwords for both IoT devices and linked cloud services. Weak passwords
contribute to successful data breaches, necessitating stringent password policies for
enterprises, as IoT devices and cloud services are susceptible to intrusions due to guessable
credentials.
• Define a clear, effective, and detailed access control plan. Enterprises should develop a
comprehensive access control plan, identifying users, groups, and roles for detailed
authentication and authorization policies in the IoT-cloud ecosystem, applying the principle
of least privilege.
• Employ code or application security best practices. IoT-cloud ecosystems benefit
organizations by simplifying control and enabling remote usability through code or
applications. Best practices include static and dynamic application analysis.
• Adopt the shared responsibility model. Enterprises should consider the shared
responsibility model used by cloud service providers and employ cloud-specific security
solutions to ensure data protection in cloud-native systems. 42
Use Cases
• IoT-Cloud convergence offers a wide range of use cases across various industries, leveraging the combined
power of IoT devices and cloud computing.
• Smart Home Automation:
• IoT Devices: Smart thermostats, security cameras, doorbell cameras, lights, and appliances.
• Cloud Integration: Centralized control and monitoring of devices through a cloud platform. Users can
remotely access and manage their smart home devices, receive real-time notifications, and analyze
historical data for energy efficiency.
• Industrial IoT (IIoT) for Manufacturing:
• IoT Devices: Sensors on machinery, RFID tags on products, and monitoring devices for quality control.
• Cloud Integration: Real-time monitoring of production lines, predictive maintenance using cloud
analytics, and centralized control of manufacturing processes. The cloud enables data storage, analysis,
and seamless integration with other enterprise systems.
• Healthcare Monitoring:
• IoT Devices: Wearable health trackers, smart medical devices, and sensors for patient monitoring.
• Cloud Integration: Continuous monitoring of patient health, real-time data transmission to cloud
platforms for analysis, and storage of patient records. Cloud-based applications can provide healthcare
professionals with insights for personalized patient care.
43
Use Cases
• Environmental Monitoring:
• IoT Devices: Sensors for air quality, water quality, and climate conditions.
• Cloud Integration: Continuous monitoring of environmental parameters, early detection of pollution events,
and data analysis for environmental research. Cloud platforms facilitate collaboration between researchers and
government agencies.
• Smart Agriculture:
• IoT Devices: Soil sensors, weather stations, drones, and GPS-equipped tractors.
• Cloud Integration: Precision farming through real-time data analytics, weather predictions, and crop health
monitoring. Farmers can optimize irrigation, plan planting schedules, and receive insights for efficient resource
utilization.
• Connected Cars and Transportation:
• IoT Devices: Sensors in vehicles, GPS systems, and connectivity for in-car applications.
• Cloud Integration: Real-time tracking, monitoring vehicle health, and predictive maintenance. Cloud services
can provide traffic updates, navigation assistance, and facilitate over-the-air updates for vehicle software.
• Supply Chain Visibility:
• IoT Devices: RFID tags, GPS trackers, and temperature sensors for goods in transit.
• Cloud Integration: Real-time tracking of shipments, monitoring of environmental conditions during
transportation, and data-driven insights for supply chain optimization. Cloud platforms enhance transparency
and collaboration across the entire supply chain.
44
Use Cases
• Smart Cities:
• IoT Devices: Smart streetlights, environmental sensors, waste management sensors, and public
transportation systems.
• Cloud Integration: City-wide data aggregation for monitoring air quality, traffic flow, energy
consumption, and waste management. Cloud platforms enable efficient city planning and resource
allocation based on real-time and historical data.
• Retail and Inventory Management:
• IoT Devices: RFID tags, smart shelves, and inventory tracking sensors.
• Cloud Integration: Real-time monitoring of inventory levels, demand forecasting, and supply chain
optimization. Cloud-based analytics help retailers make data-driven decisions for stock
replenishment and improve overall supply chain efficiency
• Energy Management:
• IoT Devices: Smart meters, sensors in power grids, and energy consumption monitoring
devices.
• Cloud Integration: Real-time monitoring of energy consumption, predictive maintenance for
power infrastructure, and optimization of energy distribution. Cloud analytics help utility
companies balance supply and demand efficiently.
45
Summary-IoT-Cloud Convergence
• In order to facilitate effective data processing, storage, and analysis, cloud computing and
the Internet of Things (IoT) are merging. This involves integrating IoT devices with cloud
services.
• Some common architectures for IoT cloud convergence: Fog Computing Architecture,
Edge Computing Architecture, Hierarchical Architecture, Client-Server Model,
Microservices Architecture, Serverless Architecture and Containerization.
• The convergence of the IoT and the cloud redefines the IoT’s scalability for future
expansions and gives the cloud an avenue for creating new services and boosting
capabilities.
• In IoT convergence, data offloading and computation refer to the deliberate allocation of
data processing responsibilities among various architectural levels, such as edge devices,
fog nodes, and cloud servers.
• The integration of IoT devices with cloud computing presents new challenges and
vulnerabilities, necessitating the security of data, devices, and communications for the
successful and safe deployment of IoT solutions.
46
Topics in Module-5-Smart Computing over IoT-
Cloud
• Cognitive
Computing
Capabilities
• Underlying
Technologies
• Empowering
Analytics
• Deep Learning
Approaches –
Algorithms, Methods
and Techniques

1
SMART COMPUTING OVER IOT– CLOUD
• Smart computing in general refers to the mechanism of empowering devices and
utilities that we use in our day- to- day lives with computing capabilities. On similar
lines, smart computing over the IoT– Cloud refers to convergence of hardware,
software, and network technologies that empower the IoT– Cloud application with real-
time awareness of the environment and enhanced analytics that can assist humans in
better decision making and optimizations, thereby driving the business to success.

2
BIG DATA ANALYTICS AND COGNITIVE COMPUTING
1. Big Data Analytics:
• With the numerous devices that are connected to each other and the Cloud through the Internet,
the amount of data that these devices generate is immeasurable and huge.
• This huge volume of data in different formats is referred to as big data and analysis of this data
in order to generate suggestions and solutions is called as big data analytics. However, analysis
of such data in IoT applications is a huge challenge due to the their size and heterogeneity.
2. Cognitive computing :
• Cognitive computing is a mechanism used in solving problems that are complex and may
have a certain degree of uncertainty in arriving at suitable answers. It is a self- learning
system that mimics the human brain/ thinking with the help of computerized models. It is a
confluence of several underlying technologies such as natural language processing (NLP),
pattern recognition, data mining, sentiment analysis, machine learning, neural networks, and
deep learning.

3
Cognitive Computing Capabilities
• Cognitive computing :
• Cloud platforms offer centralized device management capabilities, allowing for remote
monitoring, configuration, and firmware updates of IoT devices. Cognitive computing can offer
improved data analysis.
• Example: The health care industry integrates data from various sources such as journals, medical
records, diagnostic tools, and other documents. All these data provide evidence and help make
informed decisions and recommendation related to the treatment that can be provided to patients.
Here is where cognitive computing comes in handy by performing quick and reliable analysis of
the data and presenting it to the physicians, surgeons, or medical professionals.
• Cognitive computing can lead to improved customer satisfaction levels. For instance, the Hilton
group, which is a hospitality and travel business, has employed a robot, Connie (Watson enabled)
that provides customers with precise, relevant, and accurate information on various topics related
to travel and accommodation. It also provides information on fine dining, amenities offered at
hotels, and places to visit thus making the customers have a smart, easy, and enjoyable travel
experience.

4
Cognitive Computing Capabilities
• Cognitive computing can simplify complex processes into simpler and efficient processes. In
the case of Swiss Re, an insurance company, the application of cognitive computing has
made the process of identifying patterns simpler and efficient, thereby enabling real-time
problem solving for more efficient responses. It has employed the IBM Watson technology
to perform analysis of huge volumes of structured and unstructured data pertaining to the
risk of exposure of sensitive information. Based on the analysis, measures were adopted to
put efficient risk management tools in place and improve the productivity of the business.
• Cognitive computing can be employed for identifying safety concerns in a product earlier in
the lifecycle, thereby helping to reduce costs that might be incurred in a recall after
completion. It also helps in upholding reputations of big organizations by identifying
shortcomings at an earlier stage. Also, the delays in time- to- market that might occur if a
product fails are also taken care of with early detection.
• Cognitive computing over IoT can enable products to make independent and instantaneous
decisions in businesses without human interference. Fact- based solutions can be provided
proactively to drive the entire business process right from engaging in relevant and
meaningful conversations with customers to the manufacturing and maintenance of tools and
equipment.
5
Cognitive Computing Capabilities
• Cognitive computing must possess the following features in order to realize the previously-
mentioned capabilities:
• Adaptive: Cognitive computing must be able to keep up with the dynamically changing data,
goals and requirements by learning, and updating constantly.
• Interactive: Cognitive computing should provide flexibility and ease by allowing users to
communicate just the way they would in a real- world human- to- human interaction using voice,
gestures, and natural languages.
• Iterative and stateful: It should possess the capability of collecting relevant information by
asking suitable questions from the user in the event where enough information and requirements
are not available in order to describe the problem in question.
• Contextual: Cognitive computing should discover and extract relevant information like location,
time, and user details pertaining to the problem based on sensory inputs such as gestures, speech,
and vision. Cognitive computing should analyze and process real- time and near real- time data.
• Cognitive computing is capable of minimizing the amount of traffic from and to the Cloud in an
IoT– Cloud system by imparting intelligence to the edge devices. Devices can be equipped with
capabilities that can reduce energy consumption and improve performance and privacy.

6
Underlying Technologies
• Natural language processing :
• Natural language processing (NLP) is a field of study that helps in translating and interpreting
human language by computers.
• These computers basically work upon the natural human language by analyzing and
understanding the language, thereby being able to extract meaningful information in a smart
way.
• A piece of software written by developers using the underlying NLP algorithms can help
understand the human language (speech and text) better and use it for analysis. Some of the
applications made possible due to NLP’s ability to extract meaning from language based on the
analysis of the hierarchical structure of language are grammar correction, speech to text
convertor, and automatic language translators. NLP algorithms are machine learning based
algorithms that enable learning of rules of execution by studying/ analyzing a predefined set of
examples such as books or a set of sentences, leading to a statistically generated inference. Here,
the programmer is relieved from the burden of having to write/ code the set of rules for analysis.
7
Underlying Technologies
• Natural language processing :
• NLP has a set of open standard libraries that assist in real- time application development.
• Algorithmic is a model based on ML that supports deployment and management of applications without the
need to spend efforts in setting up servers and infrastructure. It helps a great deal in automating the ML
operations for an organization with simple API endpoints to the algorithms, some of which are discussed
below.
• Apache OpenNLP is a toolkit employed to process text that is written in the natural language and helps in
development of services that support proficient text processing actions. Common tasks that are performed by
NLP like language recognition, segmentation, parsing, and chunking tokenization are supported by this open-
source library that is based on machine learning.
• Natural Language Toolkit (NLTK) is a collection of efficient libraries for processing text in natural language
(English). It is a platform that supports symbolic and statistical NLP with programs written in Python. NLTK
has been found more suitable for teaching and research. It is also suitable for empirical linguistics in Python,
machine learning, artificial intelligence, and retrieval of meaningful information.
• Stanford NLP is a package of NLP software developed and managed by the Stanford NLP group. The tools
offered by the group can be integrated into applications that require human language processing, analysis, and
interpretation requirements. Its use has been extensive in the fields of academia and in industrial and
governmental organizations. 8
Underlying Technologies
• Data mining:
• Data mining is the process of excavating huge volumes of data in order to draw inferences, patterns,
knowledge, and information that can be used to make improved business decisions, devise effective
cost- cutting strategies, and improve revenue. It involves the application of certain mechanisms to help
in finding anomalies and correlations in larger data sets, thereby enabling detection of outcomes. It is
one of the phases in the process of “knowledge discovery in databases”.
• Data mining has six classes of tasks that are performed during the process as listed below.
• Anomaly detection is a mechanism applied in order to examine the data and detect any change or
outlier or deviation that can be used for further analysis and investigation.
• Association rule learning or also known as market basket analysis is a method of identifying
relationships, patterns in the data, and associated variables. For example, identification of customer
buying habits can help business in understanding frequently bought items and items bought together.
This can help in developing efficient marketing strategies.

9
Underlying Technologies
• Data mining:
• Clustering is an action of identifying similarities in data, thus leading to the detection of groups and
structures in the data. This grouping is done without any predefined/ known labels or structures in the
data.
• Classification is similar to that of clustering in that both the methods perform grouping of data based
on identified features. However, classification is a supervised learning technique wherein the data is
categorized based on previously available labels or structures, for example, classification of emails as
spam and valid.
• Regression is a predictive analysis mechanism wherein modeling of data is done in order to assess the
strength and relationship between variables or data sets.
• Summarization is a representation of data sets in a compressed and compacted way with the help of
reports and visualizations (graphs and charts).

10
Underlying Technologies
• Machine learning
• Supervised learning
• Unsupervised learning
• Reinforcement learning
• Neural networks are systems that take its root from a human brain. They are similar to machine
learning models in that they learn from experience without the need for programming that is specific to
the task. However, neural networks are different in that they are able to make intelligent decisions on
its own, unlike machine learning where decisions are made based on what it has learned only. Neural
networks have several layers that are involved in the refinement of output at each level. The layers
have nodes that are responsible for carrying out the computations. It mimics the neuron in the human
brain wherein it collects and unites the input from the data and assigns suitable weights which are
responsible to either magnify or diminish the input value pertaining to the task. An activation function
is assigned to a node that decides about the depth of progress that is to be made by the signal through
the various layers in order to affect the outcome based upon the sum of the product of the input–
weight pair. A neuron is activated in each layer when a signal reaches or propagates through it.

11
Underlying Technologies
• Sentiment analysis finds its application in many well- known platforms and applications. It is
basically a text analysis mechanism that is used to detect the polarity (positive, negative, or neutral
opinion) in a given text, be it a long document, paragraph, or a few sentences. It basically helps in
understanding human emotions from the text. There are various types of sentiment analysis, and they
are described below.
• Fine-​grained sentiment analysis is used when the precision of sentiment/ polarity is very important
for the business. For example, in a five- star rating for a product or a review for a session, the polarity
is recorded in five degrees (Excellent – 5, good – 4, fair – 3, meets expectations – 2, below
expectations – 1).
• Emotion detection is a method applied to identify human emotions such as happy, sad, angry, or
frustrated from a text using a variety of machine learning algorithms.
• Aspect-​based sentiment analysis is a mechanism that helps identify emotions expressed pertaining to
specific features or characteristics of a product or artifact. For example, when customers review a
phone, they may specify certain features of the phone as being outstanding or abysmal.
• Multilingual sentiment analysis is a method applied on texts written in a variety of languages. This
involves a lot of effort in terms of pre-processing and resource requirements. Many tools are also
available online that can be used effectively, but for more accurate and specific results, algorithms and
techniques must be developed indigenously.
12
Underlying Technologies
• Patten recognition is a mechanism applied to analyze and process data in order to generate
meaningful patterns or regularities. Pattern recognition has its application in various areas such as
image analysis, bioinformatics, computer graphics, signal processing, image processing, and many
more. Pattern recognition techniques are of three types as follows.
• In statistical pattern recognition, patterns are described using features and measurements. Each
feature is represented as a point in an n- dimensional space. Statistical pattern recognition then picks
features and measurements that permit the pattern vectors to fit in various groups or categories in the
n- dimensional space.
• Syntactic pattern recognition uses basic subpatterns or simply referred to as primitives that are
utilized in making descriptions of the patterns by organizing them into words and sentences.
• Neural pattern recognition works on a system containing numerous processors that are
interconnected and enable parallel computing. The systems are then trained on sample sets, enabling
them to learn from the given input– output data sets, thereby enabling them to adapt themselves to the
changing data.

13
Empowering Analytics
• The huge volumes of the big data pose huge problems too. Industries and businesses today are
overwhelmed with the amount of information that is accumulated for analysis. However, the talent
that is required to handle such data and retrieve meaningful information for businesses is scarcely
available.
• The number of data scientists and analysts are not enough to keep up with the ever- increasing data
volumes. Experienced specialists are required in order to handle the available platforms and put
them to effective use.
• A solution to this problem would be to increase the supply of specialists by increasing the training
programs offered to interested people. On these grounds, an even more effective solution would be
to utilize the existing technologies and train machines/ computers rather than human beings to
manage the tools. This is made possible with advancements in cognitive computing. The
confluence of cognitive computing, artificial intelligence, and machine learning can aid both
experienced and inexperienced staff to handle complex analytic operations using the available tools
and platforms. It also helps in improving the accuracy and quality of the results. This accelerates
the process of analysis in real- time and near real- time data, thereby enabling businesses to make
real- time decisions.

14
Empowering Analytics
• The capabilities that cognitive computing can empower big data with are countless and very
promising. The argument presented above about the lack of sufficient talent with knowledge to
handle the big data platforms can be overcome with advancements in NLP. With NLP in picture,
employees who are not proficient in data/ information processing and data languages that is
required for analytics activities can simply work on the platforms and tools with normal
interactions just as we do with other human beings. The platforms can be equipped with the
capability to transform normal language into data queries and requests and respond with solutions
or answers in the same way as a natural language, enabling easy understanding. This brings in
much more flexibility into big data analytics.
• Big data analytics empowered by cognitive computing has accelerated the decision- making
process, accuracy, and productivity of many businesses with its tools and platforms

15
Deep Learning Approaches
• Deep learning is a subset of machine learning in which the age- old
traditional algorithms used to instruct the computers on the tasks to be
performed are equipped with capabilities to modify their own
instructions to improve the functionality of the computer.
• Deep learning is a mechanism that enables computers to process huge
volumes of data and learn from experience similar to humans. The
algorithms and mechanisms in deep learning perform tasks in a
repetitive manner, each time adjusting the parameters in order to
achieve the desired outcome.

16
Artificial Neural Networks (ANN)
• ANN is a deep learning approach that imparts artificial intelligence
into machines by mimicking the human brain and the nervous system.
• Imagine that you have just hurt your index finger. The sensory nerve
in the hand immediately sends out signals (chain reaction) that
ultimately reaches your brain and tells that you are experiencing pain.
• This is the basic idea behind the functionality of ANNs. The ANN
have a sequence of branching nodes which function in a similar way to
that of the neurons in the human body. Inputs are fed into the input
nodes, which then propagate the information into the series of internal
nodes, which process the information until the desired output is
generated.

17
Artificial Neural Networks (ANN)
• In ANN, a node depicts the neuron
which receives the information and
transforms the information by
performing a quantitative function,
which is carried over to the next
neuron.
• In transit the connectivity lines
(synapse) in turn apply its own
transformation function on the
information and modifies it by
adding a constant value. This
modification process happens by
the application of weights.
18
Artificial Neural Networks (ANN)
• The input from multiple synapses
or connectivity lines are collected,
summed up, and then sent to the
next node.
• This node in turn adjusts the data
by applying constants, thereby
modifying the data. This
application of constants at the
nodes is called as node bias.
• Application of weights and bias to
the input data is important as this
ascertains that the data is
propagated properly through the
network.
19
Artificial Neural Networks (ANN)
• For a node to be able to propagate/
pass the data, it must be activated.
Activation of nodes happens when
the output it produces meets the
threshold value that is set by the
programmer, after which the data
will be passed on to the next node;
otherwise the node remains
dormant.
• However, this single pass of
information to the final nodes in
many cases might not lead to the
desired output.
20
Artificial Neural Networks (ANN)
• For example, the network might
accidentally identify a cat as a
dog, which is not acceptable. To
counter this, an algorithm called
the backpropagation algorithm is
applied to the network, which
uses feedbacks to enable the
adjustment of weights and biases
and finetune the synapses until
the result is agreeable or even
almost correct.
21
Convolution Neural Network (CNN)
• CNN is mostly applied to image processing problems and natural
language processing problems. Traditional neural networks make no
assumptions on the inputs and weights that are used to train the models
which are not suitable for images and natural language- based
problems.
• CNN treats data as being spatial. As opposed to neurons being
connected to neurons in the preceding layer, they are instead linked to
neurons that are only close to it and all of them have the same weights.
This simplification enables to maintain the spatial property of the data
set. The simplification of an image is made possible via the CNN
facilitating better processing and understanding of the images.

22
Convolution Neural Network (CNN)
• The CNN architecture as shown in Figure consists of multiple layers that exist in a usual neural network. The
CNN has additional layers such as the convolution layer, pooling layer, ReLu (rectified linear unit) layer, and
a wholly connected layer. The ReLu layer functions as an activation layer that ensures nonlinearity while the
data moves through each layer of the network. Absence of this layer could cause the loss of dimensionality
that is required to be maintained with the data fed into each layer. The fully connected layer performs
classification on the data set. The convolution layer performs the most important function in the network. It
places a filter over an array of image pixels, leading to the formation of a “convolved feature map”.

23
Convolution Neural Network (CNN)
This enables focus on specific features of
the image, which might be missed out
otherwise. The pooling layer reduces the
number of samples of a feature map,
causing a reduction in the number of
parameters to process, thereby enabling
faster processing. This leads to a pooled
feature map. The pooled feature map is
obtained by either performing a max
pooling (selects the maximum input of a
particular convolved feature) or average
pooling (calculates the average).
Ultimately, the model builds up an image
of its own based on its own mathematical
rules. If unlabeled data are used, then
unsupervised learning methods can be
used to train the CNN. Auto encoders
enable us to compress and place the data
into a low- dimensional space. 24
Recurrent Neural Networks (RNN)
• RNN are used in many applications such as speech recognition, language translation, and
prediction of stocks. RNNs are used to model sequential data.
• In order to understand this, consider a still snapshot taken of a ball that is in motion over a period.
• Now, from this picture we would want to predict the direction of motion that the ball is moving in.
Such a prediction with a single standstill picture of the ball will be very difficult. Knowledge of
where the ball had been before the current picture is required for an accurate and real- time
prediction. If we have recorded many snapshots of the ball’s position in succession (sequential
information over time), we will have enough information to make better predictions.
• Similarly, audio is sequential information that can be broken up into smaller pieces of information
and fed into the RNN for processing. Textual data is another type of sequential information that can
be a sequence of alphabets or words.

25
Recurrent Neural Networks (RNN)
• RNNs work upon these kinds of sequential information to provide
predictions based upon a concept called “sequential memory.”
• Sequential memory is a mechanism that enables the human brain to
recognize patterns that occur in a sequence. This is implemented in
RNNs with the help of a looping mechanism that enables to pass on
earlier information in the forward direction for processing. This
intermediary information is represented as the hidden state, which
depicts previous inputs that affect the later states.
• The RNN, however, suffers from a problem, which is short- term
memory and vanishing gradient, which is the side effect of the
backpropagation methodology in RNNs.

26
Recurrent Neural Networks (RNN)
• RNN differs from CNN in that, CNN is a feed- forward network used to
filter spatial data while the recurrent neural network (RNN) feeds
data back into itself, thereby being the best candidates for sequential
data. A CNN can recognize patterns across space whereas RNN can
recognize patterns over time.

27
Algorithms, Methods, And Techniques

• Back propagation is a method of


training the neural networks that
works on the principle of the
supervised learning methodology.
• Backpropagation helps in fine- tuning
the weights by reassigning it to an
approximate value based on the
difference inferred between the
actual and desired output. The
iterations are repeated until a suitable
weight is achieved for the model with
minimal error value.

28
Algorithms, Methods, And Techniques
• Stochastic gradient descent is a method that helps in sampling huge
volumes of data by randomly selecting data points and sampling
them, thereby reducing the amount of computation required. It is
based on the unsupervised learning mechanism.
• Transfer learning is a method adopted to train models in layers by
adopting the convolution mechanism. The last few layers tend to be
more specific to the data fed as input while the starting layers are
more generic pertaining to simple patterns. For example, in a model
provided with a training dataset, the early layers might be looking for
eyes, ears, and mouth while the later layers may be looking for dogs,
humans, and cars.

29
Algorithms, Methods, And Techniques
• Logistic regression is a classification algorithm working on the
principle of supervised learning. It forecasts the probability of a
dependent or target variable. The dependent variable has only two
outcomes coded as 1 (success/ yes) and 0 (failure/ no). It is the
simplest mechanism used in classification problems such as illness
prediction (cancer and diabetes) and spam email identification.
• Naive Bayes algorithm is another classification method based on the
supervised learning mechanism. The central idea of the naïve Bayes
classifier is the Bayes theorem. The classification has two phases,
namely the learning phase (model trained on a given dataset) and
evaluation phase (performance testing).

30
Algorithms, Methods, And Techniques
• Support vector machines (SVM) is a machine learning
algorithm that follows the supervised learning
mechanism. It is widely used in classification problems
where the goal is to find a hyperplane that best segregates
all the data points into two different categories as shown
in figure.
• Support vectors are nothing but data points themselves
that are closer to the hyperplane whose removal can cause
a great change or shift in position of the hyperplane. The
success of SVMs is determined from the distance that
exists between the hyperplane and the nearest data point,
which is called as the margin (greater distance means
effective classification). When a clear hyperplane cannot
be identified, a 3D view of the data points can help in
obtaining the hyperplane, which is done through a
mechanism called kernelling. SVM finds its application in
many areas such as in the cancer and neurological disease
diagnosis and many other research related to health care.

31

You might also like