Machine Learning PPT 10-3-2022 Final

Joint All India Council for Technical Education (AICTE) – Gujarat Technological University
Sponsored “One Week e-Faculty Development Programme on- “ A future of Artificial

Intelligence in Healthcare System” March 07-12, 2022
Machine Learning Driven Pharmaceutical Product

Development
Presented by:
Dr. Vaishali T. Thakkar
Professor and Research coordinator
(Pharmaceutics department )
Anand Pharmacy College
Disclaimer:
The information given in this presentation has been compiled

from various sources believed to be reliable.
All cited examples should serve as a guideline.
The contents have been picked up from public resources.
It is used for educational purposes only.
Prior permission shall be obtained, if the contents of this
presentation are to be used by anyone
2
BUZZ WORD OF PHARMA INDUSTRY 4.0
The domain of healthcare has always been flooded with a huge

amount of complex data, coming in at a very fast-pace
3
Discipline of Data sciences and their relationship
4
Sources of big data
International
Data Corporation,
153 exabytes of data
2,314 exabytes in 2020,

more than an 11,000% spike.
Data Mining - are an integral part of a in Big Data Analytics.

5
Data Mining and Stages of Knowledge Discovery Data Mining (KDD)
 Data mining is the process of extraction of data (non-trivial, implicit, previously unknown, and
potentially useful) from a large database.
 It is also known by other names such as Knowledge Extraction, Data-pattern Analysis, Information
Harvesting and Knowledge Discovery (KD)
Selection – Gathering of data from
heterogeneous source
Pre-processing- eliminating the noisy data

handling missing data, detect or remove
outlier
Transformation: Aggregation, smoothing,

normalization, generalization, and
discretization
Data mining Choosing the data mining

algorithm for Beneficial information
Evaluation of the outcomes is prepared with

statistical justification and significance testing.
6
7
Descriptive Predictive
Definition Process of finding useful information by Process of forecasting the

analysis of Huge data useful information by analysis
of huge data
Process Data aggregation and data mining Statistical based forecasting
method
Accuracy Provide accurate data – based on past data Do not provide accurate data
Approach Reactive Proactive
MOVING FROM DESCRIPTIVE TO

PREDICTIVE AND PRESCRIPTIVE
ANALYTICS
8
9
Breakdowns of industrial development and the great changes in
related categories
10
Relation between different methodologies in Data Mining
Data Mining - are an integral part of a in Big Data Analytics.
11
Regression Analysis – (Multiple Linear Regression - MLR )
Multivariate mathematical and statistical methods are used to handle the large
data set.
Data is converted into information
1. When the factors are few in number(rows are

more than columns)
2. When the factors are not significantly correlated

(collinear) – “r-matrix
3. When the factors have a well- understood

relationship to the responses (Y).
12
Multivariate analysis
Multicollinearity exist in the data - There are
chances of misinterpretation of the results.
Multicollinearity is a statistical phenomenon in

which there exists a perfect or exact
relationship between the independent
variables (IV’s)
Residuals are NOT randomly distributed
PCA – PLA Useful
13
Residual Plots
Normal probability plot
IDEAL The assumptions of MLR are violated What is checked ?

The residuals are normally
distributed – all the points are on
or near the line
Residuals are NOT randomly distributed
When the data do not meet the basic criteria for performing MLR-
PCA – PLA Useful

14
Principal Component Analysis (PCA) and Partial Least Square (PLS)
There are two main types of MVA:

Principal Component Analysis (PCA)
 X’s only
Partial Least Square (PLS)

 X’s and Y
Partial least squares (PLS) is a method for

constructing predictive models when the factors
are many and highly collinear.
In PLS, the emphasis is on predicting the

responses and not necessarily on trying to
understand the underlying relationship
between the variables .
15
Artificial intelligence (AI) is a branch of computer sciences
Key objective is the development of machines whose cognitive functions related to
mimic the perception, learning, problem-solving and decision making exceed that of
humans.
Machine can perform all task like human intelligence..
Thinking humanly  Siri, Alexa and other smart assistants

Thinking rationally  Self-driving cars
Acting humanly  Robo-advisors
Acting rationally  Netflix's recommendation
AI systems are powered by machine learning, some of them are powered by

deep learning
16
Types of Artificial Intelligence
17
Type- 1 AI Based on Functionality
Reactive ability Theory Work in Self-

Reactive No memory Limited of aware Hypothetical
Theory & Memory Progress
AI Power power Mind stage ness stage
@ Research Lab Level @ Hypothetical

@ Basic and oldest @ Reactive ability and @ Deep understating
type of AI. have memory Stage
of human minds
@ Science fiction
@ Replicate a human’s capabilities ranging needs, likes,
ability to react to @ Past information emotions, thought movies
process, @ Internal
different kinds of used for better
@ Bellhop Robot for “thoughts”
stimuli. future decision
hotels- Understand
@ No memory power @ Self-driving cars the demand of
@ IBM’s chess-playing .
customer
supercomputer
18
Type- 2 AI Based on
capability
@ Perform goal-
@ Strong AI/Deep AI
oriented task, @ It better than humans
@ Examples: face @ machine -general
in areas like math,
recognition, driving intelligence
sports, science,
car, speech (mimic human
medicine, art,
recognition/voice behavior) capable
hobbies, emotional
assistants, or of learning and
relationships, or
browsing the applying this
simply everything.
Internet. knowledge to
@ Perform all task solve any problem
@ Super computer
Precisely
19
Machine learning (ML)
Machine learning (ML) is a branch of AI in which, based on the
training dataset that are first provided, the computer develops its
own logic for answering future questions.
 Machine learning is not dependent on any programming but

depend on Data.
 Machine learning is to produce accurate predictions on new

unseen data after Training and learning of dataset.
 “a computer program is said to learn from experience E with

respect to some class of tasks T and performance measure P if its
performance on tasks in T, as measured by P, improves with
experience E.”
20
Process of Machine learning
1. Gathering data from various sources

2.Cleaning data to have homogeneity
3. Selection of right ML algorithm model building
4. Gaining insights from the model’s results
5. Transforming results into visual graphs
21
22
Types of Learning
 Precise mapping between input-output data

 Data set is labeled
 Algorithm identify the relationships between the two
Supervised
variables can predict a new outcome
Learning
 Resolves via classification and Regression problem
 Algorithms: Naive Bayes, KNN, SVM, Logistic Regression,
Decision Tree, Linear Regression, Random Forest, etc.
Regression: A regression problem is when the

output variable is a real value or Numerical data
hardness, Dissolution data …………..
Examples Risk Assessment and Score Prediction
Classification : A classification problem is when

the output variable is categorical, such as
“disease” and “no disease
Diagnosis, Image analysis and
fraud detection
23
Illustration of Supervised Learning:
An algorithm is trained about Hb

level and corresponding output of
either Anemic or non-Anaemic based
on labelled data.
Input data for patients with their Hb

levels is fed into the algorithm.
The algorithm analyses the
patient’s data with Step 1 inputs.
When new data is entered, machine

recognizes the Hb level and generates report
if patient is suffering from Anaemia or not.
24
Types of Learning
 ML technique to find patterns in data, in an exploratory
manner.
 The data is not labeled, which means only the input variables
(X) are given with no corresponding output variables. Unsupervised
 Evaluation is Qualitative and does not predict anything Learning
 Resolves via clustering and Dimensionality reduction
problem.
Clustering: Grouping of similar data into groups or

clusters. Example: K-Means, K-Means++, K-
Medoid, etc.
Dimensionality Reduction: Compression of the

data to reduce the its complexity without altering its
structure.
Example: Principal Component Analysis, PLS
25
Illustration of Unsupervised Learning: Spread of Zika Virus
01 02 03
Input data is entered of Machine algorithm analyses Based on the clustering density, we
patients suffering from the data and clusters the can identify where the Zika virus has
data based on coastal spread to the most and an
Zika virus from various
region patients and inland awareness campaigns can be
locations of India. region patients. launched in the concerned regions.
Types of Learning
 Algorithm is learning from its mistakes or reward
based learning
 Process of trial and error find the solution of problem
 Most impotent task that to provide the simulated Reinforcement
environment. Learning
 Do not require pre-existing knowledge or data.
 It required lot of training time and a huge number of
iterations to learn tasks.
27
An illustration as an example from healthcare sector:
We use a trained data

load the new x-ray image Simultaneously the doctor also
labelled with correct
(data) on this system and diagnosis's the patient condition
diagnosis (Disease/ Normal) by taking a look at the same x-ray
and onto this data the based on past learning, the and giving a feedback on
machine learning algorithm model predicts the condition “Correctly diagnosed by ML” or
is built. of the patient. “Incorrectly diagnosed by ML”.
The feedback (or rewards)

by doctor makes the
algorithm better for future
diagnosis to a point where
doctor intervention would
be minimum.
28
Comparison of Different Machine Learning Methods Commonly Used in Pharmaceutical Research
29
DEEP LEARNING (DL)/ Hierarchical learning or Deep structured learning,
@ Sophisticated approach to machine learning
@ DL includes larger numbers of hidden layers (usually more than

three), and each layer comprises many more nodes. Therefore,
DL uses multiple levels of representations that can
ultimately learn very complex functions.
@ Deep requires minimal human intervention
@ DL required more time to set up and generate results

instantaneously.
@ DL employs neural networks and is built to accommodate large

volumes of unstructured data.
30
@ Neural Networks (NN), or more precisely Artificial Neural Networks
(ANN), is a class of Machine Learning algorithms that recently received a
lot of attention (again!) due to the availability of Big Data and fast
computing facilities
@ Biologically inspired computational model.
@ ANN is capable of simulating neurological processing ability of the human
brain.
Neurons are responsible for decision making
31
ANNs consist of artificial neurons or processing elements (PEs) that are connected
via coefficients (weights)
The first layer of an artificial neuron is the input
layer, which corresponds to the dendrites of the
biological neuron and transfers information to the
next layer
The hidden layer connects these two layers

through certain coefficients (weights)
Thus, the number of hidden neurons in the neural

network that will give the highest correlation
coefficient (r) and lowest error.
This process is called learning or training data is done with Error -back propagation
method.
32
Key difference Machine Learning vs Deep Learning
DEEP LEARNING MACHINE LEARNING
@ Needs Big data
@ Require machine with @ Perform well in small and
GPU medium dataset
@ Need to understand @ Work well with low end

the basic functionality machine
of day
@ Training and processing
@ Data training and time short
processing time long
@ Number of algorithm
@ Number of algorithm is many
very few
@ Easy to interpret the data
@ Difficult to interpret
data
33
34
APPLICATIONS OF MACHINE LEARNING IN PHARMACEUTICAL SCIENCES
Machine Learning in Drug Design and Discovery
Pharmacology- PK/PD modeling, IVIVC , Clinical

Pharmacokinetic
Pharmaceutical Product development -

Preformulation/formulation and process development
Diagnosis and clinical- Practice-Research and

Early detection of disease and its complication ,
Personalization of patient care
Pharma Marketing: Forecast

market sales and Insurance
35
Machine Learning in Drug Design and Discovery
Physico-chemical properties of a compound
correlated with the corresponding chemical
or biological activities
The high-throughput screened data were

subject to filtration based on drug-likeness,
ADMET analysis, and toxicity.
AI/ QSAR/ ML
Molecular docking and molecular

dynamics simulations studies
final predicted compounds were visualized

for binding energy calculations and active
site identification.
final compound identified and underwent

in vitro and in vivo experimental studies
for validation 36
37
Predicting disease-related Complication
National University Hospital, Singapore, electronic health records of the diabetic population
Drs. Kee-Yuan
Ngiam and
C. N. Lee,
Deep Machine Learning

38
AI/ ML in Clinical Trial
39
Machine Learning in Pharmaceutical Preformulation
Molecular descriptors of compounds and experimental conditions were employed as

inputs, while complexation free energy as outputs.
40
Machine Learning in Pharmaceutical Formulations
41
We are now solving problems with
AI/ML is the main tool behind machine learning and AI that were…
new-age innovation and in the realm of science fiction for the
discoveries like driverless cars or last several decades
disease detecting algorithm
Generalized AI is worth thinking

about because it stretches our
imaginations and it gets us to think
AI/ML is the main tool behind new- about our core values and issues of
age innovation and discoveries like choice
driverless cars or disease detecting
algorithm
42
Big pharma’s AI initiatives
43
44
45
46
47

Machine Learning PPT 10-3-2022 Final

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning PPT 10-3-2022 Final

Uploaded by

Copyright:

Available Formats

Joint All India Council for Technical Education (AICTE) – Gujarat Technological University

Sponsored “One Week e-Faculty Development Programme on- “ A future of Artificial

Machine Learning Driven Pharmaceutical Product

The information given in this presentation has been compiled

The domain of healthcare has always been flooded with a huge

153 exabytes of data

2,314 exabytes in 2020,

Data Mining - are an integral part of a in Big Data Analytics.

Pre-processing- eliminating the noisy data

Transformation: Aggregation, smoothing,

Data mining Choosing the data mining

Evaluation of the outcomes is prepared with

Definition Process of finding useful information by Process of forecasting the

Approach Reactive Proactive

MOVING FROM DESCRIPTIVE TO

Data Mining - are an integral part of a in Big Data Analytics.

1. When the factors are few in number(rows are

2. When the factors are not significantly correlated

3. When the factors have a well- understood

Multicollinearity is a statistical phenomenon in

Residuals are NOT randomly distributed

PCA – PLA Useful

IDEAL The assumptions of MLR are violated What is checked ?

PCA – PLA Useful

There are two main types of MVA:

Partial Least Square (PLS)

Partial least squares (PLS) is a method for

In PLS, the emphasis is on predicting the

Thinking humanly  Siri, Alexa and other smart assistants

AI systems are powered by machine learning, some of them are powered by

Reactive ability Theory Work in Self-

@ Research Lab Level @ Hypothetical

 Machine learning is not dependent on any programming but

 Machine learning is to produce accurate predictions on new

 “a computer program is said to learn from experience E with

1. Gathering data from various sources

 Precise mapping between input-output data

Regression: A regression problem is when the

Classification : A classification problem is when

An algorithm is trained about Hb

Input data for patients with their Hb

When new data is entered, machine

Clustering: Grouping of similar data into groups or

Dimensionality Reduction: Compression of the

We use a trained data

The feedback (or rewards)

@ Sophisticated approach to machine learning

@ DL includes larger numbers of hidden layers (usually more than

@ Deep requires minimal human intervention

@ DL required more time to set up and generate results

@ DL employs neural networks and is built to accommodate large

Neurons are responsible for decision making

The hidden layer connects these two layers

Thus, the number of hidden neurons in the neural

@ Need to understand @ Work well with low end

Machine Learning in Drug Design and Discovery

Pharmacology- PK/PD modeling, IVIVC , Clinical

Pharmaceutical Product development -

Diagnosis and clinical- Practice-Research and

Pharma Marketing: Forecast

The high-throughput screened data were

Molecular docking and molecular

final predicted compounds were visualized

final compound identified and underwent