You are on page 1of 47

Joint All India Council for Technical Education (AICTE) – Gujarat Technological University

Sponsored “One Week e-Faculty Development Programme on- “ A future of Artificial


Intelligence in Healthcare System” March 07-12, 2022

Machine Learning Driven Pharmaceutical Product


Development

Presented by:
Dr. Vaishali T. Thakkar
Professor and Research coordinator
(Pharmaceutics department )
Anand Pharmacy College
Disclaimer:

The information given in this presentation has been compiled


from various sources believed to be reliable.
All cited examples should serve as a guideline.
The contents have been picked up from public resources.
It is used for educational purposes only.
Prior permission shall be obtained, if the contents of this
presentation are to be used by anyone

2
BUZZ WORD OF PHARMA INDUSTRY 4.0

The domain of healthcare has always been flooded with a huge


amount of complex data, coming in at a very fast-pace
3
Discipline of Data sciences and their relationship

4
Sources of big data

International
Data Corporation,

153 exabytes of data

2,314 exabytes in 2020,


more than an 11,000% spike.

Data Mining - are an integral part of a in Big Data Analytics.


5
Data Mining and Stages of Knowledge Discovery Data Mining (KDD)
 Data mining is the process of extraction of data (non-trivial, implicit, previously unknown, and
potentially useful) from a large database.
 It is also known by other names such as Knowledge Extraction, Data-pattern Analysis, Information
Harvesting and Knowledge Discovery (KD)
Selection – Gathering of data from
heterogeneous source

Pre-processing- eliminating the noisy data


handling missing data, detect or remove
outlier

Transformation: Aggregation, smoothing,


normalization, generalization, and
discretization

Data mining Choosing the data mining


algorithm for Beneficial information

Evaluation of the outcomes is prepared with


statistical justification and significance testing.
6
7
Descriptive Predictive

Definition Process of finding useful information by Process of forecasting the


analysis of Huge data useful information by analysis
of huge data
Process Data aggregation and data mining Statistical based forecasting
method
Accuracy Provide accurate data – based on past data Do not provide accurate data

Approach Reactive Proactive

MOVING FROM DESCRIPTIVE TO


PREDICTIVE AND PRESCRIPTIVE
ANALYTICS

8
9
Breakdowns of industrial development and the great changes in
related categories

10
Relation between different methodologies in Data Mining

Data Mining - are an integral part of a in Big Data Analytics.

11
Regression Analysis – (Multiple Linear Regression - MLR )

Multivariate mathematical and statistical methods are used to handle the large
data set.
Data is converted into information

1. When the factors are few in number(rows are


more than columns)

2. When the factors are not significantly correlated


(collinear) – “r-matrix

3. When the factors have a well- understood


relationship to the responses (Y).

12
Multivariate analysis
Multicollinearity exist in the data - There are
chances of misinterpretation of the results.

Multicollinearity is a statistical phenomenon in


which there exists a perfect or exact
relationship between the independent
variables (IV’s)

Residuals are NOT randomly distributed

PCA – PLA Useful

13
Residual Plots
Normal probability plot

IDEAL The assumptions of MLR are violated What is checked ?


The residuals are normally
distributed – all the points are on
or near the line
Residuals are NOT randomly distributed

When the data do not meet the basic criteria for performing MLR-

PCA – PLA Useful


14
Principal Component Analysis (PCA) and Partial Least Square (PLS)

There are two main types of MVA:


Principal Component Analysis (PCA)
 X’s only

Partial Least Square (PLS)


 X’s and Y

Partial least squares (PLS) is a method for


constructing predictive models when the factors
are many and highly collinear.

In PLS, the emphasis is on predicting the


responses and not necessarily on trying to
understand the underlying relationship
between the variables .

15
Artificial intelligence (AI) is a branch of computer sciences
Key objective is the development of machines whose cognitive functions related to
mimic the perception, learning, problem-solving and decision making exceed that of
humans.
Machine can perform all task like human intelligence..

Thinking humanly  Siri, Alexa and other smart assistants


Thinking rationally  Self-driving cars
Acting humanly  Robo-advisors
Acting rationally  Netflix's recommendation

AI systems are powered by machine learning, some of them are powered by


deep learning
16
Types of Artificial Intelligence

17
Type- 1 AI Based on Functionality

Reactive ability Theory Work in Self-


Reactive No memory Limited of aware Hypothetical
Theory & Memory Progress
AI Power power Mind stage ness stage

@ Research Lab Level @ Hypothetical


@ Basic and oldest @ Reactive ability and @ Deep understating
type of AI. have memory Stage
of human minds
@ Science fiction
@ Replicate a human’s capabilities ranging needs, likes,
ability to react to @ Past information emotions, thought movies
process, @ Internal
different kinds of used for better
@ Bellhop Robot for “thoughts”
stimuli. future decision
hotels- Understand
@ No memory power @ Self-driving cars the demand of
@ IBM’s chess-playing .
customer
supercomputer

18
Type- 2 AI Based on
capability

@ Perform goal-
@ Strong AI/Deep AI
oriented task, @ It better than humans
@ Examples: face @ machine -general
in areas like math,
recognition, driving intelligence
sports, science,
car, speech (mimic human
medicine, art,
recognition/voice behavior) capable
hobbies, emotional
assistants, or of learning and
relationships, or
browsing the applying this
simply everything.
Internet. knowledge to
@ Perform all task solve any problem
@ Super computer
Precisely

19
Machine learning (ML)
Machine learning (ML) is a branch of AI in which, based on the
training dataset that are first provided, the computer develops its
own logic for answering future questions.

 Machine learning is not dependent on any programming but


depend on Data.

 Machine learning is to produce accurate predictions on new


unseen data after Training and learning of dataset.

 “a computer program is said to learn from experience E with


respect to some class of tasks T and performance measure P if its
performance on tasks in T, as measured by P, improves with
experience E.”

20
Process of Machine learning

1. Gathering data from various sources


2.Cleaning data to have homogeneity
3. Selection of right ML algorithm model building
4. Gaining insights from the model’s results
5. Transforming results into visual graphs

21
22
Types of Learning

 Precise mapping between input-output data


 Data set is labeled
 Algorithm identify the relationships between the two
Supervised
variables can predict a new outcome
Learning
 Resolves via classification and Regression problem
 Algorithms: Naive Bayes, KNN, SVM, Logistic Regression,
Decision Tree, Linear Regression, Random Forest, etc.

Regression: A regression problem is when the


output variable is a real value or Numerical data
hardness, Dissolution data …………..
Examples Risk Assessment and Score Prediction

Classification : A classification problem is when


the output variable is categorical, such as
“disease” and “no disease
Diagnosis, Image analysis and
fraud detection
23
Illustration of Supervised Learning:

An algorithm is trained about Hb


level and corresponding output of
either Anemic or non-Anaemic based
on labelled data.

Input data for patients with their Hb


levels is fed into the algorithm.
The algorithm analyses the
patient’s data with Step 1 inputs.

When new data is entered, machine


recognizes the Hb level and generates report
if patient is suffering from Anaemia or not.

24
Types of Learning
 ML technique to find patterns in data, in an exploratory
manner.
 The data is not labeled, which means only the input variables
(X) are given with no corresponding output variables. Unsupervised
 Evaluation is Qualitative and does not predict anything Learning
 Resolves via clustering and Dimensionality reduction
problem.

Clustering: Grouping of similar data into groups or


clusters. Example: K-Means, K-Means++, K-
Medoid, etc.

Dimensionality Reduction: Compression of the


data to reduce the its complexity without altering its
structure.
Example: Principal Component Analysis, PLS

25
Illustration of Unsupervised Learning: Spread of Zika Virus

01 02 03

Input data is entered of Machine algorithm analyses Based on the clustering density, we
patients suffering from the data and clusters the can identify where the Zika virus has
data based on coastal spread to the most and an
Zika virus from various
region patients and inland awareness campaigns can be
locations of India. region patients. launched in the concerned regions.
Types of Learning
 Algorithm is learning from its mistakes or reward
based learning
 Process of trial and error find the solution of problem
 Most impotent task that to provide the simulated Reinforcement
environment. Learning
 Do not require pre-existing knowledge or data.
 It required lot of training time and a huge number of
iterations to learn tasks.

27
An illustration as an example from healthcare sector:

We use a trained data


load the new x-ray image Simultaneously the doctor also
labelled with correct
(data) on this system and diagnosis's the patient condition
diagnosis (Disease/ Normal) by taking a look at the same x-ray
and onto this data the based on past learning, the and giving a feedback on
machine learning algorithm model predicts the condition “Correctly diagnosed by ML” or
is built. of the patient. “Incorrectly diagnosed by ML”.

The feedback (or rewards)


by doctor makes the
algorithm better for future
diagnosis to a point where
doctor intervention would
be minimum.
28
Comparison of Different Machine Learning Methods Commonly Used in Pharmaceutical Research

29
DEEP LEARNING (DL)/ Hierarchical learning or Deep structured learning,

@ Sophisticated approach to machine learning

@ DL includes larger numbers of hidden layers (usually more than


three), and each layer comprises many more nodes. Therefore,
DL uses multiple levels of representations that can
ultimately learn very complex functions.

@ Deep requires minimal human intervention

@ DL required more time to set up and generate results


instantaneously.

@ DL employs neural networks and is built to accommodate large


volumes of unstructured data.

30
@ Neural Networks (NN), or more precisely Artificial Neural Networks
(ANN), is a class of Machine Learning algorithms that recently received a
lot of attention (again!) due to the availability of Big Data and fast
computing facilities
@ Biologically inspired computational model.
@ ANN is capable of simulating neurological processing ability of the human
brain.

Neurons are responsible for decision making

31
ANNs consist of artificial neurons or processing elements (PEs) that are connected
via coefficients (weights)
The first layer of an artificial neuron is the input
layer, which corresponds to the dendrites of the
biological neuron and transfers information to the
next layer

The hidden layer connects these two layers


through certain coefficients (weights)

Thus, the number of hidden neurons in the neural


network that will give the highest correlation
coefficient (r) and lowest error.

This process is called learning or training data is done with Error -back propagation
method.

32
Key difference Machine Learning vs Deep Learning
DEEP LEARNING MACHINE LEARNING
@ Needs Big data
@ Require machine with @ Perform well in small and
GPU medium dataset

@ Need to understand @ Work well with low end


the basic functionality machine
of day
@ Training and processing
@ Data training and time short
processing time long
@ Number of algorithm
@ Number of algorithm is many
very few
@ Easy to interpret the data
@ Difficult to interpret
data
33
34
APPLICATIONS OF MACHINE LEARNING IN PHARMACEUTICAL SCIENCES

Machine Learning in Drug Design and Discovery

Pharmacology- PK/PD modeling, IVIVC , Clinical


Pharmacokinetic

Pharmaceutical Product development -


Preformulation/formulation and process development

Diagnosis and clinical- Practice-Research and


Early detection of disease and its complication ,
Personalization of patient care

Pharma Marketing: Forecast


market sales and Insurance
35
Machine Learning in Drug Design and Discovery
Physico-chemical properties of a compound
correlated with the corresponding chemical
or biological activities

The high-throughput screened data were


subject to filtration based on drug-likeness,
ADMET analysis, and toxicity.

AI/ QSAR/ ML

Molecular docking and molecular


dynamics simulations studies

final predicted compounds were visualized


for binding energy calculations and active
site identification.

final compound identified and underwent


in vitro and in vivo experimental studies
for validation 36
37
Predicting disease-related Complication
National University Hospital, Singapore, electronic health records of the diabetic population

Drs. Kee-Yuan
Ngiam and
C. N. Lee,

Deep Machine Learning


38
AI/ ML in Clinical Trial

39
Machine Learning in Pharmaceutical Preformulation

Molecular descriptors of compounds and experimental conditions were employed as


inputs, while complexation free energy as outputs.
40
Machine Learning in Pharmaceutical Formulations

41
We are now solving problems with
AI/ML is the main tool behind machine learning and AI that were…
new-age innovation and in the realm of science fiction for the
discoveries like driverless cars or last several decades
disease detecting algorithm

Generalized AI is worth thinking


about because it stretches our
imaginations and it gets us to think
AI/ML is the main tool behind new- about our core values and issues of
age innovation and discoveries like choice
driverless cars or disease detecting
algorithm
42
Big pharma’s AI initiatives

43
44
45
46
47

You might also like