Professional Documents
Culture Documents
Predicting Oral Desintegrating
Predicting Oral Desintegrating
999078, China
a r t i c l e i n f o a b s t r a c t
Article history: Oral disintegrating tablets (ODTs) are a novel dosage form that can be dissolved on the
Received 25 October 2017 tongue within 3 min or less especially for geriatric and pediatric patients. Current ODT for-
Accepted 15 January 2018 mulation studies usually rely on the personal experience of pharmaceutical experts and
Available online 2 February 2018 trial-and-error in the laboratory, which is inefficient and time-consuming. The aim of cur-
rent research was to establish the prediction model of ODT formulations with direct com-
Keywords: pression process by artificial neural network (ANN) and deep neural network (DNN) tech-
Oral disintegrating tablets niques. 145 formulation data were extracted from Web of Science. All datasets were divided
Formulation prediction into three parts: training set (105 data), validation set (20) and testing set (20). ANN and
Artificial neural network DNN were compared for the prediction of the disintegrating time. The accuracy of the ANN
Deep neural network model have reached 85.60%, 80.00% and 75.00% on the training set, validation set and testing
Deep-learning set respectively, whereas that of the DNN model were 85.60%, 85.00% and 80.00%, respec-
tively. Compared with the ANN, DNN showed the better prediction for ODT formulations.
It is the first time that deep neural network with the improved dataset selection algorithm
is applied to formulation prediction on small data. The proposed predictive approach could
evaluate the critical parameters about quality control of formulation, and guide research
and process development. The implementation of this prediction model could effectively
reduce drug product development timeline and material usage, and proactively facilitate
the development of a robust drug product.
✩
Note that Run Han and Yilong Yang made equal contributions to this paper.
∗
Corresponding author. University of Macau, Avenida da universidade, Taipa, Macau 999078, China. Tel.: +853 88224514.
E-mail address: defangouyang@umac.mo (D. Ouyang).
Peer review under responsibility of Shenyang Pharmaceutical University.
https://doi.org/10.1016/j.ajps.2018.01.003
1818-0876/© 2018 Shenyang Pharmaceutical University. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND
license. (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Asian Journal of Pharmaceutical Sciences 13 (2018) 336–342 337
ceutical research, the prediction models were developed for “tablets” with 84, respectively. Among these results, only re-
break force and disintegration of tablet formulation by ANN, search articles were selected for further data extraction. After
genetic algorithm, support vector machine and random for- the manual screening, 145 direct compressed ODT formula-
est approaches [21]. Another ANN example was quantitative tions with the disintegrating time were extracted including
structure activity relationships (QSAR) of antibacterial activ- 23 active pharmacological ingredients (APIs) groups for our
ity study [22,23]. DNN is a type of representation learning with model, as shown in Table S1. All APIs were described as nine
multiple levels of neural networks. Unlike the traditional ANN molecular parameters, including molecular weight, XLogP3,
with manual feature extraction, deep-learning can automati- hydrogen bond donor count, hydrogen bond acceptor count,
cally extract feature and even transform low-level representa- rotatable bond count, topological polar surface area, heavy
tion to more abstract level without any feature extractor [24]. atom count, complexity and logS. According to the function
Moreover, deep-learning is more sensitive to irrelevant and of excipients, all excipients were divided into five categories:
particular minute variations with complicated parameters of filler, binder, disintegrant, lubricant, and solubilizer. Each type
the network, which could reach higher accuracy rather than of excipient was individually coded for further training. The
the conventional machine learning algorithms [19]. In recent formulation data included API molecular descriptors and its
years, DNN has been applied in pharmacy research, such as amount, the type of encoded excipients and its amount, man-
drug design, drug-induced liver injury and virtual screening ufacture parameters (e.g. the hardness, friability, thickness
[25]. In most cases, deep-learning could generate a novel and and tablet diameter) and the disintegrating time of each for-
complex system to represent various objects through molec- mulation.
ular descriptor so that it would be very helpful for drug dis-
covery and prediction [26]. Junshui Ma et al. extracted data 2.2. Dataset classification: Training set, validation set
from internal Merck data and included on-target and absorp- and testing set
tion, distribution, metabolism, excretion (ADME), each molec-
ular was described as serious features. Finally, they used deep To ensure good prediction ability of computational model,
neural nets to evaluate QSAR and the result was better than especially in the small amount of pharmaceutical data, the
random forest commonly used [27]. dataset should be carefully divided into three parts, including
The aim of current research was to establish the quanti- training set, validation set and testing set. The three datasets
tative prediction model of the disintegrating time of ODT for- strategy is an effective way to test the accuracy on new data
mulations with direct compression process by ANN or DNN. out of our datasets. In details, the training set is for training
model and the validation set is used for adjusting the param-
eters and finding the best model, while testing set shows the
2. Methodology prediction accuracy on real unknown data from the datasets,
as shown in Fig. 3 . Therefore, how to select data for three
2.1. Data extraction datasets appropriately is the key step. Compared with random
selection, manual selection and maximum dissimilarity algo-
Formulation data collection was the foundation of building rithm selection, the improved maximum dissimilarity algo-
the prediction model. To ensure the data reliability, the key- rithm (MD-FIS) is the best choice. MD-FIS is based on the max-
word search strategy was used in Web of Science database. imum dissimilarity algorithm considering small group data
The synonym strings of keywords were used, such as “oral” + in the whole dataset, it will avoid selecting data mostly from
“disintegrating” + “tablets” with 461 results, “fast” + “disinte- small group and ensure the representation of validation and
grating” + “tablets” with 407 results, “rapidly” + “disintegrat- test set.
ing” + “tablets” with 266 results, and “oral” + “dispersible” +
Asian Journal of Pharmaceutical Sciences 13 (2018) 336–342 339
Fig. 4 showed the label (true) value and predictive value of dis-
The prediction model for ODTs was trained by ANN and
integrating time on ANN model (A. training set; B. validation
DNN, respectively. In the training process, all data are nor-
set; C. testing set), while indicated the true value and predic-
malized and then divided into three sets with our previous
tive value of disintegrating time on DNN model (D. training set;
proposed MD-FIS selection algorithm in R language. For ANN
E. validation set; F. testing set). As shown in Fig. 4, the training
and DNN networks, Deeplearning4j machine learning frame-
set and validation set of both ANN and DNN showed good re-
work (https://deeplearning4j.org/ ) was used to train predic-
sults. As Table 1 shows, the predictive accuracy of ANN model
tion models. All the source codes can be found on the web-
is 85.60% on training set and 80.00% on validation set, while
site (http://ml.mydreamy.net/pharmaceutics/ODT.html). The
the DNN model is 85.60% and 85.00%, respectively. However,
ANN model in Fig. 1 with termination condition at 15,000
the testing set of ANN with only 75.00% accuracy is lower than
epochs and hidden nodes is 200. The deep-learning process in
that of DDN (80.00%), which indicated that DNN is able to sig-
Fig. 2 uses full-connected deep feedforward networks includ-
nificantly better predict real unknown data than ANN.
ing ten layers with 2000 epochs. This neural network contains
As the result shows, ANN is an efficient network for train-
50 hidden nodes on each layer. All networks choose tanh as
ing prediction model within the adjustment of validation set,
the activation function except the last layer with sigmoid ac-
reaching a high accuracy on training set and validation set.
tivation function. Learning rate is set to 0.01. Batch gradient
However, when predicting real unknown data, the accuracy
descent with the 0.8 momentum is used for training the net-
of testing set dropped significantly, which is called overfitting
works.
in machine learning. DNN performs well in all three datasets
Note that epoch indicates how many times the dataset is
with over 80% accuracy and predicted stably with average
used for training. Feedforward network means that the output
value, which is more capable of establishing a better predic-
of the network is computed layer-by-layer from one-direction
tion model for ODT than ANN.
without any inside loop. Learning rate impacts how fast the
When analyzing the different network structures between
network will be convergent. Batch gradient descent is a train-
ANN and DNN, ANN just includes one hidden layer, while DNN
ing strategy to use all datasets to train the model at each time.
includes ten layers with 2000 epochs and each layer contains
Momentum indicates how much the speed will be kept in each
50 hidden nodes. Thus, DNN could extract the feature of data
training step.
with higher level and give a more accurate predictive result. It
2.4. Pharmaceutical evaluation criterion is unsurprising that DNN, as an innovative and effective tech-
nique for pharmaceutical research, can provide a higher accu-
European Pharmacopeia defined that ODT could disintegrate racy prediction about disintegrating time than ANN. Thus, the
within 3 min in the mouth before being swallowed. In all our desired DNN with the proposed MD-FIS selection algorithm
formulation data, the disintegrating time ranges from 0 s to can be used to achieve good predictive results on pharmaceu-
100 s. Usually, the successful prediction in pharmaceutics is tical formulations with small data.
that absolute error is less than 10%. Thus, a good model is that In order to ensure a satisfied prediction accuracy, two key
the prediction deviation of the disintegrating time is not more factors are to be considered: data and algorithm. The first
than 10 s. The accuracy of prediction disintegrating time is the issue is the reliable data in pharmaceutical research. Deep-
percentage of successful prediction to total predictions: learning attempts to learn these characteristics to make better
representations and create models from reliable data. Thus,
Number f − f ≤ 10 data extraction is a critical step. In current research, reliable
AccuracyPDT = ll Predictions
A formulation datasets were manually extracted and labeled
from the research articles of Web of Science by experienced
where, f is the prediction value, f is the label (real) value. All
pharmaceutical scientists.
predictions are the number of predicted data.
340 Asian Journal of Pharmaceutical Sciences 13 (2018) 336–342
Fig. 4 – The true value and predictive value on dataset: (A) the true value and predictive value of training set and (B)
validation set and (C) testing set on ANN model. (D) the true value and predictive value of training set and (E) validation set
and (F) testing set on DNN model.
Asian Journal of Pharmaceutical Sciences 13 (2018) 336–342 341
MYRG2016-00040-ICMS-QRCM), Macau Science and Technol- [14] Suñé-Negre J, Roig-Carreras M, Fuster-García R, et al. Nueva
ogy Development Fund (FDCT) (Grant No. 103/2015/A3) and metodología de preformulaciÓn galénica para la
the National Natural Science Foundation of China (Grant No. caracterizaclÓn de sustancias en relaciÓn a su viabilidad
oara la comoresiÓn: Diaqrama SeDeM. Cienc Tecnol Pharm
61562011).
2005;15:125–36.
[15] Suñé-Negre JM, Pérez-Lozano P, Miñarro M, et al. Application
of the SeDeM diagram and a new mathematical equation in
Appendix. Supplementary material the design of direct compression tablet formulation. Eur J
Pharm Biopharm 2008;69:1029–39.
Supplementary data to this article can be found online at doi: [16] Aguilar-Díaz JE, García-Montoya E, Suñe-Negre JM,
10.1016/j.ajps.2018.01.003. Pérez-Lozano P, Miñarro M, Ticó JR. Predicting orally
disintegrating tablets formulations of ibuprophen tablets: an
application of the new SeDeM-ODT expert system. Eur J
references Pharm Biopharm 2012;80:638–48.
[17] Aguilar-Díaz JE, García-Montoya E, Pérez-Lozano P,
Suñé-Negre JM, Miñarro M, Ticó JR. SeDeM expert system a
new innovator tool to develop pharmaceutical forms. Drug
[1] Bhowmik D, Chiranjib B, Krishnakanth P, Chandira RM. Fast
Dev Ind Pharm 2014;40:222–36.
dissolving tablet: an overview. J Chem Pharm Res
[18] Hopfield JJ. Neural networks and physical systems with
2009;1:163–77.
emergent collective computational abilities. Spin glass
[2] Bandari S, Mittapalli RK, Gannu R. Orodispersible tablets: an
theory and beyond: an introduction to the replica method
overview. Asian J Pharm 2014;2.
and its applications. World Scientific; 1987. p. 411–15.
[3] Lindgren S, Janzon L. Dysphagia: prevalence of swallowing
[19] Schmidhuber J. Deep learning in neural networks: an
complaints and clinical finding. Med Clin North Am
overview. Neural Netw 2015;61:85–117.
1993;77:3–5.
[20] Rost B, Sander C. Combining evolutionary information and
[4] Dutta S, De PK. Formulation of fast disintegrating tablets. Int
neural networks to predict protein secondary structure.
J Drug Formul Res 2011;201:1.
Proteins 1994;19:55–72.
[5] Fu Y, Yang S, Jeong SH, Kimura S, Park K. Orally fast
[21] Akseli I, Xie J, Schultz L, et al. A practical framework toward
disintegrating tablets: developments, technologies,
prediction of breaking force and disintegration of tablet
taste-masking and clinical studies. Crit Rev Ther Drug
formulations using machine learning tools. J Pharm Sci
Carrier Syst 2004;21.
2017;106:234–47.
[6] Prateek S, Ramdayal G, Kumar SU, Ashwani C, Ashwini G,
[22] Dudek AZ, Arodz T, Gálvez J. Computational methods in
Mansi S. Fast dissolving tablets: a new venture in drug
developing quantitative structure-activity relationships
delivery. Am J PharmTech Res 2012;2:252–79.
(QSAR): a review. Comb Chem High Throughput Screen
[7] Kaur T, Gill B, Kumar S, Gupta G. Mouth dissolving tablets: a
2006;9:213–28.
novel approach to drug delivery. Int J Curr Pharm Res
[23] Murcia-Soler M, Pérez-Giménez F, García-March FJ,
2011;3:1–7.
et al. Artificial neural networks and linear discriminant
[8] Douroumis D. Orally disintegrating dosage forms and
analysis: a valuable combination in the selection of new
taste-masking technologies; 2010. Expert Opin Drug Deliv
antibacterial compounds. J Chem Inf Comp Sci
2011;8:665–75.
2004;44:1031–41.
[9] Shukla D, Chakraborty S, Singh S, Mishra B. Mouth dissolving
[24] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature
tablets I: an overview of formulation technology. Sci Pharm
2015;521:436–44.
2009;77:309–26.
[25] Xu Y, Dai Z, Chen F, Gao S, Pei J, Lai L. Deep learning for
[10] Siddiqui MN, Garg G, Sharma PK. Fast dissolving tablets:
drug-induced liver injury. J Chem Inf Model 2015;55:2085–93.
preparation, characterization and evaluation: an overview.
[26] Baskin II, Winkler D, Tetko IV. A renaissance of neural
Int J Pharm Rev Res 2010;4:87–96.
networks in drug discovery. Expert Opin Drug Discov
[11] Al-Khattawi A, Mohammed AR. Compressed orally
2016;11:785–95.
disintegrating tablets: excipients evolution and formulation
[27] Ma J, Sheridan RP, Liaw A, Dahl GE, Svetnik V. Deep neural
strategies. Expert Opin Drug Deliv 2013;10:651–63.
nets as a method for quantitative structure–activity
[12] Aguilar-Díaz JE, García-Montoya E, Pérez-Lozano P,
relationships. J Chem Inf Model 2015;55:263–74.
Suñe-Negre JM, Miñarro M, Ticó JR. The use of the SeDeM
[28] Hinton G, Deng L, Yu D, et al. Deep neural networks for
Diagram expert system to determine the suitability of
acoustic modeling in speech recognition: the shared views of
diluents–disintegrants for direct compression and their use
four research groups. IEEE Signal Proc Mag 2012;29:82–97.
in formulation of ODT. Eur J Pharm Biopharm 2009;73:414–23.
[29] Bengio Y, Simard P, Frasconi P. Learning long-term
[13] Pérez P, Suñé-Negre JM, Miñarro M, et al. A new expert
dependencies with gradient descent is difficult. IEEE Trans
systems (SeDeM Diagram) for control batch powder
Neural Netw 1994;5:157–66.
formulation and preformulation drug products. Eur J Pharm
Biopharm 2006;64:351–9.