Professional Documents
Culture Documents
An International Journal
To cite this article: Pantea Koochemeshkian, Nuha Zamzami & Nizar Bouguila (2020): Flexible
Distribution-Based Regression Models for Count Data: Application to Medical Diagnosis,
Cybernetics and Systems, DOI: 10.1080/01969722.2020.1758464
Article views: 14
ABSTRACT KEYWORDS
Data mining techniques have been successfully utilized in dif- Distribution-based regres-
ferent applications of significant fields, including medical sion; multinomial; Beta-
research. With the wealth of data available within the health- Liouville; scaled Dirichlet;
count data; bio-medical
care systems, there is a lack of practical analysis tools to dis- data mining; Maximum
cover hidden relationships and trends in data. The complexity Likelihood Estimation (MLE)
of medical data that is unfavorable for most models is a con-
siderable challenge in prediction. The ability of a model to
perform accurately and efficiently in disease diagnosis is
extremely significant. Thus, the model must be selected to fit
the data better, such that the learning from previous data is
most efficient, and the diagnosis of the disease is highly
accurate. This work is motivated by the limited number of
regression analysis tools for multivariate counts in the litera-
ture. We propose two regression models for count data based
on flexible distributions, namely, the multinomial Beta-
Liouville and multinomial scaled Dirichlet, and evaluated the
proposed models in the problem of disease diagnosis. The
performance is evaluated based on the accuracy of the predic-
tion which depends on the nature and complexity of the
dataset. Our results show the efficiency of the two proposed
regression models where the prediction performance of both
models is competitive to other previously used regression
models for count data and to the best results in the literature.
1. Introduction
Data mining techniques have shown to increasingly attract scholars’ atten-
tion due to their successful implementation in numerous applications of
different fields such as genomics, sports, medical, image analysis, epidemi-
ology, marketing, criminology, industrial statistics, and text mining (Zhang
et al. 2017; Nikoloulopoulos and Karlis 2009; Bouguila and Amayri 2009).
In the medical field, for instance, the majority of doctors do not possess
expertise in every sub-specialty. Thus, the automation of disease diagnosis
Zhang et al. (2017) have examined regression models for multivariate count
data with efficient distributions for analyzing complex genomic data. The
authors proposed regression models based on Dirichlet Multinomial and
Generalized Dirichlet Multinomial that overcome some limitations of the
multinomial model (Bouguila 2009). In this work, we further investigate
the problem of analyzing multivariate count responses with other flexible dis-
tributions that overcome both specific mean-variance structure and the nega-
tive correlation requirement of the Dirichlet distribution as a prior to the
Multinomial. More precisely, we propose two regression models based on flex-
ible distributions for count data, namely; Multinomial Beta-Liouville and
Multinomial scaled Dirichlet. First, we introduce the response distributions,
propose the link functions, and derive the score and information matrices for
estimating the parameters and give the complete regression algorithm. Then,
we show the efficiency of the proposed models in analyzing high-throughput
data in genomics. Furthermore, we investigated, with the proposed models,
the problem of the diagnosis of three different diseases, namely, heart attack,
breast cancer, diabetes, as well as the analysis of genomics dataset.
The rest of this article is organized as follows. Section 2 discusses previous
works in distribution-based regression models for count data. In Section 3,
we propose two distribution-based regression models where we first discuss
the properties of the considered distributions, then propose the link func-
tions and provide all the details about the models’ parameters estimation.
Section 4 is devoted to the application of the proposed models on real gen-
omics and medical data and to the discussion of the results. Section 5 gives
the work concluding remarks.
2. Related Work
Count data frequently appear in instances where incidences of several asso-
ciated occurrences are measured by counting them. If multivariate counts
are accessible, there is often an interest in investigating the dependencies
among them. However, the applications and techniques for analyzing
multivariate count data are comparatively uncommon (Wedel, B€ ockenholt,
and Kamakura 2003). In this section, we review the related works in count
data regression. In all the reviewed models here, we symbolized our dataset
by X ¼ {W1, … , Wn} which consists of n independent vectors Wj ¼ (Xj,
Yj), where Xj ¼ (xj1, xj2, … , xjd)T is a d-dimensional response vector, and
Yj ¼ (Yj1, Yj2, … , Yjp)T is a p-dimensional co-variate vector.
distribution has the advantage that by varying its parameters (Bouguila and
Ziou 2005b) it permits multiple modes and asymmetries and can thus
approximate a wide variety of shapes (Bouguila, Ziou, and Vaillancourt 2004,
Bouguila and Ziou 2005a). The Dirichlet distribution is commonly used given
its flexibility and its several interesting properties, such as the consistency of
its estimates, and its ease of use as well as the fact that it is conjugate to the
multinomial distribution. Considering the Dirichlet as a prior distribution to
the multinomial results in the Dirichlet Multinomial (DM) Distribution
(Wang and Zhao 2017; Bouguila, Ziou, and Vaillancourt 2003).
The probability of a d-dimensional count vector X ¼ (x1, … , xd), with
P
m ¼ di¼1 xi , that follows a multinomial distribution with parameters q ¼
(q1, … , qd), is given by:
Y d
m
MðXjqÞ ¼ qi xi (1)
X i¼1
The most popular multinomial-logit model uses the joint distribution based
on multinomial and Dirichlet, which is called the Dirichlet-Multinomial (DM)
distribution (Madsen, Kauchak, and Elkan 2005). The probability of a vector
X over m possible trails following the DM Distribution, with parameters a ¼
(a1, … , ad), is given by (Zhang et al. 2017):
Y Qd a
i¼1 ð i Þxi
d
m CðjajÞ Cðxi þ ai Þ m
DMðXjaÞ ¼ ¼ (2)
x Cðjaj þ mÞ i¼1 Cðai Þ X ðjajÞm
where (jaj)(m) ¼ jaj(jaj þ 1) … (jaj þ m 1) denotes the rising factorial,
P
and jaj ¼ di¼1 ai :
Even though the DM regression enables the parameterization of the
multi-class correlation coefficient for unit-specific covariates, it may dis-
close additional information that may not be identified by the grouped
conditional logit model (Guimaraes and Lindrooth 2005). The inverse link
function ai ¼ eai Y the parameters
T
Estimating the Dirichlet multinomial regression model does not present any
specific challenge, and its numerical optimization process based on the
CYBERNETICS AND SYSTEMS: AN INTERNATIONAL JOURNAL 5
(5)
6 P. KOOCHEMESHKIAN ET AL.
C Pd1 a Cða þ bÞCða0 ÞCb0 Qd1 Ca0
m i¼1 i i¼1 i
MBLðXjhÞ ¼ P Q
X C d1 0 0 0 d1
i¼1 ai C a þ b Cða ÞCðb Þ i¼1 Cðai Þ
a þ ðbÞ þ a
m ð Þzi xiþ1 ð i Þxi
¼ (6)
X jajm ða þ bÞzi
P
where zi ¼ ik¼1 xk , a0 ¼ ai þ xi and b0 ¼ b þ xd : Note that by substituting
P
a ¼ di¼1 ai and b ¼ ad , the MBL is reduced to the Dirichlet Multinumial
(Eq. 2). Indeed, MBL is Indeed, MBL is an attractive distribution to fit
count data, given the fact that it has fewer parameters than MGD with a
comparable performance (Bouguila 2011).
Consider the parameters set h ¼ ða1 , :::, ad1 , a, bÞ as all the regres-
sion coefficients the complete log-likelihood is given by:
2
X n X d X n xXij 1 T X zij T
mj 4 ai y j
Ln ðXjhÞ ¼ ln þ ln e þ k þ ln ebyj þ k
j¼1
Xj i¼1 j¼1 k¼0 k¼0
3
XXi1 T xX i, m 1 T X
xi þ1
ln eai yj þ k ln eai yj þ k ln eayj þ ebyj þ k 5
T T
þ
k¼0 k¼0 k¼0
(14)
X n X xi Xd T
ai YjT
þ ln xi þ e þ k ln eai Yj þ k (20)
j¼1 k¼0 i¼1
X
n X xi
d X T
ai YjT
þ ln e þ k ln mj eai Yj k
j¼1 i¼1 k¼1
ðtþ1Þ
X
n
H ¼ argmaxH log p Xj jH (21)
j¼1
For both models, closed-form solutions do not exist. Thus, the process
requires a Newton-Raphson optimization that iterates between scoring steps
based on the present values and an update of the parameters, such that:
Hðtþ1Þ ¼ HðtÞ HH
1
GH (22)
where G is the gradients and H is the Hessian matrix based on the first
and second order derivatives of the log-likelihood function, respectively.
The complete derivation needed for estimating the parameters of the two
proposed models are given in (Appendix A).
To achieve an optimal performance of our proposed models, the initial val-
ues of the parameters were calculated using the method of moments
(Taboada et al. 2011), which depends on the mean and variance of each dis-
tribution. Then, using the maximum likelihood approach, the parameters are
updated to get their natural values with respect to the given dataset. Finally,
the regression model is applied to predict the multivariate count response.
The complete learning algorithm is summarized in (Algorithm 1).
4. Experimental Results
Our aim in this section is to apply the proposed regression models on real
datasets. We evaluate both Multinomial Beta-Liouville and Multinomial
scaled Dirichlet regression models to show their effectiveness compared to
the previously proposed distribution based regression models for count
data. All the models were implemented in MATLAB.
we compare Ypredict to the actual data in the test split Ytest of a given
dataset. Since we work on multivariate data where each Y is a vector, the
average accuracy was calculated. That is, the average of the differences
between Ypredict ¼ ðy01 , . . . , y0p Þ and Ytest ¼ ðy1 , . . . , yp Þ should be calculated.
We use the following equation to calculate the accuracy for each model:
lðYpredict Ytest Þ
ACC ¼ 1 100 (25)
lðjYtest jÞ
The prediction results in the following subsection are shown by figures,
and in each figure, the X axis shows the observed data points, and Y axis
shows the value of each Y that we predict.
1
https://github.com/Yiwen-Zhang/MGLM/tree/master/MGLM/data.
CYBERNETICS AND SYSTEMS: AN INTERNATIONAL JOURNAL 13
Figure 1. Comparing the actual test values and the predicted values of Y using the proposed
MBL-based regression model for RNA-seq dataset.
Figure 2. Comparing the actual test values and the predicted values of Y using the proposed
MSD-based regression model for RNA-seq dataset.
2
http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/stressEcho.html.
14 P. KOOCHEMESHKIAN ET AL.
Figure 3. Comparing the actual test values and the predicted values of Y using the proposed
MBL-based regression model for Stress Echocardiography dataset.
Figure 4. Comparing the actual test values and the predicted values of Y using the proposed
MSD-based regression model for Stress Echocardiography dataset.
3
https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic).
16 P. KOOCHEMESHKIAN ET AL.
Table 3. Comparing different distribution-based regression models for breast cancer dataset.
Performance metrics
MODEL Log-likelihood AIC BIC Accuracy
DM 3.1532e þ 03 6.3383e þ 03 6.4111e þ 03 91.00%
GDM 3.5300e þ 03 5.1000e þ 03 5.1910e þ 03 93.00%
MBL 2.4137e þ 05 4.8414e þ 05 4.8387e þ 05 98.00%
MSD 1.7277e þ 05 3.4637e 1 05 3.4621e 1 05 98.00%
Figure 6. Comparing the actual test values and the predicted values of Y using the proposed
MBL-based regression model for Breast Cancer dataset.
Figure 7. Comparing the actual test values and the predicted values of Y using the proposed
MSD-based regression model for Breast Cancer dataset.
4
https://www.kaggle.com/uciml/pima-indians-diabetes-database/downloads/pima-indians-diabetes-database.zip/1.
18 P. KOOCHEMESHKIAN ET AL.
Table 4. Comparing different regression models for Pima Indians Diabetes dataset.
Performance metrics
MODEL Log-likelihood AIC BIC Accuracy
DM 622.6466 1.2653e þ 03 1.3066e þ 03 92.00%
GDM 31612.78 6.3664e þ 04 6.3358e þ 04 94.50%
MBL 1.7917e þ 06 3.5843e þ 06 3.5842e þ 06 99.00%
MSD 2.8160e þ 04 25.5400e 1 04 25.5425e 1 04 97.75%
Figure 8. Comparing the actual test values and the predicted values of Y with MBL regression
model for Pima Indians Diabetes dataset.
Figure 9. Comparing the actual test values and the predicted values of Y with MSD regression
model for Pima Indians Diabetes dataset.
nine variables with no missing values reported. The variables in the consid-
ered dataset are based on personal data, such as age, the number of preg-
nancy times, and the results of medical examinations, e.g., blood pressure,
body mass index, the result of glucose tolerance test, etc.
CYBERNETICS AND SYSTEMS: AN INTERNATIONAL JOURNAL 19
Table 5. We can see from the results in this table that our proposed
approach is competitive to the most successful approaches.
For instance, three different algorithms have been previously imple-
mented to analyze the RNA-seq dataset, including the two with a similar
approach (i.e., DM, and GDM-based regression models (Zhang et al.
2017)), and CASI (Richard et al. 2010). SEG dataset has been considered
using the Classification and Regression Trees (CART) (Krivokapich et al.
1999) and the Hidden Markov Model (HMM) (Chykeyuk, Clifton, and
Noble 2011), which are well-known algorithms, however, our proposed
models are giving the highest accuracy of prediction. Similarly, comparing
the results of previous algorithms such as logistic regression (Schein and
Ungar 2004) and two models of neural networks (Abbass 2002) imple-
mented on the BCD dataset, our proposed models have the high-
est accuracy.
Furthermore, while the average accuracy of diabetes diagnosis on DD
dataset ranges between 71 and 80%, obtained using previous methods such
as logistic regression (Schein and Ungar 2004), different neural network
models (Kayaer and Yıldırım 2003; Smith et al. 1988), decision trees (Han,
Rodriguez, and Beheshti 2008) and KNN (Kayaer and Yıldırım 2003), our
proposed approaches achieve a superior performance of 97.75% and 99%
for the proposed regression models based on MSD and MBL, respectively.
5. Conclusion
This article introduced two novel regression models for count data based
on multinomial Beta-Liouville and multinomial scaled Dirichlet distribu-
tions. This work is mainly motivated by the fact that these distributions
offer high flexibility, better fitting, and considerable potential to accurately
describe count data compared to the previously used models. Thus, the
proposed regression models have the benefits of better fitting multivariate
count data compared to the previously proposed ones. To validate the per-
formance of the proposed models, we considered the application of assess-
ing the connections and patterns analysis in medical data. The evaluation is
performed by considering different measures that are usually used to evalu-
ate regression models, including model selection criteria such as AIC and
BIC, as well as the prediction accuracy. According to the obtained results,
the proposed models achieved a superior performance presented by
the high accuracy of predicting diseases. It could be claimed that these new
distribution-based regression models yield better results than the state-of-
the-art methods. Future work can be devoted to the application of the pro-
posed regression models in different problems of computer vision and
image processing.
CYBERNETICS AND SYSTEMS: AN INTERNATIONAL JOURNAL 21
References
Abbass, H. A. 2002. An evolutionary artificial neural networks approach for breast cancer
diagnosis. Artificial Intelligence in Medicine 25 (3):265–81.
Aitchison, J. 1982. The statistical analysis of compositional data. Journal of the Royal
Statistical Society: Series B (Methodological) 44 (2):139–60.
Andrews, D. F. 1974. A robust method for multiple linear regression. Technometrics 16 (4):
523–31.
Ankam, D., and N. Bouguila. 2019. Generalized dirichlet regression and other compos-
itional models with application to market-share data mining of information technology
companies. In Proceedings of the 21st International Conference on Enterprise Information
Systems, ICEIS 2019, Heraklion, Crete, Greece, May 3–5, 2019, vol. 1, 158–166. 10.5220/
0007708201580166.
Bayes, C. L., J. L. Bazan, and C. Garcıa. 2012. A new robust regression model for propor-
tions. Bayesian Analysis 7 (4):841–66.
Bishop, C. M. 2006. Pattern recognition and machine learning. New York, NY: Springer.
Bouguila, N. 2008. Clustering of count data using generalized dirichlet multinomial distri-
butions. IEEE Transactions on Knowledge and Data Engineering 20 (4):462–74.
Bouguila, N. 2009. A model-based approach for discrete data clustering and feature weight-
ing using map and stochastic complexity. IEEE Transactions on Knowledge and Data
Engineering 21 (12):1649–64.
Bouguila, N. 2011. Count data modeling and classification using finite mixtures of distribu-
tions. IEEE Transactions on Neural Networks 22 (2):186–98.
Bouguila, N., and D. Ziou. 2006. A hybrid sem algorithm for high-dimensional unsuper-
vised learning using a finite generalized dirichlet mixture. IEEE Transactions on Image
Processing 15 (9):2657–68.
Bouguila, N., and O. Amayri. 2009. A discrete mixture-based kernel for svms: application
to spam and image categorization. Information Processing & Management 45 (6):631–42.
Bouguila, N., and D. Ziou. 2005a. Mml-based approach for finite dirichlet mixture estima-
tion and selection. In International Workshop on Machine Learning and Data Mining in
Pattern Recognition, 42–51. Berlin, Heidelberg: Springer.
Bouguila, N., and D. Ziou. 2005b. On fitting finite dirichlet mixture using ecm and mml.
In International Conference on Pattern Recognition and Image Analysis, 172–82. Berlin,
Heidelberg: Springer.
Bouguila, N., D. Ziou, and J. Vaillancourt. 2003. Novel mixtures based on the dirichlet dis-
tribution: application to data and image classification. In International Workshop on
Machine Learning and Data Mining in Pattern Recognition, 172–81. Berlin, Heidelberg:
Springer.
Bouguila, N., D. Ziou, and J. Vaillancourt. 2004. Unsupervised learning of a finite mixture
model based on the dirichlet distribution and its application. IEEE Transactions on
Image Processing 13 (11):1533–43.
Burnham, K. P., and D. R. Anderson. 2001. Kullback-leibler information as a basis for
strong inference in ecological studies. Wildlife Research 28 (2):111–9.
Burnham, K. P., and D. R. Anderson. 2004. Multimodel inference: understanding aic and
bic in model selection. Sociological Methods & Research 33 (2):261–304.
Burnham, K. P., D. R. Anderson, and K. P. Huyvaert. 2011. Aic model selection and multi-
model inference in behavioral ecology: some background, observations, and comparisons.
Behavioral Ecology and Sociobiology 65 (1):23–35.
22 P. KOOCHEMESHKIAN ET AL.
Chykeyuk, K., D. A. Clifton, and J. A. Noble. 2011. Feature extraction and wall motion
classification of 2D stress echocardiography with relevance vector machines. In 2011
IEEE international symposium on biomedical imaging: From nano to macro, 677–680.
IEEE.
Connor, R. J., and J. E. Mosimann. 1969. Concepts of independence for proportions with a
generalization of the dirichlet distribution. Journal of the American 64 (325):194–206.
Dempster, A. P., N. M. Laird, and D. B. Rubin. 1977. Maximum likelihood from incom-
plete data via the em algorithm. Journal of the Royal Statistical Society: Series B
(Methodological) 39 (1):1–22.
Ferrari, S., and F. Cribari-Neto. 2004. Beta regression for modelling rates and proportions.
Journal of Applied Statistics 31 (7):799–815.
Gavin III, J. R., K. Alberti, M. B. Davidson, and R. A. DeFronzo. 1997. Report of the expert
committee on the diagnosis and classification of diabetes mellitus. Diabetes Care 20 (7):
1183.
Guimaraes, P., and R. Lindrooth. 2005. Dirichlet-multinomial regression. Economics
Working Paper Archive at WUSTL, Econometrics (0509001).
Han, J., J. C. Rodriguez, and M. Beheshti. 2008. Diabetes data analysis and prediction
model discovery using rapidminer. In 2008 Second International Conference on Future
Generation Communication and Networking, vol. 3, 96–99.
Hankin, R. K. 2010. A generalization of the dirichlet distribution. Journal of Statistical
Software 33 (11):1–18.
Herbrich, R., T. Graepel, and K. Obermayer. 1999. Regression models for ordinal data: A
machine learning approach. Berlin: Technische Universit€at Berlin.
Howar, F., B. Steffen, B. Jonsson, and S. Cassel. 2012. Inferring canonical register automata.
In International Workshop on Verification, Model Checking, and Abstract Interpretation,
251–66. Berlin, Heidelberg: Springer.
Hsu, C-n, D. Schuschel, and Y-t Yang. 1999. The annigma-wrapper approach to neural nets
feature selection for knowledge discovery and data mining. Taipei, Taiwan: Institute of
Information Science.
Karstoft, K.,. C. F. Brinkløv, I. K. Thorsen, J. S. Nielsen, and M. Ried-Larsen. 2017. Resting
metabolic rate does not change in response to different types of training in subjects with
type 2 diabetes. Frontiers in Endocrinology 8:132.
Kayaer, K., and T. Yıldırım. 2003. Medical diagnosis on pima indian diabetes using general
regression neural networks. In Proceedings of the International Conference on Artificial
Neural Networks and Neural Information Processing (ICANN/ICONIP), vol. 181, 184.
Krivokapich, J., J. S. Child, D. O. Walter, and A. Garfinkel. 1999. Prognostic value of
dobutamine stress echocardiography in predicting cardiac events in patients with known
or suspected coronary artery disease. Journal of the American College of Cardiology 33
(3):708–16.
Madsen, R. E., D. Kauchak, and C. Elkan. 2005. Modeling word burstiness using the dirich-
let distribution. In Proceedings of the 22nd International Conference on Machine
Learning, 545–552.
Mallick, B. K., and A. E. Gelfand. 1994. Generalized linear models with unknown link func-
tions. Biometrika 81 (2):237–45.
Maronna, R. 2011. Alan Julian Izenman (2008): modern multivariate statistical techniques:
regression, classification and manifold learning. Statistical Papers 52 (3):733–4.
Montgomery, S. B., M. Sammeth, M. Gutierrez-Arcelus, R. P. Lach, C. Ingle, J. Nisbett, R.
Guigo, and E. T. Dermitzakis. 2010. Transcriptome genetics using second generation
sequencing in a caucasian population. Nature 464 (7289):773–7.
CYBERNETICS AND SYSTEMS: AN INTERNATIONAL JOURNAL 23
Wong, T.-T. 2009. Alternative prior assumptions for improving the performance of naïve
bayesian classifiers. Data Mining and Knowledge Discovery 18 (2):183–213.
Zamzami, N., R. Alsuroji, O. Eromonsele, and N. Bouguila. 2020. Proportional data model-
ing via selection and estimation of a finite mixture of scaled dirichlet distributions.
Computational Intelligence 36 (2):459–85.
Zamzami, N., and N. Bouguila. 2018a. Consumption behavior prediction using hierarchical
Bayesian frameworks. In 2018 First International Conference on Artificial Intelligence for
Industries (AI4I), 31–34. IEEE.
Zamzami, N., and N. Bouguila. 2018b. Text modeling using multinomial scaled dirichlet
distributions. In International Conference on Industrial, Engineering and Other
Applications of Applied Intelligent Systems, 69–80. Cham: Springer.
Zamzami, N., and N. Bouguila. 2019a. An accurate evaluation of msd log-likelihood and its
application in human action recognition. In 7th IEEE Global Conference on Signal and
Information Processing (GlobalSIP). IEEE.
Zamzami, N., and N. Bouguila. 2019b. A novel scaled dirichlet-based statistical framework
for count data modeling: Unsupervised learning and exponential approximation. Pattern
Recognition 95:36–47.
Zhang, Y., H. Zhou, J. Zhou, and W. Sun. 2017. Regression models for multivariate count
data. Journal of Computational and Graphical Statistics 26 (1):1–13.
@ Ln ðXjLn ðXjhÞhÞ X n h i
¼ g20 ðyj Þ wða þ b Þ þ w a0 w a0 þ b0 wða Þ (A2)
@a j¼1
@ Ln ðXjhÞ X n
@ 2 Ln ðXjhÞ X n
¼ g200 ðyj Þ½w0 ða þ bÞ w0 ða0 Þ w0 a0 þ b0 w0 ðaÞ (A5)
@ a
2
j¼1
@ 2 Ln ðXjhÞ X n
¼ g300 ðyj Þ½w0 ða þ bÞ w0 b0 w0 a0 þ b0 w0 ðbÞ (A6)
@ b
2
j¼1
The derivatives to estimate the MSD-based model parameters The first derivative of
MSD log likelihood function with respect to ai , i ¼ 1, :::, d and bi ¼ i ¼ 1, :::, d is:
@ Ln ðXj#Þ X n
¼ k01 ðyj Þ wðjajÞ wðmi þ jajÞ þ wðxi þ ai Þ wðai Þ (A7)
@ ai j¼1
CYBERNETICS AND SYSTEMS: AN INTERNATIONAL JOURNAL 25
@ Ln ðXj#Þ X n
xi
¼ k02 ðyj Þ (A8)
@ bi j¼1
b i
(A9)
8
>
< Pn 00 xi
@ Ln ðXj#Þ
2 k
j¼1 2 ð j Þ
y if i1 ¼ i2 ¼ i
¼ b2i (A10)
@ai1 @ai2 >
:0 otherwise,