You are on page 1of 14

Journal homepage:

www.jcimjournal.com/jim
www.elsevier.com/locate/issn/20954964
Available also online at www.sciencedirect.com.
Copyright © 2017, Journal of Integrative Medicine Editorial Office.
E-edition published by Elsevier (Singapore) Pte Ltd. All rights reserved.

● Methodology
A data-driven method for syndrome type
identification and classification in traditional
Chinese medicine
Nevin Lianwen Zhang1, Chen Fu2, Teng Fei Liu1, Bao-xin Chen2, Kin Man Poon3, Pei Xian Chen1, Yun-
ling Zhang2
1. Department of Computer Science and Engineering, the Hong Kong University of Science and
Technology, Hong Kong, China
2. Department of Neurology, Dongfang Hospital, Beijing University of Chinese Medicine, Beijing 100078, China
3. Department of Mathematics and Information Technology, the Education University of Hong Kong, Hong
Kong, China

ABSTRACT
The efficacy of traditional Chinese medicine (TCM) treatments for Western medicine (WM) diseases
relies heavily on the proper classification of patients into TCM syndrome types. The authors developed
a data-driven method for solving the classification problem, where syndrome types were identified and
quantified based on statistical patterns detected in unlabeled symptom survey data. The new method is
a generalization of latent class analysis (LCA), which has been widely applied in WM research to solve
a similar problem, i.e., to identify subtypes of a patient population in the absence of a gold standard. A
well-known weakness of LCA is that it makes an unrealistically strong independence assumption. The
authors relaxed the assumption by first detecting symptom co-occurrence patterns from survey data and
used those statistical patterns instead of the symptoms as features for LCA. This new method consists
of six steps: data collection, symptom co-occurrence pattern discovery, statistical pattern interpretation,
syndrome identification, syndrome type identification and syndrome type classification. A software
package called Lantern has been developed to support the application of the method. The method was
illustrated using a data set on vascular mild cognitive impairment.
Keywords: medicine, Chinese traditional; syndrome; syndrome classification; latent tree analysis; symptom
co-occurrence patterns; patient clustering; stand syndrome differentiation

Citation: Zhang NL, Fu C, Liu TF, Chen BX, Poon KM, Chen PX, Zhang YL. A data-driven method for
syndrome type identification and classification in traditional Chinese medicine. J Integr Med. 2017; 15(2):
110–123.

1 Introduction to Western medicine (WM).[1] A common practice is to


divide the patients with a WM disease into several TCM
Traditional Chinese medicine (TCM) has been syndrome types based on symptoms and signs (both
increasingly used in healthcare in China and around referred as symptoms henceforth for simplicity), and to
the world as a complementary or alternative method apply different TCM treatments to patients of different

http://dx.doi.org/10.1016/S2095-4964(17)60328-5
Received August 30, 2016; accepted November 7, 2016.
Correspondence: Prof. Nevin Lianwen Zhang; E-mail: lzhang@cse.ust.hk. Prof. Yun-ling Zhang; E-mail: yunlingzhang2004@163.com

Journal of Integrative Medicine 110 March 2017, Vol. 15, No. 2


www.jcimjournal.com/jim

types. The efficacy of TCM treatments depends heavily on throughout the paper.
whether the classification is done properly.[2,3]
The problem of TCM syndrome classification of a 2 Technical background
WM disease consists of four subproblems:[4–7] (1) What
TCM syndrome types exist among the patients with the This paper builds upon two data analysis methods,
disease? (2) What is the prevalence of each syndrome namely latent class analysis (LCA) and latent tree analysis
type? (3) What are the characteristics of each syndrome (LTA). They are based on probabilistic models that
type in terms of symptom occurrence probabilities? describe relationships among categorical variables. Some
(4) How do we determine to the syndrome type(s) of a of the variables are observed, while the others are latent,
patient, based on symptoms?
that is, unobserved. In this section, we explain LCA and
The syndrome classification problem is of fundamental
LTA in layman’s term so that medical researchers without
importance to TCM research and clinical practice.[2,3] As
will be seen in Section 9, this problem has so far not been a strong background in statistics and machine learning can
satisfactorily solved. In this paper we present a data-driven understand the key ideas.
method for its solution. The idea is to: (1) conduct a cross- 2.1 Latent tree models and latent class models
sectional survey of the patients with the WM disease The models that we use are called latent tree models
and collect information about symptoms of interest to (LTMs). An LTM describes the relationship among a set
TCM; (2) perform cluster analysis on the data and divide of variables at two levels. At the qualitative level, it is an
the patients into clusters based on symptom occurrence undirected tree where the observed variables are located
patterns; (3) match the patient’s clusters with TCM at the leaf nodes, whereas the latent variables are located
syndrome types; (4) use the statistical characteristics of at the internal nodes. At the quantitative level, it describes
the patient’s clusters to quantify the TCM syndrome types the relationship between each pair of neighboring
and to establish classification rules. variables using a conditional probability distribution.
We will first explain the data analysis methods that Figure 1(a) shows an example LTM taken from Xu et
this paper relies upon in Section 2. Then we will present al.[9] Qualitatively, it asserts that a student’s math grade
our method for solving TCM syndrome classification (MG) and science grade (SG) are influenced by his
in Sections 3 through 8. Related works and limitations analytical skill (AS); his English grade (EG) and history
will be discussed in Section 9 and conclusions drawn grade (HG) are influenced by his literacy skill (LS) and
in Section 10. A data set on vascular mild cognitive the two skills are correlated. Here, the grades are observed
impairment (VMCI) [8] will be used for illustration variables, while the skills are latent variables.

Figure 1 The concept of latent tree model and latent class model
The subfigure (a) and the tables illustrate the concept of latent tree models using an example that involves two latent variables (the skill variables)
and four observed variables (the grade variables). The tables show some of the probability parameters for the latent tree model. The subfigure (b)
illustrates the concept of latent class models where intelligence is the only latent variable.
AS: analytical skill; LS: literacy skill; MG: math grade.

March 2017, Vol. 15, No. 2 111 Journal of Integrative Medicine


www.jcimjournal.com/jim

For simplicity, a ssume all the variables have two not. Similarly, EG and HG are independent of each other
possible values “low” and “high”. The dependence conditioned on the latent variable LS, but MG and SG are
of MG on AS is quantitatively characterized by the not.
conditional distribution P (MG|AS), which is also Historically, LCMs precede LTMs. LTMs are introduced
included in Figure 1. It says that a student with high AS as a generalization of LCMs in Zhang[11], where they are
tends to have high MG and a student with low AS tends called hierarchical LCMs.
to have low MG. Similarly, the dependences of other 2.2 Latent class analysis
grade variables on the skill variables are quantitatively LCA refers to the analysis of data using LCMs. As
characterized by the distributions P(SG|AS), P(EG|LS) an example, consider a data set comprising the grades
and P(HG|LS), respectively. To save space, they are not obtained in the aforementioned four subjects by students
shown. from one school. To perform LCA on the data, we
To specify the quantitative relationships among the latent assume there is a latent variable Y that is related to the
variables, it is convenient to root the model at one of the grade variables as shown in Figure 2(a). The task is
latent variables and regard it as a directed model—a tree- to: (1) determine the number of possible values for Y,
structured Bayesian network.[10] If we use AS as the root, (2) determine the marginal distribution P( Y ) and the
then we need to provide, as given in Figure 1, the marginal distributions P(MG|Y), P(SG|Y), P(EG|Y) and P(HG|Y)
distribution of the root P(AS) and the distribution of the grade variables conditioned on Y. The first task is
P(LS|AS) of LS conditioned on its parent AS. If LS was known as model selection, while the second as parameter
chosen as the root instead, we would need to provide estimation.
P(LS) and P(AS|LS). The choice of root does not matter In statistics, the concept of likelihood measures how
because different choices give rise to equivalent directed well a model fits data. In LCA, probabilistic parameters
models.[11] are determined using the maximum likelihood estimate
LTMs with a single latent variable are called latent (MLE) principle.[12] The number of possible values for Y
class models (LCMs). Figure 1(b) shows an example is often determined using the Bayes Information Criterion
LCM, where “intelligence” is the sole latent variable. (BIC).[13] The BIC score is likelihood, plus a penalty term
Qualitatively, it asserts that a student’s grades in the four for model complexity. The use of BIC intuitively means
subjects are all influenced by his intelligence level. that we want a model that fits data well, but do not want it
Different models make different independence to be overly complex.
assumptions. In Figure 1(b), the four grade variables Note that there are two equivalent versions of BIC in
are assumed mutually independent, conditioned on the literature that differ by the sign of the score. In the
the latent variable “intelligence”. This is known as the version the BIC score takes positive values, lower values
local independence assumption. Another way to state are preferred. In the other version, where BIC scores take
the assumption is that the correlations among the grade negative values, higher values are preferred.
variables can be properly modeled using a single latent In practice LCA is used as a tool for clustering
variable. In this sense, we also call it the unidimensionality discrete data. [14] Each value of the latent variable Y
assumption. represents a probabilistic cluster of individuals; all the
In Figure 1(a), the four grade variables are not assumed values collectively give a partition of all individuals. To
to be unidimensional. The correlations among the determine the number of possible values for Y amounts
grade variables are modeled using two latent variables to determining the number of clusters, and to determine
AS and LS. MG and SG are independent of each other the probabilistic parameters amounts to determining the
conditioned on the latent variable AS, but EG and HG are statistical characteristics of the clusters.

Figure 2 Illustration of the results of latent class analysis and latent tree analysis
The subfigure (a) shows the model for latent class analysis. There is only one latent variable Y. The task is to determine the number of values for Y and
the probability parameters. The subfigure (b) shows a model that might be obtained from latent tree analysis, where it is necessary to determine the
number of latent variables and connections among them additionally.

Journal of Integrative Medicine 112 March 2017, Vol. 15, No. 2


www.jcimjournal.com/jim

2.3 Latent tree analysis 3 Key ideas of method


LTA refers to the analysis of data using LTMs. For
a given data set, there are many possible LTMs. For LCA has been applied in WM research to identify
example, one possible LTM for the imaginary student subtypes of a patient population in the absence of a gold
grade data is shown in Figure 2(b) and another in Figure standard. The first applications appeared around 1990 and
2(a). There are also other possible models. The task similar studies have increased sharply in recent years.[19–22]
is to determine which model is the best for the data. The studies primarily focus on infectious diseases, but
Specifically, we need to determine: (1) the number of also cover mental or behavioral problems, diseases of the
latent variables, (2) the number of possible values for musculoskeletal system, diseases of the digestive system
each latent variable, (3) the connections among the and neoplasms. Recently, LCA has also been used to
latent and observed variables, and (4) the probability identify TCM syndrome types among psoriatic patients.[23]
parameters. The model selection problem here is more When performing LCA, researchers begin by selecting,
difficult than in the case of LCA. It consists of the first based on domain knowledge, a set of symptom variables
three items. to be included in the analysis. Model parameters are
Several algorithms have been developed for LTA.[15] estimated based on the MLE principle and obtained
Extensive empirical studies have been conducted, where using algorithms such as Expectation-Maximization and
the different algorithms are compared in terms of the BIC Newton-Raphson. The number of clusters is determined
scores of the models they obtain and running time.[15,16] manually, based on the BIC score and/or other goodness-
The experiments indicate that the Extension Adjustment of-fit measures. The output includes the prevalence (i.e.,
Simplification until Termination (EA ST) algorithm[17] size) of each cluster, and a characterization of each cluster,
finds the best models in data sets with dozens to around in terms of the distribution of the symptom variables.
one hundred observed variables, while the Bridged- LCA makes the local independence assumption, which
Islands (BI) algorithm[16] finds the best models on data is often violated in practice. When this happens, the
sets with hundreds to around one thousand observed model obtained by LCA might not fit the data well, and
variables. On data sets with dozens to around one the estimates of subtype prevalence and other parameters
hundred variables, BI is much faster than EAST, while might be biased.[19] It can also lead to spurious clusters.[24]
the models it obtains are sometimes inferior. EAST is We propose to fit an LTM to data instead of an LCM.
unable to deal with data sets with hundreds or more The fitting is automatically carried out by computers
observed variables. through a search procedure guided by the BIC score.
The LTM shown in Figure 2(b) can be viewed as two In comparison with LCMs, LTMs are a much larger
LCMs, with their latent variables connected by an edge. class of models (and yet computationally manageable).
In general, an LTM can be regarded as a collection of Consequently, the model obtained by LTA would fit data
much better than that obtained by LCA.
LCMs, where each LCM is based on a distinct subset of
LTA yields a model with multiple latent variables.
observed variables and the latent variables are connected
One example is shown in Figure 3. The structure of the
to form a tree structure. As pointed out earlier, an LCM model partitions all symptom variables into groups, with
gives one probabilistic partition of data. Consequently, each group being all the variables directly connected to a
an LTM gives multiple partitions of data. Each partition latent variable. The variables in each group are mutually
is based primarily on a distinct subset of observed independent given the latent variable. So, LTA partitions
variables and the different partitions are correlated. all the symptom variables into unidimensional subsets.
For this reason, LTA is a tool for multidimensional Intuitively, eac h of the latent variables represents
clustering.[17] one latent aspect of the data. As will be seen in Section
2.4 Software tools 4, a latent aspect manifests as either probabilistic co-
A software tool called Lantern has been developed occurrence of a group of symptoms, or probabilistic
to facilitate LCA and LTA, and is freely available.[18] mutual exclusion of two subgroups of symptoms, with
The software is designed to run on desktop personal symptoms in each subgroup tending to co-occur.
computers, and can analyze data using various algorithms, Each of the latent variables also gives a partition of
inspect the results, and perform further analyses that data. The patient clusters in such partitions usually do
are described below. Separate implementations of the not correspond to TCM syndrome types. There are two
reasons. First, the definition of a TCM syndrome type
EAST and BI algorithms are also provided from the same
typically requires information from multiple latent aspects
website,[18] and enable users to run data analysis on more
of the data. Second, one latent aspect of the data might
powerful servers. On data sets that involve around 100 be related to multiple syndrome types. Two symptoms
symptom variables and 1 000 samples, EAST typically that tend to be mutually exclusive are usually caused by
takes several days, while BI takes a few hours. We different syndrome types. Two symptoms that tend to co-
recommend that users run EAST on servers rather than on occur might be caused by a single syndrome type or by
desktops. two different syndrome types that co-occur.

March 2017, Vol. 15, No. 2 113 Journal of Integrative Medicine


www.jcimjournal.com/jim

Figure 3 Structure of the latent tree model obtained on the VMCI data
The variables labeled with English phrases are symptom variables, while the Y-variables are latent variables. The integer next to a latent variable is the
number of its possible values. VMCI: vascular mild cognitive impairment.

Journal of Integrative Medicine 114 March 2017, Vol. 15, No. 2


www.jcimjournal.com/jim

To identify the cluster of patients that corresponds to VMCI. The data set is comprised of 803 patients and
a TCM syndrome type, we first select a set of symptom 93 symptoms. Detailed information about the data
variables based on domain knowledge and then perform set and the data collection process are reported in Fu
cluster analysis to group the patients based on those et al. [8]
variables. However, we do not always use the symptom The first step of the method is to perform LTA on the
variables themselves as features for the cluster analysis. VMCI data. This is done using the EAST algorithm. The
If two or more symptom variables are from the same structure of the resultant model is shown in Figure 3.
latent aspect of data, we “combine” them into one latent The variables labeled with English phrases are symptom
feature, and use the latent feature instead of the symptom variables, which are from the data set. The Y-variables are
variables themselves. The objective here is to relax the latent variables, which are introduced during data analysis.
local independence assumption. The integer next to a latent variable is the number of its
The following is an illustration of this approach: possible values. For example, Y01 has 2 possible values,
Figure 4 shows the model that we use to identify the whereas Y15 has 3.
cluster of patients that correspond to the symptom type The widths of the edges indicate strengths of
phlegm-dampness. According to domain knowledge, the correlations between neighboring variables. We see that
symptoms insomnia and dreamfulness are both related to
Y01 is strongly correlated with the symptom variables
phlegm-dampness, and hence are included in the model.
sallow complexion, asthenia of defecation, and dry stool
According to the results of LTA (Figure 3), they are from
the same latent aspect of data, and hence are not used as or constipation, and weakly related to clear profuse
features directly. Instead, they are “combined” into one urination. Similarly, Y08 is strongly correlated with the
latent feature and the latent feature is connected to the latent variables Y09 and Y12, and weakly related to Y04
clustering variable Z. Given Z, the two symptom variables and Y13.
are not mutually independent, and hence the local The latent variables reveal symptom co-occurrence
independence assumption is relaxed. patterns and symptom mutual-exclusion patterns hidden
The model in Figure 4 is called a joint clustering model in the data. Each of those statistical patterns can be
because it jointly considers information from several intuitively understood as the manifestation of a latent
aspects when performing cluster analysis on the patients. aspect of the data. In the rest of this section, we examine a
The BIC score of joint clustering model is –6 326. In few of the patterns.
contrast, the BIC score of the best LCM for the same Each of the latent variables represents a probabilistic
symptom variables is –6 390. So, a better model fit is partition of the patients in the data set. For example,
achieved by relaxing the local independence assumption. Y06 has two possible values and hence partitions
the patients into two clusters, which are denoted as
Y06 = s0 and Y06 = s1 respectively. Table 1 shows
the information about the partition compiled using
the “mod el interpretation” function of the Lant ern
software.

Table 1 Partition of patients given by the latent variable


Y06: the population is partitioned into two mutually exclusive
clusters, consisting of 79% and 21% of the patients respectively
Y06 = s0 Y06 = s1
Symptom Mutual information
(0.79) (0.21)
Thick tongue fur 0.05 0.63 0.16
Greasy tongue fur 0.38 0.79 0.06
The partition is characterized using the symptom variables directly
Figure 4 Joint clustering model for phlegm-dampness connected to Y06. Their occurrence probabilities in the two clusters are
shown in columns two and three. The fourth column shows the mutual
information between Y06 and the symptom variables.
We call our method for solving the TCM syndrome
classification problem the LTA method because of its The symptoms in Table 1 are those that are directly
reliance on LTA. The details of the method are presented connected to Y06 in Figure 3. They are sorted in
in the next five sections. descending order of their mutual information with the
partition, or, equivalently, with the variable Y06. Mutuol
4 Statistical symptom pattern discovery information measures the amount of information that a
symptom variable contains about the partition, or vice
We illustrate the LTA method using a data set on versa. It is closely related to the difference in occurrence

March 2017, Vol. 15, No. 2 115 Journal of Integrative Medicine


www.jcimjournal.com/jim

probabilities of the variables in the two clusters. The Table 4 Partition of patients given by the latent variable Y25
larger the mutual information, the larger the difference
Y25 = s0 Y25 = s1 Mutual
in occurrence probabilities, and the more important the Symptom
(0.64) (0.36) information
variable is for distinguishing between the two clusters.
Among the two symptoms in Table 1, thick tongue fur is Insomnia 0.16 0.78 0.20
more important because its occurrence probabilities in
the two clusters differ more than those of greasy tongue Dreamfulness 0.23 0.83 0.18
fur. Flushed face 0.10 0.03 0.01
We see that the two clusters consist of 79% and 21%
of the patients respectively. The symptoms occur with In general, a latent variable reveals either that a group
higher probabilities in the cluster Y06 = s1, and with of symptoms tend to co-occur, or that two subgroups of
lower probabilities in the cluster Y06 = s0. This indicates symptoms tend to be mutual exclusive, with symptoms
that the two symptoms are positively correlated. When in each group tending to co-occur. An examination of all
one symptom occurs, the other also tends to occur. In the latent variables in Figure 3 has found that Y02, Y05,
other words, the two symptoms tend to co-occur. In this Y09, Y10, Y12, Y14, Y15 and Y25 fall into the second
sense, Y06 reveals the probabilistic co-occurrence of the category, whereas the other latent variables fall into the
two symptoms. This is the statistical meaning of Y06. first category. The details are presented in Fu et al.[8]
Table 2 shows the information about that partition given
by the latent variable Y20. It reveals the probabilistic co- 5 Statistical pattern interpretation
occurrence of the three symptoms fat tongue, tongue with
ecchymosis, and tooth-marked tongue.
The objective is to identify patient clusters, based on
Table 3 shows the information about the partition given the VMCI data and domain knowledge that correspond
by the latent variable Y12. In Figure 3, Y12 is directly to TCM syndrome types, and to quantify the syndrome
connected to the two symptom variables slippery pulse types, using the statistical characteristics of the patient
and thin pulse. In the cluster Y12 = s0, slippery pulse clusters. We need to answer three questions to begin with:
occurs with high probability and thin pulse does not What are the target syndrome types? What information in
occur at all. In the cluster Y12 = s1, on the other hand, the data should we use for each target syndrome type? Is
slippery pulse occurs with low probability and thin pulse the information sufficient?
occurs with high probability. This indicates that the two One way to find the answers for these questions is to
symptoms are negatively correlated. When one symptom examine the symptoms in the data one by one, and, for
occurs, the other tends not to occur. In this sense, Y12 each symptom, list all the syndrome types that can lead
reveals the probabilistic mutual exclusion of slippery to the occurrence of the symptom according to domain
pulse and thin pulse. knowledge. In this way, a list of syndrome types is
Table 4 shows the information about the partition associated with each symptom. All the syndrome types
given by yet another latent variable Y25. It reveals that appear in the lists are the target syndrome types. For
the probabilistic mutual exclusion of two groups of each target syndrome type, all symptoms in the data set
symptoms: (1) insomnia and dreamfulness, and (2) flushed that it can cause to occur constitute the information we
face. Moreover, it indicates that the two symptoms in the should use for the syndrome type. The information is
first group tend to co-occur. sufficient if all key manifestations of the syndrome type
are covered.
Examining the symptoms one by one is tedious. We
Table 2 Partition of patients given by the latent variable Y20
suggest examining them in groups instead. As seen in
Y20 = s0 Y20 = s1 Mutual the previous section, LTA has discovered symptom co-
Symptom
(0.91) (0.09) information occurrence patterns and thereby divided the symptoms
into groups. We examine those statistical patterns one by
Fat tongue 0.05 0.58 0.08
one. For each pattern, we identify the syndrome types
Tongue with ecchymosis 0.02 0.30 0.04 that, according to domain knowledge, can lead to the co-
occurrence of the symptoms in the pattern. This step is
Tooth-marked tongue 0.08 0.46 0.04 therefore called Statistical Pattern Interpretation.
We begin with Y06, which reveals the probabilistic co-
Table 3 Partition of patients given by the latent variable Y12 occurrence of thick tongue fur and greasy tongue fur. To
determine the TCM connotation of the statistical pattern,
Y12 = s0 Y12 = s1 Mutual
Symptom
(0.43) (0.57) information
we ask this question: What TCM syndrome type or types
can bring about the co-occurrence of the two symptoms?
Slippery pulse 0.85 0.16 0.26 According to domain knowledge (e.g., Yang et al.[25]), the
answer is phlegm-dampness. Hence, phlegm-dampness is
Thin pulse 0.00 0.57 0.24
the interpretation for the statistical pattern.

Journal of Integrative Medicine 116 March 2017, Vol. 15, No. 2


www.jcimjournal.com/jim

Y20 reveals the probabilistic co-occurrence of the three the syndrome types and whether the information is
symptoms fat tongue, tongue with ecchymosis, and tooth- sufficient.
marked tongue. According to domain knowledge, the first One of the potential target syndrome types identified in
and the third symptoms can be caused by either phlegm- the previous section is phlegm-dampness. According to
dampness or Qi deficiency, and the second symptom can the discussions, the following symptoms should be used
be caused by blood stasis. Therefore, the co-occurrence of when quantifying the syndrome type: Y06 (thick tongue
the three symptoms can be due to either the co-occurrence fur and greasy tongue fur), Y20 (fat tongue and tooth-
of phlegm-dampness and blood stasis, or the co- marked tongue), Y12 (slippery pulse), and Y25 (insomnia
occurrence of Qi deficiency and blood stasis. We regard and dreamfulness).
phlegm-dampness and Qi deficiency as (two alternative) In the research of Fu et al., [8] all latent variables in
Figure 3 are examined and interpreted. The interpretations
primary interpretations of the statistical pattern, because
suggest that the following information should also be
they explain the leading symptom fat tongue. In contrast,
included when quantifying phlegm-dampness: Y11 (sticky
blood stasis is regarded as a secondary interpretation of or greasy feel in mouth), Y21 (dizziness, head feels as
the pattern. if swathed, distending headache, nausea or vomiting),
If a latent variable reveals the mutual exclusion of Y03 (dizzy headache), Y14 (thirst desire no drinks), Y16
two groups of symptoms, the two groups are interpreted (urinary incontinence), and Y19 (expectoration).
separately. For example, Y12 reveals the probabilistic In total, there are 16 symptoms that concern various
mutual exclusion of the two symptoms slippery pulse manifestations of phlegm-dampness. Together, they cover
and thin pulse. Slippery pulse can be explained by all major aspects of the impacts of phlegm-dampness.
phlegm-dampness, while thin pulse can be explained It is therefore concluded that phlegm-dampness is well
by either Qi deficiency or blood deficiency. Y25 reveals supported by the data and it can be a target syndrome type
the probabilistic mutual exclusion of two groups of for quantification.
symptoms: (1) insomnia and dreamfulness, and (2) flushed
face. The first group can be explained by any of several
7 Syndrome quantification
syndrome types, namely Yin deficiency, fire-heat, blood
deficiency, Qi deficiency, and phlegm-dampness. The
second group can be explained by either fire-heat or Yin To quantify phlegm-dampness, we perform cluster
deficiency. analysis on the patients based on the 16 symptom variables
Note that the interpretation of a co-occurrence pattern and thereby identify a patient cluster that corresponds
is about identifying an appropriate explanation for to the syndrome type. The size of the patient cluster is
the statistical pattern. It is not about determining the then regarded as a measurement of the prevalence of
syndrome type of a patient given that the pattern is phlegm-dampness, and occurrence probabilities of the
present. To interpret the co-occurrence of insomnia and 16 symptoms in the cluster are used as the definition of
dreamfulness, for instance, the correct question to ask phlegm-dampness.
is: What syndrome types can lead to the occurrence of The cluster analysis is carried out using the model
the two symptoms? There might be multiple possible shown in Figure 4. The latent variable Z, at the top,
answers for such a question as indicated above. The represents the patient clusters to be identified. The features
wrong question is: what is the syndrome type of a patient are from various latent aspects of the data. If there is only
if he has the two symptoms? This question cannot be one symptom variable from an aspect, it is connected to Z
answered because it is not possible to determine the directly. If there are multiple symptom variables from an
syndrome type of a patient only based on those two aspect, they are connected to Z via an intermediate latent
symptoms. variable.
Statistical pattern interpretation requires domain The analysis divides the patients into clusters by jointly
knowledge and thus expert judgment. It is possible that considering information from various aspects of data.
different experts interpret the same pattern in different Hence it is called joint clustering, and Z is called the joint
ways. If that happens, the issue can be resolved through clustering variable. The computational task is to determine
discussions among TCM researchers, for example, the number of possible values for Z and the probability
by following the Delphi method.[26] Statistical pattern parameters. This is done using Lantern through a
interpretation is where TCM researchers need to invest the procedure similar to standard LCA.
most effort when using the LTA method. As pointed out in Section 3, the use of the joint
clustering model in Figure 4, instead of the LCM with
6 Target syndrome type identification all the symptom variables as features, relaxes the local
independence assumption. The BIC score of the joint
The statistical pattern interpretation process gives rise to clustering model is –6 326. In contrast, the BIC score of
a list of syndrome types. Each of them is a potential target the best LCM is –6 390. So, the joint clustering model fits
for syndrome quantification. The next step is to determine the data better than the best LCM.
what symptoms in the data set should be used to quantify The result of the joint clustering is a partition with three

March 2017, Vol. 15, No. 2 117 Journal of Integrative Medicine


www.jcimjournal.com/jim

patient clusters, which are denoted as Z = s0, Z = s1 and 8 Syndrome classification


Z = s2 respectively. Information about the partition is
given in Table 5. There are 16 symptom variables in the In the previous two sections, we have discussed how
joint clustering model. Only seven of them are shown in to identity and quantify TCM syndrome types based on
Table 5, for simplicity. The others are pruned by Lantern domain knowledge and symptom co-occurrence patterns
based on cumulative information coverage. Specifically, discovered in data. In this section, we discuss the last step
Lantern sorts the 16 variables in descending order of their of the LTA method, which is to derive classification rules
mutual information with the joint clustering variable Z. for the syndrome types.
The cumulative information coverage of a variable in the 8.1 Model-based classification
ordered list is a ratio, where the numerator is the amount As seen in the previous section, quantitative
of information about the partition contained in the variable characterizations of TCM syndrome types are obtained
and in all those before it, and the denominator is the using joint clustering models such as the one shown in
amount of information about the partition contained in all Figure 4. For the following discussions, we consider a
the variables.[17] It exceeds 95% at the variable dizziness. general joint clustering model. We denote the symptom
Hence, we conclude that the first seven variables are variables in the model as X1, …, Xn and the joint clustering
sufficient for characterizing the differences among the latent variable as Z. Each symptom variable has two
three clusters. possible states 0 and 1, which represent the absence or
In Table 5, we see that the symptoms, especially the first presence of the symptom, respectively. The variable Z has
three, occur with much higher probabilities in the clusters two or more states. Let s be a state of Z that is interpreted
Z = s1 and Z = s2 than in the cluster Z = s0. We thus as a syndrome type and use ~s to denote the other state(s).
interpret Z = s1 and Z = s2 as two subtypes of phlegm- The problem is to determine whether Z = s or Z = ~s based
dampness and Z = s0 as the class of patients without on the values of the symptom variables.
phlegm-dampness. For the purpose of this paper, our goal This problem is simple in theory. All we need to do is
is to quantify the syndrome type phlegm-dampness and to compute the posterior distribution P(Z|X1,…, Xn) of Z
given the values of the symptom variables, and conclude
we are not interested in exploring it’s subtypes. Hence, we
that Z = s if its posterior probability is higher than that of
merge the two clusters Z = s1 and Z = s2 into one cluster
Z = ~s, that is,
Z = s12, and interpret Z = s12 as the class of patients with
P(Z = s|X1,…, Xn ) > P(Z = ~s|X1,…, Xn) (1)
phlegm-dampness. This method is called model-based classification.
If we accept the above interpretations, and if the Although conceptually simple, it lacks operability. It
patients surveyed are a representative sample of the is unrealistic to expect physicians to do probability
general VMCI population, then we have obtained a calculations in the clinic setting. It is of course possible
quantification of phlegm-dampness for VMCI. According to write a software tool for the physicians to use as a
to the quantification results (columns 2 and 5 of Table 5), black box. However, it might be difficult for patients and
58% of the VMCI patients have phlegm-dampness and physicians to trust black boxes.
42% of them do not. In the class of phlegm-dampness 8.2 Score-based classification rules
(Z = s12), greasy tongue fur occurs with a probability of For the sake of operability, we derive a score-
0.80, slippery pulse occurs with a probability of 0.60, and based classification rule to approximate model-based
so on. In the class of non-phlegm-dampness (Z = s0), the classification. Scores are associated with the symptoms.
symptoms occur with much lower probabilities. The total score for a patient is calculated based on the

Table 5 Joint clustering results for phlegm-dampness


Symptom Z = s0 (0.42) Z = s1 (0.44) Z = s2 (0.14) Z = s12 (0.58) MI CIC (%)
Greasy tongue fur 0.03 0.86 0.60 0.80 0.36 56
Sticky or greasy feel in mouth 0.05 0.18 0.62 0.29 0.09 70
Slippery pulse 0.27 0.67 0.39 0.60 0.07 75
Urinary incontinence 0.17 0.13 0.65 0.26 0.07 84
Dizzy headache 0.02 0.00 0.25 0.06 0.06 88
Expectoration 0.26 0.20 0.63 0.30 0.05 92
Dizziness 0.45 0.42 0.80 0.51 0.03 95
The patients are divided into three clusters Z = s0, Z = s1 and Z = s2. The last two clusters are merged into one cluster Z = s12. The sixth column
shows mutual information (MI) and the seventh column shows cumulative information coverage (CIC).

Journal of Integrative Medicine 118 March 2017, Vol. 15, No. 2


www.jcimjournal.com/jim

presence and absence of the symptoms. When the total becomes a classification rule and it is an approximation to
score exceeds a threshold, the patient is classified into the the rule given by inequality (1).
class Z = s. Table 6 shows the classification rule for phlegm-
We start by rewriting inequality (1) into the following dampness derived from the joint clustering model in
equivalent form using Bayes rule: Figure 4. We see that the score for greasy tongue fur is
P(Z = s)P(X1,…, Xn|Z = s) > P(Z = ~s)P(X1,…, Xn|Z = ~s) 7.1, the score for slippery pulse is 2.1, and so on. The
To obtain a score-based classification rule, we assume threshold is 4.2.
that the symptom variables are mutually independent
given Z = s or Z = ~s. Strictly speaking, the assumption is
Table 6 Classification rule for phlegm-dampness (Z = s12)
not true. In Figure 4, for instance, the two variables greasy
tongue fur and thick tongue fur are not independent of Symptom Score Threshold Accuracy
each other given Z. Hence approximations are introduced
Greasy tongue fur 7.10
here and their impacts will be assessed later. The
assumption allows us to further rewrite the inequality as Slippery pulse 2.10
follows:
P(Z = s)P(X1|Z = s)… P(Xn|Z = s) > P(Z = ~s) Sticky or greasy feel in mouth 2.80
P(X1|Z = ~s)… P( Xn|Z = ~s) Thick tongue fur 1.50
By taking the logarithm of both sides and re-arranging
the terms, we get: Dizzy headache 1.80
Tooth-marked tongue 1.00
Fat tongue 1.00
(2) Urinary incontinence 0.60 3.7 0.967
By subtracting a constant from both sides of inequality Dizziness 0.40 3.9 0.965
(2), we get:
Thirst desire no drinks 0.60 4.0 0.966
Head feels as if swathed 0.40 4.1 0.966
Expectoration 0.30 4.2 0.969
Two technical remarks are in order. First, the choice Nausea or vomiting 0.40 4.2 0.969
of base for the logarithm does not matter. We use
Distending headache 0.20 4.2 0.969
base 2. Second, the probability values might be 0
sometimes. We deal with this issue by smoothing. The Insomnia 0.02 4.2 0.969
formula for calculating the conditional distribution
Dreamfulness 0.02 4.2 0.969
. Smoothing means we
The shaded part can be deleted if one wants to simplify the rule.

change the formula to where


To use the classification rule, we need to examine
|Xi| is the number of possible values of Xi and c is the the patient and see whether the symptoms in the table
smoothing parameter. The smoothing parameter is to be are present. The patient gets one score for a symptom
determined by the user. In Lantern, it is set at 0.000 001 by if the symptom is present. If the total score exceeds the
default. threshold of 4.2, the patient is classified into the phlegm-
Note that on the left hand side of inequality (3), there is dampness class (Z = s12). Otherwise, the patient is
one term for each symptom variable. We regard it as the classified into the non-phlegm-dampness class (Z = s0).
score for that symptom. To be more specific, the score for The classification rule is an approximation of the model-
symptom variableis Xi is: based classification. How accurate is the approximation?
To answer this question, we applied both methods on
the patients in the VMCI data set. It turns out that the
Theoretically, there are two scores for each symptom rule classifies 96.9% of the patients the same way as the
variable, one for Xi = 0 (absence of the symptom), and model-based classification. Therefore, the accuracy of the
another for Xi = 1 (presence of the symptom). However, rule is 96.9%, as indicated at the bottom of the “accuracy”
the score for Xi = 0 is always 0. So, there is in effect only column of Table 6.
one score for Xi, i.e., the score for the presence of the 8.3 Understanding the scores
symptom. It is important to realize that the scores in Table 6 are
The term on the right side of (3) is regarded as the calculated from the probability values in Table 5, and they
threshold. Under this interpretation, the inequality have intuitive interpretations. To see this, note that:

March 2017, Vol. 15, No. 2 119 Journal of Integrative Medicine


www.jcimjournal.com/jim

In other words, a patient is classified into the phlegm-


dampness class if the probability of belonging to the class
exceeds 0.8. If the classification is increased by 4 to 8.2,
then the corresponding probability cut-off is 0.94.
8.5 Classification rule simplification
The first fraction on the right hand side is the odds of Several symptoms in Table 6 have low scores. We can
observing Xi = 1 in the class Z = ~ s, while the second consider eliminating such symptoms so as to simplify the
term is the odds of observing Xi = 1 in the class Z = ~s. classification rule.
So, the score is simply a log odds ratio, a standard term in The elimination of symptoms from a classification rule
statistics to quantify how strongly the presence or absence would affect its accuracy. The impact needs to be assessed
of one property (the symptom Xi) is associated with the before the elimination actually takes place. The Lantern
presence or absence of another property (whether the software has a function to facilitate this operation. To
patient is in the class Z = s). The score for a symptom is illustrate the function, we consider eliminating several
large if it occurs with high probability in Z = s and low symptoms at the bottom of Table 6. The accuracies of
probability in Z = ~s . the simplified rules are shown in the “accuracy” column.
As an example, consider how the score for the symptom The number 0.969, at the bottom, is the accuracy for
greasy tongue fur is computed from Table 5. According the rule with no symptoms removed; the second to
to Table 5, the symptom occurs in the phlegm-dampness last number, which happens to be also 0.969, is the
class with a probability of 0.80 and does not occur with accuracy for the rule with the last symptom removed
a probability of 0.20 (1–0.80). So, the odds of observing and so on. We see that, if we keep the top 8 symptoms
symptom in the phlegm-dampness class is 0.80/0.20 = 4.0. and remove the bottom 8 symptoms from the rule, the
On the other hand, the symptom occurs in the non- rule becomes much simpler and its accuracy is still high
phlegm-dampness with a probability of 0.03 and does (0.967). Consequently, we recommend the rule with the
not occur with a probability of 0.97. So, the odds of first 8 symptoms as the final rule. The threshold for the
observing the symptom in the non-phlegm-dampness simplified rule is 3.7.
class is 0.03/0.97 ≈ 0.03. Because the odds of observing
the symptom is higher in the phlegm-dampness class 9 Discussion
than in the non-phlegm-dampness class, it is positive
evidence for phlegm-dampness. Its score is the logarithm 9.1 Related works
of the odds ratio, that is, log2 (4.0/0.03) ≈ 7.1. Previous research on the TCM syndrome classification
The threshold 4.2 is also computed from probability problem has been conducted in three directions. The first
values in Table 5. Intuitively, it is the total strength of line of research is known as TCM Syndrome Essence
the evidence for non-phlegm-dampness when all the Study ( Zheng Shi Zhi Yan Jiu). The objective is to
symptoms are absent. identify biomarkers that can be used as gold standards
Note that the position of a symptom in the classification for TCM syndrome classification. Despite extensive
rule is determined not only by its score, but also by how efforts by many researchers over a long period of
often it occurs. For example, the score for the symptom time, little success has been achieved. When reflecting
sticky or greasy feel in mouth is 2.8, which is higher on the efforts in this direction, Zhao [3] wrote in 1999
than the score for the symptom slippery pulse, which is that “Syndrome essence study has accumulated a lot
2.1. However, the former symptom occurs with lower of experimental data in the past 40 years. Against the
probability in the data than the latter symptom and is original research objectives, all the data have brought us
hence applicable to a smaller fraction of the patients. are confusion and perplexity.”
Consequently, it is placed behind slippery pulse in the In the second line of research, panels of experts are
rule. set up to establish syndrome classification standards
8.4 Classification threshold adjustment for various WM diseases.[4,5] We call such efforts Panel
The symptom scores and thresholds in Table 6 are Standardization. A standard set up this way includes a
derived from equation (1). This means that a patient list of syndrome types that are deemed present among
is classified into the phlegm-dampness class when the the patients with a WM disease. It also provides, for each
probability of belonging to the class exceeds 0.5. If a syndrome type, a list of symptoms that are likely to occur,
higher probability cut-off is desired, one should increase and highlights the key symptoms for classification. There
the threshold of the rule. are typically no quantitative information about syndrome
If the classification threshold is increased by k, then the prevalence and symptom occurrence probabilities, and
probability cut-off is increased to . For example, if no clearly specified classification rules. Moreover, the
standards are based on subjective opinions of experts, and
the classification threshold is increased by 2 (from 4.2 different panels might produce different standards.[27]
to 6.2), then the corresponding probability cut-off is 0.8. The third line of research is based on labeled clinic

Journal of Integrative Medicine 120 March 2017, Vol. 15, No. 2


www.jcimjournal.com/jim

data.[2,6,28] In such a study, the patients of a WM disease are quantify the syndrome types and establish classification
surveyed and information about symptom occurrence is rules. This new approach can also be called Unsupervised
collected. The patients are examined by TCM physicians Quantification. Although domain knowledge is required
and their syndrome types are determined. In other words, when interpreting the statistical patterns discovered, the
the data have syndrome class labels and hence the data approach places TCM syndrome classification on the
are labeled. Statistics are then calculated based on the basis of stronger evidence than Panel Standardization and
labels to determine syndrome prevalence and symptom Supervised Quantification. The evidence might not be as
occurrence probabilities for each syndrome type. strong as what one would get from Syndrome Essence
Classification rules are established using statistical and Study, but the latter has not been successful so far. So, the
machine learning techniques such as regression, neural LTA method represents the best of what can be done for
networks and support vector machines.[7] The conclusions the time being.
One limitation of the LTA method is that it can currently
are summaries of the behaviors of the TCM physicians
deal with only dichotomous data, and hence is not able
who participate in the studies. Hence we call such work
to take the severity of symptoms into consideration.
Supervised Quantification. This is mainly for the sake of simplicity. While it is
Syndrome Essence Study would provide the strongest possible and would be interesting to extend the method to
evidence for TCM syndrome classification, but it has not polytomous data, doing so would substantially complicate
been successful. Panel Standardization and Supervised the methodology, software and the results. The literature
Quantification are based on weak evidence, and the results on LCA has mostly been focused on dichotomous data
depend heavily on the experts who participate in the for the same reason.[19–22] Since the results are meant to be
research work. guidelines for physicians to use as references, rather than
In the literature, cluster analysis is used more rigid standards for physicians to follow meticulously, it is
commonly to group symptom variables than to group desirable to keep them as simple as possible.
patients. [29] The symptom variable clusters obtained
are interpreted as syndrome types. This is problematic 10 Conclusions
because symptom variables being strongly correlated
with each other (and hence grouped together) does not How to properly classify a population of patients
necessarily imply that the symptoms tend to co-occur. into TCM syndrome types is a problem of fundamental
Mutual exclusion also implies strong correlation. In importance to TCM research and clinic practice. The
addition, each symptom is placed in only one cluster problem has not been satisfactorily solved before. A novel
and hence is related to only one syndrome type. In TCM data-driven method for solving the problem is presented
theory, on the other hand, a symptom can be caused by in this paper. It is called the LTA method and consists of
multiple syndromes. six steps:
Factor analysis is another unsupervised learning (1) Data collection: conduct a cross-sectional survey of
method that has been used to quantify syndrome types the population and collect data about the symptoms and
in terms of symptoms.[30] Factor analysis assumes that signs of interest to TCM.
observed variables (symptoms) are linear combinations (2) Statistical pattern discovery: analyze data using
of latent variables (syndrome types). The coefficients LTMs to reveal probabilistic symptom co-occurrence/
are called factor loadings. Researchers typically report mutual-exclusion patterns hidden in the data.
factor loadings for latent variables and interpret the latent (3) Statistical pattern interpretation: interpret the
patterns to determine their TCM connotations.
variables based on observed variables with high factor
(4) Syndrome identification: based on the statistical
loadings. Factor analysis uses continuous latent variables,
patterns and their interpretations, identify a list of
and hence it does not give patient clusters as the LTA syndrome types that are well supported by the data.
method does. In addition, the factors are usually assumed (5) Syndrome quantification: partition the population
to be mutually independent, while TCM syndrome types into patient clusters by performing joint clustering, match
are correlated. the patient clusters with syndrome types, and quantify the
9.2 Strengths and limitations syndrome types using population statistics of the patient
In this paper, we have presented a novel approach to clusters.
TCM syndrome classification, namely the LTA method. (6) Syndrome classification: derive a classification rule
The method starts with data on symptom occurrence. for each syndrome type from the results of syndrome
There are no judgments about syndrome types and hence quantification.
the data are unlabeled. The data are analyzed using Among the six steps, steps 2, 5 and 6 are carried out
LTMs to identify statistical symptom co-occurrence using computers, while steps 1, 3 and 4 are done by TCM
patterns, and the patterns are used to identify patient researchers.
clusters that correspond to syndrome types. The statistical Preliminary ideas behind the LTA method have appeared
characteristics of the patient clusters are then used to in the literature in the past decade. [31–38] Significant

March 2017, Vol. 15, No. 2 121 Journal of Integrative Medicine


www.jcimjournal.com/jim

advances are presented in this paper regarding the general 2014; 2014: 418206.
framework, statistical pattern interpretation, syndrome 7 Liu GP, Li GZ, Wang YL, Wang YQ. Modelling of inquiry
identification, syndrome quantification and syndrome diagnosis for coronary heart disease in traditional Chinese
classification. A software package, called Lantern, has medicine by using multi-label learning. BMC Complement
been developed to facilitate the use of the method. A Altern Med. 2010; 10: 37.
complete case study is reported in an accompanying 8 Fu C, Zhang NL, Chen BX, Chen ZR, Jin XL, Guo RJ,
Chen ZG, Zhang YL. Identification and classification of
paper.[8] Researchers should be able to use the method in
TCM syndrome types among patients with vascular mild
their own work after reading the two papers. cognitive impairment using latent tree analysis. [2016-08-
26]. http://arxiv.org/abs/1601.06923.
11 Funding 9 Xu ZX, Zhang NL, Wang YQ, Liu GP, Xu J, Liu TF,
and Liu AH. Statistical validation of traditional Chinese
Research on this article was supported by Hong Kong medicine syndrome postulates in the context of patients
Research Grants Council under grants No. 16202515 and with cardiovascular disease. J Altern Complement Med.
16212516, Guangzhou HKUST Fok Ying Tung Research 2013; 19(10): 799–804.
Institute, China Ministry of Science and Technology 10 Pearl J. Probabilistic reasoning in intelligent systems:
TCM Special Research Projects Program under grants networks of plausible inference. San Mateo, California:
No. 200807011, No. 201007002 and No. 201407001-8, Morgan Kaufmann Publishers. 1988.
11 Zhang NL. Hierarchical latent class models for cluster
Beijing Science and Technology Program under grant
analysis. J Mach Learn Res. 2004; 5: 697–723.
No. Z111107056811040, Beijing New Medical Discipline
12 Aldrich J. R.A. Fisher and the making of maximum
Development Program under grant No. XK100270569, likelihood 1912–1922. Statistical Sci. 1997; 12(3): 162–
and Beijing University of Chinese Medicine under grant 176.
No. 2011-CXTD-23. 13 Schwarz G. Estimating the dimension of model. Ann Statist.
1978; 6(2): 461–464.
12 Acknowledgements 14 Bartholomew DJ, Knott M. Latent variable models and
factor analysis. 2nd ed. London: Arnold. 1999.
We thank Xuezhong Zhou, Zhaoxia Xu, Hua Zhang, 15 Mourad R, Sinoquet C, Zhang NL, Liu TF, Leray P. A
Ping Liu and Vivian Tam for useful discussions and thank survey on latent tree models and applications. J Artif Intell
Jerry Wing Fai Yeung for valuable comments on earlier Res. 2013; 47: 157–203.
16 Liu TF, Zhang NL, Chen PX, Liu AH, Poon LKM, Wang Y.
versions of this paper.
Greedy learning of latent tree models for multidimensional
clustering. Mach Learn. 2013; 98(1): 301–330.
13 Conflict of interests 17 Chen T, Zhang NL, Liu TF, Wang Y, Poon LKM. Model-
based multidimensional clustering of categorical data. Artif
The authors declare that they have no competing Intell. 2011; 176(1): 2246–2269.
interests, financial or non-financial. 18 Zhang NL, Poon LKM, Liu TF. Lantern software. [2016-
01-18]. http://www.cse.ust.hk/faculty/lzhang/tcm/.
19 van Smeden M, Naaktgeboren CA, Reitsma JB, Moons KG,
REFERENCES de Groot JA. Latent class models in diagnostic studies when
there is no reference standard—a systematic review. Am J
1 Normile, D. The new face of traditional Chinese medicine. Epidemiol. 2014; 179(4): 423–431.
Science. 2003; 299(5604): 188–190. 20 van Loo HM, de Jonge P, Romeijn JW, Kessler RC,
2 Zhu WF. Differentiation of syndrome pattern elements. Schoevers RA. Data-driven subtypes of major depressive
Beijing: People’s Medical Publishing House. 2008: 6–12. disorder: a systematic review. BMC Med. 2012; 10: 156.
Chinese. 21 Wasmus A, Kindel P, Mattussek S, Raspe HH. Activity
3 Zhao GQ. Perplexity and outlet of TCM syndrome essence’s and severity of rheumatoid arthritis in Hannover/FRG and
study. Yi Xue Yu Zhe Xue. 1999; 20(11): 47–50. Chinese. in one regional referral center. Scand J Rheumatol Suppl.
4 China State Administration of Traditional Chinese 1989; 79: 33–44.
Medicine. Diagnostic and therapeutic effect evaluation 22 Sullivan PF, Smith W, Buchwald D. Latent class analysis
criteria of diseases and syndromes in traditional Chinese of symptoms associated with chronic fatigue syndrome and
medicine. Nanjing: Nanjing University Press. 1994. fibromyalgia. Psychol Med. 2002; 32(5): 881–888.
Chinese. 23 Yang X, Chongsuvivatwong V, Lerkiatbundit S, Ye J,
5 China State Bureau of Technical Supervision. National Ouyang X, Yang E, Sriplung H. Identifying the Zheng in
standards on clinic terminology of traditional Chinese psoriatic patients based on latent class analysis of traditional
medical diagnosis and treatment–—Syndromes, GB/T Chinese medicine symptoms and signs. Chin Med. 2014;
16751.2—1997. Beijing: China Standards Press. 1997. 9(1): 1.
Chinese. 24 Vermunt JK, Magidson J. Latent class cluster analysis. In:
6 Wang J, Xiong X, Liu W. Traditional Chinese medicine Hagenaars JA, McCutcheon AL. Advances in latent class
syndromes for essential hypertension: a literature analysis analysis. Cambridge: Cambridge University Press. 2002:
of 13,272 patients. Evid Based Complement Alternat Med. 89–106.

Journal of Integrative Medicine 122 March 2017, Vol. 15, No. 2


www.jcimjournal.com/jim

25 Yang WY, Meng FY, Jiang YN, Hu H, Guo J, Fu YL. Med. 2008; 14(5): 583–587.
Diagnostics of traditional Chinese medicine. Beijing: 33 Zhao Y, Zhang NL, Wang T, Wang Q. Discovering symptom
Xueyuan Press. 1998. Chinese. co-occurrence patterns from 604 cases of depressive patient
26 Linstone HA, Turoff M. The Delphi method: techniques and data using latent tree models. J Altern Complement Med.
applications. Vol. 29. Boston: Addison-Wesley Publishing 2014; 20(4): 265–271.
Company, Inc. 1975. 34 Zhang NL, Yuan SH. Implicit structure model and research
27 Zhang NL, Yuan SH, Wang TF, Zhao Y, Wang Y, Liu on syndrome differentiation of Chinese medicine (I): Basic
TF, Wang QG. Latent structure analysis and syndrome thought and analytic tool of implicit structure. Beijing
differentiation for the integration of traditional Chinese Zhong Yi Yao Da Xue Xue Bao. 2006; 29(6): 365–369.
medicine and Western medicine (I): basic principle. Shi Jie Chinese.
Ke Xue Ji Shu Zhong Yi Yao Xian Dai Hua. 2011; 13(3): 35 Zhang NL, Yuan SH, Chen T, Wang Y. Implicit structure
498–502. Chinese with abstract in English. model and research on syndrome differentiation of Chinese
28 Zhang GZ, Wang P, Wang JS, Jiang CY, Deng BW, Li P, medicine (II): data analysis of kidney deficiency. Beijing
Zhao YM, Liu WL, Zhai X, Chen WW, Zeng L, Zhou DM, Zhong Yi Yao Da Xue Xue Bao. 2008; 31(9): 548–587.
Sun LY, Li YY. Study on the distribution and development Chinese.
rules of TCM syndromes of 2 651 psoriasis vulgaris cases. 36 Zhang L, Yuan SH, Chen T, Wang Y. Implicit structure
Zhong Yi Za Zhi. 2008; 49(10): 894–896. Chinese with model and research on syndrome differentiation of Chinese
abstract in English. medicine (III): syndrome differentiation from model and
29 Zhang NL, Zhou XZ, Chen T, He LY, Liu BY. The syndrome differentiation by experts. Beijing Zhong Yi Yao
interpretation variable clustering results in the context of Da Xue Xue Bao. 2008; 31(10): 659–663. Chinese.
TCM syndrome research. Zhongguo Zhong Yi Yao Xin Xi Za 37 Wang TF, Zhang NL, Zhao Y, Wang Y, Wu XY, Yuan
Zhi. 2007; 14(7): 102–103. Chinese. SH, Wang ZY, Du CF, Yu CG, Chen T, Poon KM, Wang
30 Xue J, Wang Y, Han R. Factor analysis on the distribution QG. Latent structure models and their applications in
of Chinese medicine syndromes in patients with TCM syndrome research. Beijing Zhong Yi Yao Da Xue
hyperlipidemia in Xinjiang region. Zhongguo Zhong Xi Xue Bao. 2009; 32(8): 519–526. Chinese with abstract in
Yi Jie He Za Zhi. 2010; 30(11): 1169–1172. Chinese with English.
abstract in English. 38 Zhang NL, Xu ZX, Wang YQ, Liu TF, Liu GP, Li FF,
31 Zhang NL, Yuan SH, Chen T, Wang Y. Latent tree models Yan HX, Guo R. Latent structure analysis and syndrome
and diagnosis in traditional Chinese medicine. Artif Intell differentiation for the integration of traditional Chinese
Med. 2008; 42: 229–245. medicine and Western medicine (II): joint clustering. Shi
32 Zhang NL, Yuan S, Chen T, Wang Y. Statistical validation of Jie Ke Xue Ji Shu Zhong Yi Yao Xian Dai Hua. 2012, 14(2):
traditional Chinese medicine theories. J Altern Complement 1422–1427. Chinese.

Submission Guide
Journal of Integrative Medicine (JIM) is an international, types include reviews, systematic reviews and meta-analyses,
peer-reviewed, PubMed-indexed journal, publishing papers randomized controlled and pragmatic trials, translational and
on all aspects of integrative medicine, such as acupuncture patient-centered effectiveness outcome studies, case series and
and traditional Chinese medicine, Ayurvedic medicine, herbal reports, clinical trial protocols, preclinical and basic science
medicine, homeopathy, nutrition, chiropractic, mind-body studies, papers on methodology and CAM history or education,
medicine, Taichi, Qigong, meditation, and any other modalities editorials, global views, commentaries, short communications,
of complementary and alternative medicine (CAM). Article book reviews, conference proceedings, and letters to the editor.

● No submission and page charges ● Quick decision and online first publication
For information on manuscript preparation and submission, please visit JIM website. Send your postal address by e-mail to
jcim@163.com, we will send you a complimentary print issue upon receipt.

March 2017, Vol. 15, No. 2 123 Journal of Integrative Medicine

You might also like