Professional Documents
Culture Documents
To cite this article: Cungui Cheng , Jia Liu , Changjiang Zhang , Miaozhen Cai , Hong Wang & Wei
Xiong (2010) An Overview of Infrared Spectroscopy Based on Continuous Wavelet Transform Combined
with Machine Learning Algorithms: Application to Chinese Medicines, Plant Classification, and Cancer
Diagnosis, Applied Spectroscopy Reviews, 45:2, 148-164, DOI: 10.1080/05704920903435912
Taylor & Francis makes every effort to ensure the accuracy of all the information (the
“Content”) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://www.tandfonline.com/page/terms-
and-conditions
Downloaded by [Colorado College] at 00:44 09 December 2014
Applied Spectroscopy Reviews, 45:148–164, 2010
Copyright © Taylor & Francis Group, LLC
ISSN: 0570-4928 print / 1520-569X online
DOI: 10.1080/05704920903435912
Abstract: Infrared spectroscopy has been a workhorse technique for materials analysis
and can result in positively identifying many different types of material. In recent years
there have been reports using wavelet analysis and machine learning algorithms to
extract features of Fourier transform infrared spectrometry (FTIR). The machine learn-
ing algorithms contain back-propagation neural network (BPNN), radial basis function
neural network (RBFNN), and support vector machine (SVM). This article reviews
the important advances in FTIR analysis employing a continuous wavelet transform
(CWT) and machine learning algorithms, especially in the applications of the method
for Chinese medicine identification, plant classification, and cancer diagnosis.
Introduction
Infrared radiation is commonly defined as electromagnetic radiation with frequencies from
14,300 to 20 cm−1 (0.7 to 500 µm). Due to the change in a molecule’s dipole moment,
a normal molecular motion, such as a vibration, rotation, rotation/vibration, lattice mode,
or a combination, difference, or overtone of these normal vibrations, may result in ab-
sorbing infrared radiation in this region. The corresponding frequencies and intensities of
the infrared spectrum may be used to identify the material. It can be used to quantify the
amount of a particular compound in a mixture. Chemical compounds of different classes
contain structural units that absorb infrared radiation at essential similar frequencies and
intensities within that the same class of compound. These bands are called group frequen-
cies. The infrared spectroscopy scientist uses the so-called group frequency, the similar
148
IR Spectroscopy with Machine Learning Algorithms 149
When wavelet transformation is used to analyze data, a proper wavelet basis function
and decomposing level number should be determined according to the spectral character-
istics of the signal. The suitable wavelet base and wavelet scale are determined by the
effect of signal decomposition in different scales and the characteristics of the FTIR signal
in a wavelet multiscale decomposition procedure. There is not a general criterion about
how to choose the optimal wavelet basis function. In general, we choose a proper wavelet
basis function by considering the properties of the wavelet basis function, features of the
signal to be analyzed, and the actual problem. The part of the signal whose shape is similar
to that of the wavelet basis function will be enlarged, and other parts of the signal will
be suppressed. In addition, proper scale wavelet is used according to the real problems.
Large-scale wavelet basis function should be used if we describe the total and approximate
properties of the signal by the wavelet transform. Small-scale wavelet basis function should
be used if we extrude the details of the signal by the wavelet transform.
When using CWT to detect the singularity of the curvature curve, we should choose
the proper wavelet, which has a similar shape to the signal to the analyzed, short compact
branch set, and large vanishing moment, as a wavelet basis function. Some representative
wavelet basis functions include Coiflet, Symlets, Daubechies, Morlet, Mexihat, and Meyer.
Figures 1a–f show their function curves in the time domain. Compared to the other three
wavelets, the Morlet wavelet has the shortest compact branch set (Figure 1c), so we choose
a Morlet wavelet as the analyzing wavelet.
Machine learning is the use of computer simulation of human learning activities, by
learning to acquire new knowledge to identify existing knowledge. Radial basis function
neural network (RBFNN) can extend or preprocess the input vector to a high-dimensional
space. It not only has good generalization ability but avoids the complex computation
as back-propagation neural network. ANN is an engineering system that simulates its
structure and intelligent behavior based on the understanding of the brain organizational
structure and operation mechanism. SVM is a machine learning technique that has arisen
in recent years, and its core idea is to map nonlinearly the data to a high-dimension
IR Spectroscopy with Machine Learning Algorithms 151
Downloaded by [Colorado College] at 00:44 09 December 2014
Figure 1. Wavelet basis function curves in time domain. (a) Mexi hat wavelet; (b) Meyer wavelet;
(c) Morlet wavelet; (d) Db10 wavelet; (e) Coiflet 5 wavelet; (f) Sym8 wavelet.
feature space. The back-propagation neural network (BPNN) is one of the most com-
mon neural network structures. It is simple and effective. However, the back-propagation
algorithm using the gradient descent method is not optimal. The modification of the
weights is done according to the gradient of the error curve, which points in the direc-
tion to the local minimum near the instance. The resilient propagation (Rprop) algorithm
for BPNN provides the solution to this problem. This article mainly reviews the con-
tinuous wavelet transform and machine learning algorithms that are applied in infrared
spectra.
152 C. Cheng et al.
Figure 2. FTIR spectra of (a) Stephania tetrandra S. Moore and (b) Stephania cepharantha Hayata.
IR Spectroscopy with Machine Learning Algorithms 153
where Ch is the center of basis functions and ρ is the width. Nodes compute the Euclidian
distance between the input vector and the center and then transformation is carried out
by the transfer function. The third layer is the output layer, where output node j can be
written as:
Ch (k) = Ch (k − 1) 1 ≤ h ≤ H, h = q
Cq (k) = Cq (k − 1) + α(k)[x(k) − Cq (k − 1)] (6)
(k) = [1 (k), 2 (k), . . . , H (k), ]T = [1 (l1 (k), ρ), . . . , H (lH (k), ρ)]T (10)
If the actual output is yj (k), the error is εj (k) = yj (k) − ŷj (k). According to recursive least
squares method, the adjusted algorithm of the network weight values is as follows:
1 P (k − 1)(k) · T (k) · P (k − 1)
P (k) = P (k − 1) − (13)
λ(k) λ(k) + T (k)P (k − 1)(k)
where P is the error variance matrix and λ is a forgetting factor. To make the network
identify samples more accurately, we design a competition layer after the output layer
in the network. The output vector of the competition layer consists of the output of all
the neurons in the competition layer. The outputs of all other neurons are zero except the
victorious one. A radial basis function neural network has two adjustable parameters: center
Ch and W ij . The learning process of the network is divided into two stages: the first stage
is the center adjustment; that is, the center Ch of a Gaussian function of hidden layer’s
nodes is decided by the training sample; the second stage is the network weight value
adjustment; that is, after having determined the parameters of hidden layer, we can get the
output layer network’s connecting weights W ij with least squares principle according to
the given training samples. In their studies, the feature vector is input to the RBFNN to
train to accurately classify the Stephania tetrandra S. Moore and Stephania cepharantha
Hayata. One hundred twenty-eight couples of FTIR are used to train and test the proposed
method, where 78 couples of data are used as training samples and 50 couples of data
IR Spectroscopy with Machine Learning Algorithms 155
are used as testing samples. Experimental results show that the accurate recognition rate
between Stephania tetrandra S. Moore and Stephania cepharantha Hayata is 99.8% and
99.9%, respectively, using the proposed method.
Work on the recognition method between semen cuscutae and its sibling plant Japanese
dodder seed based on the FTIR-CWT and ANN classification method has been performed
(12). The artificial neural network has many models. It can be divided into feed-forward and
feedback based upon the network structure. One of the main applications of a feed-forward
network is identification and classification. There are no strict distinctions between the
input and output layers of a feedback network, and we extract the important characteristics
and energy minimization of data after study (13). When all the feed-forward neural network
nodes used the sigmoid function, one hidden layer was sufficient for arbitrary classification.
Then, the network model was chosen and rules learned, input and output data were studied
(output data also known as the target output data), and the network was learned and trained
to get the neural network’s node weight and node threshold. The network weights and
Downloaded by [Colorado College] at 00:44 09 December 2014
threshold were determined by comparing the error between the output data of the artificial
neural network and the target output data until the errors were within the allowable range.
The sigmoid function was used as the active function. To make the least square error of
the corresponding input samples p minimum, we studied and amended the threshold and
weights. The formula of the least square error function can be written by
where tpj is the target output value of sample p in the output layer’s the jth node, that is, the
type of the plant, and o is the actual output value of sample p in the output layer’s node. The
actual output value is calculated from the input layer to the output layer, and the adjusted
direction of error and weight are from the output layer to the input layer. The formula of
calculation of the output value opj of the node (the output value equal to the input value
when node j is the input layer’s node) is
1
opj = f (netpj ) = (15)
(1 + e−wj i opj −θj )
where wj i is the weight value that connects nodes i and j, and θ j is the threshold of node j.
Threshold values can be considered as the weight connecting an output equal to 1 to other
nodes, so its adjustment process is the same as for wj i . The adjustment of weight wj i is as
follows: The formula of amendment weight wj i , which connects the hidden layer’s node i
to the output layer’s node j, is as follows (when j is the output layer’s node):
where η is the learning rate, α is the momentum term, and δ pj is the error signal of the
output layer’s node j, and δ pj is calculated as follows:
When j is not the output layer nodes, we also used the above weight amendment to connect
the hidden layer’s node i and the hidden layer’s node j. But in this case δ pj calculation
becomes
where δ pk is the error between the output with input from node j and node k and wkj is the
weight connecting nodes j and k.
We used FTIR and HATR techniques to obtain the FTIR of semen cuscutae and its sib-
ling Japanese dodder seed. The similar features between the semen cuscutae and Japanese
dodder seed were extracted by continuous wavelet transformation. After comparison anal-
ysis, we chose the decomposition levels 7, 10, and 13 that were used to extract the feature
vectors, which were used to train the ANN, and the trained neural network was used to
classify semen cuscutae and Japanese dodder seeds, which were collected from different
places all over China. According to 32 testing samples, the sibling plants, semen cuscutae
and Japanese dodder seed, we found that they could be effectively identified by FTIR with
continuous wavelet feature and artificial neural network (13).
Jin and Cheng (14) used continuous wavelet transformation with HATR-FTIR spec-
troscopy to identify Fructus lycii and its unofficial samples of the same genera. HATR-FTIR
of Fructus lycii and its unofficial samples of the same genera were directly obtained by
Downloaded by [Colorado College] at 00:44 09 December 2014
using HATR-FTIR. Then Jin and Cheng (14) used principal component analysis (PCA) to
determine the information load quantity in all the regions and the Morlet wavelet was used
as the mother wavelet to analyze HATR-FTIR of Fructus lycii and its unofficial samples
in the continuous wavelet domain. The data in the FTIR spectra were analyzed by PCA.
The result showed that it had important application value with combined HATR-FTIR to
continuous wavelet transformation to identify Fructus lycii and its unofficial samples. This
method was shown to be direct, rapid, and accurate.
Semen celosiae and semen celosiae cristatae were identified by us using continuous
wavelet transformation with FTIR (15). We used FTIR spectroscopy to obtain the infrared
spectra of semen celosiae and semen celosiae cristatae directly, quickly, and accurately.
Then CWT was used to extract the features of the FTIR spectra and succeeded in enlarging
the differences between semen celosiae and semen celosiae cristatae’s FTIR spectrum.
The accurate identification rate was greatly improved. Because the Morlet wavelet can
effectively detect singular values of the signal, it was selected as the mother wavelet
and one-dimensional continuous wavelet transformation was implemented to the infrared
spectra of semen celosiae and its confusable varieties. We observed the difference between
semen celosiae and semen celosiae cristatae at all scales in the wavelet domain and an
optimal scale was determined where we selected the most obvious difference between
semen celosiae and semen celosiae cristatae for identification. The results show that it
is effective to apply continuous wavelet transformation on the basis of FTIR to identify
traditional Chinese medicinal materials that are the same genus but different species.
We did research on the recognition of Equisetum arvense L. and Hippochaete ramo-
sissima Boerner (16). Based on FTIR spectrometry with chemotrics, we used FTIR and
HATR techniques to obtain the FTIR of Equisetum arvense L. and Hippochaete ramosis-
sima Boerner. Features of their similar absorption were extracted by CWT. The features
at decomposition levels 8, 9, and 10 were used as input data of SVM. A model of their
discrimination was established by the FTIR-CWT-SVM. SVM is a machine learning tech-
nique that has arisen in recent years, and its core idea is to nonlinearly map the data to a
high-dimension feature space. A low Vapnik–Chervonenkis-dimension optimal separating
hyperplane is constructed in this space, which makes the upper bound of classification
risk least. SVM is based on the principle of minimum experience risk and can be guar-
anteed theoretically under the condition that the sample number is inclined to infinite.
Most properties of SVM method are contained in the structural risk minimization principle
proposed by Vapnik (17). It requests disjoin not only faultlessly but also maximal interval
and guarantees minimal risk. The SVM was first proposed to solve the problem of linear
IR Spectroscopy with Machine Learning Algorithms 157
separability. There still exist linear unseparated problems, so the SVM method can solve
nonlinear problems. Its main ideas are as follows: Suppose there are n training sets (xi, yi) ∈
Rd × {±1}, and transform the input vector to a high-dimension feature space according
to nonlinear mapping g := (g1, g2, . . .), then account for oriental classification in the new
space. The nonlinear map is implemented by defining an appropriate inner product function.
An oriental hyperplane constructed in the feature space can be expressed as follows:
n
H (x) = αi yi g(x), g(xi ) + α0 (19)
i=1
Feature space is a Hilbert space. It need not know what g(x) is when some changes
happen in the space, because it only refers to the inner product operation of kernel function:
Select some kernel functions that meet Mercer conditions, and then try to input interspace
Downloaded by [Colorado College] at 00:44 09 December 2014
midline impartibility samples that can separate in high-dimension feature space. Equation
(20) can be expressed as:
n
H (x) = αi yi K(x, Xi ) + α0 (21)
i=1
where
n
αi yi = 0 (23)
i=1
0 ≤ αi ≤ C; i = 1, . . . , n (24)
C is a positive real index number in Eq. (25), inattentive variable controls parameters are
induced, which considers some samples cannot be correctly classified. It can control punish
extent to wrong classification (18). SVM can be carried out as shown in Figure 4.
The accuracy of discrimination for the 120 predicable samples was above 90% by
training, and when the radial basic function was used as the kernel function, its accuracy
of discrimination was exactly 100% by training. The results show that the eigenvalues of
the FTIR of the samples were different after continuous wavelet feature extraction, and it is
advantageous to classify the two plants using SVM as a classifier. By using SVM to classify
the eigenvalues that were extracted by continuous wavelet, it was successful to effectively
identify similar plants in morphology Equisetum arvense L. and Hippochaete ramosissima
Boerner (16).
158 C. Cheng et al.
Downloaded by [Colorado College] at 00:44 09 December 2014
Figure 5. FTIR spectra of (a) yellow foxtail seed, (b) giant foxtail seed, and (c) green foxtail seed.
a fast, reliable, objective, and effective method of chemical taxonomy. However, FTIR
analysis has limitations, so fast and accurate classification is needed (21).
The continuous wavelet transformation applied in infrared spectra was reported by
Cheng et al. (21). Based on the fact that the wavelet coefficients of each level are different
for the same characterization of the signal; only a few coefficients are needed to reflect
on the absorption peaks of spectra. It is one of the most efficient chemometrics analysis
methods. Cheng et al. (21) were successful in effectively identifying the sibling plants
yellow foxtail seed, giant foxtail seed, and green foxtail seed by FTIR with continuous
wavelet feature and ANN classifier. These plants are difficult to distinguish by traditional
phytotaxonomy. FTIR and HATR techniques were used to obtain the FTIR spectra of the
yellow foxtail seed, the giant foxtail seed, and the green foxtail seed, as shown in Figure 5.
Because they belong to sibling plants, they contain similar chemical compositions,
including hydroxy of cellulose (seed coat), starch, and plant hormones ß-sitosterol, and
their IR absorption is quite similar. The FTIR spectra from the different plants have close
absorbance and it is difficult to distinguish them. CWT to extract their features was used for
further classification. The feature vectors were used to train the artificial neural network.
160 C. Cheng et al.
The trained neural network was used to classify the seeds. It effectively identified the
sibling plants yellow foxtail seed, giant foxtail seed, and green foxtail seed by FTIR with
continuous wavelet feature extraction and ANN classification and we (21) reported that
the wavelet transform is a more effective signal processing method than FTIR and plays
an important role in signal analysis and feature extraction. Ehrentreich pointed out that the
wavelet transform has been established with FTIR as a data processing method in analytical
chemistry. This method makes a contribution to plant classification (21).
sis process is tedious and often is influenced by human factors. Apart from conventional
methods of cancer diagnosis, it is necessary to develop new approaches that are simple,
objective, and noninvasive. In recent years, FTIR has been developed as a diagnostic tool
for various human cancers and other diseases. Wang et al. (23) used FTIR spectroscopy to
analyze normal and cancerous tissues of the esophagus. FTIR applied in cancer detection
was reported by Sahu (24). Mark et al. (25) reported FTIR microspectroscopy as a quanti-
tative diagnostic tool for assignment of premalignancy grading in cervical neoplasia. But,
the identification between normal tissue and early cancer is difficult because their Fourier
infrared spectrums are similar. In order to improve the accuracy of diagnosing earlier stages
of cancer with FTIR, a novel method of extraction of FTIR feature using CWT analysis
and classification using SVM was reported by Cheng et al. (26). Our research showed that
wavelet transformation can easily detect the singularity of the signal. The faint difference
can be greatly extruded between signals by the wavelet transform.
The current authors of this review have performed many studies and succeeded in
efficiently identifying gastric normal tissues, early cancer, and advanced cancer tissues
using continuous wavelets and SVM (6). The identification between gastric normal tissues
and early cancer is difficult because the FTIR spectra are similar.
As shown Figure 6, peaks’ locations, intensities, and shapes change greatly. The hy-
droxy absorption peak from protein, nucleic acid, and grease exists at about 3400 cm−1, and
there is no obvious difference in intensity. The carbonyl group absorption peak from protein
exists at 1649 cm−1, and its intensity decreases after becoming cancer. The absorption peak
from the amide II band exists at 1538 cm−1, and absorption intensity also decreases after
becoming cancer. Other spectrum bands such as symmetrical flexing vibration and unsym-
metrical flexing vibration peaks of di-phosphate ester from nucleic acid exist at 1084 and
1242 cm−1. The absorption peak of collagen protein exists at 1342 cm−1. At an increase of
the degree of cancer, the absorption intensities in the FTIR spectrum gradually decrease.
In order to improve the accuracy to diagnose earlier stage gastric cancer with FTIR, we
develop a novel method of extraction of FTIR feature using CWT and classification using
the SVM. To the FTIR of gastric normal tissues, early carcinoma tissues, and advanced
gastric carcinoma tissues, nine feature parameters were extracted with CWT. With SVM,
all spectra were classified into two categories: normal or abnormal, which included early
carcinoma and advanced gastric carcinoma. The accuracy rate of polynominal function and
radical basis function (RBF) kernels was high in all kernels. The accuracy rate of poly
kernels in gastric normal tissues, early carcinoma tissues, and advanced carcinoma tissues
was 100%, 96%, and 100%, respectively. The accuracy rate of RBF kernels in normal, early
IR Spectroscopy with Machine Learning Algorithms 161
Downloaded by [Colorado College] at 00:44 09 December 2014
Figure 6. FTIR spectra of (a) gastric normal tissues, (b) early cancer, and (c) advanced cancer
tissues.
carcinoma, and advanced carcinoma was 100, 96, and 100%, respectively. The research
results show the feasibility of establishing the models with FTIR-CWT-SVM to identify
normal tissues and early carcinoma tissues.
We conducted a study on the early detection of colon cancer using the methods of
wavelet feature extraction. We reported a new method for the early detection of colon cancer
using a combination of feature extraction based on wavelets for FTIR and classification
using SVM (27). The FTIR data collected from 36 normal Sprague–Dawley rats, 60 1,2-
DMH-induced SD rats, and 44 second-generation induced rats were first preprocessed.
Then, 12 feature variants were extracted using CWT. The extracted feature variants were
then input into the SVM for classification of normal, dysplasia, early carcinoma, and
advanced carcinoma. Among the kernel functions used in the SVM, the poly and RBF
kernels had the highest accuracy rate. The accuracy of the poly kernels in normal, dysplasia,
early carcinoma, and advanced carcinoma was 100, 97.5, 95, and 100%, respectively.
The accuracy of the RBF kernels in normal, dysplasia, early carcinoma, and advanced
162 C. Cheng et al.
carcinoma was 100, 95, 95, and 100%, respectively. The results indicate that this method
could effectively and easily diagnose colon cancer in its early stages.
We did research on the methods of wavelet feature extraction and SVM classification
of FTIR with lung cancer data (26). In order to improve the accuracy to early stage lung
cancer diagnosis rate with FTIR, we developed a novel method of extraction of FTIR
features using wavelet analysis and classification using SVM. To the FTIR of normal lung,
early carcinoma, and advanced lung cancer tissues, nine feature variants were extracted
with CWT. With SVM, we classified all spectra into two categories, normal and abnormal,
which included early lung cancer and advanced lung cancer. We found that the accuract
rates of poly and RBF kernels was high in all kernels and the accuracy rates of poly kernels
in normal, early lung cancer, and advanced cancer, were 100, 95, and 100%, respectively,
and those of RBF kernels in normal, early lung cancer, and advanced cancer were 100, 95,
and 100%, respectively. The results show the feasibility of establishing the models with an
FTIR-CWT-SVM method to identify normal lung tissue, early lung cancer, and advanced
Downloaded by [Colorado College] at 00:44 09 December 2014
lung cancer.
Li et al. (28) reported the classification of FTIR cancer data using wavelets and fuzzy
C-means clustering. A feature extracting method based on wavelets for FTIR cancer data
analysis was reported. They use a set of low-frequency wavelet basis to represent FTIR
data to reduce data dimension and remove noise. The fuzzy C-means algorithm is used to
classify the data. They use wavelet features and the original FTIR data provided by the
Derby City General Hospital in the UK to compare classification performance. The results
show that only 30 wavelet features are needed to represent 901 cm−1 of the FTIR data to
produce good clustering results.
We studied the classification of FTIR in cancer data analysis using wavelets and
BPNN (29). We present a feature extracting method based on wavelets for HATR-FTIR
cancer data analysis and classification using an artificial neural network trained with a
back-propagation algorithm. Extraction using the CWT from 168 spectra of fresh normal
and abnormal lung tissue samples gave 12 features. Further classification based on BPNN
resulted in two categories: normal or abnormal. The BPNN is one of the most common
neural network structures because of the simplicity and efficiency. However, the back-
propagation algorithm using the gradient descent method is not optimal. The modification
of the weights is done according to the gradient of the error curve, which points in the
direction to the local minimum near the instance. The Rprop algorithm for BPNN provides
the solution to this problem. There are four variables to be used in the Rprop algorithm:
initial weight adjustment step s0 , maximum weight adjustment step smax , increasing weight
adjustment step w++, and decreasing weight adjustment step w−. The preferred values for
increasing and decreasing weight adjustment step are set as w++ = 1.2, w− = 0.5. The
initial weight adjustment step s0 can be set arbitrarily. Usually, s0 = 0.1, smax = 50.0. The
number of hidden layer neurons can be estimated by the following equation:
In + 0.168 × (In − On ) In > On
Hn = (26)
On − 0.168 × (On − In ) In < On
where Hn indicates the number of neurons in the hidden layer, In shows the number of
neurons in the input layer, and On represents the number of neurons in the output layer.
The number of neurons in the hidden layer is 10 according to initial computation. Based
on this, the neural network is trained and predicted by increasing or decreasing the number
of neurons in the hidden layer. The optimal number of neurons in the hidden layer can be
obtained when the predicted accuracy is the highest. The accuracy of identifying normal,
IR Spectroscopy with Machine Learning Algorithms 163
early carcinoma, and advanced carcinoma was 100, 90, and 100%, respectively. The results
indicate that FTIR with CWT and BPNN can effectively and easily diagnose lung cancer
in its early stages (29).
Acknowledgment
This research was supported by the National Natural Science Foundation of China under
project number 30800705.
References
1. Mckelvy, M., Britt, T.R., Davis, B.L., and Gillie, J.K. (1998) Infrared spectroscopy. Anal. Chem.,
70(12): 119–177.
2. Movasaghi, Z., Rehman, S., and Rehman, I.ur. (2008) Fourier transform infrared spectroscopy
of biological tissues. Appl. Spectros. Rev., 43: 134–179.
3. Tian, G.Y., Yuan, H.F., Liu, H.Y., and Lu, W.Z. (2003) The application of wavelet transform in
Downloaded by [Colorado College] at 00:44 09 December 2014
19. Lü, H.F., Cheng, C.G., Tang, X., and Hu, Z.H. (2004) FTIR spectrum of hypericum and tradeum
with reference to their identification. Acta Botanica Sinica, 46: 401–406.
20. Huang, H., Sun, S.Q., Xu, J.W., and Wang, Z. (2003) Novel application of FTIR in medical herb
chemotaxonomy. Spectroscopy and Spectral Analysis, 23: 253–257.
21. Cheng, C.G., Xiong, W., Tian, Y.M., and Zhang, C.J. (2009) A novel recognition of three kinds of
sibling plants using FTIR with continuous wavelet feature extraction combined with an artificial
neural network. Spectroscopy, 24 (2): 58–67.
22. Cheng, C.G., Sun, G.L., and Zhang, C.J. (2007) Early detection of gastric cancer using wavelet
feature extraction and neural network classification of FT-IR. Spectroscopy, 22 (11): 38–42.
23. Wang, J.S. and Shi, J.S. (2003) FT-IR spectroscopic analysis of normal and cancerous tissues of
esophagus. World J. Gastroenterol., 9: 1897–1899.
24. Sahu, R.K. and Mordechai, S. (2005) Fourier transform infrared spectroscopy in cancer detection.
Future Oncol., 1: 635–647.
25. Mark, S., Sahu, R.K., and Kantarovich, K. (2004) Fourier transform infrared microspectroscopy
as a quantitative diagnostic tool for assignment of premalignancy grading in cervical neoplasia.
Downloaded by [Colorado College] at 00:44 09 December 2014