You are on page 1of 12

Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 255 (2021) 119722

Contents lists available at ScienceDirect

Spectrochimica Acta Part A: Molecular and


Biomolecular Spectroscopy
journal homepage: www.elsevier.com/locate/saa

SERS-based viral load quantification of hepatitis B virus from PCR


products
Fatima Batool a, Haq Nawaz a,⇑, Muhammad Irfan Majeed a,⇑, Nosheen Rashid b, Saba Bashir a, Saba Akbar a,
Muhammad Abubakar a, Shamsheer Ahmad a, Muhammad Naeem Ashraf a, Saqib Ali a, Muhammad Kashif a,
Imran Amin c
a
Department of Chemistry, University of Agriculture, Faisalabad 38040, Pakistan
b
Department of Chemistry, University of Central Punjab, Lahore, Faisalabad Campus, Pakistan
c
PCR Laboratory, PINUM Hospital, Faisalabad, Pakistan

h i g h l i g h t s g r a p h i c a l a b s t r a c t

 SERS analysis of PCR product of viral


DNA extracted from various patients
of hepatitis B is conducted.
 SERS features increasing and
decreasing in intensity with change in
viral load are used for diagnostic
purpose.
 Data is analyzed using multivariate
analysis techniques including PCA,
PLS-DA and PLSR.
 PCA differentiates between healthy
and diseased samples with 89% and
98% of sensitivity and specificity
respectively.
 PLSR model depicts 0.9031of
goodness value with RMSE value of
0.2923 and proves to be valid for
prediction of blind sample.

a r t i c l e i n f o a b s t r a c t

Article history: Hepatitis B is a contagious liver disorder caused by hepatitis B virus and if not treated at an early stage, it
Received 22 January 2021 becomes chronic and results in liver cirrhosis and hepatocellular carcinoma which can even lead to death.
Received in revised form 5 March 2021 In present study, surface-enhanced Raman spectroscopy (SERS) is employed for the analysis of poly-
Accepted 15 March 2021
merase chain reaction (PCR) products of DNA extracted from hepatitis B virus (HBV) infected patients
Available online 19 March 2021
in comparison with healthy individuals. SERS spectral features are identified which are solely present
in the HBV positive samples and consistently increase in intensities with increase in viral load which
Keywords:
can be considered as a SERS spectral marker for HBV infection. For sake of understanding, these various
SERS
Hepatitis B, Viral DNA
levels of viral loads in this study are classified as low (1–1000 IU), medium (1000–10,000 IU), high (above
Viral load 10,000 IU) and negative control (>1). In order to explore the efficiency of SERS for discrimination of SERS
Silver nanoparticles spectral datasets of different samples of varying viral loads and healthy individuals, principal component
Principal component analysis analysis (PCA) is applied. PCA is used for comparison of these classes including low, medium and high
Partial least square regression analysis levels of viral loads with each other and with healthy class. Moreover, partial least square discriminant
analysis and partial least square regression analysis are employed for the classification of different levels
of viral loads in the HBV positive samples and prediction of viral loads in the unknown samples,

⇑ Corresponding authors.
E-mail addresses: haqchemist@yahoo.com (H. Nawaz), irfan.majeed@uaf.edu.pk
(M.I. Majeed).

https://doi.org/10.1016/j.saa.2021.119722
1386-1425/Ó 2021 Elsevier B.V. All rights reserved.
F. Batool, H. Nawaz, Muhammad Irfan Majeed et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 255 (2021) 119722

respectively. PLS-DA is applied for validity of classification and its sensitivity and specificity was found to
be 89% and 98% respectively. PLSR model was constructed for prediction of viral loads on the bases of
SERS spectral markers of HBV infection with goodness value of 0.9031 and value of root means square
error (RMSE) 0.2923. PLSR model also proved to be valid for prediction of blind sample.
Ó 2021 Elsevier B.V. All rights reserved.

1. Introduction In the present study, SERS-based method is developed to ana-


lyze the PCR products of HBV DNA extracted from HBV infected/-
Hepatitis B virus infection is a contagious but remediable liver suspected patients with different levels of viral loads. These HBV
disease which is now becoming one of the major cause of death diseased samples are analyzed by comparing them with negative
throughout worldwide. HBV belongs to hepadnaviridae family control samples and true healthy samples. There is no such study
and causes transient and chronic liver disorders. Transient infec- published yet which combines two important techniques of PCR
tions persist for months while chronic infections are long lasting and SERS for demonstration of method which has great potential
which can often result in liver failure with cirrhosis and hepatocel- to be employed for the quantitative characterization and diagnosis
lular carcinoma [1,2]. Replication of HBV DNA begins in hepato- of HBV.
cytes cells and the relaxed circular HBV DNA (RC-DNA) invades
into hepatocytes nucleus, where it is converted into covalently 2. Materials and method
closed circular DNA (cccDNA), from which various genomic RNAs
are transcribed by RNA polymerase 2. The pregenomic RNA is 2.1. Preparation of silver nanoparticles (AgNPs)
packed into capsids and further reverse transcribed by P proteins
to form new RC-DNA. This matured RC-DNA is further used for AgNPs are prepared by using chemical reduction method.
cccDNA amplification, these are enveloped and liberated from cell Briefly, one molar solution of Silver nitrate was prepared by adding
as virions [3]. For better treatment of this disease an efficient, easy 33.72 ml of silver nitrate to 200 ml of deionized water until it starts
and reliable diagnostic tool is required. It is evident to monitor to boil then it is added to 8 ml of 1% trisodium citrate (Na3C6H5O7)
every patient’s virologic profile during antiviral therapy, in order with stirring at 600 rpm for about one hour. Solution is set free to
to measure disease progression. Viral load is an important param- cool down at room temperature. Volume of 200 ml was maintained
eter to examine disease progression and patient’s immune by adding deionized water. The fresh prepared 2 ml suspension of
response to given therapy of disease. AgNPs are centrifuged at 600rmp for 7 min to remove supernatant
As previously reported, PCR assays are commonly used for qual- from it [15]. The size and morphology of nanoparticles were deter-
itative purposes and real time PCR is used for quantitation of mined with electron microscopy for which methodology has been
nucleic acids. Using PCR, quantity of target nucleic acid can be described in our previously published work [9]. The AgNPs were
determined by two ways including relative quantitation and abso- characterized with TEM and were found to be the spherical in
lute quantitation. The earlier method reveals variation in concen- shape with an average diameter of 53 nm as shown in the supple-
tration of target sequence as compare to its level in related mentary material (Fig. S1).
matrix while absolute quantitation shows exact quantity of nucleic
acid sequence related to specific unit. Sufficient information can be 2.2. Sample collection
obtained from relative quantitation while monitoring the progress
of infection [4]. PCR is a reliable technique for quantitation of Blood samples of clinically diagnosed HBV infected individuals
cccDNA in infected liver cells which provides estimation of viral were collected from Punjab Institute of Nuclear Medicine (PINUM),
load of each individual patient. PCR process is repeated many times Faisalabad. Sample preparation and DNA purification was per-
for making desired number of copies. formed using Cobalt X 480 Hamilton instrument. DNA extraction
The PCR products are usually analyzed using Gel documenta- and DNA amplification (PCR products) was carried out using Natch
tion system (GDS) which is used for imaging and separation of S biotech sansure instrument in PCR Lab of PINUM, Faisalabad, Pak-
nucleic acids [5]. Agarose gel strength offers excellent handling istan. In this regard, as shown in Table 1, total 53 samples were col-
of less percentage of gels for separation of huge number of DNA lected and their viral loads were predetermined. Out of these, 37
fragments. These gels are commonly strained using ethidium bro- samples are HBV positive with different viral loads (VL) which
mide or other strainers like GelGreen. For visualization purpose, are further classified as low VL (1–1000 IU), Medium VL (1000–
Gel doc includes UV transilluminator [6]. Gel doc method has some 10,000 IU) and high VL (above 10,000 IU), 13 are negative con-
limitations including continuous irradiation of UV light weakens trolled samples (pseudo healthy) with viral load of less than one
the fluorescence of bands and also the samples with less molecular (<1) and three of them are true healthy samples with zero viral
weight or less amount further limit the use of Gel doc. Moreover load.
Gel doc requires expensive instrumentation and materials [7]. In
order to overcome these limitations, development of new methods
2.3. Polymerase chain reaction (PCR) protocol
is required to analyze the PCR products more accurately for the
qualitative and quantitative purposes.
Reagents used for PCR amplification include Primers, template
Recently, surface-enhanced Raman spectroscopy (SERS) has been
DNA, Buffer, Taq polymerase enzyme and nucleotide. For quantifi-
employed for various applications including disease diagnosis such
cation of HBV DNA using Real-time PCR, amplification was carried
as tuberculosis [8] hepatitis C [9] dengue [10] and almost all types of
out using set of primers and probes according to the surface anti-
cancers like breast [11] prostate [12] and cervical cancer [13] etc. As
gen (s gene) of HBV. The primers were: HBV_S_F: 5́-CCTCTT
SERS provides clear differentiation so it is also used to for discrimi-
CATCCTGCTGCT-3́, HBV_S_R: 5́-AACTGAAAGC CAAACAGTG-3́.
nation among stages of diseases. SERS is inexpensive, reliable tech-
TaqMan probe used for PCR was: 5́-TET-TCCCATCCCAT
nique and gives straightforward identification of sample with no
CATCCCTGGGCTTT-TAMRA-3́. For PCR, 900 nM of both primers
interference of water and requires no sample preparation [14].
and 250 nM of the probe was used in 20 ml volume and initiative
2
F. Batool, H. Nawaz, Muhammad Irfan Majeed et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 255 (2021) 119722

Table 1
Sample details of HBV positive, low VL, medium VL, high VL, negative control (pseudo healthy) and true healthy samples.

Sr. No. Sample Name Viral Load (IU) Log VL Class


1 S17 17.5 1.243038 Low
2 S28 27.9 1.445604 Low
3 S15 40.76 1.610234 Low
4 S18 56.18 1.749582 Low
5 S8 63.14 1.800305 Low
6 S21 82.81 1.918083 Low
7 S27 89.03 1.949536 Low
8 S13 118.4 2.073352 Low
9 S26 151.4 2.180126 Low
10 S34 166.23 2.220709 Low
11 S19 241.7 2.383277 Low
12 S29 299.5 2.476397 Low
13 S16 572.9 2.758079 Low
14 S12 581.8 2.764774 Low
15 S3 899 2.953760 Low
16 S6 1157 3.063333 Medium
17 S11 1158 3.063709 Medium
18 S10 1511 3.179264 Medium
19 S2 1662 3.220631 Medium
20 S30 2243 3.350829 Medium
21 S7 2355 3.371991 Medium
22 S20 2371 3.374932 Medium
23 S9 2390 3.378398 Medium
24 S37 2419 3.383636 Medium
25 S14 3266 3.514016 Medium
26 S1 3615 3.558108 Medium
27 S36 4483 3.651569 Medium
28 S31 5981 3.776774 Medium
29 S33 17,860 4.251881 High
30 S5 29,310 4.467016 High
31 S4 35,730 4.553033 High
32 S24 42,820 4.631647 High
33 S23 584,000 5.766413 High
34 S22 4,747,000 6.676419 High
35 S32 133,500,000 8.125481 High
36 S25 335,400,000 8.525563 High
37 S35 389,400,000 8.590396 High
38 S38 Negative control <1 Healthy
39 S39 Negative control <1 Healthy
40 S40 Negative control <1 Healthy
41 S41 Negative control <1 Healthy
42 S42 Negative control <1 Healthy
43 S43 Negative control <1 Healthy
44 S44 Negative control <1 Healthy
45 S45 Negative control <1 Healthy
46 S46 Negative control <1 Healthy
47 S47 Negative control <1 Healthy
48 S48 Negative control <1 Healthy
49 S49 Negative control <1 Healthy
50 S50 Negative control <1 Healthy
51 S51 True healthy 0 Healthy
52 S52 True healthy 0 Healthy
53 S53 True healthy 0 Healthy

denaturing step of PCR cycle was carried out at 95 °C for ten min- 2.5. Data preprocessing and analysis
utes followed by 45 PCR cycles for 15 s at 95 °C [16]. The PCR prod-
ucts were further analyzed by Surface Enhanced Raman Preprocessing of raw SERS spectral data is performed in
Spectroscopy. MATLAB software version 7.8.0.347. Data is preprocessed using
algorithms for smoothening, baseline correction and vector nor-
2.4. SERS data acquisition malization and substrate removal by taking data in a single matrix.
The Savitzky-Golay algorithm is used for smoothening while Rub-
SERS data acquisition was carried out using Raman Microscope ber band and polynomial methods are used for baseline correction
Spectrometer (ATR8300BS) Optosky China equipped with diode of the spectral data. All the samples were given exactly same pre-
laser of 785 nm with a lens of 40X/0.6 leading to spot size of processing treatment in one matrix. The detailed peak assignment
2 mm. For this purpose, 30 ml of each sample was mixed with of SERS spectral data is given in Table 2. Data was further analyzed
30 ml of silver nanoparticles and left for 30 min as incubation time by multivariate data analysis techniques. Principal component
to develop good interaction between sample and nanoparticles. analysis (PCA), a statistical tool, was applied to SERS spectral data
From this incubation mixture, 40 ml of each sample was put on to determine variability and differentiation in the different data
the aluminum slide grove and fifteen spectra were acquired from sets and the reason of this differentiation. PCA operates by reduc-
each sample by using 100 mW laser power and 2 s as integration ing the dimensionality of data while variability remains unaffected
time. [17]. For selection of discriminative features for prediction of clas-
3
F. Batool, H. Nawaz, Muhammad Irfan Majeed et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 255 (2021) 119722

Table 2
DNA based Peak Assignment for SERS spectral data obtained from literature. Citations given in the last column of Table.
1
Assignments Raman Bands (cm ) References
Adenine 733(vs) 816(w) 1337(m) 1419(vs) [29,22,30,23,24]
Guanine 655(vs) 852(m) 1032(vs) 1566(m) 1605(w) [24,23,31]
Thymine 816(w) 1214(m) 1373(w) 1655(w) 1702(w) [29,22,25–27]
Cytosine 617(w) 776(m) 1221(m) 1511(w) [32,33,30,24,25]
Phosphate 1098(w) – – – – [34]
Deoxyribose 945(w) 957(s) 1010(m) 1455(s) – [24,35,36]
Aromatic ring 1111(w) – – – – [37]

Abbreviations: w; weak; m; medium; s; strong; vs; very strong.

sification and checking the authenticity of classification, Partial control ones are shown by dashed lines and labelled in a red box,
least square discriminant analysis (PLS-DA) is a swift and easy tool. include 655(vs), 852(s) 1337(m), 1455(s) and 1605(w) cm 1. In
PLS-DA is a versatile algorithm, applied to SERS spectral data for Fig. 1 SERS band of guanine appears at 655(vs) and 852(s) cm 1
prediction and descriptive modelling and classification of data into and bands of adenine appears at 1337(m) and 1605(w) in diseased
various sets in order to classify different levels/ stages of the dis- classes only while it has very low intensity in healthy class, which
ease [18]. depicts structural alterations of DNA of HBV. Similarly, at
Partial least square regression (PLSR) is a statistical tool applied 1455 cm 1 band of deoxyribose appears in diseased class and
to datasets, which limits the predictors to small groups of uncorre- absent in healthy class which also depicts changes that occur in
lated components to bring about least square regression in these structure of DNA after HBV infection.
components instead of applying on original dataset [19]. Here PLSR The prominent spectral features of adenine are observed at 733
model is constructed for predicting HBV viral loads which are (vs), 816(w), 1337(m), 1373(w), 1419(vs) and 1605(w) cm 1. The
dependent variables for SERS spectral data. SERS dataset is most dominant peak of adenine is observed at 733 cm 1, and its
regressed to explore the difference between measured and pre- intensity increases with increase in viral load and becomes very
dicted variables which elaborate efficiency of PLS model. In PLSR, dominant in high viral load class [22]. The weak band at
root mean square error of cross validation (RMSECV) is used to 816 cm 1 is due to poly nucleotide duplexes poly[d(A)]. poly[d
explore the significance or goodness of technique. Leave K-out (T)], this band depicts unique characteristics for backbone confor-
cross validation (LKOCV) is used to construct a model with mini- mation of DNA. In the mean spectra of disease this band go on
mum latent variables in order to avoid overfitting of dataset. Data decreasing as compared to healthy ones and eventually disappear
is randomly selected and divided into two parts, from which 60% of in high VL class as shown in Fig. 1. The SERS band at 1337 cm 1
data is for calibration and 40% is left as test dataset [20]. is due to CAN stretching vibration of pyrimidine ring. At
1419 cm 1 Raman peak of guanine is observed. Because of defor-
mation vibration of NH2 in Adenine a weak band appears at
3. Results and discussion 1605 cm 1 [23,24].
The SERS bands of guanine appears at 655(vs), 852(m), 1419(s),
Characterization of silver nanoparticles was carried out using 1605(w), 1032(vs), 1566(m) cm 1. Different extent of enhance-
transmission electron microscopy (TEM) which revealed that Ag- ment in the intensities of these peaks associated with guanine
NPs possess oval shaped geometry with average size of takes place with increase in viral load and appear much diminished
65  45 nm as displayed in the Fig. 1(S). Studies shows that parti- in negative control and true healthy classes. The SERS spectral fea-
cles having such size range shows excellent SERS activity [21]. tures at 816(w), 1214(m), 1373(w), 1655(w) and 1702(w) cm 1 are
For comparison of samples on the basis of their viral loads, all assigned to thymine. The SERS band at 1214 cm 1 is denoted by
53 samples are divided into five classes high VL, Medium VL, dotted line is prominent in healthy samples but decreases in inten-
Low VL, Negative control and true healthy. Each class contains dif- sity as the viral load increases. It clearly shows the presence of thy-
ferent samples of same ranges of viral load while negative control midine [25]. The band of 1373 cm 1 shows the ring breathing
is the one having viral load of below detection limits of PCR tech- vibration modes of DNA bases (thymine), shows significant reduc-
nique and true healthy possess zero viral load. tion in intensity as the viral load increases [26]. Another SERS band
appearing at 1655 cm 1 of thymine shows stretching of C2 = O-
3.1. Mean SERS spectra C2N3 bond. It slightly appears in diseased classes and eventually
disappear in negative control and healthy class, denoted by dashed
Fig. 1 shows mean SERS spectra of PCR products of true healthy, lines [25]. A weak band of thymidine appears to be at 1702 cm 1
negative control and different HBV positive samples including low which decreases as the viral load increases but it is absent in true
VL, medium VL and high VL. The spectral peaks observed are healthy class [27].
mainly categorized into four classes on the basis of their intensity The bands of cytosine appear to be at 617(w), 776(m), 1221(m),
including very strong (vs), strong (s), medium (m) and weak (w). 1511(w) and 1605(w) cm 1. The bending vibration of cytosine
The significant SERS spectral features are labeled using solid lines, shows a weak band at 617 cm 1 [25] and the peak at 776 cm 1
dotted lines and dashed lines. The SERS bands which increase in is due ring breathing in cytosine, which is found to decrease in
their intensities with increasing viral load are indicated by solid intensity with increase in viral load. A weak band appearing at
lines including, 733(vs), 957(m), 1032(vs), 1111(w), 1221(s), 1098 cm 1 is associated to phosphate group used in linkage of
1419(vs), 1566(m), 1655(w), 1788(m) cm 1. SERS bands that go DNA bases making a long polymer of nucleotide [28]. A medium
on decreasing with increasing viral load are indicated by dotted intensity peak at 945 cm 1 shows the presence of deoxyribose
lines, are observed at 506(w), 617(w), 776(m), 816(w), 945(m), complex is found with enhanced intensity. Further deoxyribose
1010(m), 1098(w), 1214(m), 1373(w), 1511(w) and 1702(w) bands appear at 947(w), 957(s), 1010(m), 1455(s) cm 1. The peaks
cm 1. Moreover, the peaks which are observed with a shift in the labelled in red boxes are 655(vs), 852(s) 1337(m), 1455(s) and
HBV positive samples as compared to true healthy and negative 1605(w) cm 1. These peaks are characteristics peaks as they

4
F. Batool, H. Nawaz, Muhammad Irfan Majeed et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 255 (2021) 119722

Fig. 1. Mean plot of SERS dataset. Black color is for true healthy, mustard yellow is for negative control, magenta is low VL, blue is medium VL and red is high VL. Solid lines
represent the peaks of ascending order which are increasing by increasing viral load, dotted lines represent the peaks of descending order which are decreasing with
increasing viral load and dashed lines indicates the peaks that are present in diseased samples but absent in true healthy and negative control samples (characteristics peaks).

Fig. 2. PCA scatter plot of SERS data of diseased (red), negative control (green) and true healthy (magenta) samples indicating different classes.

appear in diseased class and absent in healthy class which shows of SERS spectral data in such a way that no information is lost.
changes that occur in structure of DNA after HBV infection. PCA was carried out using all SERS spectral data, to identify and
confirm the SERS features associated with HBV and to compare dif-
3.2. Principal component analysis (PCA) ferent classes of samples having different levels of viral load. Fig. 3
shows PCA scatter plot of different classes of samples having differ-
In order to interpret SERS spectral datasets of different groups ent levels of viral load along with healthy samples. The SERS spec-
of viral load along with true healthy and negative controls, princi- tral data of different groups is clustered and separated by PC-1 on
pal component analysis (PCA) is applied to reduce dimensionality X-axis and indicated with different colors.
5
F. Batool, H. Nawaz, Muhammad Irfan Majeed et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 255 (2021) 119722

In Fig. 2 39% variability of data is explained by PC-1 and 17% by ance of PC-1 is 36% and PC-2 is 18%. Class Low VL is clustered on
PC-2. It is evident from the PCA scatter plot that classes are sepa- the negative side of PC-1 and class high VL on the positive side
rated on the basis of principal component 1 (PC-1) indicating that of PC-1 and no spectra (dots) are showing mixing which indicate
due to different levels of viral load, the SERS spectral features are that Low VL and high VL samples are nicely differentiated due to
different which cause the differentiation of these data sets. It is different viral load. Notably, as mentioned in Table 1, ‘‘Class low”
observed that SERS spectral data of HBV diseased samples (red) has samples with viral load less than 1000 IU/ml and ‘‘class high”
are shown on the negative side of PC-1 and true healthy (magenta) viral load has samples with range of viral load greater than
along with negative control (green) samples are clustered on the 5000 IU/ml. Fig. (a-ii) shows loadings of class low VL vs high VL.
positive side of PC-1. Fig. 2 shows PCA scatter plots of different Raman bands labeled by solid lines are found increasing in intensi-
samples of different viral loads from same class showing differen- ties by increasing viral load, while dotted lines include the features
tiation of spectral data differentiated on the basis of their viral that are decreasing in intensities by increase of viral load. And
loads. SERS spectral data of different samples is shown in the form dashed lines indicates those features which are present in diseased
of clusters (a) PCA scatter plot of spectral data of 15 different sam- classes of HBV and observed shifts in their wavenumbers as com-
ples of ‘‘class low VL”, each having fifteen spectra (b) PCA scatter pared to negative control and true healthy samples. Positive load-
plot of spectral data of 13 different samples of ‘‘class Medium ings are observed at 733(vs), 957(m), 1032(vs), 1111(w), 1221(s),
VL” (c) PCA scatter plot of spectral data of 9 different samples of 1419(vs), 1566(m), 1655(w), 1788(m) cm 1, these are the features
‘‘class High VL”. that are increasing by increasing viral load, these characteristics
To further elaborate the differentiation of different levels/- features are SERS biomarkers of HBV samples clustered on the pos-
classes of viral load of different samples, pairwise PCA analysis itive side of the PC-1 in scatter plot. It is also observed in mean plot
was performed in order to show the comparison within the same that these samples possess high content of virus. The negative
class of viral loads which are categorized as different classes/- loadings can be associated with the features that are decreasing
groups including low (1–1000 IU), medium (1000–10,000 IU), high by increasing viral load and clustered on the negative side of PC-
(above 10,000 IU) and negative control (>1), as mentioned in 1 in scatter plot. These are observed at 506(w), 617(w), 776(m),
Table 1. 816(w), 945(m), 1010(m), 1098(w), 1214(m), 1373(w), 1511(w)
Fig. 4 shows the pairwise comparison of SERS spectral data of and 1705(w) cm 1.
different samples of different classes having different viral loads Fig. 4(b-i) shows PCA scatter plot of SERS spectra of class Low VL
including low VL, medium VL and high VL as PCA scatter plots. (15 samples) versus Medium VL (13 samples) which shows separa-
Fig. 4(a-i) shows PCA scatter plot of SERS spectra of class Low VL tion of both the spectral data sets on the basis of PC-2 showing
(15 samples) versus high VL (9 samples) indicated by green color variance of 18%. Blue dots are on the positive side of zero line indi-
and red color dots, respectively. It is observed that low VL and high cating medium VL class while green dots are on the negative side
VL classes are differentiated on the basis of PC-1 as explained vari- indicating low VL. Notably, the ‘‘class low VL” includes samples

Fig. 3. (a) PCA scatter plot of class Low VL containing 15 samples (b) PCA scatter plot of class Medium VL containing 13 samples (c) PCA scatter plot of class High VL
containing 9 samples.

6
F. Batool, H. Nawaz, Muhammad Irfan Majeed et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 255 (2021) 119722

Fig. 4. PCA pair wise comparison plot of all diseased classes of viral loads and their loadings. a(i) PCA comparison plot of class low VL vs high VL, light green dots denote class
low VL and red color is for high VL a(ii) loadings of low VL vs high VL. b(i) PCA comparison plot of class low VL (green color) vs medium VL (blue color) b(ii) loadings of low VL
vs medium VL c(i) PCA comparison plot of medium VL (blue color) vs high VL (high VL) c(ii) loadings of medium VL vs high VL.

with viral load of less than 1000 and class medium has viral load 1000 to 10,000 IU/ml for ‘‘class high” is above 10,000 IU/ml. As
within range of above 1000 and less than 10,000 IU/ml. Few sam- viral load between these two classes have less difference so it is
ples are intermixing which is due to very minute difference of viral also showing a little mixing of spectral data sets of few samples
load between two classes. Fig. 4(b-ii) shows PCA loadings of class The SERS spectral datasets are found clustered with clear differen-
low VL versus class medium VL. The SERS spectral features that tiation between various classes of different viral loads. Fig. 4(c-ii)
are increasing are considered as positive loadings and lie on the shows loadings of class medium VL versus class high VL. Each SERS
positive side of PC-1. These features contained high virus concen- feature is labelled by solid, dotted and dashed lines respectively.
tration as observed in mean plot because it increases linearly. Most prominent SERS features/loadings include 733 cm 1 and
These are noted at 733(vs), 957(m), 1032(vs), 1111(w), 1221(s), 1418 cm 1, which are the highest peaks of SERS dataset indicating,
1419(vs), 1566(m), 1655(w), 1788(m) cm 1. While negative load- among others, the reason of differentiation between two data sets.
ings are clustered on the negative side of principle component 1 Positive loadings are observed at 733(vs), 957(m), 1032(vs), 1111
and decreases gradually as the viral load increases. Their virus con- (w), 1221(s), 1419(vs), 1566(m), 1655(w), 1788(m) cm 1 and neg-
centration is decreased down the class. These features are observed ative loadings are observed at 506(w), 617(w), 776(m), 816(w),
at 506(w), 617(w), 776(m), 816(w), 945(m), 1010(m), 1098(w), 945(m), 1010(m), 1098(w), 1214(m), 1373(w), 1511(w) and 1705
1214(m), 1373(w), 1511(w) and 1705(w) cm 1 as observed in (w) cm 1.
mean plot. Fig. 5 shows pairwise PCA scatter plot and loadings for compar-
Fig. 4(c-i) shows PCA scatter plot of SERS spectral data of med- ison of all diseased classes versus negative control class. Negative
ium VL (13 samples) versus high VL (9 samples). These two classes control class is indicated by black color and other classes are indi-
are separated on the bases of PC-2 by percentage variance of 17%, cated by different colors. Fig. 5(a) Shows comparison of low VL
class high VL (red colored dots) are on the positive side of a line class versus negative control class, magenta dots are for spectral
intersecting y axis and class medium VL (blue colored dots) are data of low VL (15 samples) which lie on positive side of PC-1 line
on the negative side. Range of viral load for medium class is from and black dots are indicating negative control class (13 samples)

7
F. Batool, H. Nawaz, Muhammad Irfan Majeed et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 255 (2021) 119722

Fig.5. Pairwise PCA scatter plots of HBV positive samples including comparison of different classes of low VL, medium VL, high VL with negative control samples (a) Low VL vs
negative control (b) medium VL vs negative control (c) high VL vs negative control (d) loadings of low, medium and high VL class vs negative control.

that lie on the negative side of PC-1. Fig. 5(b) Shows comparison of 3.3. Partial least square discriminant analysis (PLS-DA)
medium VL versus negative control where blue dots are for spec-
tral data of medium VL class (13 samples) which lie on the positive Partial least square discernment analysis (PLS-DA) is a data
side of PC-1 line and black dots are indicating negative control modelling technique that is applied for predictive and descriptive
class (13 samples) that lie on the negative side of PC-1. Fig. 5(c) modelling in addition to the selection of variables. PLS-DA is a ver-
Shows comparison of high VL class versus negative control class satile algorithm which is recently used in many aspects in forensic
and spectral data of high VL (8 samples) class is clustered one sciences for evidence analysis. It is widely used for disease classifi-
the positive side of PC-1 line and black dots are indicating negative cation in medical sciences. Here PLS-DA is used for checking the
control class (13 samples) that lie on the negative side of PC-1. validity of classification system of True healthy, pseudo healthy
Fig. 5(d) Shows loadings of all diseased classes in comparison with and diseased class. PLS-DA is applied to all five classes of SERS
negative control class. The positive and negative loadings indicate spectral Dataset (low VL, medium VL, high VL, negative control
the SERS spectral features associated with development of the and true healthy). For this purpose, 60% of the SERS data was
severity of the disease with the increase of viral load of HBV and selected was for validation and for selection of optimal number
can be considered as SERS spectral markers of HBV infection. Pos- of latent variables and the remaining 40% data was used for calcu-
itive loading are the features that are increasing by increasing viral lation of number of latent variables (NLV). The selected number of
load and these are the SERS spectral features that is clustered on latent variables for SERS data was 17 as shown in the scores plot of
the positive side of PC-1 in scatter plot. Negative loadings are the LV-1 and LV-2 in Fig. 6(a), the score plot gave out clear differenti-
features that are decreasing by increasing viral load and these fea- ation between HBV diseased, true healthy and pseudo healthy
tures are clustered on the negative side of PC-1 line in scatter plot. class. PLS-DA was developed in order to check the authenticity of
The positive loadings are observed at 658, 733, 850, 1031, 1111, constructed classes. Negative control is indicated by black color
1337, 1418, 1456, 1566, 1604 and 1788 cm 1, and negative load- and true healthy is indicated by blue color and both of their SERS
ings are observed at 506, 617, 776, 816, 945, 1010, 1098, 1211, spectral data sets are clustered on the positive side of the line
1373, 1511 and 1705 cm 1. intersecting y axis and all HBV positive classes are clustered on

8
F. Batool, H. Nawaz, Muhammad Irfan Majeed et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 255 (2021) 119722

Fig. 6. (a) PLS-DA Scatter diagram of HBV diseased class (red) vs negative control (Black) and true healthy (Blue) class. (b) Receiver operating characteristic curve (ROC)
showing area under curve (AUC).

the negative side. For PLS-DA model, 19 latent variables are The extent of classification is further explained by generating
selected which are used to check diagnostic ability of this tech- ROC curve which is used for cross validation of classification
nique. Fig. 6(a) indicates the score plot of PLS-DA for SERS dataset model. Fig. 6(b) shows that the obtained value of area under curve
of all the different classes/groups of samples (low, medium and (AUC) for SERS dataset is 0.78. The maximum value for AUC is 1
high VL) viral load and shows that healthy and diseased classes which shows the 100% accuracy of model and zero value indicates
are well differentiated from each other on the bases of presence that model is not valid. As AUC value of SERS dataset is near about
of different levels of viral load. The SERS spectral data of the three 1 so it can be stated that constructed model is valid and shows
classes of disease show some mixing because of less difference of excellent performance for classification.
viral load between them but they show good differentiation from
healthy samples. Calculated parameters of PLS-DA are accuracy 3.4. Partial least square regression
of 95%, precision of 74% sensitivity of 89% and specificity of 98%
which reveal that constructed model is valid and can be useful PLSR is applied to further extend the capability of SERS method
for the classification of different HBV positive samples and discrim- for prediction of VLs. For construction of PLSR model, and to avoid
ination of healthy samples from HBV positive samples. It is biasedness of dataset, SERS spectral data (555 spectra) are taken in
observed that PLS-DA plot explains the difference more clearly a single matrix and randomly divided into two halves 60% (333
between the classes then PCA scatter plot. spectra) and 40% (222 spectra), used as calibration dataset and val-
9
F. Batool, H. Nawaz, Muhammad Irfan Majeed et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 255 (2021) 119722

idation dataset/ test dataset, respectively. Thirteen latent variables Table 3


are used for constructing PLSR model of 37 samples shown in Fig- Evolution Parameters of PLS-DA plot.

ure (S2). Fig. 7(a) shows Calibration PLSR model and (b) Predicted Parameters Values
PLSR model. Different evolution parameters, which predict the per- Area under curve 0.78
formance of PLSR are given in Table 3. The calculated value of Accuracy 95%
goodness (R2 calculated) for calibration is 0.9031 and for prediction Precision 74%
is 0.8376 which reveal the authenticity of regression plot. The Sensitivity 89%
Specificity 98%
adjusted R2 value for calibration is 0.87 and for prediction is 0.69
respectively. The errors recorded for regression model were mini-
mum in value which further confirm the validity of PLSR model. To further elaborate the validity of our constructed PLSR model,
Errors calculated for calibration are (MSE = 0.13, MAE = 0.34, one sample was considered as blind sample. PLS regression tool was
MRE = 0.29 and RMSE = 0.29) and for prediction/validation are applied for unknown blind sample S34. The predicted value of blind
(MSE = 0.08, MAE = 0.31, MRE = 0.23 and RMSE = 0.37), respectively sample S34 was found to be 2.558 as shown in Fig. 7(b) which is
(see Table 4). very close to 2.220 indicating that constructed model is valid.

Fig. 7. (a) PLSR model of calibration for SERS dataset of all diseased, 37 samples. (b) PLSR model of prediction (validation model) for SERS dataset of all diseased, 37 samples.
Red squared dot indicates blind/unknown sample (S37).

10
F. Batool, H. Nawaz, Muhammad Irfan Majeed et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 255 (2021) 119722

Table 4 References
Details of evolution parameters of PLSR model including both calibration and
validation dataset. [1] R.P. Beasley, Hepatitis B virus. The major etiology of hepatocellular carcinoma,
Cancer 61 (10) (1988) 1942–1956.
Evolution Parameters Calibration Dataset Validation Dataset
[2] C. Seeger, W.S. Mason, Hepatitis B virus biology, Microbiol. Mol. Biol. Rev. 64
RMSE 0.29 0.37 (1) (2000) 51–68.
R2 0.90 0.84 [3] J. Beck, Hepatitis B virus replication, World J. Gastroenterol. 13 (1) (2007) 48.
Adjusted R2 0.87 0.69 [4] W.M. Freeman, S.J. Walker, K.E. Vrana, Quantitative RT-PCR: pitfalls and
Mean absolute error 0.34 0.31 potential, Biotechniques 26 (1) (1999) 112–125.
Mean relative error 0.29 0.23 [5] P.Y. Lee et al., Agarose gel electrophoresis for the separation of DNA fragments,
Mean square error 0.13 0.08 J. Visualized Exp. 62 (2012).
[6] T. Goldmann et al., Cost-effective gel documentation using a web-cam, J.
Biochem. Biophys. Methods 50 (1) (2001) 91–95.
[7] T. Scott, G. Dace, M. Altschuler, Low-cost agarose gel documentation system,
BioTechniques 21 (1) (1996) 68–72.
4. Conclusions [8] R. Botta et al., Tuberculosis determination using SERS and chemometric
methods, Tuberculosis 108 (2018) 195–200.
Surface enhanced Raman spectroscopy is found to be an excel- [9] M. Kashif et al., Surface Enhanced Raman Spectroscopy of the serum samples
for the diagnosis of Hepatitis C and prediction of the viral loads, Spectrochim.
lent technique for analyzing PCR products of viral DNA of HBV as it Acta Part A: Mol. Biomol. Spectrosc. 242 (2020) 118729.
gave out clear identification of biochemical features associated to [10] H.T. Ngo et al., DNA bioassay-on-chip using SERS detection for dengue
HBV in blood samples. These SERS features related to DNA can be diagnosis, Analyst 139 (22) (2014) 5655–5659.
[11] S. Cervo et al., SERS analysis of serum for detection of early and locally
used for differentiation of increasing viral load of the Hepatitis B.
advanced breast cancer, Anal. Bioanal. Chem. 407 (24) (2015) 7503–7509.
PCA of SERS datasets of HBV positive samples in comparison with [12] A. Stefancu et al., Combining SERS analysis of serum with PSA levels for
healthy samples was performed to further elaborate the differenti- improving the detection of prostate cancer, Nanomedicine 13 (19) (2018)
ation of samples of diseased and healthy class. Moreover, pair-wise 2455–2467.
[13] S. Bamrungsap et al., SERS-fluorescence dual mode nanotags for cervical
PCA analysis was also found helpful for the differentiation of SERS cancer detection using aptamers conjugated to gold-silver nanorods,
data sets of various classes such as low VL, medium VL, High VL, Microchimica Acta 183 (1) (2016) 249–256.
Negative control (pseudo healthy) and True healthy classes. PLS- [14] M.E. Hankus, D.N. Stratis-Cullum, P.M. Pellegrino, Characterization of next-
generation commercial surface-enhanced Raman scattering (SERS) substrates,
DA was performed to further check the validity of classification in: Chemical, Biological, Radiological, Nuclear, and Explosives (CBRNE) Sensing
and its sensitivity and specificity was found to be 89% and 98% XII. 2011. International Society for Optics and Photonics.
respectively. PLSR model was constructed for prediction of viral [15] H. Fang et al., Ultrasensitive and quantitative detection of paraquat on fruits
skins via surface-enhanced Raman spectroscopy, Sensors Actuators B: Chem.
loads in HBV positive samples by keeping SERS spectral data of 213 (2015) 452–456.
an unknown viral load as a blind sample. The RMSEC value of PLSR [16] M.-L. He et al., A new and sensitive method for the quantification of HBV
model was 0.2923 and value of goodness (R2) was found to be cccDNA by real-time PCR, Biochem. Biophys. Res. Commun. 295 (5) (2002)
1102–1107.
0.9031, which ensure the validity of model.
[17] S.-Y. Liu, Human liver tissue metabolic profiling research on hepatitis B virus-
related hepatocellular carcinoma. 19(22) (2013) 3423.
[18] L.C. Lee, C.-Y. Liong, A.A. Jemain, Partial least squares-discriminant analysis
CRediT authorship contribution statement (PLS-DA) for classification of high-dimensional (HD) data: a review of
contemporary practice strategies and knowledge gaps, The Analyst 143 (15)
(2018) 3526–3539.
Fatima Batool: Writing - original draft. Haq Nawaz: Supervi- [19] B. Li, J. Morris, E.B. Martin, Model selection for partial least squares regression,
sion, Project administration, Validation. Muhammad Irfan Chemometrics Intelligent Lab. Syst. 64 (1) (2002) 79–89.
Majeed: Conceptualization, Writing - review & editing, Validation. [20] A. Meade et al., Fourier transform infrared microspectroscopy and multivariate
methods for radiobiological dosimetry, Radiation Res. 173 (2) (2010) 225–237.
Nosheen Rashid: Conceptualization, Writing - review & editing,
[21] K.G. Stamplecoskie et al., Optimal size of silver nanoparticles for surface-
Validation. Saba Bashir: Conceptualization, Resources. Saba enhanced Raman spectroscopy, J. Phys. Chem. C 115 (5) (2011) 1403–1409.
Akbar: Writing - review & editing. Muhammad Abubakar: Data [22] L. Sun, C. Yu, J. Irudayaraj, Surface-enhanced Raman scattering based
curation. Shamsheer Ahmad: Software. Muhammad Naeem Ash- nonfluorescent probe for multiplex DNA detection, Anal. Chem. 79 (11)
(2007) 3981–3988.
raf: Software. Saqib Ali: Formal analysis. Muhammad Kashif: For- [23] R. Dong et al., Temperature-dependent Raman spectra of collagen and DNA,
mal analysis. Imran Amin: Formal analysis. Spectrochim. Acta Part A: Mol. Biomol. Spectrosc. 60 (3) (2004) 557–561.
[24] W. Ke et al., Surface-enhanced Raman spectra of calf thymus DNA adsorbed on
concentrated silver colloid, Appl. Spectrosc. 59 (4) (2005) 418–423.
Declaration of Competing Interest [25] C. Otto et al., Surface-enhanced Raman spectroscopy of DNA bases, J. Raman
Spectrosc. 17 (3) (1986) 289–298.
[26] J.W. Chan et al., Micro-Raman spectroscopy detects individual neoplastic and
The authors declare that they have no known competing finan- normal hematopoietic cells, Biophys. J. 90 (2) (2006) 648–656.
cial interests or personal relationships that could have appeared [27] J. De Gelder et al., Reference database of Raman spectra of biological
molecules, J. Raman Spectrosc.: Int. J. Original Work in all Aspects of Raman
to influence the work reported in this paper.
Spectroscopy, Including Higher Order Processes, and also Brillouin and
Rayleigh Scattering 38 (9) (2007) 1133–1147.
[28] X. Li et al., Polymerase chain reaction-surface-enhanced Raman spectroscopy
Acknowledgements (PCR-SERS) method for gene methylation level detection in plasma,
Theranostics 10 (2) (2020) 898.
The authors are obliged to Higher Education Commission (HEC) [29] R.M. Wartell, J.T. Harrell, Characteristics and variations of B-type DNA
conformations in solutions: a quantitative analysis of Raman band
of Pakistan for providing financial funding for this research work intensities of eight DNAs, Biochemistry 25 (9) (1986) 2664–2671.
[Grant # 6400/NRPU and 8494/NRPU]. [30] E. Pyrak, A. Jaworska, A. Kudelski, SERS studies of adsorption on gold surfaces
of mononucleotides with attached hexanethiol moiety: comparison with
selected single-stranded thiolated DNA fragments, Molecules 24 (21) (2019)
Appendix A. Supplementary material 3921.
[31] M. Mathlouthi, A.M. Seuvre, J.L. Koenig, FT-IR and laser-Raman spectra of
guanine and guanosine, Carbohydrate Res. 146 (1) (1986) 15–27.
Supplementary data to this article can be found online at [32] A. Ruiz-Chica et al., On the interpretation of Raman spectra of 1-aminooxy-
https://doi.org/10.1016/j.saa.2021.119722. spermine/DNA complexes, Nucleic Acids Res. 32 (2) (2004) 579–589.

11
F. Batool, H. Nawaz, Muhammad Irfan Majeed et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 255 (2021) 119722

[33] Y. Yu et al., Applying AuNPs/SWCNT to fabricate electrical nanogap device for [36] Y. Chen et al., Raman spectroscopy analysis of the biochemical characteristics
DNA hybridization detection, Carbon 157 (2020) 40–46. of molecules associated with the malignant transformation of gastric mucosa,
[34] J. Morla-Folch et al., Conformational SERS classification of K-Ras point PLoS One 9 (4) (2014) e93906.
mutations for cancer diagnostics, Angewandte Chemie 129 (9) (2017) 2421– [37] D. Naumann, Infrared and NIR Raman spectroscopy in medical microbiology,
2425. in: Infrared Spectroscopy: New Tool in Medicine, International Society for
[35] C.M. Muntean et al., FT-Raman signatures of genomic DNA from plant tissues, Optics and Photonics, 1998.
Spectroscopy 23 (2) (2009) 59–70.

12

You might also like