You are on page 1of 9

International Journal of Electrical & Computer Sciences IJECS-IJENS Vol: 11 No: 03

78

Fuzzy Inference System Mamdani to Predicting Conformational Epitope Location


Subtitle as needed ( Faizah Dept. of Computer Science and Electronics, FMIPA UGM faizah@ugm.ac.id M.R.Widyanto Faculty of Computer Science, University of Indonesia widyanto@cs.ui.ac.id Asmarinah Dept. of Biology, Faculty of Medicine, University of Indonesia asmarinah@fk.ui.ac.id

Abstract: Predicting of conformational epitope is one of the major challenge in the field of vaccine design. Several methods have been developed for predicting conformational epitope but that methods have mostly been based on protein sequence and not very effective. This is the first attempt in this are to predict conformational epitope using fuzzy inference system mamdani. The proposed method based on amino acid properties and spatial information. The prediction results of the proposed system have high accuracy and its performance is comparable to existing tools. Keyword : conformational epitope, fuzzy system, prediction, amino acid

I.

INTRODUCTION

In the past century, medical research has improved health and increased life expectancy largely because of success in preventing and treating infectious diseases. Vaccines in particular , offer protection against infectious diseases. With growing need of monoclonal antibodies and vaccines, conformational epitope prediction especially for virus has become more and more desirable. A lot of efforts have been put for this purpose, but primarily on linear epitopes. Bioinformatics provides the tools that help designers streamline vaccine laboratory work [1]. In genetic engineering technology in particular DNA (deoxyribonucleic acid) recombinant, researchers explored the amino acid sequence and DNA sequence of the epitope to create an effective vaccine design. Genetic information of this epitope will be assembled in the form of plasmids (with special construction) and then will be transformed into competent cells (eg bacteria, yeast) to then do culture (propagation). The hope is that competent cells will produce recombinant proteins from the epitope or antigen that has been constructed in the plasmid so that the protein can be purified and could be used in vaccination. However, crystallographic studies have shown that most of epitopes in protein antigens are conformational [2], while only a few methods have been designed for this condition. For instance, the first server conformational epitope predictor (CEP) [3] is one of the fisrt methods created to identify the conformational epitope stretches, which adopt the Voronoi polyhedron of target protein to find its accessible syrface regions and categories them as Antigen Determinant (AD). Another method, Discotope [4], predicts epitopes with logodds probability matrices of amino acid residues and structural surface information. And the most recent predictors of conformational epitopes is PEPOP [5], which utilizes 3-D structural information to predict conformational epitopes and identify immunogenic peptodes [6] . In this paper, we proposed a novel algorithm which employ fuzzy inference system mamdani based on amino acid

statistics, spatial information and contact map analysis to predict conformational epitopes in virus H5N1. It was different from previous approaches that employs expert system to identify location of conformational epitopes [7]. The following sections describe dataset, the architecture of this method, result and at the last section conclusion of this study.

II.

DATASET

Epitope datasets were constructed from sources detailed below. In each case, the prediction methods were tested by their ability to detect these epitopes amongst the full set of overlapping nonamers derived from the proteins that contained the epitopes. The full set of nonamers will contain a small number of known epitopes and the remainder will be non-epitopes. Of course, this set of nonepitopes could include epitopes that have not been experimentally verified. However, the majority (see introduction) would be nonbinders with the corresponding MHC molecule. Added to this, the labelling of epitopes as non-epitopes impact on both rescaled and non-rescaled calculations equally. Previous research has also shown that this property of the nonepitope set did not produce significantly different results [8]. Each respective set of experimentally defined epitopes was denoted the positive dataset and theset of non-binding (or unknown) peptides was denoted the negative dataset. Conformational epitope showed in Figure 1.

1110303-9696 IJECS-IJENS June 2011 IJENS

IJENS

International Journal of Electrical & Computer Sciences IJECS-IJENS Vol: 11 No: 03

79

The value of a residue surface accessibility (ASA) calculated using the program Surface Racer ( Tsodikov ,Record, &Sergeev , 2002) . The value of accessibility surface is the surface area of residue that can be accessed from outside. This value will be divided by the maximum surface area of the residue to obtain the ratio of surface accessibility. Maximum surface area of each amino acid can be seen in Table 2. Table 2 Residue Surface Accessibility Amino Acid Fig.1 Conformational Epitope Virus H5N1 were extracted from PDB database dated December 2010. Only those with resolution better than 3.0 and protein antigen with more than 25 residues were retained. Redundant epitopes were removed by 60% similarity. Eighty two structures were finally retained as the training data which included 84 unique epitopes. The testing data were collected from the training dataset of Discotope[3], databases of IEDB and SEPPA[1].
Amino Acid Residue Surface Accessibility (ASA ) (2) Alanina Arginina Asam Aspartat Asparagina Sisteina Asam Glutamat 71,09 156,19 115,09 114,11 103,15 129,12

III.

METHODS

Glutamina Glisina Histidina Isoleusina Leusina Lisina Metionina Fenilalanina Prolina

128,14 57,05 137,14 113,16 113,16 128,17 131,19 147,18 97,12 87,08 101,11 186,12 163,18 99,14

The first step to identify epitope is find 5 amino acid properties, that included log-odds ratio, parker hidrophobility scale, surface accessibility ratio, volume residue and surface area. Table 1 show log-odds ratio and parker hidrophobility scale. Log-odds ratio will be use as propensity epitope scale. This is indicate that these amino acids have a great opportunity to be the epitope. Table 1 Parker hidrophobility scale and log-odds ratio
Amino acid Parker D E N S Q G K T R P H C A Y V M I F L W 2.46 1.86 1.64 1.5 1.37 1.28 1.26 1.15 0.87 0.3 0.3 0.11 0.03 0.78 1.27 1.41 2.45 2.78 2.87 3 Log-odds ratios 0.691 0.346 1.242 0.145 1.082 0.189 1.136 0.233 1.18 1.164 1.098 3.519 1.522 0.03 1.474 0.273 0.713 1.147 1.836 0.064

Serina Treonina Triptofan Tirosina Valina

ASA value generally computed using the algorithm "ball rolling" (Shrake & Rupley, 1973) which was developed by Shrake and Rupley in 1973. This algorithm uses a ball that usually measuring 1.4 (the size of hydrogen atoms) to trace the surface of the molecule that wish calculated value will melt away. Meanwhile, the value of residue volume, surface area and the side chain of amino acid energy obtained from the amino acid index (Wikipedia) is shown in Table 3.

1110303-9696 IJECS-IJENS June 2011 IJENS

IJENS

International Journal of Electrical & Computer Sciences IJECS-IJENS Vol: 11 No: 03

80

Table 3 Residue volume and surface area amino acid


Asam Amino A R D N C E Q G H I L K M F P S T W Y V Residue Volume 88.6 173.4 111.1 114.1 108.5 138.4 143.8 60.1 153.2 166.7 166.7 168.6 162.9 189.9 112.7 89.0 116.1 227.8 193.6 140.0 Surface Area 115 225 150 160 135 190 180 75 195 175 170 200 185 210 145 115 140 255 230 155

If PH low and LR low and ASA low and VR low and SA low then value of epitope low If PH low and LR medium and ASA low and VR low and SA low then value of epitope low If PH low and LR high and ASA low and VR low and SA low then value of epitope low If PH low and LR low and ASA medium and VR low and SA low then value of epitope low If PH low and LR low and ASA high and VR low and SA low then value of epitope rendah If PH low and LR low and ASA low and VR medium and SA low then value of epitope low If PH low and LR low and ASA low and VR high and SA low then value of epitope low If PH low and LR low and ASA low and VR low and SA medium then value of epitope low If PH low and LR low and ASA low and VR low and SA high then value of epitope low If PH medium and LR low and ASA low and VR low and SA high then value of epitope medium If PH medium and LR medium ASA low and VR low and SA high then value of epitope medium If PH medium and LR high and ASA low and VR low and SA high then value of epitope medium If PH medium and LR low and ASA medium and VR low and SA high

Methods that proposed in this research show in Figure 2.

then value of epitope medium If PH medium and LR low and ASA high and VR low and SA high then value of epitope medium If PH medium and LR low and ASA low and VR medium and SA high then value of epitope medium If PH medium and LR low and ASA low and VR high and SA high then value of epitope medium If PH medium and LR low and ASA low and VR low and SA medium then value of epitope medium If PH medium and LR low and ASA low and VR low and SA high then

Fig. 2 Methods of research

value of epitope medium If PH high and LR low and ASA low and VR low and SA low then value of epitope medium

FIS MAMDANI Fuzzy inference system that used is the method of Mamdani . This system has five input parameters and one output parameter. Each input parameter has three membership functions of triangular-shaped function. Output parameter has a 3 pieces of triangle-shaped membership function. Rules are established to produce the output value totaled 22 rules epitope. The rules are formulated based on observations of the relationship between input parameters to the values of epitope. The rules shown in Figure 3. Some stages to get output value in FIS Mamsani consist of : fuzzification, decide rules, implication and defuzzification. Domain for each variable input and variable output show in Table 4.

If PH high and LR high and ASA low and VR low and SA low then value of epitope high If PH high and LR medium and ASA medium and VR medium and SA medium then value of epitope high If PH high and LR high and ASA high and VR high then value of epitope high and SA high

Fig. 3 Rules of FIS Mamdani


Note : PH = Parker Hodrophobility ; LR : Log Ratio ; ASA : Accessibility Surface Area ; VR=Volume Residu ; SA=Surface Area

1110303-9696 IJECS-IJENS June 2011 IJENS

IJENS

International Journal of Electrical & Computer Sciences IJECS-IJENS Vol: 11 No: 03 2. Table 4 Domain and fuzzy representation of Variable input and variable output
Variable Input Parker_Hidrophobility (PH) Domain [0.03,3] Fuzzy Set Representation Low [0, 1.325] Medium [1,2] High [1.8,3] Log-odd ratio (LR) [0.03, 3.19] Low [0.1.6] Medium [0.4,3.6] High [2.4,4] Accessibility Area (ASA) Surface [57.05, 186.12] Low [0,80] Medium [20,180] High [120,200] Volume Residue (VR) [60.1, 227.8] Low [0,100] Medium [25.66,225.7] High [150,250] Surface Area (SA) [75,255] Low [0,120] Medium [30,270] High [180,300] Output Epitope Value [0,300] Low [0,120] Medium [30,270] High [180, 300]
Conformational Epitope Ya Distance <= Linear Epitope No Ya Epitope Score > Threshold? Not Epitope No Epitope Score FIS MAMDANI Extraxtion 5 Atribut/Properties Amino acid

81

3. 4.

5.

From the 3D structure of proteins will get five main attribute values in the 3D protein that used as input parameters in the fuzzy inference system. These parameters are scale tendency epitopes, parker hidropbobilicity scale, contacts value , the ratio of surface accessibility, Residue Volume and Surface Area. Performed fuzzification of input parameters are included Performed using a fuzzy inference mamdani, where the result is a score epitope which will be compared with threshold. If the score exceeds the threshold, the epitope is the epitope residues. Then will search again if the epitope that a row spacing distance 6A, if so then the residue is defined as an conformational epitope

This algorithm is shown in Figure 4.


Antigen/ Proten (PDB File)

IV.

ALGORITHM OF EPITOPE IDENTIFICATION

Fig. 4 Algorithm of Epitope Prediction

Epitope identification algorithms can be explained as follows: 1. Enter *.pdb files from the antigen / protein to be identified.

1110303-9696 IJECS-IJENS June 2011 IJENS

IJENS

International Journal of Electrical & Computer Sciences IJECS-IJENS Vol: 11 No: 03

82

V.
a.

RESULT

EXPERIMENT SCENARIO In these experiments will use data from the IRD H5N1, DB Structure and Conformational Neuraminidase Epitope Database (CNED). Besides the main data, will be tested also more homologous VDAC3 data for comparison in the experiment. The scenario will be done in this experiment is shown in Figure 5.

3. Scenario 3 (Test Data Homogeneous) In Scenario 3, the tested data is data VDAC3. Different from the data in scenario 1 and 2 are varied, the data consist of 1 VDAC3 only *. PDB files. This data can be used as test data, because the length of sequence data is not very long and did not have many variants. For comparison, experimental results using fuzzy Mamdani will be compared with results SEPPA discotope server and the server. SEPPA server can be accessed at http://lifecenter.sgst.cn/seppa/. b. EXPERIMENT RESULT

Scenario 1 In Scenario 1, used 10 samples taken from the CED data for testing. The evaluation results are shown in Table 5. Table 5 The comparison of data with the method CED and Fmamdani
PDB_ID 1WEJ/F Epitope Location (CED) HGLFGRK(3339)+GITWKEETLME(5666)+AYLKKATNE(96104) Epitope Location (FMamdani) 1-2, 4-5, 21-28,37-58, 62-63,66-67, 6970, 72-81, 83-84, 86-89, 104

Fig. 5 Experiment Scenario

1XUM

I60E61+YVSI(8285)+EIR(107109)+FLGIF(130-

54-55, 66, 70-71, 75, 215, 225

This experimental scenario can be explained as follows: 1. Scenario 1 (Test CED) In scenario 1, the results of the experiment by using fuzzy Mamdani will be compared with the data conformational epitopes that have been identified. The data was obtained from CED antigen that can be accessed at http://immunet.cn/ced/. Antigens stored in the CED has been known as conformational epitope location so that it can be used to test the accuracy of the proposed method when used to test the same data. 2. Scenario 2 (Test Data H5N1)

134)+E157+K183 1QGT/B PSD(20-22)+PSIRD(2529)+IR(126-127) 1QGT/C PSD(20-22)+PPAY(129132) 1IAI/H TNYG(3033)+WNYT(50,52,54,59)+ YNYY(101,104-106) 1IAI/L D28+R68+HYSTF(9194,96) 1-5, 7-8, 22, 45-46, 48-50, 74-75,77-80, 92, 128-136, 143 1-5, 7-8, 22, 45-46, 48-49,75, 77-78, 80, 92, 128-137, 142 1, 14-17, 31, 41-44,47, 55,61-63, 66-67, 85, 87-89, 101-107, 109, 136-137,165, 168, 198-200, 210 1,10, 12, 40-43, 45, 56-60, 93-95, 109, 122, 142-143, 145,149-158, 164-165, 167, 169-170, 184-185, 187-191, 201203, 210-214 1TPX/A KQHTVTTTTKGE(188199) 1H0D/C GLTSPCKD(3441)+GGSPWPP(85-91) 129-130, 133, 135-139, 141-160, 162, 168-177, 201-208, 222-228 2-5, 7-8, 10-11, 15-20, 24, 28,31-34, 3738, 48-52, 60-66, 68,85-86, 89-91, 109, 118-119, 122-123 1A7C/A NKD(87-89)+QWK(174176)+HGDT(229232)+NRS(329-331) 2-3, 27, 30, 52-53, 60,68-70, 81, 83-90, 107-108, 142, 146-147, 149,172, 174, 176-183, 185-186, 193-195, 197-198, 206-207, 214, 216-218, 229-231, 242, 244, 261, 264-269, 291-291, 294, 302, 313, 330-348, 350, 366 1NDG/C RHGNYR(14-16,1921)+WW(6263)+SRNLN(7275,77)+TNKKISDG(89,93, 96-98,100-102) 1DAB/A TWDDD(99103)+GGFGPVLDGW(252 -261) 1-14, 22-24, 28-29, 70-73, 155-162, 235-236, 242, 244, 321, 370-374, 431432, 509-511, 525, 538-539 616, 619-623, 644-649, 667-668, 670671, 701-703

In the 2nd scenario, data is data of H5N1 virus tested. The reason for using this data as test data because the results of this experiment will be very beneficial for the prevention of disease (in the form of vaccines) as well as drug design especially for influenza vaccine. In contrast to data obtained from CED that tended to vary, the data have some similarities with H5N1 virus in several variants, so it can be observed more easily. For comparison, experimental results using fuzzy Mamdani will be compared with predicted results discotope server. These predictors can be accessed at http://www.cbs.dtu.dk/services/DiscoTope/. So far, among the predictors of tools that are available, the prediction accuracy discotope has the highest value. So that can be used as the accuracy assessment of the proposed method.

1110303-9696 IJECS-IJENS June 2011 IJENS

IJENS

International Journal of Electrical & Computer Sciences IJECS-IJENS Vol: 11 No: 03


AAC34264.1 98-99, 129-131, 133-134, 136-137, 140, 157, 185188, 206-209, 232-233,

83

68-98, 129-131, 133-134, 136137, 140, 157, 185-188, 206209, 232-233, 235-239, 258, 296, 315-317, 326-331, 416-452

Prediction Evaluation test results and accuracy for scenario 1 is shown in Table 6.

235-239, 258, 296, 315317, 326-331, 341-342,

Table 6 Prediction Evaluation Results Scenario 1


PDB_ID /chain 1WEJ/F 1XUM 1QGT/B 1QGT/C 1IAI/H 1IAI/L 1TPX/A 1H0D/C 1A7C/A 1NDG/C 1DAB/A 1WEJ/F 1XUM 1QGT/B 1QGT/C 1IAI/H 1IAI/L 1TPX/A average 10 0 3 1 10 7 17 5 10 7 2 7 3 20 14 14 2 15 TP F P 43 8 8 22 10 44 70 16 98 67 33 57 10 16 26 82 89 72 T N 40 20 2 56 11 2 65 10 7 28 0 49 17 4 19 7 11 9 46 14 0 28 8 14 6 72 10 3 21 2 F N 13 16 12 49 28 27 49 25 41 49 14 22 46 47 35 32 21 30 43.48% 0.00% 20.00% 2.00% 26.32% 20.59% 25.76% 16.67% 19.61% 12.50% 12.50% 24.14% 6.12% 29.85% 28.57% 30.43% 8.70% 33.33% 20.03% 48.19% 96.19% 87.50% 83.58% 86.67% 70.86% 80.00% 75.38% 63.97% 74.62% 78.29% 44.66% 93.33% 94.74% 84.88% 46.75% 53.65% 74.65% 74.33% 18.87% 0.00% 27.27% 4.35% 50.00% 13.73% 19.54% 23.81% 9.26% 9.46% 5.71% 10.94% 23.08% 55.56% 35.00% 14.58% 2.20% 17.24% 18.92% 47.17% 89.38% 74.68% 61.41% 66.37% 61.62% 71.39% 56.84% 56.97% 63.75% 72.02% AAD16787.1 40.15% 71.86% 83.02% AAD16786.1 Sen(%) Spec(%) Ppv(%) Acc(%) AAC40507.1

352-354, 375, 384, 416424,440,449-452 89, 92-93, 123-125, 127128, 130-131, 151-152, 89-93, 123-128, 130-131, 151152, 179-182, 201-203, 226227, 320-325, 335-336, 346348, 369, 378, 410-418, 434, 440-446

179-182, 201-203, 226-227, 229-232, 252, 290, 309311, 320-325, 335-336,

346-348, 369, 378, 410418, 434, 443-446 89, 92-93, 123-125, 127128, 130-131, 134, 151, 179-182, 198, 200-203,226227, 229-233, 252, 290, 309-310, 320-325, 335-336, 346-348, 369, 378, 410418, 434, 441, 442-446 92-93, 123-125, 127-128, 130-131, 134, 151, 179182, 200-203, 226-227, 134, 151, 179-182, 200-203, 226-227, 229-232, 252, 290, 309-311, 320-325, 335-336, 86-89, 92-93, 123-125, 127-128, 151-154, 309-325, 229-233, 252, 290, 335-336, 346-348,

369, 378, 410-418, 434, 441-446

229-232, 252, 290, 309311, 72.40% 43.00% 48.84% 69.00% 63.88% AAD16788.1 320-325, 335-336,

346-348, 369, 378, 410-418, 434, 443-446

346-348, 369, 378, 410418, 434, 443-446 90, 129, 93-94,124-126, 131-132, 12890, 93-94,124-126, 152-153, 321-326, 128-129, 180-183, 336-337,

152-153, 227-

131-132, 310-311,

180-183,

201-204,

228,230-234, 291, 310-311, 321-326, 336-337, 347-349,

347-349, 370, 379,411-420,435, 444-447

From the experimental results can be seen that the results of conformational epitope prediction on Mamdani fuzzy when compared with predicted results in CED on some data have a fairly high accuracy value, and some have almost reached 90%. Specificity average value was quite high, so it can be interpreted that the residue is not epitope that are recognized as epitope residues more than epitope residue that are recognized as the epitope . Scenario 2 In the scenario 2 used primary data from H5N1 virus data. For the experiment will be tested some varies sample data. The sample data will be tested with discotope server and Fuzzy Mamdani (proposed method). The evaluation results are shown in Table 7. Table 7 The prediction results with a fuzzy Mamdani Discotope
PDB_ID AAC14419.1 Discotope 77-78, 108-110, 112-113, 115-116, 136,165-167, 186188, 211-212, 214-217,237, 275, 294-295, 305-310, 354, 419, FMam 76-78, 112-113, 115-116,

370, 379,411-420,435, 444447 AAD16789.1 90, 93-94, 124-126, 128129, 131-132, 152, 180183, 201-204, 227-228,230233, 253, 291, 310-312, 321-326, 336-337, 347-349, 370, 379, 411-419, 90, 93-94, 124-126, 128-129, 131-132, 152, 180-183, 201204, 227-228,230-233, 253, 291, 310-312, 321-326, 411-419,

435,442, 444-447

435,442, 444-447 AAD16790.1 93-94, 124-126, 128-129, 131-132, 152, 180-183, 93-94, 124-126, 128-129, 131132, 152, 180-183, 201-204, 370, 379,

201-204, 227-228, 230-234, 291,310-312, 321-326, 336337,347-349, 370, 379,

336-337,347-349,

411-419, 435, 444-447

411-419, 435, 444-447 AAD16791.1 93-94, 124-126, 128-129, 131-132, 152, 180-183, 93-126, 128-129, 131-132, 152, 180-183, 201-204, 227-228,

201-204, 227-228, 230-234, 253, 291, 310-312, 321326, 336-337, 347-349,

230-234, 253, 291, 310-312, 321-326, 336-337, 347-349

136,165-167, 186-188, 211-212, 214-217, 275, 294-295, 320321,331-333, 354, 363,380, 395404, 419, 423-431 AAD16792.1

370, 379, 411-419, 444-447 94, 124-128, 132, 152, 180182, 202-204, 228-232, 336, 132-152, 228-232, 180-182, 310-312, 202-204, 322-326,

320-321,331-333, 363,380, 428-431 395-404,

310-312,

322-326,

336, 348, 370,412-418, 444-447

348, 370,412-418, 444-447

1110303-9696 IJECS-IJENS June 2011 IJENS

IJENS

International Journal of Electrical & Computer Sciences IJECS-IJENS Vol: 11 No: 03


2FK0/A 20-22, 33-34, 45-47,79, 81, 82, 103-104,122, 125-126, 128-129, 158-160, 162-173, 186-190, 192-193,19720-34, 45-47,79, 81, 82, 103104,122, 158-160, 125-126, 162-173, 128-129, 186-190,

84

192-193,197-199,222, 239-240, 242,263-264,289

the number of epitope at residue which is also identified as an epitope has been quite a lot. On conformational epitope prediction, a value above 50% is good value, especially if have high accuracy value. Scenario 3 In scenario 3 will be used homogeneous data, ie data that although there are several variants, but has the same sequence. Data is data VDAC3 tested. VDAC3 data shown in Figure 6.
>gi|5733504|gb|AAD49610.1| voltage-dependent anion channel VDAC3 [Homo sapiens] MCNTPTYCDLGKAAKDVFNKGYGFGMVKIDLKTKSCSGVEFSTSGHAYTDT GKASGNLETKYKVCNYGLT FTQKWNTDNTLGTEISWENKLAEGLKLTLDTIFVPNTGKKSGKLKASYKRD

199,222, 239-240, 242,263264,289, 291-292, 298,

312, 323-324

2KAD/A/B/C/D 2KQT/A/B/C/D 3C9J/A/B/C/D 3F5T

22-23, 45-46 22 25 21-22, 24-27, 30, 41, 45, 48-49,51, 66-82, 89-91, 9497, 100-101, 117, 120,159, 161-162, 184, 194

20-23, 43-46 22-24 20-25 21--27, 30, 41, 45, 48-49,51, 6682, 89-91, 94-97, 100-101, 117, 120,159, 161-162, 184-194

Then the evaluation results and the prediction accuracy of test scenarios 2 are shown in Table 8. In the evaluation of the 2nd scenario, sensitivity test and specificity done by comparing the predicted results with predicted results discotope as actual data. Table 8 Prediction Evaluation Results Fmamdani
PDB_ID/chain T P 5 4 9 7 9 3 2 1 2 5 5 8 4 4 4 0 6 5 2 3 3 6 4 1 1 9 1 F P 1 8 8 1 1 1 6 1 0 7 1 5 9 1 4 3 4 3 TN F N 78 41 88 55 35 22 82 67 58 53 84 0 0 0 31 Sen(%) Spec(%) Ppv(%) Acc (%) 80.05 % 73.80 % 68.00 % 78.50 % 84.94 % 92.80 % 71.69 % 48.25 % 85.07 % 71.56 % 80.50 % 91.11 % 88.00 % 83.33 % 92.89 % 79.37 %

CFSVGSNVDIDFSGPTIYG WAVLAFEGWLAGYQMSFDTAKSKLSQNNFALGYKAADFQLHTHVNDGTE FGGSIYQKVNEKIETSINLAW TAGSNNTRFGIAAKYMLDCRTSLSAKVNNASLIGLGYTQTLRP

Fig. 6 Data VDAC 3 Homo sapiens Data VDAC 3 will be incorporated into SEPPA server, Discotope server and then viewed the location FMam for conformational epitope can be identified as shown in Table 8. Table 8 Prediction Results VDAC 3 on SEPPA , disctotope and F Mamdani
ID_PDB Epitope (SEPPA) 2JK4 1, 39-40, 54, 94-95, 109-111, 162-165, 201-205, 218, 254, 137-138, 180, 200, Location Epitope Location Epitope (FMam) 1-4,8-10, 38-42, 5456,67-69, 78, 80-84, 94-95, 106-113, 136137, 163-166,168Location

AAC14419.1 AAC34264.1 AAC40507.1 AAD16786.1 AAD16787.1 AAD16788.1 AAD16789.1 AAD16790.1 AAD16791.1 AAD16792.1 2FK0/A 2KAD/A/B/C/ D 2KQT/A/B/C/ D 3C9J/A/B/C/D 3F5T Average

263 41 111 220 178 303 189 29 351 133 315 37 21 19 353

40.91% 70.29% 51.38% 27.63% 41.67% 72.50% 34.92% 37.38% 52.85% 30.26% 30.00% 100.00% 100.00% 100.00% 74.59% 57.63%

99.62% 83.67% 93.28% 95.24% 99.44% 98.06% 94.97% 80.56% 95.90% 93.66% 99.68% 90.24% 87.50% 82.61% 99.16% 92.91%

98.18% 92.38% 92.08% 65.63% 96.15% 90.63% 81.48% 85.11% 81.25% 71.88% 97.30% 50.00% 25.00% 20.00% 96.81% 76.26%

(Discotope) 1-4,10, 38-42, 5456,67-69, 78, 8084, 94-95, 106-113, 136-137, 163-

2115-216, 253-

228-234,

166,168-169, 170180,190-191, 200205,215-219, 231232, 255-256, 269274, 287-288

169, 205, 231-232, 255-256, 287-288 269-274,

269, 271-272,

298-300

In scenario 3, the evaluation is done by comparing predictions with predicted results and the predicted results SEPPA discotope server. Evaluation of prediction results are shown in Table 9 and Table 10. Table 9 The evaluation results predicted by comparison SEPPA
PDB_ID /chain 2JK4 T P 40 F P 18 TN 102 F N 10 Sen(%) 80.00% Spec(%) 85.00% Ppv(%) 68.97% Acc(%) 85.53

From the experimental results can be seen that when the proposed method is tested using data H5N1 variety, the evaluation results demonstrate the sensitifity are higher when compared with the data used in scenario 1. Average accuracy value was pretty high on some data, some even above 90%. But overall accuracy of the resulting value is not good enough. This may be due to a more varied data when compared to data in scenario 1. Data type being tested will greatly affect the outcome prediction. On testing H5N1 data, the average sensitivity value was above 50%, meaning that

Table 10 The evaluation results predicted by comparison discotope


PDB_ID /chain 2JK4 T P 68 F P 12 TN 50 F N 15 Sen(%) 81.93% Spec(%) 80.65% Ppv(%) 85.00% Acc(%) 81.38

1110303-9696 IJECS-IJENS June 2011 IJENS

IJENS

International Journal of Electrical & Computer Sciences IJECS-IJENS Vol: 11 No: 03 From the results of conformational epitope prediction location on VDAC3 data using SEPPA, discotope and Fuzzy Mamdani discotope can be seen the results of which tend have similarity, in the sense that nothing is exactly the same results if tested on the three methods. Data VDAC3 very different from the H5N1 virus data or data of other proteins tested in scenario 1 and scenario 2. In vdac 3 although there are many variants, this protein only has 1 1d_pdb and for all variants sequencenya that same. Based on the analysis of DNA Star, conformational epitope location ideally located in exon 5 -8 which began in id_residue to 108. Based on this, the prediction based on the results shown in Table 9 and Table 10, the results are approximately correct prediction is the prediction made by discotope and proposed system of Fuzzy Mamdani. Distribution locations of conformational epitope corresponding to star with DNA analysis in 3 methods shown in Figure 7. Visualitation of conformational epitope location show on Figure 8.
ID_PDB 2JK4 SEPPA 1, 39-40, 54, 94-95, 111, 109137-138, 180, Discotope 1-4,10, 38-42, Fuzzy Mamdani 1-4,8-10, 54-56,67-69, 38-42, 78,

85

VI.

DISCUSSION AND CONCLUSSION

Developing of methods for epitope identification is needed because the epitope conformational choose a composition of 90% on a B-cell epitope. Studies that have been there before are still more focused on the identification of linear epitope which only has a composition of 10% in Bcell epitope. Developing of these methods become more important because the identification of epitope location directly affects the success of vaccine development because the epitope is a major component in vaccine development. An accurate identification method is expected to accelerate the process of vaccine development and save significant development costs are extremely useful in Indonesia, which have a source of research data is very varied. Stages of development methods by building on epitope prediction algorithm, development tool with fuzzy system mamdani and test the accuracy of the system with other existing methods. Tests performed on 3 scenarios, namely the accuracy test, test and test statistic homogenous data (protein data vdac3). Of the several steps that have been conducted in this study, several conclusions as follows: 1. Conformational epitope prediction algorithm developed is able to provide predictive results with high accuracy. This can be seen on the results of test scenario 1, where data of known as conformational epitope location on CED will be testing using Fmam tool developed. And the result is accurate. 2. The use of fuzzy systems, in this case the contact mapbased fuzzy can provide problem solving solutions in the conformational epitope prediction methods have been developed previously not described in detail. The results obtained were quite good, as evidenced by the results of tests on the 3 scenarios that have been done. 3. The evaluation results using some of the data varied, ranging from epitope data, protein data, data VDAC3 virus and protein data indicate that the method is built better than other existing methods. Some things may be possible for subsequent research are as follows : 1. Conformational Epitope prediction followed by determining the best location. This will greatly assist in the development of vaccines and the development of contraceptive alata vdac3 data. 2. Other fuzzy methods can be used to perform epitope prediction as fuzzy SVM, which probably would have different results.

54-56,67-69, 78, 80-84, 106-113, 137, 94-95, 136163-

80-84, 94-95, 106113, 136-137, 163166,168-169, 205, 231-232, 255-256, 269-274, 287-288

162-165, 200,

201-205,

2115-216, 218, 228-234, 253254, 269, 271272, 298-300

166,168-169, 170-180,190191, 200-

205,215-219, 231-232, 256, 255-

269-274,

287-288

Fig. 7 Conformational Epitope Distribution Locations

Fig. 8 Visualiation of conformational epitope location on vdac3

1110303-9696 IJECS-IJENS June 2011 IJENS

IJENS

International Journal of Electrical & Computer Sciences IJECS-IJENS Vol: 11 No: 03

86

REFERENCE
[1] Dong Xu., Keller, James M., Popesu, Mihail., Bondugula, RajKumar. (2008). Applications of Fuzzy Logic in Bioinformatics. London : Imperial College Press [2] Jing Sun, Di Wu, Tianlei Xu, Xiaojing Wang, Xiaolian Xu, Lin Tao, Y. X. Li and Z. W. Cao. (May 22, 2009). SEPPA: a computational server for spatial epitope prediction of protein antigens. Nucl. Acids Res. (2009) 37 (suppl 2): W612-W616. doi: 10.1093/nar/gkp417 [3] Urmila Kulkarni-Kale, Shriram Bhosle and A. S. Kolaskar. (2005). CEP: a conformational epitope prediction server. Oxford Journals Vulume 22, Web Server Issue, Pp. W168-W171 [4] P. H. Andersen, Morten Nielsen, Ole Lund,. (2006). Prediction of residues in discontinuous B-cell epitopesusing protein 3D structures. Protein Science 15:25582567. [4] Violaine Moreau, Ccile Fleur, Dominique Piquer, Christophe Nguyen, Nicolas Novali, Sylvie Villard, Daniel Laune, Claude Granier and Franck Molina. (January 30,2008). PEPOP: computational design of immunogenic peptides. BMC Bioinformatics 2008, 9:71doi:10.1186/1471-2105-9-71 [6] Joo Chua Tong., Tin Wee Tan., Ranganatha, Shoba.(2006, September 14). Methods and protocols for prediction of immunogenic epitopes. Briefing In Informatics Vol. 8. No 2. 96-108. [7] Faizah, Widyanto,M.R., Amaliah, Bilqis. (2010). Prediction of Conformational Epitope Using Expert System.Proceeding ICACSIS 2010 [8] Peters B, Bui HH, Sidney J, Weng Z, Loffredo JT, et al. (2005). A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules. Vaccine 23: 52125224. [9] Savoie, C.J., Kamikawaji, N., Sasazuki, T. (1999) Use of BONSAI decision trees for the identification of potential MHC Class I peptide epitope motifs. Pacific Symposium on Biocomputing 4:182-189 [10] Lundegaard, Claus., Ole Lunda, Nielsen, Morten . (2003). Prediction of epitopes using neural network based methods. Journal of Immunological Methods. [11] Wanga B, Huaa RH, Tiana Z-J, Chena N-S, Zhaoa F-R, Liua T-Q, Wanga Y-F, Tong G-Z. Identification of a virus-specific and conserved B-cell epitope on NS1 protein of Japanese encephalitis virus. Virus Res 2009; 141:905. [12] Zhao, Yingdong., Pinilla, Clemencia., Valmori, Danila ., Martin, Roland ., Simon, Richard. Application of support vector machines for T-cell epitopes prediction. Bioinformatics (2003) 19 (15): 1978-1984. [13] Bhasin, M. and Raghava, G. P. S. (2004) Prediction of CTL epitopes using QM, SVM and ANN techniques. Vaccine 22:3195-204. [14] Pedro A. Reche and Ellis L. Reinherz. (2007) Definition of MHC supertypes through clustering of MHC peptide binding repertoires. Methods in Molecular Biology,

Volume 409, Part II, 163-173, DOI: 10.1007/978-160327-118-9_11

1110303-9696 IJECS-IJENS June 2011 IJENS

IJENS