Results from PredictProtein for predict_h31294

1 of 19

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

Links: BOTTOM

Results from PredictProtein for predict_h31294
TOC for file /home/phd/server/work/predict_h31294
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.

The following information has been received by the server (TOC)
PROSITE motif search (A Bairoch; P Bucher and K Hofmann) (TOC)
SEG low-complexity regions (J C Wootton & S Federhen) (TOC)
ProDom domain search (E Sonnhammer, Corpet, Gouzy, D Kahn) (TOC)
PSI-BLAST alignment header (TOC)
MAXHOM alignment header (TOC)
MAXHOM alignment (TOC)
COILS prediction (A Lupas) (TOC)
PHD information about accuracy (TOC)
PHD predictions (TOC)
Ambivalent Sequence Predictor(Malin Young, Kent Kirshenbaum, Stefan Highsmith) (TOC)
PROF predictions (TOC)
GLOBE prediction of globularity (TOC)

END of TOC

BEG of results for file /home/phd/server/work/predict_h31294

The following information has been received by the server
reference predict_h31294 (Nov 13, 2003 11:19:45)
reference pred_h31294 (Nov 13, 2003 10:48:37)
PPhdr from: Jeremy.Derrick@umist.ac.uk
PPhdr resp: MAIL
PPhdr orig: HTML
PPhdr want: HTML
PPhdr password(###)
prediction of: - default prediction of: - PROFsec PROFacc PROFhtm ProSite SEG ProDom COILS NLS nor
return msf format
ret html

09/02/2008 10:48

Results from PredictProtein for predict_h31294

2 of 19

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

ret store
# default: single protein sequence description=PilG
MAKNGGFSLF AKKEKRFIFE GRHSASDKLV NGEVSAFTEE EARKKLAKRG IRPLQITRVK TSSKRKITQE DITVFTRQLS TMIKAGLPL

PROSITE motif search (A Bairoch; P Bucher and K Hofmann)
TOP - BOTTOM - ProSite
------------------------------------------------------------Pattern-ID: CAMP_PHOSPHO_SITE PS00004 PDOC00004
Pattern-DE: cAMP- and cGMP-dependent protein kinase phosphorylation site
Pattern:
[RK]{2}.[ST]
65
RKIT
272
RKGT
Pattern-ID:
Pattern-DE:
Pattern:
26
62
250

PKC_PHOSPHO_SITE PS00005 PDOC00005
Protein kinase C phosphorylation site
[ST].[RK]
SDK
SSK
SIK

Pattern-ID:
Pattern-DE:
Pattern:
24
38
68
104
152
217
345
352

CK2_PHOSPHO_SITE PS00006 PDOC00006
Casein kinase II phosphorylation site
[ST].{2}[DE]
SASD
TEEE
TQED
SMTE
SLLD
TVMD
SIGE
SLDD

Pattern-ID:
Pattern-DE:
Pattern:
131

TYR_PHOSPHO_SITE PS00007 PDOC00007
Tyrosine kinase phosphorylation site
[RK].{2,3}[DE].{2,3}Y
KYFDRFY

Pattern-ID:
Pattern-DE:
Pattern:
119
148
302
323
329
388

MYRISTYL PS00008 PDOC00008
N-myristoylation site
G[^EDRKHPFYW].{2}[STAGCN][^P]
GSSLSR
GVLESL
GAAGNL
GLSMTS
GMRATE
GLVIGT

Pattern-ID:
Pattern-DE:
Pattern:
170

T2SP_F PS00874 PDOC00682
Bacterial type II secretion system protein F signature
[KRQ][LIVMA].{2}[SAIV][LIVM].[TY]P.{2}[LIVM].{3}[STAGV].{6}[LMY].{3}[LIVMF]{2}P
KVKTALTYPVSVIAVAIGLVFVMMIFVLP

SEG low-complexity regions (J C Wootton & S Federhen)
TOP - BOTTOM - SEG
prot (#) default: single protein sequence description=pilg /home/phd/server/work/predict_h31294

f

09/02/2008 10:48

Results from PredictProtein for predict_h31294

3 of 19

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

prot (#) default: single protein sequence description=pilg /home/phd/server/work/predict_h31294
/home/phd/server/work/predict_h31294.segNormGcg Length: 410
11-Jul-99 Check: 2818 ..
1

MAKNGGFSLF AKKEKRFIFE GRHSASDKLV NGEVSAFTEE EARKKLAKRG

51

IRPLQITRVK TSSKRKITQE DITVFTRQLS TMIKAGLPLM QAFEIVARGH

101

GNPSMTEMLM EIRGEVEQGS SLSRAFSNHP KYFDRFYCNL VAAGETGGVL

151

ESLLDKLAIY KEKTQAIRKK VKTALTYPVS VIAVAIGLVF VMMIFVLPAF

201

KEVYANMGAE LPALTQTVMD MSDFFVSYGW MVLIALGFAI YGFLKLKARS

251

IKIQRRMDAI LLRMPIFGDI VRKGTIARWG RTTATLIAAG VPLVDVLDST

301

AGAAGNLIYE EATREIRTRV IQGLSMTSGM RATELFPNMM LQMSSIGEES

351

GSLDDMLNKA AEFYEDEVDN AVGRLSAMME Pxxxxxxxxx xxxxxxAMYL

401

PLFNLGNVVA

ProDom domain search (E Sonnhammer, Corpet, Gouzy, D Kahn)
TOP - BOTTOM - ProDom - MView
Identities computed with respect to: (query) prot
Colored by: consensus/70% and property
HSP processing: ranked

1
2
3
4
5
6
7
8
9

prot
PD034811
PD001695
PD002882
PD006803
PD190313
PD039976
PD096544
PD096546
PD007713
consensus/100%
consensus/90%
consensus/80%
consensus/70%

(#) default: single protein... score
p2000.1 (2) Q51105(1) Q5707...
540
p2000.1 (56) GSPF(15) PILC(...
238
p2000.1 (29) GSPF(8) PILC(2...
208
p2000.1 (12) HOFC(2) // PR...
184
p2000.1 (2) O84574(1) Q9Z78...
167
p2000.1 (4) CMG2(1) O06667(...
101
p2000.1 (1) GSPF_XANCP // G...
91
p2000.1 (1) HOFC_HAEIN // P...
64
p2000.1 (9) // PROTEIN CONS...
67

P(N)
5.4e-71
1.7e-36
1.8e-22
8.1e-19
6.1e-16
6.5e-14
7.9e-06
8.9e-05
0.00010

N 100.0%
1 99.1%
2 44.9%
1 47.6%
1 66.7%
1 27.5%
3 22.8%
1 31.4%
2 28.0%
2 19.3%

25 [
.
.
ASDKLVNGEVSAFTEEEARK
-------------------AKGKKVKGQLEADSEREARQ
-------------------------------------------------------------------------------------------------------------------------------------....................
....................

--- ---------------------------------------------------------------- Again: these results were obtained based on the domain data--- base collected by Daniel Kahn and his coworkers in Toulouse.
----- PLEASE quote:
--F Corpet, J Gouzy, D Kahn (1998). The ProDom database
--of protein domain families. Nucleic Ac Res 26:323-326.
----- The general WWW page is on:
-------------------------------------------http://www.toulouse.inra.fr/prodom.html
---------------------------------------------- For WWW graphic interfaces to PRODOM, in particular for your
--- protein family, follow the following links (each line is ONE
--- single link for your protein!!):
--http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD034811 ==> multiple alignment,
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD034811 ==> graphical output of

09/02/2008 10:48

Results from PredictProtein for predict_h31294

4 of 19

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD001695
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD001695
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD002882
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD002882
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD006803
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD006803
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD190313
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD190313
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD039976
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD039976
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD096544
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD096544
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD096546
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD096546
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD007713
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD007713
----- NOTE: if you want to use the link, make sure the entire line
--is pasted as URL into your browser!
----- END of PRODOM
--- ------------------------------------------------------------

==>
==>
==>
==>
==>
==>
==>
==>
==>
==>
==>
==>
==>
==>
==>
==>

multiple alignment,
graphical output of
multiple alignment,
graphical output of
multiple alignment,
graphical output of
multiple alignment,
graphical output of
multiple alignment,
graphical output of
multiple alignment,
graphical output of
multiple alignment,
graphical output of
multiple alignment,
graphical output of

PSI-BLAST alignment header
-----------------------------------------

-----------------------------------------------------------PSI-BLAST multiple sequence alignment
-----------------------------------------------------------PSI-BLAST ALIGNMENT HEADER: ABBREVIATIONS FOR SUMMARY
SEQLENGTH
: 410
ID
: identifier of aligned (homologous) protein
LSEQ2
: length of aligned sequence
IDE
: percentage of pairwise sequence identity
SIM
: percentage of similarity
LALI
: number of residues aligned
LGAP
: number of residues in all indels
BSCORE
: blast score (bits)
BEXPECT
: blast expectation value
OMIM
: OMIM (Online Mendelian Inheritance in Man) ID
PROTEIN
: one-line description of aligned protein
'!'
: indicates lower scoring alignment that is combined
with the higher scoring adjacent one
PSI-BLAST ALIGNMENT HEADER: SUMMARY

ID
trembl|Q57076|Q57076
trembl|Q9JR97|Q9JR97
trembl|Q51105|Q51105
trembl|Q888U1|Q888U1
trembl|Q8P670|Q8P670
trembl|O54482|O54482
trembl|Q8PHK9|Q8PHK9
trembl|Q9PAI0|Q9PAI0
swiss|P45793|TAPC_AERHY
trembl|Q9ZEL5|Q9ZEL5
trembl|Q82WR6|Q82WR6
trembl|Q9F669|Q9F669
trembl|Q87AA5|Q87AA5
trembl|Q8EJP7|Q8EJP7
swiss|P22609|PILC_PSEAE
trembl|Q8XVK4|Q8XVK4
trembl|Q87TD5|Q87TD5
swiss|P31705|GSPF_ERWCA
swiss|P45780|GSPF_VIBCH
trembl|Q8DDT2|Q8DDT2
swiss|P31743|GSPF_AERHY

LSEQ2
410
410
410
405
418
413
418
490
413
405
406
383
420
423
374
421
405
408
406
405
388

IDE
95
96
95
40
43
37
41
42
38
42
52
41
42
40
45
48
26
28
26
26
27

SIM LALI LGAP BSCORE BEXPECT PROTEIN
95 410
0
388
e-107 PilG.
96 410
0
388
e-107 Pilus-assembly protein (P
95 410
0
387
e-106 PilG.
63 398
4
360
2e-98 Type IV pilus biogenesis
63 395
5
353
3e-96 Fimbrial assembly protein
62 397
5
351
1e-95 Type IV pilus assembly pr
63 398
5
350
3e-95 Fimbrial assembly protein
62 392
5
348
6e-95 Fimbrial assembly protein
62 397
5
348
7e-95 Type IV pilus assembly pr
64 398
4
347
2e-94 PilC protein.
69 395
3
347
2e-94 Bacterial type II secreti
65 379
3
347
2e-94 Pilin biogenesis protein
62 392
5
345
6e-94 Fimbrial assembly protein
61 396
5
345
7e-94 Type IV pilus biogenesis
67 368
3
344
1e-93 Type 4 fimbrial assembly
69 392
1
343
3e-93 Probable fimbrial assembl
48 395
11
339
3e-92 General secretion pathway
50 395
14
339
6e-92 General secretion pathway
50 395
12
338
7e-92 General secretion pathway
50 395
11
336
5e-91 Type II secretory pathway
49 376
11
334
2e-90 General secretion pathway

09/02/2008 10:48

Results from PredictProtein for predict_h31294

5 of 19

trembl|Q8DC28|Q8DC28
trembl|Q46524|Q46524
trembl|Q9RTA2|Q9RTA2
trembl|O32568|O32568
trembl|Q8EKC7|Q8EKC7
trembl|Q87LT6|Q87LT6
swiss|Q9X4G9|PILC_VIBCH
trembl|O68432|O68432
swiss|Q00513|GSPF_PSEAE
trembl|Q8DCI0|Q8DCI0
trembl|Q9ZF84|Q9ZF84
trembl|Q9LAN5|Q9LAN5
trembl|O85599|O85599
trembl|Q8XUR9|Q8XUR9
trembl|Q9F1P9|Q9F1P9
trembl|Q56739|Q56739
trembl|Q8VRM8|Q8VRM8
trembl|Q8VPC6|Q8VPC6
swiss|P31704|GSPF_ERWCH
trembl|Q8VRL3|Q8VRL3
trembl|Q8RAF9|Q8RAF9
trembl|Q87LB2|Q87LB2
swiss|P41441|GSPF_ECOLI
trembl|Q8XI36|Q8XI36
trembl|Q9I5N8|Q9I5N8
trembl|O66951|O66951
trembl|Q8FCZ1|Q8FCZ1
trembl|Q8EA01|Q8EA01
trembl|AAK35047|AAK35047
trembl|Q9AGM6|Q9AGM6
trembl|Q8YUB0|Q8YUB0
trembl|Q8RT49|Q8RT49
trembl|Q9KUV6|Q9KUV6
trembl|Q9ZFX5|Q9ZFX5
trembl|Q8ZBI3|Q8ZBI3
swiss|P15745|GSPF_KLEPN
trembl|Q8F3M8|Q8F3M8
trembl|Q88Q63|Q88Q63
trembl|Q8P5B8|Q8P5B8
trembl|Q9ABQ1|Q9ABQ1
trembl|Q9PD61|Q9PD61
trembl|Q8PPI9|Q8PPI9
trembl|Q87DF2|Q87DF2
trembl|Q8GBE4|Q8GBE4
trembl|Q8Z9F8|Q8Z9F8
trembl|Q8PGS3|Q8PGS3
trembl|Q8ZRT4|Q8ZRT4
swiss|P31744|GSPF_XANCP
swiss|P36646|HOFC_ECOLI
swiss|P36641|PILC_PSEPU
trembl|Q52293|Q52293
trembl|Q88P05|Q88P05
trembl|Q83SM8|Q83SM8
trembl|AAP15649|AAP15649
trembl|Q8X977|Q8X977
trembl|Q56612|Q56612
trembl|Q8FL54|Q8FL54
trembl|CAD73219|CAD73219
trembl|Q88HD6|Q88HD6
trembl|Q8KRY4|Q8KRY4
trembl|Q8CZX4|Q8CZX4
trembl|Q821J4|Q821J4
trembl|Q9X2W2|Q9X2W2
trembl|Q87ZV6|Q87ZV6
trembl|O84574|O84574
trembl|Q82U86|Q82U86
trembl|Q97HA8|Q97HA8
trembl|Q9JRM8|Q9JRM8
trembl|Q9Z789|Q9Z789
trembl|AAP98775|AAP98775
swiss|P44621|HOFC_HAEIN
trembl|Q9PJH0|Q9PJH0
trembl|Q8KIK8|Q8KIK8

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

408
407
406
407
407
407
408
406
405
406
405
405
409
403
405
409
407
407
404
441
403
407
398
401
404
408
398
406
399
399
407
371
407
404
399
381
408
402
398
403
405
402
405
405
400
391
400
391
400
401
400
400
400
400
400
363
400
404
403
407
348
391
395
400
391
394
396
407
391
391
406
391
409

38
38
33
26
31
36
38
42
29
30
29
29
37
30
28
37
29
28
26
33
30
28
29
27
27
32
29
29
28
27
31
41
28
30
30
25
25
30
26
28
28
28
27
23
25
28
25
27
25
31
27
28
26
26
26
29
25
26
25
24
26
24
24
28
25
22
20
23
26
26
24
23
26

57
61
57
47
53
57
58
65
49
53
49
49
56
49
49
56
51
51
47
54
50
51
50
51
50
54
50
53
53
53
53
61
53
51
52
45
50
55
47
53
48
49
48
43
47
48
47
48
47
57
47
47
47
47
47
54
47
50
48
48
50
48
47
48
47
47
44
47
47
47
50
46
45

396
394
395
394
395
396
395
392
394
394
394
394
396
394
394
396
394
394
392
400
394
394
392
394
394
395
392
394
395
395
390
363
394
394
393
375
392
393
392
392
394
393
394
391
394
380
394
380
394
393
394
394
394
394
394
352
394
389
384
394
345
384
383
394
384
390
393
395
384
384
393
384
366

5
6
12
13
13
5
5
5
12
12
12
12
6
10
12
6
13
13
16
13
9
12
6
8
11
12
6
10
6
6
12
9
13
11
1
27
13
4
6
11
12
8
12
18
3
11
3
11
3
3
6
6
3
3
3
13
3
12
8
13
1
5
9
8
5
4
6
13
5
5
13
5
16

329
329
328
328
326
325
324
323
322
321
318
318
317
316
316
310
309
308
307
307
306
306
305
305
305
304
303
302
301
301
299
298
297
297
296
294
292
291
290
287
285
284
284
284
282
282
282
281
280
279
279
278
277
277
277
277
276
272
259
258
254
250
249
248
247
247
247
246
246
246
243
240
229

3e-89
5e-89
8e-89
9e-89
4e-88
6e-88
1e-87
3e-87
6e-87
1e-86
1e-85
1e-85
1e-85
3e-85
3e-85
2e-83
6e-83
1e-82
1e-82
2e-82
4e-82
5e-82
6e-82
7e-82
7e-82
1e-81
3e-81
5e-81
1e-80
1e-80
4e-80
1e-79
2e-79
2e-79
6e-79
2e-78
8e-78
1e-77
3e-77
2e-76
1e-75
1e-75
2e-75
2e-75
5e-75
5e-75
8e-75
1e-74
3e-74
4e-74
4e-74
1e-73
2e-73
2e-73
3e-73
3e-73
5e-73
7e-72
6e-68
1e-67
2e-66
4e-65
8e-65
9e-65
2e-64
2e-64
3e-64
4e-64
5e-64
5e-64
3e-63
3e-62
7e-59

Type IV pilin biogenesis
FimO.
Pilin biogenesis protein.
ETPF protein.
General secretion pathway
Type IV pilin biogenesis
Type IV pilin assembly pr
Pilus assembly protein Pi
General secretion pathway
Type II secretory pathway
General secretory pathway
GspF.
Type IV pilin biogenesis
Probable general secretor
GspF.
VvpC.
Hypothetical type II secr
Hypothetical type II secr
General secretion pathway
Competence protein PilC.
General secretory pathway
MSHA biogenesis protein M
Putative general secretio
Probable pilin biogenesis
Probable type II secretio
Fimbrial assembly protein
Putative general secretio
MSHA biogenesis protein M
Type II protein secretion
Putative inner membrane p
Pilin biogenesis protein.
Pilin biogenesis protein
MSHA biogenesis protein M
Outer membrane secretion
Putative type II secretio
General secretion pathway
General secretory pathway
Type IV pili biogenesis p
Type II secretion system
General secretion pathway
General secretory pathway
Type II secretion system
General secretory pathway
Yts1F protein.
Protein transport protein
General secretion pathway
Putative component in typ
General secretion pathway
Protein transport protein
Type 4 fimbrial assembly
XcpS protein.
Type II secretion pathway
Putative integral membran
Putative integral membran
Putative integral membran
MshG (Fragment).
Protein transport protein
Type 4 fimbrial assembly
Type II secretion pathway
PilC.
Putative general protein
General secretion pathway
XcpS.
General secretion pathway
GEN. secretion protein F.
Bacterial type II secreti
General secretion pathway
PilC.
General secretion protein
XcpS.
Protein transport protein
General secretion pathway
LsdF precursor.

09/02/2008 10:48

Results from PredictProtein for predict_h31294

6 of 19

trembl|CAD77776|CAD77776
trembl|Q8XX09|Q8XX09
trembl|Q9WXU9|Q9WXU9
trembl|O67317|O67317
trembl|Q83EZ9|Q83EZ9
trembl|Q82V46|Q82V46
trembl|AAP77714|AAP77714
trembl|CAD79205|CAD79205
trembl|Q89HH1|Q89HH1
trembl|AAP87276|AAP87276
trembl|AAP87275|AAP87275
trembl|Q8XTG2|Q8XTG2
trembl|Q9I0G2|Q9I0G2
trembl|Q8XSK5|Q8XSK5
trembl|AAP95988|AAP95988
trembl|Q988A9|Q988A9
trembl|Q8RHE3|Q8RHE3
trembl|P74465|P74465
trembl|Q9CPF7|Q9CPF7
trembl|EAA24563|EAA24563
trembl|Q818L1|Q818L1
trembl|Q894E5|Q894E5
trembl|Q81LZ2|Q81LZ2
trembl|Q8DMJ7|Q8DMJ7
trembl|Q92C08|Q92C08
trembl|Q8CP29|Q8CP29
trembl|Q8Y7D7|Q8Y7D7
trembl|Q8XJB7|Q8XJB7
trembl|Q8VQ72|Q8VQ72
swiss|Q9K920|CMGB_BACHD
trembl|CAD77260|CAD77260
swiss|P25954|CMGB_BACSU
trembl|Q93I66|Q93I66
trembl|Q9CDT9|Q9CDT9
trembl|Q8E7J4|Q8E7J4
trembl|Q8E235|Q8E235
swiss|P29487|TCPE_VIBCH
trembl|AAK20793|AAK20793
trembl|Q9ZF71|Q9ZF71
trembl|Q833B9|Q833B9
trembl|Q99TV1|Q99TV1
trembl|Q8CXD8|Q8CXD8
trembl|Q47071|Q47071
trembl|Q47021|Q47021
trembl|Q47020|Q47020
trembl|Q8DS50|Q8DS50
trembl|O06667|O06667
trembl|Q9R726|Q9R726
trembl|O86278|O86278
trembl|Q83Z76|Q83Z76
trembl|Q8P2Y0|Q8P2Y0
trembl|Q9A1T8|Q9A1T8
trembl|Q879Q5|Q879Q5
trembl|Q8K8V9|Q8K8V9
trembl|Q8DMJ6|Q8DMJ6
trembl|P74464|P74464
trembl|CAD75879|CAD75879
trembl|AAP84206|AAP84206
trembl|O85194|O85194
trembl|Q8DN87|Q8DN87
trembl|Q841M8|Q841M8
trembl|CAD75878|CAD75878
trembl|Q9XD72|Q9XD72
trembl|Q88V36|Q88V36
trembl|Q93AE7|Q93AE7
trembl|Q93D63|Q93D63
trembl|Q9F537|Q9F537
trembl|O07376|O07376
trembl|AAP86096|AAP86096
trembl|Q9ZIV0|Q9ZIV0
trembl|EAA02065|EAA02065
trembl|Q8KR26|Q8KR26
---

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

479
400
399
394
351
406
414
387
403
393
392
395
395
397
397
405
346
298
380
346
345
351
343
233
343
355
343
339
322
323
346
323
341
357
363
282
340
340
186
348
356
346
352
179
276
282
282
156
348
324
363
282
282
344
166
165
357
359
290
363
113
362
126
349
366
365
369
365
346
361
250
369

27
22
24
25
30
21
24
24
24
18
18
22
22
20
24
20
22
31
24
22
15
21
15
33
16
14
15
18
14
19
17
16
16
16
15
15
16
16
37
17
14
15
21
26
22
15
18
28
16
17
15
16
16
15
29
29
13
18
17
17
32
20
23
14
16
16
15
15
19
17
14
13

52
40
47
46
55
42
47
47
45
43
43
45
44
40
47
41
49
58
49
48
40
45
39
57
39
36
38
41
36
41
41
39
40
38
35
36
37
37
50
37
40
40
41
48
43
32
35
50
37
41
38
38
38
38
50
48
32
36
36
36
58
44
49
33
38
38
36
33
36
35
34
34

340
374
388
383
346
392
373
343
361
390
390
390
391
392
395
392
344
264
370
344
342
343
338
230
340
323
340
326
318
322
293
323
306
326
309
278
286
286
162
329
343
337
289
149
150
280
280
146
326
289
326
281
281
326
134
134
234
260
286
333
106
141
126
329
271
270
285
261
245
264
234
250

32
9
9
14
0
13
10
2
10
6
6
5
5
6
4
13
1
0
13
1
5
6
5
1
10
5
10
10
9
7
14
5
12
23
21
17
12
12
0
19
5
10
30
1
1
17
17
10
24
12
19
17
17
19
10
9
12
17
17
25
0
2
0
29
17
18
21
20
25
16
25
18

228
228
227
224
223
222
219
218
217
212
211
211
210
209
203
201
199
195
195
194
194
192
183
176
173
171
165
164
160
159
151
143
131
128
125
125
123
123
121
121
119
116
115
114
114
114
113
112
112
110
108
106
106
106
105
104
104
103
100
100
98
92
91
70
68
62
58
58
55
54
46
39

9e-59
1e-58
2e-58
2e-57
4e-57
1e-56
7e-56
1e-55
2e-55
7e-54
1e-53
2e-53
4e-53
5e-53
3e-51
2e-50
7e-50
7e-49
8e-49
2e-48
2e-48
1e-47
4e-45
4e-43
4e-42
1e-41
7e-40
2e-39
3e-38
8e-38
2e-35
4e-33
2e-29
1e-28
1e-27
1e-27
5e-27
5e-27
2e-26
2e-26
5e-26
5e-25
9e-25
2e-24
2e-24
3e-24
6e-24
8e-24
1e-23
4e-23
2e-22
5e-22
5e-22
7e-22
2e-21
2e-21
3e-21
4e-21
4e-20
4e-20
2e-19
1e-17
2e-17
4e-11
2e-10
1e-08
1e-07
3e-07
1e-06
2e-06
0.001
0.10

Type IV fimbrial assembly
Putative GSPF-related tra
General secretion pathway
Fimbrial assembly protein
Type IV-A pilus assembly
Bacterial type II secreti
Hypothetical protein.
General secretion pathway
Bll6020 protein.
CtsF.
CtsF.
Probable general secretio
Probable type II secretio
Putative general secretio
Protein transport protein
General secretion protein
General secretion pathway
Pilin biogenesis protein.
HofC.
General secretion pathway
ComG operon protein 2.
Putative general secretio
ComG operon protein 2.
PilC protein.
ComGB protein.
DNA transport machinery p
ComGB protein.
Probable fimbrial assembl
Late competence protein C
ComG operon protein 2 hom
General secretion pathway
ComG operon protein 2.
CofI.
Competence protein ComGB.
Hypothetical protein.
Competence protein CglB.
Toxin coregulated pilus b
Toxin-coregulated pilus b
PilC (Fragment).
Competence protein.
Hypothetical protein SAV1
DNA transport machinery p
BFPE.
Hypothetical protein.
Hypothetical protein.
Putative ABC transporter
Putative ABC transporter
XpsF (Fragment).
Orf348 protein.
CfcI.
Putative competence prote
Putative competence prote
Putative competence prote
Putative ABC transporter
PilC protein.
General secretion pathway
Probable type IV pilus as
PilR2.
Competence protein (Compe
Competence protein.
Hypothetical protein (Fra
Probable type IV pilin bi
LspF (Fragment).
ComG operon protein 2.
Integral membrane protein
PilR.
PilR protein.
Integral membrane protein
Putative component of typ
PilR (Putative membrane p
EbiP618 (Fragment).
Integral membrane protein

09/02/2008 10:48

Results from PredictProtein for predict_h31294

7 of 19

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

--- PSI-BLAST ALIGNMENT

MAXHOM alignment header
--- -------------------------------------------------------------- MAXHOM multiple sequence alignment
--- ---------------------------------------------------------------- MAXHOM ALIGNMENT HEADER: ABBREVIATIONS FOR SUMMARY
--- ID
: identifier of aligned (homologous) protein
--- STRID
: PDB identifier (only for known structures)
--- IDE
: percentage of pairwise sequence identity
--- WSIM
: percentage of weighted similarity
--- LALI
: number of residues aligned
--- NGAP
: number of insertions and deletions (indels)
--- LGAP
: number of residues in all indels
--- LSEQ2
: length of aligned sequence
--- ACCNUM
: SwissProt accession number
--- OMIM
: OMIM (Online Mendelian Inheritance in Man) ID
--- NAME
: one-line description of aligned protein
----- MAXHOM ALIGNMENT HEADER: SUMMARY
ID
STRID IDE WSIM LALI NGAP LGAP LSEQ2 ACCNUM NAME
46
63 398
2
4
406 P22609 FIMBRIAL ASSEMBLY PROTEIN
pilc_pseae
41
56 394
2
5
408 Q9X4G9 pilC).
pilc_vibch
40
56 408
2
5
413 P45793 TYPE IV PILUS ASSEMBLY PR
tapc_aerhy
33
51 399
2
3
401 P36641 FIMBRIAL ASSEMBLY PROTEIN
pilc_psepu
32
47 384
2
11
405 Q00513 GENERAL SECRETION PATHWAY
gspf_pseae
31
47 385
2
13
408 P31705 OUTF).
gspf_erwca
31
46 385
2
5
398 P41441 PROTEIN HOFF).
gspf_ecoli
31
45 370
3
13
388 P31743 GENERAL SECRETION PATHWAY
gspf_aerhy
30
42 379
3
12
390 P31744 GENERAL SECRETION PATHWAY
gspf_xancp
28
44 385
2
11
406 P45780 EPSF).
gspf_vibch
28
42 365
3
26
381 P15745 PULF).
gspf_klepn
28
41 383
4
15
404 P31704 OUTF).
gspf_erwch
28
38 395
2
2
400 P36646 PROTEIN TRANSPORT PROTEIN
hofc_ecoli
27
38 393
5
14
406 P44621 PROTEIN TRANSPORT PROTEIN
hofc_haein
----- MAXHOM ALIGNMENT: IN MSF FORMAT

----- Version of database searched for alignment:
--- SWISS-PROT release 41 (02/2003) with 122 564 proteins
---

MAXHOM alignment
TOP - BOTTOM - MaxHom - MView
Identities computed with respect to: (1) predict_h3120
Colored by: consensus/70% and property

1
2
3
4
5
6
7
8

predict_h3120
pilc_pseae
pilc_vibch
tapc_aerhy
pilc_psepu
gspf_pseae
gspf_erwca
gspf_ecoli

100.0%
45.9%
41.0%
39.6%
33.0%
31.2%
31.1%
30.8%

1 [
.
.
.
.
:
.
MAKNGGFSLFAKKEKRFIFEGRHSASDKLVNGEVSAFTEEEARKKLAKRGIRPLQITRVKTSSKRKITQ
----------ALKTSVFIWEGTDKKGAK-VKGELTGQNPMLVKAHLRKQGINPLKVRKKGISlgKKVKP
--------------KNYRWKG-INSNGKKVSGQMLAISEIEVRDKLKDQHIQIKKLKKGSVSLLARLTh
MATLTQKQNAPKKVFAFRWSGVNRKGQK-VSGELQADSINTVKAELRKQGVNVTKVSKKssKGGAKIKP
---------MNPSIRLYAWQG-TNADGLAVSGQMAGRSPAYVRAGLLRQGILVARLRPAGRAWrkRREK
------------------------PSGRQQKGVLEADSARQVRQLLRERQLAPLDVKPTRTREqrGLSA
------------------------AQGKKCRGTQEADSARQARQLLRERGLVPLSVDENrlRRKIRLST
-----------------------TQDGQKLQGIIDANDERQARLRLREEGLFLLDIRPQKSSgrPRISH

09/02/2008 10:48

Results from PredictProtein for predict_h31294

8 of 19

9
10
11
12
13
14
15

gspf_aerhy
gspf_xancp
gspf_vibch
gspf_klepn
gspf_erwch
hofc_ecoli
hofc_haein
consensus/100%
consensus/90%
consensus/80%
consensus/70%

29.8%
29.1%
28.8%
26.9%
28.4%
28.2%
27.0%

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

-------------------------------------SARQVRQQLREQGLTPLEVNETTEKAkrGAST
----------------------------MLDGQMEAANDAEVALRLQEQGHLPVETRLATGEnkKPFDN
------------------------AKGRHKKGVIEGDNARQVRQRLKEQSLVPMEVVETQVKakRGIST
------------------------EQGKPRRGVQQADSARHARQLLREKGWLALDIDPAAGGGrrRTSA
-----------------------NAQGKKSQGMQEADSARHARQLLREKGLVPVKIEEQRGEarSHRIA
-------------SKQLWRWHGITGDGNAQDGMLWAESRTLLLMALQQQMVTPLSLKRIAINS-AQWRG
-------------TKKLFYYQASNPLNQKQKGSIIADTKQQAHFQLISRGLTHIKLQQ-NWQFGAKPKN
.....................................s...hh..L.pp.h..hph...t.t.......
..............................pG...u.s...st..L.ppth..hplp..t.t...thp.
...............tsp..pG...utsttps+.hLpcpulhshclp.tthp.ttthpt
...........ttsp..pG.htApotpps+ttLpcpGlhslclp.tphptttphps

COILS prediction (A Lupas)
TOP - BOTTOM - COILS
--- COILS HEADER: SUMMARY
COILS version 2.2: R.B. Russell, A.N. Lupas, 1999
using MTIDK matrix.
weights: a,d=2.5 and b,c,e,f,g=1.0
For the threshold of 5 ( probability > 0.5):
>prot
window size = 14
window size = 21
window size = 28
seq
frame-14
frame-21
frame-28
prob-14
prob-21
prob-28
seq
frame-14
frame-21
frame-28
prob-14
prob-21
prob-28
seq
frame-14
frame-21
frame-28
prob-14
prob-21
prob-28
seq
frame-14
frame-21
frame-28
prob-14
prob-21
prob-28
seq
frame-14
frame-21
frame-28

14
0
0

residues in coiled coil domain
residues in coiled coil domain
residues in coiled coil domain

.
:
.
:
.
:
.
:
.
5
MAKNGGFSLFAKKEKRFIFEGRHSASDKLVNGEVSAFTEEEARKKLAKRG
aabcdefaabcdefabcdefgabcdefgabcdefaabcdefgabcdefgg
aabcdefabcdefgabcdefgabcdefgabcdefgaabcdefgabcdefg
abcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgg
---------------------------------------------------------------------------------------------------------------------------------------------------.
:
.
:
.
:
.
:
.
10
IRPLQITRVKTSSKRKITQEDITVFTRQLSTMIKAGLPLMQAFEIVARGH
abcdefgabcdefgabcdefgabcdefgabcdefgefgabcdefgabcde
abcdefgabcdefgabcdefgabcdefgabcdefgabcdefgefabcdef
abcdefgabcdefgabcdefgabcdefgabcdefgabcdefgababcdef
---------------------------------------------------------------------------------------------------------------------------------------------------.
:
.
:
.
:
.
:
.
15
GNPSMTEMLMEIRGEVEQGSSLSRAFSNHPKYFDRFYCNLVAAGETGGVL
fabcabcdefgabcdefgdefabcdefgabcdefgdefgabcabcdefga
gabcabcdefgabcdefgabcdefgdefgbcdabcdefgabcabcdefga
gabcdefgabcdefgabcdefgabcdefgefgabcdefgabcabcdefga
-------------------------------------------------9
-------------------------------------------------2
-------------------------------------------------1
.
:
.
:
.
:
.
:
.
20
ESLLDKLAIYKEKTQAIRKKVKTALTYPVSVIAVAIGLVFVMMIFVLPAF
bcdefgabcdefgdefgefgefggfggefgaabcdefgabcdeabcabcd
bcdefgabcdefgabcdefgdefgefgefgefgcdefgaabcdefgabcd
bcdefgabcdefgabcdefgabcdefgefgefgdefgfgabcdefgabcd
9999999999999------------------------------------22222222222222222222-----------------------------111111111111111111111111111----------------------.
:
.
:
.
:
.
:
.
25
KEVYANMGAELPALTQTVMDMSDFFVSYGWMVLIALGFAIYGFLKLKARS
efgabcdefgabcdefgabcdefgefgdefgeabcabcdeabcdefgabc
efgabcdefgabcdefgabcdefgabcdefgeaababcdefgabcdefga
efgabcdefgabcdefgabcdefgabcdefgabcdabcdefgabcdefga

09/02/2008 10:48

Results from PredictProtein for predict_h31294

9 of 19

prob-14
prob-21
prob-28
seq
frame-14
frame-21
frame-28
prob-14
prob-21
prob-28
seq
frame-14
frame-21
frame-28
prob-14
prob-21
prob-28
seq
frame-14
frame-21
frame-28
prob-14
prob-21
prob-28
seq
frame-14
frame-21
frame-28
prob-14
prob-21
prob-28
// End

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

---------------------------------------------------------------------------------------------------------------------------------------------------.
:
.
:
.
:
.
:
.
30
IKIQRRMDAILLRMPIFGDIVRKGTIARWGRTTATLIAAGVPLVDVLDST
defgefgbcdefggbcdefgabcdeabcdefgabcdefgefgaabcdefg
bcdefgabcdefggbcdefgabcdefgcdefgabcabcdefgaabcdefg
bcdefgabcdefgabcdefgabcdefgcdefgabcabcdefgaabcdefg
---------------------------------------------------------------------------------------------------------------------------------------------------.
:
.
:
.
:
.
:
.
35
AGAAGNLIYEEATREIRTRVIQGLSMTSGMRATELFPNMMLQMSSIGEES
abcdefgeabcdefgabcdefgcdefgefgdabcdabcdefgabcabcde
abcdefgabcdefgbcdefgabcdefgeabcdefgabcabcdefgabcde
abcdefgabcdefgabcdefgbcdefgfabcabcdefgabcdabcdefga
---------------------------------------------------------------------------------------------------------------------------------------------------.
:
.
:
.
:
.
:
.
40
GSLDDMLNKAAEFYEDEVDNAVGRLSAMMEPIIIVILGLVIGTLLVAMYL
fgabcdefgaabcdefgabcdefgabcdefgdeabcdefgabcdefgdef
fgaabcdefgabcdefgabcdefgabcdefgdefgabcdefgcdefgefg
bcdefgabcdefgabcdefgdefgabcdefgabcdefgdefgefgdefge
---------------------------------------------------------------------------------------------------------------------------------------------------.
:
.
:
.
:
.
:
.
45
PLFNLGNVVA
gabcdefggg
gabcdefgfg
fgdefgdefg
----------------------------

PHD information about accuracy
****************************************************************************
*
*
*
PHD: Profile fed neural network systems from HeiDelberg
*
*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*
*
*
*
Prediction of:
*
*
secondary structure,
by PHDsec
*
*
solvent accessibility,
by PHDacc
*
*
and helical transmembrane regions,
by PHDhtm
*
*
*
*
Author:
*
*
Burkhard Rost
*
*
EMBL, 69012 Heidelberg, Germany
*
*
Internet: Rost@EMBL-Heidelberg.DE
*
*
*
*
All rights reserved.
*
*
*
****************************************************************************
*
*
*
The network systems are described in:
*
*
*
*
PHDsec:
B Rost & C Sander: JMB, 1993, 232, 584-599.
*
*
B Rost & C Sander: Proteins, 1994, 19, 55-72.
*
*
PHDacc:
B Rost & C Sander: Proteins, 1994, 20, 216-226.
*
*
PHDhtm:
B Rost et al.:
Prot. Science, 1995, 4, 521-533.
*
*
*
****************************************************************************

09/02/2008 10:48

Results from PredictProtein for predict_h31294

10 of 19

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

PHD predictions
TOP - BOTTOM - PHD

PHD predictions for predict_h31294
Different levels of data:
1. PHD brief
2. PHD normal

PHDhtm summary
NHTM=4
PHDhtm detected 4 membrane helices for the best model.The second best model contained 3 helices.
TOP=in
PHDhtm predicted the topology in, i.e. the first loop region is in (Note: this prediction may be
problematic when the sequence you sent starts or ends with a region predicted in a membrane helix!)
Reliability of best model=1 (0 is low, 9 is high)
Zscore for best model=0.920
Difference of positive charges (K+R) inside - outside=-8.931 (the higher the value, the more reliable)
Reliability of topology prediction =8 (0 is low, 9 is high)
Details of the strength of each predicted membrane helix:
(sorted by strength, strongest first)
N HTM Total score Best HTM c-N
1

0.7866

0.9100

380 - 401

2

0.8267

0.8913

177 - 197

3

0.8657

0.8808

224 - 244

4

0.8765

0.6006

269 - 290

Overview over transmembrane segments:
Positions Segments Explain
1- 176

i1

inside region 1

177- 197

M1

membrane helix 1

198- 223

o1

outside region 1

224- 244

M2

membrane helix 2

245- 268

i2

inside region 2

269- 290

M3

membrane helix 3

291- 379

o2

outside region 2

380- 401

M4

membrane helix 4

402- 410

i3

inside region 3

09/02/2008 10:48

Results from PredictProtein for predict_h31294

11 of 19

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

Residue composition for your protein:
%A: 9.8 %C: 0.2 %D: 3.4 %E: 6.6 %F: 5.1
%G: 7.8 %H: 0.7 %I: 7.6 %K: 6.1 %L: 10.2
%M: 5.6 %N: 2.9 %P: 2.9 %Q: 2.4 %R: 6.1
%S: 6.1 %T: 5.8 %V: 7.6 %W: 0.5 %Y: 2.4

AA :

amino acid sequence

pH_sec:

'probability' for assigning helix (1=high, 0=low)

pL_sec:

'probability' for assigning neither helix, nor strand (1=high,
0=low)

PHD_htm: PHD predicted membrane helix: M=helical transmembrane region,
blank=non-membrane
PHD = PHD: Profile network prediction HeiDelberg
PHDrhtm: refined PHD prediction: M=helical transmembrane region,
blank=non-membrane
PiMohtm: PHD prediction of membrane topology: T=helical transmembrane region,
i=inside of membrane, o=outside of membrane
pT_htm:

'probability' for assigning transmembrane helix

pN_htm:

'probability' for assigning globular region

PHD results (brief)

AA
PHD_htm
PiMohtm

....,....1....,....2....,....3....,....4....,....5....,....6....,....7....,....8....,..
MAKNGGFSLFAKKEKRFIFEGRHSASDKLVNGEVSAFTEEEARKKLAKRGIRPLQITRVKTSSKRKITQEDITVFTRQLSTMIKAGL
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii

PHD results (normal)

AA
PHD_htm
PHDrhtm
PiMohtm

....,....1....,....2....,....3....,....4....,....5....,....6....,....7....,....8....,..
MAKNGGFSLFAKKEKRFIFEGRHSASDKLVNGEVSAFTEEEARKKLAKRGIRPLQITRVKTSSKRKITQEDITVFTRQLSTMIKAGL
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii

09/02/2008 10:48

Results from PredictProtein for predict_h31294

12 of 19

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

Ambivalent Sequence Predictor(Malin Young, Kent Kirshenbaum,
Stefan Highsmith)
TOP - BOTTOM - ASP

Ambivalent Sequence Predictor (ASP v1.0) mmy

PROF predictions

Bottom

-

TOP - BOTTOM - PROF
Summary
Details
-

PredictProtein

PROF predictions for query
Contents:
SYNOPSIS of prediction for query
1. secondary structure class
2. secondary structure composition
3. surface/core ratio
HEADER information
1. about protein
2. about alignment
3. residue composition
4. about method used
5. please quote
6. copyright & contact
7. abbreviations used
BODY with predictions
Different levels of data:
1. PROF normal
2. PROF detail

SYNOPSIS of prediction for query

PROFsec summary overall your protein can be classified as:
mixed given the following classes:

09/02/2008 10:48

Results from PredictProtein for predict_h31294

13 of 19

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

'all-alpha': %H > 45% AND %E < 5%
'all-beta': %H < 5% AND %E > 45%
'alpha-beta': %H > 30% AND %E > 20%
'mixed': all others

Predicted secondary structure composition for your protein:
sec str type H

E

L

% in protein 73.41 5.12 21.46

Predicted solvent accessibility composition (core/surface ratio) for your protein:
Classes used:
e: residues exposed with more than 16% of their surface
b: all other residues.
The subsets are for the fractions of residues predicted at higher levels of reliability, i.e. accuracy. This set
covers 36% of all residues.
accessib type b
% in protein

e

58.05 41.95

sub...: accessib type b
...set: % in subset

e

83.11 16.89

HEADER information

About your protein:
prot_id

query

prot_nres 410
prot_nali 121
prot_nchn 1
prot_nfar 106

About the alignment used:
ali_orig /home/phd/server/work/predict_h31294.hsspPsiFil
Residue composition for your protein:
%A: 9.8 %C: 0.2 %D: 3.4 %E: 6.6 %F: 5.1
%G: 7.8 %H: 0.7 %I: 7.6 %K: 6.1 %L: 10.2

09/02/2008 10:48

Results from PredictProtein for predict_h31294

14 of 19

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

%M: 5.6 %N: 2.9 %P: 2.9 %Q: 2.4 %R: 6.1
%S: 6.1 %T: 5.8 %V: 7.6 %W: 0.5 %Y: 2.4
About the PROF methods used:
prof_fpar acc=/home/phd/server/pub/prof/net/PROFboth_best.par
prof_nnet acc=6
Please quote:
1. PROF: B Rost & C Sander (1993) J Mol Biol, 232:584-599
2. PROFhtm: B Rost, P Fariselli & R Casadio (1996) Prot Science, 7:1704-1718
Copyright & Contact:
Burkhard Rost, CUBIC NYC / LION Heidelberg
Email: rost@columbia.edu
WWW: http://cubic.bioc.columbia.edu
Fax: +1-212-305 3773

ABBREVIATIONS used:
AA :

amino acid sequence

OBS_sec:

observed secondary structure: H=helix, E=extended (sheet),
blank=other (loop)

PROF_sec:

PROF predicted secondary structure: H=helix, E=extended
(sheet), blank=other (loop)
PROF = PROF: Profile network prediction HeiDelberg

Rel_sec:

reliability index for PROFsec prediction (0=low to 9=high)
Note: for the brief presentation strong predictions marked by
'*'

SUB_sec:

subset of the PROFsec prediction, for all residues with an
expected average accuracy > 82% (tables in header)
NOTE: for this subset the following symbols are used:
L: is loop (for which above ' ' is used)
.: means that no prediction is made for this residue, as the
reliability is: Rel < 5

pH_sec:

'probability' for assigning helix (1=high, 0=low)

pE_sec:

'probability' for assigning strand (1=high, 0=low)

pL_sec:

'probability' for assigning neither helix, nor strand (1=high,
0=low)

O_2_acc:

observerd relative solvent accessibility (acc) in 2 states: b
= 0-16%, e = 16-100%.

P_2_acc:

PROF predicted relative solvent accessibility (acc) in 2
states: b = 0-16%, e = 16-100%.

O_3_acc:

observerd relative solvent accessibility (acc) in 3 states: b
= 0-9%, i = 9-36%, e = 36-100%.

09/02/2008 10:48

Results from PredictProtein for predict_h31294

15 of 19

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

P_3_acc:

PROF predicted relative solvent accessibility (acc) in 3
states: b = 0-9%, i = 9-36%, e = 36-100%.

OBS_acc:

observed relative solvent accessibility (acc) in 10 states: a
value of n (=0-9) corresponds to a relative acc. of between
n*n % and (n+1)*(n+1) % (e.g. for n=5: 16-25%).

PROF_acc:

PROF predicted relative solvent accessibility (acc) in 10
states: a value of n (=0-9) corresponds to a relative acc. of
between n*n % and (n+1)*(n+1) % (e.g. for n=5: 16-25%).

Rel_acc:

reliability index for PROFacc prediction (0=low to 9=high)
Note: for the brief presentation strong predictions marked by
'*'

SUB_acc:

subset of the PROFacc prediction, for all residues with an
expected average correlation > 0.69 (tables in header)
NOTE: for this subset the following symbols are used:
I: is intermediate (for which above ' ' is used)
.: means that no prediction is made for this residue, as the
reliability is: Rel < 4

ali_orig:

input file

prof_fpar: name of parameter file, used [w]
prof_nnet: number of networks used for prediction [d]
prof_skip:

note: sequence stretches with less than 9 are not predicted,
the symbol '*' is used!

BODY with predictions for query
PROF results (normal)
....,....1....,....2....,....3....,....4....,....5....,....6....,....7....,....8....,..
MAKNGGFSLFAKKEKRFIFEGRHSASDKLVNGEVSAFTEEEARKKLAKRGIRPLQITRVKTSSKRKITQEDITVFTRQLSTMIKAGL

AA
OBS_sec
PROF_sec
Rel_sec
SUB_sec

HHHHHHHHHHEEEEEEE
EEEEEEEE HHHHHHHHHH
EEEEEE
HHHHHHHHHHHHHHHH
933431126642111122221302457023234540346888888774387305664111101232375678888888988873377
L.......HH...............LL......E....HHHHHHHHH..LL..EEE...........LHHHHHHHHHHHHHHH..LL

O_3_acc
P_3_acc
Rel_acc
SUB_acc

bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
ebe bbbbbbb beeb beb eeeeee ebeb b
e b e b e ebebb b e eeeeeee eee bbbbb bbb bbe bb
301000130162222243332221332422223132203539364936345301352110000313232213209630933833023
..........b.....b..........e...........e.b.eib.e.eb....b..................bb..b..b.....

09/02/2008 10:48

Results from PredictProtein for predict_h31294

16 of 19

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

PROF results (detail)

AA
pH_sec
pH_sec
pH_sec
pH_sec
pH_sec
pH_sec
pH_sec
pH_sec
pH_sec
pE_sec
pE_sec
pE_sec
pE_sec
pE_sec
pE_sec
pE_sec
pE_sec
pE_sec
pL_sec
pL_sec
pL_sec
pL_sec
pL_sec
pL_sec
pL_sec
pL_sec
pL_sec

....,....1....,....2....,....3....,....4....,....5....,....6....,....7....,....8....,..
MAKNGGFSLFAKKEKRFIFEGRHSASDKLVNGEVSAFTEEEARKKLAKRGIRPLQITRVKTSSKRKITQEDITVFTRQLSTMIKAGL
......
..........
........
..............
..
.........
...............
...
..........
................
....
..........
................
..........
..........
................
..........
...........
...
................
................ ...
............
........
................. .
.........................
..............
......... ................. .
---------------------------------------------------------------------------------------.
...
. ....
....
.. .
.......
.....
.......
.........
......
..
.......
.........
........
.......
. .........
..........
.........
...
......................... ...........
..................
---------------------------------------------------------------------------------------.
.
.
..
.
..
.
..
..
.
..
....
...
.
....
..
....
.....
...
..
....
.....
....
......
.....
..
.....
..........
....
......
..
. ..........
...
.....
...........
.....
........
...........................
....... .............
.....
.......................................
.........................
......
----------------------------------------------------------------------------------------

OBS_acc
OBS_acc
OBS_acc
OBS_acc
OBS_acc
OBS_acc
OBS_acc
OBS_acc
OBS_acc
PROF_acc
PROF_acc
PROF_acc
PROF_acc
PROF_acc
PROF_acc
PROF_acc
PROF_acc
PROF_acc

---------------------------------------------------------------------------------------.
...
.
..
.
....
. .
.. ...
.
.....
.
.
. .
.. ...
.
.
. .
..
.
...... . .
.
.
. . .
. ....... ...
.
. .
.. .. . . . ...... . . . . ... . .... . . .......... ...
.
.
. ..
.. .. . . ........ . . . ..... ... .... . . .......... ....
.
. .
.
. ..
.. .. . . .......... . . ..... ... .... . . ...............
.
. .. .
. ..
.. ...... .......... . . ..... ... ...... . ...............
.
. .. ..
. ..
.. ...... .......... . . ..... ... ...... . ...............
.
. .. ..
----------------------------------------------------------------------------------------

Top

-

Summary

-

Details

-

PredictProtein

GLOBE prediction of globularity
----- GLOBE: prediction of protein globularity

09/02/2008 10:48

Results from PredictProtein for predict_h31294

17 of 19

-----------------------

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

nexp =
172
(number of predicted exposed residues)
nfit =
163
(number of expected exposed residues
diff =
9.00 (difference nexp-nfit)
=====> your protein appears as compact, as a globular domain
GLOBE: further explanations preliminaryily in:
http://www.columbia.edu/~rost/Papers/98globe.html
END of GLOBE

END of results for file predict_h31294

Quotes for methods
1.

PredictProtein: PredicProtein: B Rost (1996) Methods in Enzymology, 266:525-539

Author: B Rost
Contact: liu@cubic.bioc.columbia.edu
Url: http://cubic.bioc.columbia.edu
Version: 1.99.08
Description: PredictProtein is the acronym for all prediction programs run.
2. ProSite:
K Hofmann , P Bucher, L Falquet, A Bairoch:: The PROSITE database, its status in 1999. Nucleic Acids
Res, 27, 215-219, 1999
Author: Kay Hofmann, Philip Bucher, and Amos Bairoch (SIB, Geneva, Switzerland)
Contact: Christian.Sigrist@isb-sib.ch
Url: http://www.expasy.ch/prosite/
Version: 99.07
Description: PROSITE is a database of functional motifs. ScanProsite, finds all functional motifs in your
sequence that are annotated in the ProSite db
3. SEG:
J C Wootton, and S Federhen:: Analysis of compositionally biased regions in sequence databases. Methods
in Enzymology, 266, 554-571, 1996
Author: John C Wootton and Scott Federhen (NCBI, Washington)
Contact: federhen@ncbi.nlm.nih.gov
Url: http://trex.musc.edu/manuals/unix/seg.html
Version: 1994
Description: SEG divides sequences into regions of low-, and high-complexity. Low-complexity regions
typically correspond to 'simple sequences' or 'compositionally-biased' regions
4. ProDom:
F Corpet, F Servant, J Gouzy, and D Kahn:: ProDom and ProDom-CG: tools for protein domain analysis
and whole genome comparisons. Nucleic Acids Res, 28, 267-269, 2000
Author: Florence Corpet, Florence Servant, Jerome Gouzy, and Daniel Kahn
Contact: Jerome.Gouzy@toulouse.inra.fr
Url: http://protein.toulouse.inra.fr/prodom.html
Version: 2000.1
Description: ProDom is a database of putative protein domains. The database is searched with BLAST
for domains corresponding to your protein

09/02/2008 10:48

Results from PredictProtein for predict_h31294

18 of 19

5.

6.

7.

8.

9.

10.

11.

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

MaxHom:
C Sander, and R Schneider::Database of Homology-Derived Structures and the Structural Meaning of
Sequence Alignment. Proteins, 9, 56-68, 1991
Author: Reinhard Schneider (LION, Boston) and Chris Sander (Millenium, Boston)
Contact: rost@columbia.edu
Url: local
Version: 1.99.04
Description: MaxHom is a dynamic multiple sequence alignment program which finds similar sequences
in a database.
MView:
N P Brown, C Leroy, and C Sander:: MView: A Web compatible database search or multiple alignment
viewer. Bioinformatics, 14, 380-381, 1998
Author: Nigel Brown
Contact: nbrown@nimr.mrc.ac.uk
Url: http://mathbio.nimr.mrc.ac.uk/~nbrown/mview/
Copyright: Copyright (C) Nigel P. Brown, 1997-1998. All rights reserved.
Version: 1.40.2
Description: MView is a program converting multiple sequence alignments into fancy HTML formatted
output
PHD:
B Rost:: PHD: predicting one-dimensional protein structure by profile based neural networks. Methods in
Enzymology, 266, 525-539, 1996
Author: Burkhard Rost (CUBIC, Columbia Univ, New York)
Contact: rost@columbia.edu
Url: http://cubic.bioc.columbia.edu/predictprotein
Version: 1.96
Description: PHD is a suite of programs predicting 1D structure (secondary structure, solvent
accessibility) from multiple sequence alignments
PHDhtm: B Rost, P Fariselli & R Casadio (1996) Protein Science, 7:1704-1718
Author: B Rost
Contact: rost@columbia.edu
Url: http://cubic.bioc.columbia.edu
Version: 1.96
Description: PHDhtm predicts the location and topology of transmembrane helices from multiple
sequence alignments.
PROF:
B Rost:: PROF: predicting one-dimensional protein structure by profile based neural networks.
unpublished, 2000
Author: Burkhard Rost (CUBIC, Columbia Univ, New York)
Contact: rost@columbia.edu
Url: http://cubic.bioc.columbia.edu/predictprotein
Version: 2000_04
Description: Improved version of PHD: Profile-based neural network prediction of protein structure
PROFsec: B Rost (2000) in submission
Author: B Rost
Contact: rost@columbia.edu
Url: http://cubic.bioc.columbia.edu
Version: 2000_04
Description: PROFsec predicts secondary structure from multiple sequence alignments.
PROFacc: B Rost (2000) in submission
Author: B Rost
Contact: rost@columbia.edu
Url: http://cubic.bioc.columbia.edu

09/02/2008 10:48

Results from PredictProtein for predict_h31294

19 of 19

file:///C:/Documents%20and%20Settings/MSaleem/My%20Documents...

Version: 2000_04
Description: PROFacc predicts per residue solvent accessibility from multiple sequence alignments.
12. GLOBE: B Rost::Short yeast ORFs: expressed protein or not? unpublished, 2000
Author: Burkhard Rost (CUBIC, Columbia Univ, New York)
Contact: rost@columbia.edu
Url: http://cubic.bioc.columbia.edu/predictprotein
Version: 1.98.05
Description: GLOBE predicts the globularity of a protein
13. COILS:
A Lupas:: Prediction and Analysis of Coiled-Coil Structures. Methods in Enzymology, 266, 513-525, 1 996
Author: Andrei Lupas (Max Planck Institute, Tuebingen, Germany)
Contact: andrei.lupas@tuebingen.mpg.de
Url: local
Version: 1999_2.2
Description: COILS finds coiled-coil regions in your protein
14. ASP: Young et al.:: Protein Science(1999) 8:1752-64.
Author: Malin Young (Sandia National Laboratory), Kent Kirshenbaum(Caltech), and Stefan Highsmith
Contact: mmyoung@sandia.gov; kent@cheme.caltech.edu; shighsmith@sf.uop.edu
Version: 1.0
Description: ASP finds regions that are most likely to behave as switches in proteins known to exhibit
this behavior

Links: TOP

09/02/2008 10:48