You are on page 1of 5

Vol. 23 no.

20 2007, pages 2660–2664


BIOINFORMATICS DISCOVERY NOTE doi:10.1093/bioinformatics/btm411

Sequence analysis

The DOMON domains are involved in heme and sugar recognition


Lakshminarayan M. Iyer, Vivek Anantharaman and L. Aravind*
National Center for Biotechnology Information, National Library of Medicine and National Institute of Health,
Bethesda, MD 20894, USA
Received on June 13, 2007; revised on July 19, 2007; accepted on August 8, 2007
Advance Access publication September 18, 2007
Associate Editor: John Quackenbush

Downloaded from http://bioinformatics.oxfordjournals.org at NIH Library on April 1, 2010


ABSTRACT reactions potentially as a direct participant in the electron
We expand the functionally uncharacterized DOMON domain transfer process.
superfamily to identify several novel families, including the first
prokaryotic representatives. Using several computational tools we
show that it is involved in ligand binding—either as heme- or sugar- 2 RESULTS
binding domains. We present evidence that the DOMON domain 2.1 Identification of the extended DOMON superfamily
along with the DM13 domain comprises a novel electron-transfer
To obtain a more complete understanding of the evolution and
system potentially involved in oxidative modification of animal
functions of DOMON domains, we initiated searches of NR
cell-surface proteins. Other novel versions might function as sugar
and a locally compiled database of unfinished eukaryotic
sensors of histidine kinases of bacterial two component systems.
genomes using PSI-BLAST, and an input position-specific
Contact: aravind@ncbi.nlm.nih.gov or aravind@mail.nih.gov
score matrix representing all previously identified DOMON
Supplementary information: Supplementary data are available at
domains (see Supplementary Material for a detailed description
Bioinformatics online and also at ftp://ftp.ncbi.nih.gov/pub/aravind/
of materials and methods). These searches recovered novel
domon/.
DOMON homologs in diverse protists such as ciliates,
oomycetes, diatoms and Naegleria. With further transitive
searches we also retrieved the N-terminal cytochrome domain
of the fungal cellobiose dehydrogenases (CDH), and bacterial
1 INTRODUCTION proteins from diverse taxa, such as the ethylbenzene dehydro-
The DOMON (dopamine -monooxygenase N-terminal) genase subunit (EDH ), the C-terminal domain of certain
domain also called DoH was originally identified in several NirT proteins and the carbohydrate binding domain family 9
secreted, or cell surface proteins from plants and animals (CBD9) domains of xylanases and bacterial extracellular
(Aravind, 2001; Ponting, 2001). It usually occurs fused to cellulases. For example, PSI-BLAST searches initiated with
other domains such as other Cu-ascorbate-dependent mono- the DOMON domain of the human dopamine -monooxygen-
oxygenases associated with catecholamine metabolism (the ase (DM, gi: 30474, region 50–166) against NR recovered with
eponymous enzyme in which it was found), cytochrome b561 significant E-values (E50.001) in eight iterations several novel
and adhesion modules such as EGF, reelin and SEA. This representatives in ciliates, Dictyostelium and bacteria such as
-strand-rich domain was predicted to adopt a -sandwich-like Thermococcus and Roseobacter, in addition to previously
fold, and based on its domain architectural contexts it was reported versions. The search also retrieved the cytochrome
predicted to mediate protein–protein interactions (Aravind, domain of the fungal CDH (PDB: 1D7B; (E510 15). Further
2001; Ponting, 2001). However, there exists little to no searches initiated with the DOMON-containing Tetrahymena
experimental evidence supporting such a function and the protein TTHERM_00460560 protein (gi: 118371105) retrieved
domain’s biochemical role remains poorly characterized. The several bacterial proteins, such as the C-terminal region of
explosive growth in sequence and structure databases often a NirT homolog from Colwellia (gi: 71278993), which in turn
provides new leads that allow uncovering the functions of recovered EDH (PDB: 2ivf, chain C). Finally, PSI-BLAST
previously enigmatic domains. Using sensitive sequence and searches with EDH recovered the CBD9 domains of bacterial
structural analysis, we show that DOMON is widely distributed xylanases and cellulases (PDB: 1I82A) and a set of related
outside of animals and plants in fungi, various protists, bacteria proteins from fungi, ciliates and Dictyostelium (e.g. Gibberella
and archaea. Based on the contextual and structural informa- zeae FG07921.1) with significant E-values. Using the DALI
tion gleaned from these newly identified forms, we show that program we also conducted structure searches of the PDB
the DOMON domains are predominantly heme- or sugar- database with CDH, EDH and the CBD9 domain as queries.
binding domains. We show that the DOMON superfamily has These usually recovered each other and also the bacterial
been widely utilized in various contexts involving redox Glucodextranase C-terminal domain as best hits. The above
structures are classified under the CBD9 superfamily of the
*To whom correspondence should be addressed. immunoglobulin (Ig) fold in the SCOP database (http://

ß 2007 The Author(s)


This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/
by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
The DOMON domains

STRAND-1 STRAND-2 STRAND-3 STRAND-4 STRAND-5


Sec Structure EEEE....... ..EEEEE.......EEEEE.....eEEEEEEee.. ..EEEEEE.. .....
DOPO_Haps_118791 24 QGSAPRES 4 HIPLDP 2 SLELSWNV 2 TQEAIHFQLLVRR 1 KAGVLFGMSD 1 GEL 4
MOXD1_Hsap_17511810 10 WGLLPGTA 10 RTLLDS 2 KYWLGWSQ 2 SQIAFRLQVRT-- AGYVGFGFSP 1 GAM 4
SDR2_Mmus_6677899 198 CGNKKFCV 3 LNCDPE 2 PACVFLSF 2 DNQSVMVEMSGPS DGYVSFAFSH 1 QWM 4
CDH_Pchr_11514399 (PDB:1d7b) 5 QFTDPTTG 3 TGITDP 2 DVTYGFVF 8 QSTEFIGEVVAPI 1 SKWIGIALG- GAM 3
UM06431.1_Umay_71024697 23 ESSDQASL 3 TSCNSD 1 CLRVVYSP SEKRMNMTMSANG 4 IGWYAVGTG- KQM 4
TTHERM_00689870_Tthe_89308760 29 FQQNQNDL 4 SSLSLT 1 GITLNFEI 1 GTDIVFMIESTKP QGWIGLGFG- ATM 4
AT5G54830_Atha_15239759 165 NNSEPFKA 4 DNCKKL 2 KYRLRWSL 2 EKGYVDIGLEATT 2 LNYMAFGWAK 5 NLM 4
Mthe_1462_Mthe_116754758 13 LLLFSGCI 22 VGVNEY 17 VMTISWKI DDEYLYMALNGST RGWLAIGFEP 1 EWM 4
cbdb_A1609_Dsp._73749307 33 SLLFSSCA 33 ITVSEY 8 NFTLFSRT DDQYVYIGIKAKA TGWISIGFQP 3 KGH 4
TK1382_Tkod_57641317 19 AHLSLGCL 21 ISRGEY 7 EMSLHFRV ENDTLYVGISAET HGWVAIGFGG 1 PGM 4
ebdC_Asp._56476742 (PDB:2ivfC) 2 KAKRVPGG 19 FEMFPT 22 RLDVAALH NGSMIALRLKWAS 14 GVGAMFPVAR 4 VTM 6
APE_1297_Aper_14601316 38 NAVYVDGA 17 ISLVSQ 13 TLRVAAAV 1 SEGILAIYLEWED 16 KVAVQFPLSS 5 ICM 6
AT3G62370_Atha_15228739 38 LADFRPGI 15 SEFPLR 14 KMTVKALH -DGRDIYFLLEID 14 SVALMFQIGD 4 HNM 19
cbsA_Ssol_15899519 47 PVYKVVGN 15 PWINIS 15 YLLVKAVW NGSWIIILERWYA 115 RAAIMWYMGS 6 DGM 29
CmaqDRAFT_1308_Cmaq_126353324 45 MAYKVVGS 15 PWFNVS 23 YIMVKAAW NGSYIFILVKWPD 91 RVAIMWYMGS 5 DCM 39
HCH_03667_Hche_83646397 190 VAPLAEKV 16 APVTVT 11 PVEVRALA NPYTIYFSISWPD 35 KLAVMLSKGG 7 VHL 17 HYTTDG 2
blr7997_Bjap_27383108 237 LVVARMSD 16 RPVFIR 13 LVEVRALH DGQKIYFAFRWED 35 KLAVIFSDNP 7 TDL 23 HYTTDA 2
nirB_Pstu_115239 36 DVTLFYPG 12 H11CAGCH4 SDMGEK 19 NAKVQAAN DGENLYLRFTWKQ 15 KIAYMLEGGS 9 CWGSCHGDARTM 21
Sfum_0443_Sfum_116747891 31 ETLAAAKV 15 KPLDIP 12 NVITQAVY TDDEIFFLFKWKD 25 RISLLFEINR 8 CAVTCHGAAGAA 14
FG07921.1_Gzea_46127087 18 PSVSV--- 8 IKFSKS 8 RTEVDLCY TDTALSLQFTAFD 18 YEVVEAFIYK 5 QTY 5
TTHERM_01004980_Tthe_146162880 33 NQLVVAAC 17 IPYFYY 9 STTVQICH 1 NFEIIDITWNVAD 16 QDVVEVFLGT 5 TEY 5

Downloaded from http://bioinformatics.oxfordjournals.org at NIH Library on April 1, 2010


SO2192_Sone_24373747 47 LYAYPLAG 19 QYGEAN 12 LSFTHMVG 1 YDGYLYAFFQVVD PNVVFRGKNA 13 LAP 5
CPS_3981_Cpsy_71279576 84 LYGYKLTE 19 FYNQQQ 9 LNFKASVG 1 YNNYLYLFFQVVD NRLIFRGENT 13 SDT 5
xynA_Tmar_14719787 (PDB:1i8A) 2 VATAKYGT 14 EEIETK 11 TAKVRVLW DENYLYVLAIVKD 14 SVEIFIDENN 4 YYE 2
GB2207_03065_Mgam_90417401 9 CIPTVSAG 18 TDFVTV 10 GTQVRVIT NSDGIFVGFSNYQ 20 RNIVSIDFDS TAL 2
consensus/80% ........ ..h... ...h.h.. s.p.hhh.h.h.s ...h.h.... ...
STRAND-6 STRAND-7 STRAND-8 STRAND-9 STRAND-10 <----STRAND 11----------->
Sec Structure EEEEEE ...EEEEEE.....EEEEEEEEE......eEEEEEEEEe......EEEEEEEE.......EE....EEE.-------...-EEEE
DOPO_Hsap_118791 LVVLWT 4 AYFADA 15 YQLLQVQRT PEGLTLLFKRPFGT 12 VHLVYGILE 7 AING SGLQ-------- 1 GLQRV 181\DOMON
MOXD1_Hsap_17511810 IVVGGV 4 PYLQDY 15 YHLEYAMEN STHTIIEFTRELHT 12 VRVIWAYHH 9 YHDS ---N-------- 1 GTKSL 169|
SDR2_Mmus_6677899 AYLCIR 4 VDIQPS 15 LEDMAWRLA DGVIQCSFRRNITL 13 YIFFAEGPS 6 RHSQ QPLI-------- TYEKY 352|
CDH_Pchr_11514399 (PDB:1d7b) LLLVAW 5 IVSSTR 17 TTLPETTIN STHWKWVF-RCQGC 14 GVLAWAFSN 14 EHTD FGFF-------- GIDYS 174|
UM06431.1_Umay_71024697 MMIGWV 5 VVMSQR 18 MEPKHSFSN SSGTVWTWSFPMSG 8 TPFIWASNK 13 RHTA FGSI-------- TLDLT 182|
TTHERM_00689870_Tthe_89308760 MAIFFA 5 PAVQDA 15 WTLLGSSIT SNSFQMKAKRALNT 14 YNFCIAWSS 5 YHDS YYSY-------- SITLT 181|
AT5G54830_Atha_15239759 VVVTGI 5 PFADDF 36 TKLVYGHRI DGVSFVRYRRPLND 14 LTVIWALGV 14 NHGG 5 FGHF-------- SLNLS 362|
Mthe_1462_Mthe_116754758 MVLGVV 3 ARVLDE 20 ILEYGGKSY GTHTVAEFRRRLDT 13 VSIIWAMSD 6 KHNI 1 -YGE-------- GLIYL 203|
cbdb_A1609_Dsp._73749307 FALGGV 4 AYIYDL 19 ILEYGGTES GGYTILEFKRLLTT 12 NNILWAYSD 6 MHIA -EGT-------- GKINI 225|
TK1382_Tkod_57641317 IVIAYV 5 GEISDS 20 ILSYGGRED ENGTVVEFSRPLNT 13 FRIIWAYGP 6 MHIK AGH--------- IYVTL 199/
ebdC_Asp._56476742 (PDB:2ivfC) VNAWYW 7 MEIVAE 14 DLKAVAQHR NGEWNVILCRSMAT 12 SKIAFAVWS 8 RKSY 1 GEF--------- VDFEI 212\EDHg
APE_1297_Aper_14601316 VSIVLW 6 ETLIAG 24 LVPPEAQVW 7 DGKWMVVLYRPTGS 13 TSVAFAVWQ 8 KKS- 1 SAW--------- FTMRL 257|
AT3G62370_Atha_15228739 VDIMHF 7 GRLYGG 46 HGAWWHSSF 18 KGTYYFEFSRPLRT 15 AKMSVAFWY 10 GHYT 1 NCD--------- WTPLD 303/
cbsA_Ssol_15899519 ANIWMW 6 NNATYD 60 FIWTGATYQ NGYWTVEFARPLAV 16 YYVAFAVWQ 8 DKS- ITS--------- NFLTL 419\cytb
CmaqDRAFT_1308_Cmaq_126353324 ADIWEW 6 PEGQNF 61 SNYAGAKYE DGYWIVEFVRPLKA 13 YDVAFGVWL 8 DKS- ISA--------- SFIPL 408/558/556
HCH_03667_Hche_83646397 RDVWHW 6 DMAVLD 80 DGLPSVLWM 17 DGRWYLELARARET 14 WVAPFDHAQ 1 RHA- 1 HHR--------- PLRLR 505\HCH
blr7997_Bjap_27383108 IDMWQW 6 MLGRVD 103 TILPGVIIA 17 NGHWTLELTRNMKS 16 WVAVFDHAQ 1 RHA- 1 HNR--------- PVRVV 579/_03667
nirB_Pstu_115239 YDLNQW 9 GYVATE 8 LVDAQGKLD GDTWTVVFTRKFAG 12 YNFGFAIHD 6 YHH- VSL--------- GYSLG 279\NIRT-C
Sfum_0443_Sfum_116747891 GDLWHW 6 PYKSAD 66 DTLTYRMPK 16 DGGWTVMLSRKLDT 16 ALALFDDSM 3 SYD- SEAL-------- VLEFG 324/
FG07921.1_Gzea_46127087 NPNNVT 16 PFDHFF 9 TAETQLNKR AKKW-VSKAQIPLG 13 WRMNFFRTV 21 FHI- 1 KFF--------- GHVNF 220\Gzea_FG
TTHERM_01004980_Tthe_146162880 NPQGAL 14 ISDSLI 5 GIFYSAQIA DFGY-KANAQIPVK 10 LKGNFFRID 21 FHV- 1 SQF--------- GDIFL 238/07921.1
SO2192_Sone_24373747 RYIVAT 3 GWISAF 14 EVRIQGQWA 2 DVGY-NIELRIPLD 5 LGFAIADVN 1 SKY- 1 DV---------- AAVVG 223\SO2192
CPS_3981_Cpsy_71279576 RFIISN 3 GWISAF 12 APQIQGHWL 2 SQGY-NIELRVPLD 5 IAFAFYDVD KTN- 1 EA---------- VSAIG 254/
xynA_Tmar_14719787 (PDB:1i8A) DAQFRV 3 NEQTFG 6 RFKTAVKLI EGGY-IVEAAIKWK 7 TVIGFNIQV 7 QRVG 6 PTNNSWRDPSKF GNLRL 187\CBD9
GB2207_03065_Mgam_90417401 YDFTVG 3 SQQDGI 12 TWYSQTSSN KDYW-YSEIHIPWT 11 KKIALWFS- -RVV 4 LR-FAFPNAYYS 2 TFMED 199/
consensus/80% ..hh.h ...... ......... .ss.h.h.b.p...s......h.h.........p.. ------------ .....

Fig. 1. Multiple alignment of the DOMON superfamily. Proteins are represented by their gene names, species abbreviations and GIs. The coloring
reflects the consensus at 80% conservation calculated from a more extensive alignment. Ligand interacting residues are shaded red. Consensus
abbreviations are h: hydrophobic, s: small, p: polar, h: hydrophobic and b: big. Refer to the Supplementary Material for species abbreviations.

scop.mrc-lmb.cam.ac.uk/scop/). All the retrieved sequences strands of the classical Ig-like fold, including two additional
were classified into distinct families by first clustering with N-terminal strands, and an extra strand in the ligand binding
the BLASTCLUST program (Supplementary Material) and sheet and (3) a characteristic long loop between strands 5 and 6
then further combining the clusters using uniquely shared of the conserved core that folds against the ligand-binding
sequence features and shared domain architectures. By this we -sheet and provides an interface for ligand contact. Several
obtained at least nine distinct protein families: classical other -sandwich domains have been previously identified as
DOMON (of which the fungal CDH is a member), EDH - carbohydrate (e.g. Cellulose binding domain II and III)- or
cytochrome domain, CbsA/cytochrome b558/556, Hajella heme (e.g. Cytochrome f)-binding modules (http://scop.mrc-
HCH_03667-like, NirT C-terminal-like, Shewanella SO2192- lmb.cam.ac.uk/scop/). However, the DOMON domain differs
like, CBD9-like, Gibberella zeae FG07921.1-like and from all of them, both in terms of the position and specific
Glucodextranase C-terminal domain-like families (Table 1 in mode of ligand interaction, and number of strands in the
Supplementary Material). Hereinafter, we refer to this unified -sandwich. These suggest innovation of specific ligand-binding
monophyletic assemblage of domains as the DOMON features in the DOMON superfamily after their divergence
superfamily. from the generic group of ligand-binding -sandwich domains.
A comprehensive multiple alignment of the superfamily The defining conserved sequence features of the DOMON
shows that the Ig-like -sandwich DOMON superfamily has superfamily (Figs 1 and 2) include: (1) multiple hydrophobic
10–11 strands, and shares several unique structural features residues that contribute to the hydrophobic core of the strands
(Figs 1 and 2). These include: (1) a common ligand-binding of the -sandwich, and small residues found at the boundaries
interface. (2) Several additional strands beyond the core seven of strands and loops. (2) A strongly conserved charged residue

2661
L.M.Iyer et al.

Downloaded from http://bioinformatics.oxfordjournals.org at NIH Library on April 1, 2010


Fig. 2. Structural features of DOMON domains, domain architectures and conserved gene neighborhoods. The conserved loop, ligand-binding
residues and conserved arginine are denoted in the ribbon diagrams. Domain architectures and gene neighborhoods are labeled with the gene name,
species abbreviation and Gi of the DOMON containing protein. Block arrows are used to depict genes in gene neighborhoods, with the red block
arrow denoting the DOMON containing protein. Other domain notations: black rectangle: signal peptide, TM: Transmembrane helix and
GBD: Galactose binding domain. Refer to the Supplementary Material for species abbreviations.

(usually arginine/lysine) at the end of strand 9. The strong As suggested by previous studies, the two heme-binding
conservation of this non-ligand-binding residue suggests that it versions EDH and CDH contain a conserved methionine in
may have a structural role, such as stabilizing the loop between the curved loop between conserved strands 5 and 6, which is
strands 9 and 10 or mediating conformational changes and directly linked to the heme (Hallberg et al., 2000; Kloer et al.,
(3) a polar residue (usually histidine, lysine or arginine), that 2006). These heme-binding versions also share a histidine or
interacts or coordinates ligands. lysine residue present in the beginning of the terminal strand
that directly contacts the ligand. The primary ligand-contacting
residue in the sugar-binding CBD9 family is a conserved
2.2 Deciphering the DOMON domain’s function: arginine that occupies the same position as the heme-interacting
evidence from sequence and structure H/K residue of the above versions, and makes contacts with the
The above characterization of the DOMON superfamily polar groups of the sugar moiety. The CBD9 family also
resulted in identification of previously experimentally char- contains a unique conserved tryptophan in a large insert in the
acterized versions binding different ligands. Of these EDH last strand, and was previously shown to stack against the sugar
and CDH bind a single heme moiety, NirT-C binds a di-heme (Notenboom et al., 2001; Figs 1 and 2). At least six of the
cofactor and the CBD9 either glucose or cellobiose (Devreese nine families, namely DOMON, EDH -like, cytochrome b558/
et al., 2000; Hallberg et al., 2000; Kloer et al., 2006; 556-like, HCH_03667-like, NirT C-terminus-like and Gibberella
Notenboom et al., 2001). Four of these, representing diverse zeae FG07921.1-like families, have the conserved ligand-
families of this domain, have crystal structures, with three of contacting histidine or lysine at the base of the terminal
them containing a bound soluble ligand (Fig. 2). In all these strand. Of these, most members of the DOMON, EDH -like
structures the ligand is bound in a strikingly similar fashion, and cytochrome b558/556-like families also contain the
in a comparable pocket formed by one of the sheets of the methionine residue in the insert between strands 5 and 6,
-sandwich (Fig. 2). The presence of a similarly bound ligand, strongly supporting a heme-binding function for these versions.
and conservation of the essential structural features (Figs 1 Despite lacking the methionine, the corresponding inserts of the
and 2) required for the maintenance of the binding pocket, NirT C-terminus and HCH_03667 families contain conserved
suggest that soluble ligand binding is likely to be the conserved histidine residues which could provide an alternative ligand for
function of this domain. To understand the shared ligand- heme (Fig. 1). Some members of the former family also contain
binding features of the superfamily, we extracted all ligand- a conserved insert between strands 1 and 2 with two cysteine
interacting residues from the available structures and compared and histidine residues that might contribute to binding a second
them with the conservation pattern seen in the multiple heme. The functionally obscure Gibberella zeae FG07921.1-like
alignment. proteins have a conserved tyrosine in the insert in place of the

2662
The DOMON domains

methionine. Given that tyrosine has been observed as a heme- Experimental studies on the fungal CDH have shown that the
ligand in unrelated heme-binding Ig-fold proteins, it is possible heme bound by the DOMON domain transfers electrons to the
that it functionally substitutes the methionine (Fig. 1). flavin ligand of the oxidoreductase domain during oxidation
The Glucodextranase C-terminal domain family shares the of cellobiose or cello-oligosaccharides (Stoica et al., 2006).
conserved arginine at the base of the terminal strand and the The animal SDR2-like proteins that are fused to a cytochrome
distinctive insert in the terminal strand with the conserved B561 domain have been shown to function as ferric reductases
tryptophan with the CBD9 family, suggesting a similar sugar- (Vargas et al., 2003). Together, these observations suggest that
binding function. The Shewanella SO2192-like family is also the heme-binding versions of the DOMON domains are
related to the sugar-binding versions. In spite of sharing cytochromes mediating electron transfers in redox reactions.
a common ligand, very few of the other residues lining the This prediction offers an important functional clue regarding the
binding site in the DOMON domains of CDH and EDH are several animal extracellular matrix proteins in which the
conserved across both the known and predicted heme-binding DOMON domain is fused to other adhesion modules such as
versions. Likewise few residues beyond those described above EGF, reelins, trypsin-inhibitor and SEA domains and the poly-
are shared by binding sites of the known or predicted sugar- DOMON domain proteins (Aravind, 2001). An examination of

Downloaded from http://bioinformatics.oxfordjournals.org at NIH Library on April 1, 2010


binding forms. At least in the case of the heme-binding forms, these proteins shows that at least one copy of the DOMON
this might indicate that the only other constraint on the ligand- domain in them contains the conventional heme ligands—the
binding site is the maintenance of its general hydrophobicity. methionine and histidine/lysine. This suggests that rather than
Thus, the poorly conserved ligand-contacting residues only being passive extracellular structural proteins they are likely to
generically complement the primary residues by making non- function as cytochromes involved in as yet unidentified redox
specific contacts. It is, however, possible that these differences reactions potentially related to protein hydroxylation or
in the binding pocket have a more subtle effect on the redox oxidative cross-linking. In many of the above animal proteins
properties of the bound heme. Comparisons of the carbohy- the DOMON domain typically occurs with the DM13 domain,
drate-binding versions suggest that at least in some cases which is also predicted to have a -strand-rich fold. The DM13
the differences might translate into differences in terms of the domain interestingly contains a nearly absolutely conserved
ligand bound. Nevertheless, the presence of at least one shared cysteine, which can be potentially involved in a redox reaction
polar ligand-contacting position in the sugar and heme-binding either as a naked thiol group or by binding a prosthetic group
versions at the base of the terminal strand supports an ancestral like heme. The DOMON domains of some members of the
ligand-binding role for the DOMON superfamily. dopamine -monooxygenase family, like MOXD1, contain a
conventional heme-binding pocket suggesting that it functions
as a cytochrome providing electrons for mediating the mono-
2.3 Deciphering the DOMON domain’s function: oxygenase reaction. The DOMONs of vertebrate DM and the
evidence from contextual information arthropod and nematode tyramine -hydroxylase lack the heme-
coordinating methionine and histidine. However, the ligand-
Contextual information in the form of domain architectures binding pocket is predicted to remain intact (Fig. 1) suggesting
and gene neighborhoods are often used to gain functional that they either bind an unknown ligand or weakly bind a heme,
insights into poorly characterized domains and domain which might be critical for the properties of these enzymes.
families. Most proteins of the DOMON superfamily are DOMON domains of the Shewanella SO2192-like family
secreted and contain a signal peptide. In the heme-binding are all extracellular/periplasmic domains of receptor histidine
versions, the greatest diversity of architectures was seen in the kinases (Fig. 2), which are encoded by a predicted operon
eukaryotic version of the DOMON family. One predominant also containing a neighboring gene for a protein with an
architectural theme was association with cytochromes or HTH fused to a receiver domain. These proteins are likely to
enzymatic domains whose activity involved redox or electron comprise a two-component system that potentially sense
transfer reactions. Thus, DOMON is fused to (1) a transmem- environmental sugars with the DOMON domain. The CBD9
brane cytochrome b561 domain in several proteins from diverse and the Glucodextranase C-terminal type DOMON domains
eukaryotes (Supplementary Material) and in the bacterial are found in proteins fused to other sugar-binding domains,
HCH_03667-like proteins, (2) Cu-ascorbate dependent mono- S-layer homology domains and sugar transferases, or in
oxygenases in dopamine -monooxygenase-like proteins of operons encoding possible sugar transporters, consistent with
animals, chlorophyte algae and diatoms, (3) cytochrome their role in polysaccharide metabolism.
B5-like ferric reductases in ciliates, Phytophthora and
Naegleria, (4) various Rossmann fold FAD and NAD binding
oxidoreductase domains as in the fungal CDH, other oxidor- 3 EVOLUTIONARY HISTORY AND CONCLUSIONS
eductases from Phytophthora and Naegleria and (5) a cyto- Our investigations show that in addition to eukaryotes the
chrome c domain in certain NirT proteins (e.g. Colwellia; gi: DOMON superfamily is wide distributed in phylogenetically
71278993) (Gross et al., 2005). Thus, in several cases the diverse bacteria and sporadically in archaea. This domain
DOMON domain is fused to multiple cytochrome or shows the greatest diversity in the bacterial superkingdom
Rossmann oxidoreductase domains in the same polypeptide both in terms of number of different families and domain
(Fig. 2, Supplementary Material). Interestingly, an archaeal architectures. This is also consistent with the extraordinary
DOMON containing protein (Methanosaeta Mthe_1462), is diversity of other versions of the -sandwich seen in bacteria.
fused to a ferritin family iron-binding domain. These observations suggest a possible origin for this domain in

2663
L.M.Iyer et al.

bacteria followed by possible dispersion through lateral transfer Conflict of Interest: none declared.
to eukaryotes and certain archaea. Divergence of the domain
into heme- and sugar-binding versions also appears to have
occurred in the bacteria. At least three distinct families of the REFERENCES
DOMON superfamily have been transferred to eukaryotes Aravind,L. (2001) DOMON: an ancient extracellular domain in dopamine
of which the classical DOMON and the Gibberella zeae beta-monooxygenase and other proteins. Trends Biochem. Sci., 26, 524–526.
Devreese,B. et al. (2000) Primary structure characterization of a Rhodocyclus
FG07921.1-like families are present in a wide range of
tenuis diheme cytochrome c reveals the existence of two different classes of
eukaryotes suggesting an early transfer. Likewise, we were low-potential diheme cytochromes c in purple phototropic bacteria. Arch.
also able to identify several bacterial versions (Supplementary Biochem. Biophys., 381, 53–60.
Material) of the DM13 domain, previously known only from Gross,R. et al. (2005) Site-directed modifications indicate differences in axial
haem c iron ligation between the related NrfH and NapC families of
animals. Thus, the DM13 domain and the classical family of
multihaem c-type cytochromes. Biochem. J., 390, 689–693.
the DOMON domain, both ultimately of bacterial origin, Hallberg,B.M. et al. (2000) A new scaffold for binding haem in the cytochrome
appear to have proliferated in animal extracellular proteins. domain of the extracellular flavocytochrome cellobiose dehydrogenase.
We believe the identification of the DOMON domain as Structure, 8, 79–88.

Downloaded from http://bioinformatics.oxfordjournals.org at NIH Library on April 1, 2010


a cytochrome or a sugar-binding domain would help in Kloer,D.P. et al. (2006) Crystal structure of ethylbenzene dehydrogenase from
Aromatoleum aromaticum. Structure, 14, 1377–1388.
understanding better their biochemical properties. In particu- Notenboom,V. et al. (2001) Crystal structures of the family 9 carbohydrate-
lar, we hope that it might help in exploring a hitherto unknown binding module from Thermotoga maritima xylanase 10A in native and
predicted electron transfer system possibly involved in modify- ligand-bound forms. Biochemistry, 40, 6248–6256.
ing animal extracellular proteins. Ponting,C.P. (2001) Domain homologues of dopamine beta-hydroxylase and
ferric reductase: roles for iron metabolism in neurodegenerative disorders?
Hum. Mol. Genet., 10, 1853–1858.
ACKNOWLEDGEMENT Stoica,L. et al. (2006) Direct electron transfer – a favorite electron route
for cellobiose dehydrogenase (CDH) from Trametes villosa. Comparison
The authors acknowledge the Intramural research program of with CDH from Phanerochaete chrysosporium. Langmuir, 22, 10801–10806.
the NLM, National Institutes of Health, USA, for funding their Vargas,J.D. et al. (2003) Stromal cell-derived receptor 2 and cytochrome b561 are
research. functional ferric reductases. Biochim. Biophys. Acta, 1651, 116–123.

2664

You might also like