JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 81


COMPARISON OF PROMOTER
SEQUENCES BY ALIGNMENT OF MOTIF
SEQUENCES

Meera.A
1,
Lalitha Rangarajan
2
, Poonam V Reddy
3
, Arun Chandrashekar
4

Abstract— In this paper we propose a method to compare promoter sequences by aligning the sequence of motifs present in
the promoter sequence. Alignment is performed using dynamic programming. Transcription Factors (TFs) from these
sequences are extracted using ‘TF search’ tool. The resultant sequences of motifs are then aligned and match score obtained.
As a case study, we have used promoter sequences of different mammals of the enzyme citrate synthase in central metabolic
pathway extracted from the NCBI database. Results reveal high similarity in motif sequences of different organisms in the same
chromosome. Also some amount of similarity is present among motif sequences of different chromosomes of the same
organism.
Index Terms— Dynamic programming, Pattern matching, Promoter sequence, Transcription factors (TFs), Transcription factor
binding sites (TFBS)
——————————

——————————
1 INTRODUCTION
 
Sequence alignment is of crucial importance for all aspects
of biological sequence analysis. Virtually all methods of nucleic
acid and protein sequence analysis rely directly or indirectly on
alignments, so the output of these methods depends on the qual-
ity of the underlying alignments. Some applications of sequence
alignment include genome sequence annotation and gene pre-
diction, phylogeny reconstruction, RNA and protein structure
analysis, functional classification of proteins.
Biologists rely heavily on comparison of DNA sequences
for understanding of many biological problems. In promoters,
primary sequence comparisons, however, have limitations. In
the process of aligning nucleotides, structure of motifs may be
disturbed. Although similar sequences do tend to play similar
functions, the functionality is determined by regulated region.
Often similar functions are encoded in higher order sequence
elements such as, structural motifs in amino acid sequences and
the relation between these and the underlying primary sequence
may not be univocal. As a result, similar functions are
frequently encoded by diverse sequences (Blanco E, Messeguer
et al., 2006).
The information for the control of the initiation of the RNA
synthesis by the RNA polymerase II is mostly contained in the
gene promoter, a region usually 200 to 2,000 nucleotides long
upstream of the transcription start site (TSS) of the gene.
TFBSs are typically 5 to 8 nucleotides long, and one promoter
region usually contains many of them to harbor different TFs
(Wray GA et al., 2003). The motifs appear to be arranged in
specific configurations that confer on each gene an
individualised spatial and temporal transcription program
(Wray GA et al. 2003). However, TFBSs associated to the same
TF are known to tolerate sequence substitutions without losing
functionality, and are often not conserved. Consequently,
promoter regions of genes with similar expression patterns may
not show sequence similarity, even though they may be
regulated by similar configurations of TFs. For instance, only
about 30% to 40% of the promoter regions are conserved
between human and chicken orthologous genes. Despite the
recent progress due to the development of techniques based on
phylogenetic footprinting (Wasserman WW et al., 2004), lack
of nucleotide sequence conservation between functionally re-
lated promoter regions may partially explain the still limited
success of current available computational methods for promo-
ter characterization (Ficket JW et al., 1997) and (Tompa M et.,
al 2005) .
A large amount of work has been carried out in aligning cod-
ing regions of DNA sequences for finding homology between
different species (Stephan et al., 1990). Local pairwise align-
ment methods such as Smith- Waterman (1981), BLAST (Alt-
schul et al. 1990), BLASTZ (Schwartz et al., 2000), SSAHA
(Ning et al., 2001), and BLAT (Kent 2002) are able to pinpoint
locations of rearrangements between two sequences, and are
suitable for aligning nucleotide sequences.
CONREAL, a software tool, aligns promoter se-
quences (Eugene et al., 2005). In this method output consists of
a graph, alignment and a table with predicted TFBSs. A graph
shows aligned positions in the orthologous sequences and the
————————————————
- 1 is with B.M.S College of Engineering, Bangalore.

- 2 is with University of Mysore, Manasa gangotri Mysore.

- 3 is with University of Mysore, Manasa gangotri Mysore.
- 4 is with M.V.J College of Engineering, Bangalore.

JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 82

density of predicted TFBS, facilitating general assessment of
predictions and identification of TFBS "hot spots". A pairwise
alignment of sequences provides single nucleotide resolution
information, whereas a table with predicted conserved TFBS,
sorted by position in the alignment and linked to JASPAR and
TRANSFAC databases, supplies more detailed information on
the transcription factors. But comparison using matching score
is not done. CONREAL focuses on generating an ordered chain
of conserved TFBSs, thus not aligning regions that do not con-
tain them. Siteblast (Michael,M. et al., 2005) is a BLAST-like
heuristic where the TFBS hits are used as seeds, where the se-
quence of hit pairs is aligned using a scoring scheme that con-
siders clustering of sites, binding affinity and conservation,
though the underlying sequences themselves are not aligned.
Other approaches like Monkey (Alan M Moses et al., 2004)
explicitly take into account evolutionary properties of the
TFBSs. MONKEY identify conserved transcription-factor bind-
ing sites in multispecies alignments. It employs probabilistic
models of factor specificity and binding site evolution, on
which basis the likelihood of the putative sites that are con-
served is calculated and statistical significance is assigned to
each hit. The authors illustrate how the significance of real sites
increases with evolutionary distance and explore the relation-
ship between conservation and function, but still perform the
alignment independent of the annotation step.
One of the major challenges facing biologists is to under-
stand the varied and complex mechanisms governing the regu-
lation of gene expression. Sequence conservation across differ-
ent species is an important indicator of functionality. Phyloge-
netic foot printing is referred to as the identification of func-
tional regions by comparing orthologous genomic sequences
between species (Fickett and Wasserman, 2000). With more
sequenced genomes available, comparative analysis of non cod-
ing regions has become an important approach in detecting
promoters or regulatory regions in general (Bejerano et al.,
2004). A phylogenetic tree approach is used to find evolutionary
distance and gene order in metabolic pathways using the struc-
tural information inherent in the promoter sequences such as
presence or absence of TFBS in the sequence (Meera A et al.,
2009).
Alignment algorithms are central area of research in Bioin-
formatics. The present paper places emphasis on a method to
align promoter sequences of different organisms using Dynamic
Programming to arrive at a new distance measure to compare
the promoter sequences. As a case study promoter sequences of
citrate synthase of different mammals are compared. The motifs
are extracted using ‘TF SEARCH’ tool –this tool returns three
values for each extracted motif. They are, the motif number as
given in TRANSFAC database, the corresponding transcription
factor and the percentage of similarity of the extracted motif
with the consensus motif sequence. The motifs are the binding
sites of TFs and the nucleotides between motifs are ignored in
the present study, since the function of these nucleotides is not
studied in depth yet. A string of motifs of two given promoter
sequences are aligned. The algorithm is similar to alignment of
coding regions proposed by Smith and Waterman (1981).

2. METHODOLOGY
We  have  developed  three  methods  of  determining  op‐
timal match between two strings of motifs.  
Method 1:
Promoter  sequences  upstream  of  the  TSS  of  the  en‐
zyme  Citrate  synthase  of  different  mammals  have  been 
extracted from NCBI database. Citrate synthase is present 
in  TCA(Kreb)  cycle  of  Central  metabolic  pathway  which 
is  present  in  almost  all  orgaisms.    Motifs  (TFBS)  are  ex‐
tracted  from  ‘TF  SEARCH’  tool.  These  motif  sequences 
which  represent  the  promoter  sequences  are  aligned  us‐
ing dynamic programming. 
Alignment  score  can  be  assigned  to  an  alignment  as 
follows:  if  a  motif  in  sequence  S1  is  a  match  with  its  cor‐
responding motif in sequence S2 it will receive a score of 1 
(match);  otherwise  it  will  receive  a  score  of  0(mismatch) 
(where  as  in  nucleotide  alignment  this  score  is  ‐1),  and  if 
one  of  the  two  motif  string  is  a  space,  it  will  receive  a 
score  of  ‐1  (gap),  and  the  total  score  of  this  alignment  is 
the number of matches. The optimal alignment problem is 
to  find  the  maximal  score  of  all  possible  alignments  be‐
tween  two  sequences.  This  maximal  score  can  be  used  to 
measure  the  similarity  between  the  two  sequences.  In 
aligning  coding  regions  alignment  process  begins  from 
start of coding regions of the two sequences and proceeds 
forward.  In  the  proposed  alignment  of  two  sequences  of 
motifs, TATA boxes close to start codon are anchored and 
alignment proceeds backwards.  
The  optimal  alignment  score  for  S1[1...i]  and  S2[1...j]  is 
the largest value of the three: 
                           F[ i,j‐1]‐1; 
    F[i,j]=max      F[ i‐1,j‐1] + 1 or 0; 
                 F[ i‐1,j]‐1; 
Here F[i,j‐1]is optimal alignment of S1[1…i]&S2[1…j‐1] 
  F[i‐1,j‐1] is optimal alignment of S1[1…i‐1]&S2[1…j‐1] 
  F[i‐1,j] is optimal alignment of S1[1…i‐1]&S2[1…j] 
  F[1,1] = 1   &    F[0,j] = ‐ j,    F[i,0] = ‐ i 
The  method  is  illustrated  using  following  two  hypo‐
thetical motif sequences.  
 
Suppose  that  motifs  extracted  are  from  two  promoter 
sequences are as given below.  
Sequence 1: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Motif 
label 
TFs  Score in

 216  TATA  86.2 
223  STATx  82 
75  GATA‐1  91 
126  GATA‐1  88.5 
159  C/EBPb  84.2 
101  CdxA  91 
109  C/EBPb  85 
96  Pbx‐1  81.5 
101  CdxA  90.5 
77  GATA‐3  92.5 
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 83

 
Sequence 2: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Table 1: Computation of partial matching score and the
optimal alignment of the sequences above using method
1

    216 241  75  126  101  96  100  77 
  0  ‐1  ‐2  ‐3  ‐4  ‐5  ‐6  ‐7  ‐8 
216  ‐1   1  0  ‐1  ‐2  ‐3  ‐4  ‐5  ‐6 
223  ‐2  0   1  0  ‐1  ‐2  ‐3  ‐4  ‐5 
75  ‐3  ‐1  0   2    1  0  ‐1  ‐2  ‐3 
126  ‐4  ‐2  ‐1  0    3  2  1  0  ‐1 
159  ‐5  ‐3  ‐2  ‐1  2  3  2  1  0 
101  ‐6  ‐4  ‐3  ‐2  1   3  2  1  0 
109  ‐7  ‐5  ‐4  ‐3  0  2  3  2  1 
96  ‐8  ‐6  ‐5  ‐4  ‐1  1    3  2  1 
101  ‐9  ‐7  ‐6  ‐5  ‐2  0  2   3  2 
77  101 ‐8  ‐7  ‐6  ‐3  ‐1  1  2  4 
 

The optimal alignment is as follows 
 
216     223    75    126    159     101    109       96      101      77 
216      241    75    126      ‐      101       ‐        96      100     77     
Match score is 6 and the gap score is 2 



Method-2
Table 2: Computation of partial matching score and the
optimal alignment of the sequences above using method
2




Some TFs are able to tolerate some alterations in the se-
quence of the binding sites. For eg. CdxA can bind to motifs
101 (GTTAATA) and 100 (CATAAAG). In method 1, these
motifs are considered to be distinct where as in this method
they are not differentiated. Each motif is labeled after the cor-
responding TF that is reported to bind to it. Thus the number of
distinct motifs is reduced.

The optimal alignment is as follows
TATA STATx GATA-1 GATA-1 C/EBPb CdxA C/EBPb
Pbx-1 CdxA GATA-3

TATA Nkx-2 GATA-1 GATA- - CdxA -
Pbx-1 CdxA GATA-3

Match score obtained is 7 and the gap score is 2.

Method-3
‘TF SEARCH’ tool assigns scores to extracted mo-
tifs. In this method we have taken into consideration these
scores while finding maximum alignment. If a motif string in
sequence S
1
is a match with corresponding motif in sequence
S
2
the match score assigned is minimum of the respective per-
centages (as given in ‘TF SEARCH’ tool) of the two motifs
considered, unlike a match score of 1 in the previous methods.
The scores assigned to motifs extracted by the tool are >80%.
As the match score can become 0.8 the gap penalty has been
increased to 1.2. This is done to avoid too many gaps in the
alignment.
TATA Nkx-
2
GATA1 GATA1 CdxA Pbx-
1
CdxA GATA3
0 -1 -2 -3 -4 -5 -6 -7 -8
TATA -1 1 0 -1 -2 -3 -4 -5 -6
STAT -2 0 1 0 -1 -2 -3 -4 -5
GATA1 -3 -1 0 2 1 0 -1 -2 -3
GATA1 -4 -2 -1 1 3 2 1 0 -1
C/EB -5 -3 -2 0 2 3 2 1 0
CdxA -6 -4 -3 -1 1 3 2 3 2
C/EB -7 -5 -4 -2 0 2 3 2 3
Pbx-1 -8 -6 -5 -3 -1 1 3 3 2
CdxA -9 -7 -6 -4 -2 0 2 4 3
GATA3 -
10
-8 -7 -5 -3 -1 1 3 5
Motif 
label 
TFs  score 
216  TATA  96.2 
241  Nkx‐2  92 
75  GATA‐1  81 
126  GATA‐1  88.5 
101  CdxA  94.2 
96  Pbx‐1  81 
100  CdxA  95 
77  GATA‐3  81.5 
 
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 84



Table3: Computation of partial matching score and the optimal alignment of the sequences above using method 3:



Motif 216 241 75 126 101 96 100 77
0 -1 -2 -3 -4 -5 -6 -7 -8
216 -1 -0.86 -0.14 -1.14 -2.14 -3.14 -4.14 -5.14 -6.14
223 -2 -0.14 +0.86 -0.14 -1.14 -2.14 -3.14 -4.14 -5.14
75 -3 -1.14 -0.14 +1.67 +0.67 -0.14 -1.14 -2.14 -3.33
126 -4 -2.14 -1.14 +0.67 +2.56 +1.56 +0.56 -0.44 -1.44
159 -5 -3.14 -2.14 -0.33 +1.56 +2.56 +1.56 +0.56 -0.44
101 -6 -4.14 -3.14 -1.33 +0.56 +2.47 +2.56 +1.56 +0.56
109 -7 -5.14 -4.14 -2.33 -0.44 +1.47 +1.56 +2.56 +1.56
96 -8 -6.14 -5.14 -3.33 -1.44 +0.47 +2.28 +1.56 +2.56
101 -9 -7.14 -6.14 -4.33 -2.44 -0.54 +1.28 +2.28 +1.56
77 -10 -8.14 -7.14 -5.33 -3.44 -1.54 +0.28 +1.28 +3.09


The optimal alignment is as follows
216 223 75 126 159 101 109 96 101 77
216 241 75 126 - 101 - 96 100 77
Match score obtained is 6 and the gap score is 2.

3 RESULTS AND DISCUSSION
Tables 4 and 5 show the matching score obtained using the first
two methods. Further in table 4, we have given the BLAST
result of comparison between the respective coding regions in
row1of each cell of the table. In all the methods, %similarity is
calculated as (matching score/maximum of lengths of the two
motif sequences) * 100.











JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 85




Table 4: Alignment scores of coding and motif sequences of citrate synthase of some mammals























































organism
Rat7 Can10 Pan12
Pan3
Hs6 Hs12 Bos5 sus 5 Mac11 Hs19 Pan19 Bos10

Rat 7
100
100
89
3.57
89
3.42
81
3.16
<80
2.26
89
3.16
89
2.77
89
3.57
89
2.84
88
3.63
88
4.08
88
3.11

Can10
89
3.57
100
100
93
18.22
85
19.23
<80
18.87
93
17.48
93
20.42
93
11.11
93
34.30
92
19.12
92
19.47
92
20

Pan12
89
3.42
93
18.22
100
100
83
14.30
85
17.26
99
10.94
92
11.06
93
4.1
98
15.49
84
15.01
97
12.53
92
8.89
Pan3
81
3.16
85
19.23
83
14.30
100
100
81
20.16
87
9.05
84
4.25
84
16.97
88
20.36
87
12.44
87
12.22
87
7.56
Hs6
<80
2.26
<80
18.87
85
17.26
81
20.16
100
100
85
15.48
83
16.77
84
15.5
86
22.58
85
17.58
85
18.06
83
20.48
Hs12
89
3.16
93
17.48
99
10.94
87
9.05
85
15.48
100
100
92
12.76
93
16.75
97
16.5
97
15.32
97
16.97
91
11.33
Bos5
89
2.77
93
20.42
92
11.06
84
4.25
83
16.77
92
12.76
100
100
93
17.23
92
14.68
91
14.46
91
3.82
98
5.96
sus 5
89
3.57
93
11.11
93
4.1
84
16.97
84
15.5
93
16.75
93
17.23
100
100
93
25.5
92
16.46
93
16.82
92
16
Mac11
89
2.84
93
34.30
98
15.49
88
20.36
86
22.58
97
16.5
92
14.68
93
25.5
100
100
95
16.05
95
20.4
84
15.56
Hs19
88
3.63
92
19.12
84
15.01
87
12.44
85
17.58
97
15.32
91
14.46
92
16.46
95
16.05
100
100
98
10.34
84
14
Pan19
88
4.08
92
19.47
97
12.53
87
12.22
85
18.06
97
16.97
91
3.82
92
16.82
95
20.4
98
10.34
100
100
84
13.33
Bos10
88
3.11
92
20
92
8.89
87
7.56
83
20.48
91
11.33
98
5.96
92
16
84
15.56
84
14
84
13.33
100
100
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 86


Table 5: Alignment scores between motif sequences using method 2
organism 
Rat7 
 
Can10  Pan12  
Pan3 
Hs6  Hs12  Bo5  Sus5   Mac11  Hs19  Pan19   Bos10 
Rat7  100  17.3  13.4  17.2  17.1  11.65  16.4  16.66  22.5  18.4  18.75  21.1 
Can10  17.3  100  22.3  22.2  20.9  22.1  23.1  16.03  20.89  17.92  18.5  19.8 
Pan12  13.4  22.3  100  9.9  20.8  14.6  14.7  20.5  17.8  19.4  10.3  12.2 
Pan3  17.2  22.2  9.9  100  24.4  9.7  7.2  20.6  20.5  19.1  12.9  8.7 
Hs6  17.1  20.9  20.8  24.4  100  19.4  21.9  18.2  24.4  18.7  18.7  21.3 
Hs12  11.65  22.1  14.6  9.7  19.4  100  16.4  20.2  17.6  19.9  12.9  11.4 
Bo5  16.4  23.1  14.7  7.2  21.9  16.4  100  19.3  20.1  15.6  13.7  14.2 
Sus5  16.66  16.03  20.5  20.6  18.2  20.2  19.3  100  27.4  17.5  17.7  17.3 
Mac11  22.5  20.89  17.8  20.5  24.4  17.6  20.1  17.4  100  21.3  21.1  18.5 
Hs19  18.4  17.92  19.4  19.1  18.7  19.9  15.6  14.5  21.3  100  16.1  15.6 
Pan19  18.75  18.5  10.3  12.9  18.7  12.9  13.7  14.7  21.1  16.1  100  14.3 
Bos10  21.1  19.8  12.2  8.7  21.3  11.4  14.2  15.3  18.5  15.6  14.3  100 

It may be observed that the match score has increased whencompared to Table 4.
In the above tables,
Sus 5- Sus scrofa chromosome 5
Hs 12-Homosapiens chromosome 12
Bos‐ 5 BosTaurus chromosome 5  
     Can 10‐Cannis familiaris chromosome10  
     Hs 6‐Homosapiens  chromosome 6                                                      
Bos 10‐BosTaurus  10 
 


Rat 7- Rattus chromosome 7
Pan12-PanTrygolodyteschromosome12
Hs 19-Homosapiens chromosome 19
Pan 3-Pan Trygolodytes chromosome 3
Mac 11‐Macaca Mulatta chromosome 11 
Pan 19‐ Pan Trygolodytes chromosome 19 



JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 87


Table 4 shows the alignment scores of coding and mo-
tif sequences of some mammals. The number of align-
ments between two motif sequences is converted to per-
centage. The following pairs of chrosomes of the same
organism show high similarity in the alignment of pro-
moter sequences.
Pan trygolodytes chromosome 3, Pan trygolodytes
chromosome 19; Homosapien chromosome 12, Homosa-
pien chromosome 6. Also, we observe that Homosapien
chromosome 19 & Pan trygolodytes chromosome 19,
Homosapien chromosome 12 & Pan trygolodytes chro-
mosome 12 have nearly identical scores, when compared
with all organisms across all chromosomes considered, in
coding as well as promoter regions. There is no clear evi-
dence of direct relationship (increase in alignment score
of coding region correspond to increase in score of pro-
moter region) between coding and promoter region of
any of the organisms. Rattus chromosome 7 has more or
less same similarity with all organisms regarding coding
as well as promoter regions (row1 of table 4).
Table 5 gives the alignment scores between motif se-
quences using method 2. It may be observed that the
match score has increased when compared to Table 1.This
is expected because the different combinations of same TF
have been grouped and assigned a common TF name. For
eg. TF Caudal type homeodomain protein/ cardiac spe-
cific homeo box (CdXA) binds to CAATAAAACT, AA-
CACGTTATT, AATAAATG, CATTTAAG, ACT-
TAAATT, TTGTGCAATA, ACTTAAAT, ACACGTTA.
These motifs are not differentiated in method 2. Hence
alignment score has increased.
In method 3, the alignment is performed with a new
match score set to the minimum of the percentages of the
matching motifs. It may be observed that motifs have
been assigned percentage by motif extraction programs.
These are all found to be >80%. Hence the match score
can drop to 0.8 unlike a fixed one in the previous me-
thods. The gap penalty is set to -1.2 which falls within the
range recommended in evolving with a gap penalty
(Mount D 2001). Too high a gap penalty does not intro-
duce gaps and too low value introduces too many gaps
both resulting in low alignment. In the present case study
we have found that the alignment score is same as that in
method 1.Comparison of promoter sequences tells us
about differential expressions. Results reported here may
be verified with Transcriptome data.

4 CONCLUSIONS AND FUTURE ENHANCE-
MENT

The promoter sequences are extracted from NCBI da-
tabase upstream of the coding regions of Citrate synthase
in TCA (kreb) cycle. Then the motif sequences from these
promoters are extracted using ‘TF SEARCH’ tool. Com-
parison between these motif sequences of different
mammals is made by alignment of the pair of motif se-
quences using dynamic programming method. Results of
alignment of motif sequences are also compared with the
alignment results of coding region.

Also it has been reported (Sara J et al., 2006) that TFBSs
closer to TATA box are important. We are also working
on the choice of region selection of the promoter by the
user. Also, a phylogenetic tree can be constructed using
the combined score of regulatory region and the coding
region, which is perhaps more meaningful. Also different
uses of percentages reported by TF search tools is being
explored.


REFERENCES

[1]. Blanco E, Messeguer X, Smith TF, Guigo´ R,“Transcription
Factor Map Alignment Of Promoter Regions,” PLoS
Comput Biol 2(5): e49. DOI: 10.1371/journal.pcbi 0020049,
2006
[2]. Wray GA, Hahn MW, Abouheif E, Balhoff JP, Pizer M, et
al., “The Evolution Of Transcriptional Regulation In Euka-
ryotes,” Mol Biol Evol 20:1377–1419, 2003
[3]. Wasserman WW, Sandelin A, “Applied Bioinformatics For
The Identification Of Regulatory Elements,” Nat Rev Genet
5: 276–286, 2004
[4]. Ficket, J.W., Hatzigeorgiou, “Eukaryotic Promoter Recog-
nition,” Genome Res 7, 861–878, 1997
[5]. Stefan Kurtz, Adam Phillippy, Arthur L Delcher, Michael
Smoot, Martin Shumway, Corina Antonescu and Steven L
Salzberg, “Versatile And Open Software For Comparing
Large Genomes”, Genome Biology, 5:R12, 2004
[6]. Smith, T.F. and Waterman, M.S “Identification Of Com-
mon Molecular Subsequences. “J. Mol. Biol. 147: 195–
197,1981
[7]. Altschul S.F., Gish,W., Miller,W., Myers,E.E. and Lip-
man,D.J. “Basic Local Alignment Search Tool,”J. Mol. Biol.,
215, 403–410. [PubMed], 1990
[8]. Schwartz,S., Kent,W.J., Smit,A., Zhang,Z., Baertsch,R.,
Hardison,R.C., Haussler,D. and Miller W,“Human–Mouse
Alignments With BLASTZ.” Genome Res., 13, 103–107, 2003
[9]. Ning, Z., Cox, A.J., and Mullikin, J.C, “SSAHA: A Fast
Search Method For Large DNA Databases,” Genome Res.
11: 1725–1729, 2001
[10]. Kent, W.J. BLAT—“The BLAST-Like Alignment Tool,”
Genome Res. 12: 656–664, 2002
[11]. Eugene Berezikov, Victor Guryev and Edwin Cuppen
“CONREAL Web Server: Identification And Visualization
Of Conserved Transcription Factor Binding Sites,” Nucleic
Acids Research, Vol. 33, Web Server issue W447–W450
doi:10.1093/nar/gki378Gen. Biol., 5, R98, 2005
[12]. Michael,M. et al. “SITEBLAST—Rapid And Sensitive Local
Alignment Of Genomic Analysis Of Transcription-Factor
Binding Affinity,” Cell, 124, 47–59, 2005
[13]. Alan M Moses,Derek Y Chiang,Daniel A Pollard,Venky N
Iyer &Michael BEisen “MONKEY:Identifying Conserved
Transcription Factor Binding Sites In Multiple Align-
ments Using A Binding Site-Specific Evolutionary Model,”
Genome biology vol.5, issue 2,article 98, 2004
[14]. Fickett J.W.and Wasserman W.W “Discovery And Model-
ing Of Transcriptional Regulatory Regions,” Curr.Opin
Biotech.11, 19-24, 2000
[15]. Bejerano G,Siepel A.C,Kent W.J. et al., “Computational
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 88

Screening Of Conserved Genomic DNA In Search Of
Functional Noncoding Elements,” Nat Methods 2,535-545,
2005
[16]. Meera A, Lalitha Rangarajan, Savithri Bhat, “Computa-
tional Approach Towards Finding Evolutionary Distance
And Gene Order Using Promoter Sequences Of Central
Metabolic Pathway,” Inter disciplinary sciences-
computational life sciences DOI: 0.1007/s12539-009-0017-3
[Spriger link], 2009
[17]. Mount.D. Bioinformatics Sequence And Genome Analysis.
Cold Spring Harbor, NY: Cold spring Harbor Laboratory
Press-2001
[18]. Sara J. Cooper, Nathan D. Trinklein, Elizabeth D. Anton,
Loan Nguyen, and Richard M. Myers, “Comprehensive
Analysis Of Transcriptional Promoter Structure And
Function In 1% Of The Human Genome”
doi:10.1101/gr.4222606 Genome Res. 16: 1-10; 2006 original-
ly published online Dec 12, 2005.






































































































JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 89







IEEE Computer Society staff will edit and complete the
final formatting of your paper.
2 PROCEDURE FOR PAPER SUBMISSION
2.1 Review Stage
Detailed submission guidelines can be found on the au-
thor resources Web pages. Author resource guidelines are
specific to each journal, so please be sure to refer to the
correct journal when seeking information. All authors are
responsible for understanding these guidelines before
submitting their manuscript. For further information on
both submission guidelines, authors are strongly encour-
aged to refer to http://computer.org/author/transguide/.
2.2 Final Stage
For papers accepted for publication, it is essential that the
electronic version of the manuscript and artwork match
the hardcopy exactly! The quality and accuracy of the
content of the electronic material submitted is crucial
since the content is not recreated, but rather converted
into the final published version.
All papers in IEEE Computer Society Transactions are
edited electronically. A final submission materials check
list, transmission and compression information, and gen-
eral publication materials can be found at:
http://computer.org/author/transguide/.

2.3 Figures
All tables and figures will be processed as images. You
will have the greatest control over the appearance of your
figures if you are able to prepare electronic image files.
Save them to a file in PostScript (PS) or Encapsulated
PostScript (EPS) formats. Use a separate file for each im-
age. File names should be of the form “fig1.ps” or
“fig2.eps.”
For more information on how to format your figure
or table files for final submission, please go to
http://computer.org/author/transguide/.
2.4 Copyright Form
An IEEE Computer Society copyright form must accom-
pany your final submission. You can get a .pdf, .html, or
.doc version at http://computer.org/copyright.htm. Authors
are responsible for obtaining any security clearances.
For any questions about initial or final submission re-
quirements, please contact one of our staff members. Con-
tact information can be found at:
http://www.computer.org/portal/pages/ieeecs/content
/contact.html.
3 SECTIONS
As demonstrated in this document, the numbering for
sections upper case Arabic numerals, then upper case
Arabic numerals, separated by periods. Initial paragraphs
after the section title are not indented. Only the initial,
introductory paragraph has a drop cap.
4 CITATIONS
IEEE Computer Society style is to not citations in individ-
ual brackets, followed by a comma, e.g. “[1], [5]” (as op-
posed to the more common “[1, 5]” form.) Citation ranges
should be formatted as follows: [1], [2], [3], [4] (as op-
posed to [1]-[4], which is not IEEE Computer Society
style). When citing a section in a book, please give the rele-
vant page numbers [2]. In sentences, refer simply to the refer-
ence number, as in [3]. Do not use “Ref. [3]” or “reference [3]”
At the beginning of a sentence use the author names instead
of “Reference [3],” e.g., “Smith and Smith [3] show ... .” Please
note that references will be formatted by IEEE Computer
Society production staff in the same order provided by
the author.
5 EQUATIONS
If you are using Word, use either the Microsoft Equation
Editor or the MathType add-on (http://www.mathtype.com)
for equations in your paper (Insert | Object | Create New
| Microsoft Equation or MathType Equation). “Float over
text” should not be selected.
Number equations consecutively with equation num-
bers in parentheses flush with the right margin, as in (1).
First, use the equation editor to create the equation. Then,
select the “Equation” markup style. Press the tab key and
write the equation number in parentheses. To make your
equations more compact, you may use the solidus ( / ),
the exp function, or appropriate exponents. Use paren-
theses to avoid ambiguities in denominators. Punctuate
equations when they are part of a sentence, as in
. ) ( ) ( ) | | ( exp
)] 2 ( / [ ) , (
0 2 1
1
0
0 2
0
2
ì ì ì ì ì
u o m m
d r J r J z z
r d dr r F
i i j
r
÷
·
÷ ÷ ·
=
í
í
(1)
Be sure that the symbols in your equation have been
defined before the equation appears or immediately fol-
lowing. Italicize symbols (T might refer to temperature,
but T is the unit tesla). Per IEEE Computer Society, please
refer to “(1),” not “Eq. (1)” or “equation (1),” except at the
beginning of a sentence: “Equation (1) shows ... .” Also
see The Handbook of Writing for the Mathematical Sciences,
1993. Published by the Society for Industrial and Applied
Mathematics, this handbook provides some helpful in-
formation about math typography and other stylistic mat-
ters. For further information about typesetting mathemat-
ical equations, please visit the IEEE Computer Society
styel guide: http://computer.org/author/style/math-exp.
Please note that math equations might need to be refor-
© 2010 Journal of Computing Press, NY, USA, ISSN 2151-9617
http://sites.google.com/site/journalofcomputing/
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 90

matted from the original submission for page layout reasons.
This includes the possibility that some in-line equations will
be made display equations to create better flow in a para-
graph. If display equations do not fit in the two-column for-
mat, they will also be reformatted. Authors are strongly en-
couraged to ensure that equations fit in the given column
width.
6 HELPFUL HINTS
6.1 Figures and Tables
Because IEEE Computer Society staff will do the final
formatting of your paper, some figures may have to be
moved from where they appeared in the original submis-
sion. Figures and tables should be sized as they are to
appear in print. Figures or tables not correctly sized will
be returned to the author for reformatting.
Detailed information about the creation and submis-
sion of images for articles can be found at:
http://computer.org/author/transguide/. We strongly encour-
age authors to carefully review the material posted here
to avoid problems with incorrect files or poorly formatted
graphics.
Place figure captions below the figures; place table
titles above the tables. If your figure has two parts, in-
clude the labels “(a)” and “(b)” as part of the artwork.
Please verify that the figures and tables you mention in
the text actually exist. Figures and tables should be called
out in the order they are to appear in the paper. For ex-
ample, avoid referring to figure “8” in the first paragraph
of the article unless figure 8 will again be referred to after
the reference to figure 7. Please do not include figure
captions as part of the figure. Do not put captions in
“text boxes” linked to the figures. Do not put borders
around the outside of your figures. Per IEEE Computer
Society, please use the abbreviation “Fig.” even at the
beginning of a sentence. Do not abbreviate “Table.”
Tables are numbered numerically.
Figures may only appear in color for certain journals.
Please verify with IEEE Computer Society that the journal
you are submitting to does indeed accept color before
submitting final materials. Do not use color unless it is
necessary for the proper interpretation of your figures.
Figures (graphs, charts, drawing or tables) should be
named fig1.eps, fig2.ps, etc. If your figure has multiple
parts, please submit as a single figure. Please do not give
them descriptive names. Author photograph files should
be named after the author’s LAST name. Please avoid
naming files with the author’s first name or an abbreviated
version of either name to avoid confusion. If a graphic is to
appear in print as black and white, it should be saved and
submitted as a black and white file (grayscale or bitmap.)
If a graphic is to appear in color, it should be submitted as
an RGB color file.
Figure axis labels are often a source of confusion. Use
words rather than symbols. As an example, write the
quantity “Magnetization,” or “Magnetization M,” not just
“M.” Put units in parentheses. Do not label axes only
with units. As in Fig. 1, for example, write “Magnetiza-
tion (A/m)” or “Magnetization (A· m
÷
1
),” not just
“A/m.” Do not label axes with a ratio of quantities and
units. For example, write “Temperature (K),” not “Tem-
perature/K.” Table 1 shows some examples of units of
measure.
Multipliers can be especially confusing. Write “Magne-
tization (kA/m)” or “Magnetization (103 A/m).” Do not
write “Magnetization (A/m) × 1,000” because the reader
would not know whether the top axis label in Fig. 1 meant
16,000 A/m or 0.016 A/m. Figure labels should be legible,
approximately 8 to 12 point type. When creating your
graphics, especially in complex graphs and charts, please
ensure that line weights are thick enough that when repro-
duced at print size, they will still be legible. We suggest at
least 1 point.
6.3 Footnotes
Number footnotes separately in superscripts (Insert | Foot-
note)
1
. Place the actual footnote at the bottom of the column
in which it is cited; do not put footnotes in the reference list
(endnotes). Use letters for table footnotes (see Table 1).
Please do not include footnotes in the abstract and avoid
using a footnote in the first column of the article. This will
cause it to appear of the affiliation box, making the layout
look confusing.
6.4 Lists
The IEEE Computer Society style is to create displayed
lists if the number of items in the list is longer than three.
For example, within the text lists would appear 1) using a
number, 2) followed by a close parenthesis. However,
longer lists will be formatted so that:
1. Items will be set outside of the paragraphs.

1
It is recommended that footnotes be avoided (except for the unnum-
bered footnote with the receipt date on the first page). Instead, try to
integrate the footnote information into the text.

Fig. 1. Magnetization as a function of applied field. Note that “Fig.” is
abbreviated. There is a period after the figure number, followed by one
space. It is good practice to briefly explain the significance of the figure
in the caption.
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 91

2. Items will be punctuated as sentences where it is
appropriate.
3. Items will be numbered, followed by a period.
6.5 Theorems and Proofs
Theorems and related structures, such as axioms corollaries,
and lemmas, are formatted using a hanging indent para-
graph. They begin with a title and are followed by the text,
in italics.
Theorem 1. Theorems, corollaries, lemmas, and related struc-
tures follow this format. They do not need to be numbered,
but are generally numbered sequentially.
Proofs are formatted using the same hanging indent for-
mat. However, they are not italicized.
Proof. The same format should be used for structures such
as remarks, examples, and solutions (though these
would not have a Q.E.D. box at the end as a proof does). 
7 END SECTIONS
7.1 Appendices
Appendixes, if needed, appear before the acknowledgment.
In the event multiple appendices are required, they will be
labeled “Appendix A,” “Appendix B, “ etc. If an article does
not meet submission length requirements, authors are
strongly encouraged to make their appendices supplemental
material.
IEEE Computer Society Transactions accepts supple-
mental materials for review with regular paper submis-
sions. These materials may be published on our Digital
Library with the electronic version of the paper and are
available for free to Digital Library visitors. Please see our
guidelines below for file specifications and information.
Any submitted materials that do not follow these specifi-
cations will not be accepted. All materials must follow US
copyright guidelines and may not include material pre-
viously copyrighted by another author, organization or
company. More information can be found at
http://computer.org/author/transguide/
SuppMat.htm.
7.2 Acknowledgments
The preferred spelling of the word “acknowledgment” in
American English is without an “e” after the “g.” Use the
singular heading even if you have many acknowledg-
ments. Avoid expressions such as “One of us (S.B.A.)
would like to thank ... .” Instead, write “F. A. Author
thanks ... .” Sponsor and financial support acknowledg-
ments are included in the acknowledgment section. For
example: This work was supported in part by the US De-
partment of Commerce under Grant BS123456 (sponsor
and financial support acknowledgment goes here). Re-
searchers that contributed information or assistance to the
article should also be acknowledged in this section.
7.3 References
Unfortunately, the Computer Society document translator
cannot handle automatic endnotes in Word; therefore,
type the reference list at the end of the paper using the
“References” style. See the IEEE Computer Society’s style
for reference formatting at: http://computer.org/author/style/
transref.htm. The order in which the references are submit-
ted in the manuscript is the order they will appear in the
final paper, i.e., references submitted nonalphabetized
will remain that way.
Please note that the references at the end of this docu-
ment are in the preferred referencing style. Within the
text, use “et al.” when referencing a source with more
than three authors. In the reference section, give all au-
thors’ names; do not use “et al.” Do not place a space be-
tween an authors' initials. Papers that have not been pub-
lished should be cited as “unpublished” [4]. Papers that
have been submitted or accepted for publication should
be cited as “submitted for publication” [5]. Please give
affiliations and addresses for personal communications
[6].
Capitalize all the words in a paper title. For papers pub-
lished in translation journals, please give the English citation
first, followed by the original foreign-language citation [7].
7.3 Additional Formatting and Style Resources
Additional information on formatting and style issues can be
obtained in the IEEE Computer Society Style Guide, which is
posted online at: http://computer.org/author/style/. Click on
the appropriate topic under the Special Sections link.
4 CONCLUSION
Although a conclusion may review the main points of
the paper, do not replicate the abstract as the conclu-
sion. A conclusion might elaborate on the importance
of the work or suggest applications and extensions.
Authors are strongly encouraged not to call out mul-
tiple figures or tables in the conclusion—these should
TABLE 1
UNITS FOR MAGENTIC PROPERTIES

Statements that serve as captions for the entire table do not need footnote letters.
a
Gaussian units are the same as cgs emu for magnetostatics; Mx = maxwell,
G = gauss, Oe = oersted; Wb = weber, V = volt, s = second, T = tesla, m =
meter, A = ampere, J = joule, kg = kilogram, H = henry.
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 92

be referenced in the body of the paper.
ACKNOWLEDGMENT
The authors wish to thank A, B, C. This work was sup-
ported in part by a grant from XYZ.
REFERENCES
[1] J.S. Bridle, “Probabilistic Interpretation of Feedforward Classification
Network Outputs, with Relationships to Statistical Pattern Recogni-
tion,” Neurocomputing—Algorithms, Architectures and Applications, F. Fo-
gelman-Soulie and J. Herault, eds., NATO ASI Series F68, Berlin: Sprin-
ger-Verlag, pp. 227-236, 1989. (Book style with paper title and editor)
[2] W.-K. Chen, Linear Networks and Systems. Belmont, Calif.:
Wadsworth, pp. 123-135, 1993. (Book style)
[3] H. Poor, “A Hypertext History of Multiuser Dimensions,”
MUD History, http://www.ccs.neu.edu/home/pb/mud-
history.html. 1986. (URL link *include year)
[4] K. Elissa, “An Overview of Decision Theory," unpublished.
(Unplublished manuscript)
[5] R. Nicole, "The Last Word on Decision Theory," J. Computer
Vision, submitted for publication. (Pending publication)
[6] C. J. Kaufman, Rocky Mountain Research Laboratories, Bould-
er, Colo., personal communication, 1992. (Personal communica-
tion)
[7] D.S. Coming and O.G. Staadt, "Velocity-Aligned Discrete
Oriented Polytopes for Dynamic Collision Detection," IEEE
Trans. Visualization and Computer Graphics, vol. 14, no. 1, pp. 1-
12, Jan/Feb 2008, doi:10.1109/TVCG.2007.70405. (IEEE Transac-
tions )
[8] S.P. Bingulac, “On the Compatibility of Adaptive Controllers,”
Proc. Fourth Ann. Allerton Conf. Circuits and Systems Theory, pp.
8-16, 1994. (Conference proceedings)
[9] H. Goto, Y. Hasegawa, and M. Tanaka, “Efficient Scheduling
Focusing on the Duality of MPL Representation,” Proc. IEEE
Symp. Computational Intelligence in Scheduling (SCIS ’07), pp. 57-
64, Apr. 2007, doi:10.1109/SCIS.2007.367670. (Conference pro-
ceedings)
[10] J. Williams, “Narrow-Band Analyzer,” PhD dissertation, Dept. of
Electrical Eng., Harvard Univ., Cambridge, Mass., 1993. (Thesis
or dissertation)
[11] E.E. Reber, R.L. Michell, and C.J. Carter, “Oxygen Absorption
in the Earth’s Atmosphere,” Technical Report TR-0200 (420-46)-
3, Aerospace Corp., Los Angeles, Calif., Nov. 1988. (Technical
report with report number)
[12] L. Hubert and P. Arabie, “Comparing Partitions,” J. Classifica-
tion, vol. 2, no. 4, pp. 193-218, Apr. 1985. (Journal or magazine
citation)
[13] R.J. Vidmar, “On the Use of Atmospheric Plasmas as Electromagnetic
Reflectors,” IEEE Trans. Plasma Science, vol. 21, no. 3, pp. 876-880,
available at http://www.halcyon.com/pub/journals/21ps03-
vidmar, Aug. 1992. (URL for Transaction, journal, or magzine)
[14] J.M.P. Martinez, R.B. Llavori, M.J.A. Cabo, and T.B. Pedersen,
"Integrating Data Warehouses with Web Data: A Survey," IEEE
Trans. Knowledge and Data Eng., preprint, 21 Dec. 2007,
doi:10.1109/TKDE.2007.190746.(PrePrint)

First A. Author Biographies should be limited to one paragraph
consisting of the following: sequentially ordered list of degrees, in-
cluding years achieved; sequentially ordered places of employ con-
cluding with current employment; association with any official jour-
nals or conferences; major professional and/or academic achieve-
ments, i.e., best paper awards, research grants, etc.; any publication
information (number of papers and titles of books published); current
research interests; association with any professional associations.

Second B. Author Jr. biography appears here. Degrees achieved
followed by current employment are listed, plus any major academic
achievements.

Third C. Author is a member of the IEEE and the IEEE Computer
Society.

Sign up to vote on this title
UsefulNot useful