Parallelizing Partial Digest Problem On Multicore System PDF

Parallelizing Partial Digest Problem
on Multicore System
Hazem M. Bahig1,2(&), Mostafa M. Abbas3,4,

and M.M. Mohie-Eldin5
1
Computer Science Division, Department of Mathematics, Faculty of Science,
Ain Shams University, Cairo 11566, Egypt
Hazem.m.bahig@gmail.com
2
College of Computer Science and Engineering,
Hail University, Hail, Kingdom of Saudi Arabia
3
Qatar Computing Research Institute,
Hamad Bin Khalifa University, Doha, Qatar
mohamza@hbku.edu.qa
4
KINDI Center for Computing Research, College of Engineering,
Qatar University, Doha, Qatar
5
Department of Mathematics, Faculty of Science,
Al-Azhar University, Cairo, Egypt
Abstract. The partial digest problem, PDP, is one of the methods used in
restriction mapping to characterize a fragment of DNA. The main challenge of
PDP is the exponential time for the best exact sequential algorithm in the worst
case. In this paper, we reduce the running time for generating the solution of
PDP by designing an efficient parallel algorithm. The algorithm is based on
parallelizing the fastest sequential algorithm for PDP. The experimental study on
a multicore system shows that the running time of the proposed algorithm
decreases with the number of processors increases. Also, the speedup achieved
good scales with increase in the number of processors.
Keywords: Partial digest problem Parallel algorithm Scalability Multicore
1 Introduction
Physical mapping of the genome is one of the fundamental steps in genome studies.
One of the methods that is used in physical mapping is the digestion of DNA with one
restriction enzyme this is the partial digestion process. The enzyme cuts the double
stranded DNA within a specific short sequence of nucleotides called restriction sites.
After that we measure the lengths of obtained fragments and reconstruct the original
ordering of these fragments [1]. For example, the restriction enzyme TaqI cuts the
luciferase gene at the tcga sequence [2].
Several applications of genomic studies that require the genome mapping are
determining the order of genes or extracting the distinctive short fragments of DNA
sequence, and comparing the genomes of various species [3–7].
© Springer International Publishing AG 2017

I. Rojas and F. Ortuño (Eds.): IWBBIO 2017, Part II, LNBI 10209, pp. 95–104, 2017.
DOI: 10.1007/978-3-319-56154-7_10
96 H.M. Bahig et al.
The combinatorial problem for partial digestion is called the partial digest problem,
PDP. Assume that the set of restriction site locations is represented as the set X = {x0,
x1, …, xn} and the multiset of lengths of DNA fragments is represented as the multiset
D = {d1, d2, …, dm}. The PDP is defined as follows [8].
Given a multiset D = {d1, d2, …, dm}. Find the set X = {x0, x1, …, xn} such that
DX = {| xj – xi |, 0 i < j n} = D.
For example, the output of the partial digestion process when we use the restriction
enzyme tcga on the luciferase gene is D = {9, 30, 100, 170, 293, 302, 393, 402, 462,
562, 632, 732, 855, 864, 945, 954, 975, 984, 1025, 1034, 1247, 1277, 1347, 1377,
1809, 1839, 1979, 2009}. The goal of the PDP is to find the set of restriction sites
locations which is X = {0, 30, 975, 984, 1277, 1377, 1839, 2009} [2].
The complexity analysis of the exact solution for PDP is a still an open problem
[9–11]. Many research papers are being introduced to find exact and approximate
solutions for PDP [12–20]. The main challenge for finding the exact solution of PDP is
the exponential time required for the best known sequential algorithm in case of the
worst case. In [21], Zhang gave an example for the worst case instances. Before 2016,
the best practical sequential algorithm for PDP is the algorithm designed by Skiena,
Smith, and Lemke [18]. Recently, Fomin presented an algorithm for PDP, in [19],
which is faster than the Skiena, Smith, and Lemke algorithm in some cases. But still
Skiena, Smith, and Lemke algorithm is better than Formin’s algorithm in Zhang’s
instances. In the same year, Abbas and Bahig [20] proposed the fastest exact sequential
algorithm for PDP. For Zhang’s data, the improvement is greater than 75% over the
Skiena, Smith, and Lemke algorithm.
The goal of this research paper is to reduce the running time of the fastest exact
sequential algorithm [20] because, with large value of n, the running time of the
algorithm that is proposed by Abbas and Bahig is still high. The algorithm takes
approximately 19 h for n = 90; while the running time of the Skiena, Smith, and
Lemke algorithm on the same n is greater than one day. To achieve this goal, we will
use high performance computing to speedup the running time of the fastest exact
sequential algorithm.
The rest of this paper is as follows. In Sect. 2, we describe briefly the fastest exact
sequential for PDP which is the BBb2 algorithm. In Sect. 3, we introduce a new
parallel algorithm on multicore system for PDP that is based on the BBb2 algorithm. In
Sect. 4, we study the proposed algorithm experimentally according to running time,
memory consumed, and scalability in the worst case. Section 5 contains the conclusion
of our work.
2 BBb2 Algorithm
The BBb2 algorithm is the proposed algorithm by Abbas and Bahig [20] to find the
exact solution of the PDP. The algorithm is based on two main stages. In the first stage,
the algorithm applies the breadth-first strategy while using the two bounding conditions
that are suggested by Skiena, Smith, and Lemke [18]. In addition, the BBb2 algorithm
deletes all repeated subproblems at the same level. For more details about the condition
Parallelizing Partial Digest Problem 97
for repeated subproblems, see Theorems 1 and 2 in [20]. The subroutine that is used to
traverse the tree level by level is called GenerateNextLevel [20]. Also, in the first stage
we will traverse the search tree by using breadth-first strategy for a certain number of
levels. This number of levels is determined by using a subroutine called Find_aM, see
[20]. In the second stage of BBb2 algorithm, we solve the subproblems at level aM,
individually, using the breadth first strategy. The values of all elements at the current
level are represent by the two lists LD and LX. The steps of BBb2 algorithm are as
follows.
3 Parallel Breadth-Breadth Algorithm
In this section, we propose a parallel algorithm, PBBb2, for PDP based on the algo-
rithm BBb2 under multi-core architecture.
In the PBBb2 algorithm, we parallelize the two main stages of the BBb2 algorithm.
In the parallelization of the first stage, we build the solution tree of PDP in the breadth
first strategy sequentially till the number of subproblems at a level is greater than or
equal to the number of processors, P. After that we assign the subproblems to the
processors to work on them till the level aM. We can summaries the main steps of the
parallelization of the first stage as follows.
1. Apply the BBb2 algorithm from line 1–6, where T is a list contains the values of D
and X for each subproblem and initially equal to (D,{0, maximum(D)}).
2. Repeat the following until reaching level aM.
(a) If the number of elements of T is less than P, then we apply the procedure
GenerateNextLevel many times, at least one, on T until the number of elements
of T is greater than or equal to P or until reaching level aM. If the algorithm
reaches to the level aM, we terminate the process of the first stage and we go to
the second stage.
(b) If the number of elements of T is greater than or equal to P, then we do the
following:
(i) Remove the first k*P elements from the list T and assign it to a new
temporary list R, where k is an integer and k = ⎿| T | /P⏌.
(ii) Each processor, pi, works dynamically on one element, e, from R as
follows:
• Adding the element, e, to a temporary list Wi.
• Calling the procedure GenerateNextLevel until reaching level aM and
saving the output to the list Ti.
(iii) The first processor (from the P processors worked on the elements of
R) finished with the execution of its work in (ii) will go to Step (a).
3. Add the elements of Ti, 0 i P − 1, to the list T.
4. Remove the duplication from the list T.
In the parallelization of the second stage, we assign the elements of the list T to the
processors and then each processor works on the assigned element until the leaf of the
search tree using the breadth first manner or the bounding conditions cut this element.
We can summarize the parallelization of the second stage as the following steps.
Repeat the following until the list T is empty.
1. If the number of elements of T is less than P, then we apply the procedure Gen-
erateNextLevel many times, at least one, on T until the number of elements of T is
greater than or equal to P or until the list T is empty. In case of the list T is empty,
we terminate the second stage.
2. If the number of elements of T is greater than or equal to P, then we do the
following:
(a) Remove the first k*P elements from the list T and assign it to a new temporary
list R, where k is an integer and k = ⎿| T |/P⏌.
(b) Each processor, pi, works dynamically on each element e 2 R by executing the
steps from line 10 to 16 in BBb2 algorithm. If the processor pi, found a solution,
say si, then we add si to the set of solutions S if it does not exist in S.
(c) The first processor (from P processors worked on the elements of R) finished
with the execution of its work in (b) will go to Step 1.
4 Performance of PBBb2 Algorithm
In this section, we will evaluate the performance of PBBb2 algorithm by comparing it

with the fast sequential algorithm, BBb2, on Zhang data [21] according to running time
and memory consumption. We also measure the scalability of PBBb2 algorithm.
To evaluate the performance of PBBb2 algorithm, we use the methodology test that
is presented in [20] while we conduct the experiments using multi-core. The machine
used in the experimental study is an Intel Xeon E5-2690 and consists of Dual Octa-core
processors machine. Each processor has a speed of 2.9 GHz. The memory of the
machine is 128 GB RAM, while the cache is 20 MB. Both algorithms were imple-
mented using the C++ language and openMP directives.
In the case of the running time, Table 1 demonstrates the running time of the
PBBb2 and BBb2 algorithms for 35 n 90, where the symbols ‘s’, ‘m’ and ‘h’
are used for second, minute, and hour respectively. The number of processors used to
run the PBBb2 algorithm is P = 2, 4, 6, 8, and 10. If P = 1, the PBBb2 algorithm is
equivalent to the BBb2 algorithm. It is clear that for fixed values of n the running time
of the PBBb2 algorithm using P > 1 is faster than the BBb2 algorithm. Also, the
running time for the PBBb2 algorithm decreases as the number of processors increases
as in Fig. 1. In Fig. 1, we use the log scale to represent the running time of PBBb2.
Table 1. Running time for BBb2 and PBBb2 algorithms

n BBb2 PBBb2
P
1 2 4 6 8 10
35 0.146 s 0.079 s 0.04 s 0.03 s 0.025 s 0.016 s
40 0.453 s 0.228 s 0.125 s 0.091 s 0.078 s 0.045 s
45 1.558 s 0.793 s 0.441 s 0.315 s 0.28 s 0.192 s
50 3.923 s 1.887 s 0.941 s 0.714 s 0.574 s 0.524 s
55 15.801 s 6.961 s 3.873 s 2.932 s 2.426 s 2.233 s
60 33.134 s 15.736 s 8.685 s 5.956 s 4.884 s 4.239 s
65 1.967 m 0.898 m 0.495 m 0.352 m 0.299 m 0.26 m
70 6.438 m 2.803 m 1.615 m 1.099 m 0.966 m 0.813 m
75 30.027 m 16.109 m 7.999 m 5.507 m 3.388 m 2.969 m
80 44.329 m 22.83 m 10.811 m 6.981 m 5.984 m 5.144 m
85 4.655 h 2.387 h 1.056 h 0.877 h 0.824 h 0.773 h
90 18.525 h 11.52 h 6.104 h 4.546 h 3.74 h 3.526 h
Figure 2 represents the scalability of the PBbb2 algorithm as a function of the

number of processors P and the problem size n, where the speedup is defined as the
ratio between the fastest sequential algorithm and the parallel algorithm. Figure 2
demonstrates that the speedup achieved good scaling with increasing the number of
processors.
2 18
55
1.8 35 16
60
1.6 40 14
65
1.4 45 12
time in minutes
70
time in second
1.2 50
10
1 75
8
0.8
6
0.6
4
0.4
0.2 2
0 0
2 4 6 8 10 2 4 6 8 10
Number of processors Number of processors
a: 35 ≤ n ≤ 50 b: 55 ≤ n ≤ 75
c: 80 ≤ n ≤ 90
Fig. 1. Running time of the PBBb2 algorithm.
Figure 3 shows the memory consumption of PBBb2 algorithm using different

numbers of processors. We used the log scale to represent the memory consumption of
the PBBb2 algorithm. From Fig. 3, we can note that in most instances, the memory
consumption of the PBBb2 algorithm increases as the number of processors increases.
The reason behind increasing the memory consumption in parallelism is that the
P processors work on P different subproblems at the same time. In other words, the
processors will build different subtrees at the same time in the breadth first manner.
12 9
35 55
40 8 60
10
45 7 65
8 50 70
6
Speedup
Speedup
5
6
4
4 3
2
2
1
0 0
2 4 6 8 10 2 4 6 8 10
a: 35 ≤ n ≤ 50 b: 55 ≤ n ≤ 70
12
75
10 80
85
8 90
Speedup
0
2 4 6 8 10
Number of processors
c: 75 ≤ n ≤ 90
Fig. 2. Scalability of the PBBb2 algorithm.

12 140
35 55
10 120
40 60
Memory (MByte)
Memory (MByte)
45 100 65
8
50 70
80
6
60
4
40
2
20
0 0
1 2 4 6 8 10 1 2 4 6 8 10
a: 35 ≤ n ≤ 50 b: 55 ≤ n ≤ 70
c: 75 ≤ n ≤ 90
Fig. 3. Memory consumption for the PBBb2 algorithm.
5 Conclusions
In this research paper, we parallelized the fast exact algorithm for the partial digest
problem, PDP. The main challenge of PDP is the exponential time for the best exact
sequential algorithm in the worst case. The proposed algorithm is based on working on
many independent subproblems at the same time and traversing the search tree with the
breadth–first strategy. The experimental results on multicore system have shown that
the running time of the parallel algorithms decreases as the number of processors
increases. The average efficiency of the PBBb2 algorithm is 88.53%. Also, the speedup
achieved good scaleing with increasing the number of processors.
References
1. Pevzner, P.: DNA physical mapping and alternating eulerian cycles in colored graphs.
Algorithmica 13(1–2), 77–105 (1995)
2. Devine, J.H., Kutuzova, G.D., Green, V.A., Ugarova, N.N., Baldwin, T.O.: Luciferase from
the east European firefly Luciola mingrelica: cloning and nucleotide sequence of the cDNA,
overexpression in Escherichia coli and purification of the enzyme. Biochimica et Biophysica
Acta (BBA)-Gene Struct. Expr. 1173(2), 121–132 (1993)
3. Baker, M.: Gene-editing nucleases. Nat. Methods 9(1), 23–26 (2012)
4. Sambrook, J., Fritsch, E.F., Maniatis, T.: Molecular Cloning. A Laboratory Manual, 2nd
edn., pp. 1.63–1.70. Cold Spring Harbor Laboratory Press, Cold Spring Harbor (1989)
5. He, X., Hull, V., Thomas, J.A., Fu, X., Gidwani, S., Gupta, Y.K., Black, L.W., Xu, S.Y.:
Expression and purification of a single-chain Type IV restriction enzyme Eco94GmrSD and
determination of its substrate preference. Sci. Rep. 5, 9747 (2015)
6. Narayanan, P.: Bioinformatics: A Primer. New Age International (2005)
7. Dear, P.H.: Genome mapping. eLS (2001)
8. Jones, N.C., Pevzner, P.: An Introduction to Bioinformatics Algorithms. MIT Press,
Cambridge (2004)
9. Lemke, P., Werman, M.: On the complexity of inverting the autocorrelation function of a
finite integer sequence, and the problem of locating n points on a line, given the (nC2)
unlabelled distances between them. Preprint 453 (1988)
10. Daurat, A., Gérard, Y., Nivat, M.: Some necessary clarifications about the chords’ problem
and the partial digest problem. Theoret. Comput. Sci. 347(1–2), 432–436 (2005)
11. Cieliebak, M., Eidenbenz, S., Penna, P.: Noisy Data Make the Partial Digest Problem
NP-Hard. Springer, Heidelberg (2003)
12. Pandurangan, G., Ramesh, H.: The restriction mapping problem revisited. J. Comput. Syst.
Sci. 65(3), 526–544 (2002)
13. Błażewicz, J., Formanowicz, P., Kasprzak, M., Jaroszewski, M., Markiewicz, W.T.:
Construction of DNA restriction maps based on a simplified experiment. Bioinformatics
17(5), 398–404 (2001)
14. Blazewicz, J., Burke, E.K., Kasprzak, M., Kovalev, A., Kovalyov, M.Y.: Simplified partial
digest problem: enumerative and dynamic programming algorithms. IEEE/ACM Trans.
Comput. Biol. Bioinf. 4(4), 668–680 (2007)
15. Karp, R.M., Newberg, L.A.: An algorithm for analysing probed partial digestion
experiments. Comput. Appl. Biosci. 11(3), 229–235 (1995)
16. Nadimi, R., Fathabadi, H.S., Ganjtabesh, M.: A fast algorithm for the partial digest problem.
Jpn J. Ind. Appl. Math. 28(2), 315–325 (2011)
17. Ahrabian, H., Ganjtabesh, M., Nowzari-Dalini, A., Razaghi-Moghadam-Kashani, Z.:
Genetic algorithm solution for partial digest problem. Int. J. Bioinform. Res. Appl. 9(6),
584–594 (2013)
18. Skiena, S.S., Smith, W.D., Lemke, P.: Reconstructing sets from interpoint distances. In:
Proceedings of the Sixth Annual Symposium on Computational Geometry, pp. 332–339.
ACM (1990)
19. Fomin, E.: A simple approach to the reconstruction of a set of points from the multiset of n2
pairwise distances in n2 steps for the sequencing problem: II algoirthm. J. Comput. Biol. 23,
1–7 (2016)
20. Abbas, M.M., Bahig, H.M.: A fast exact sequential algorithm for the partial digest problem.
BMC Bioinform. 17, 1365 (2016)
21. Zhang, Z.: An exponential example for a partial digest mapping algorithm. J. Comput. Biol.
1(3), 235–239 (1994)

Parallelizing Partial Digest Problem On Multicore System PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Parallelizing Partial Digest Problem On Multicore System PDF

Uploaded by

Copyright:

Available Formats

Parallelizing Partial Digest Problem

Hazem M. Bahig1,2(&), Mostafa M. Abbas3,4,

Keywords: Partial digest problem Parallel algorithm Scalability Multicore

© Springer International Publishing AG 2017

3 Parallel Breadth-Breadth Algorithm

4 Performance of PBBb2 Algorithm

In this section, we will evaluate the performance of PBBb2 algorithm by comparing it

Table 1. Running time for BBb2 and PBBb2 algorithms

Figure 2 represents the scalability of the PBbb2 algorithm as a function of the

Fig. 1. Running time of the PBBb2 algorithm.

Figure 3 shows the memory consumption of PBBb2 algorithm using different

Fig. 2. Scalability of the PBBb2 algorithm.

Fig. 3. Memory consumption for the PBBb2 algorithm.

You might also like