Professional Documents
Culture Documents
Prior Solutions
Parallel Binary Search implementations with OpenMP do exist, however; rarely is there any specifically real-world timing
data comparison between a serial and parallel implementation. Due to such circumstance, this scientific paper aims to
compare the real-world performance between a parallel and serial implementation. It also aims to investigate any real-world
performance differences between a serial and parallel approach to Binary Search.
2 Design
Typical Serial Binary Search
Typically, a serial approach to non-recursive Binary Search algorithm goes as follows:
# pragma omp s e c t i o n
t h r e a d _ o n e = binary Search_openmp ( 0 , q u a r t e r _ s l i c e , s e a r c h V a l ) ;
# pragma omp s e c t i o n
t h r e a d _ t w o = binary Search_openmp ( q u a r t e r _ s l i c e + 1 , middle , s e a r c h V a l ) ;
# pragma omp s e c t i o n
t h r e a d _ t h r e e = binary Search_openmp ( middle + 1
# pragma , quarter_slice
∗
3 , s e a r c h V a l ) ; omp s e c t i o n
} t h r e a d _ f o u r = binary Search_openmp ( ( q u a r t e r _ s l i c e ∗ 3 ) + 1 , l a s t , s e a r c h V a l ) ;
}
Results
Below, contains both timing results between Parallel Binary Search and Parallel Binary Search using OpenMP. Timing data
was recorded using omp_get_wtime(). Four runs were taken per entry and timing data was averaged. The operation was a
Binary Search, where the user will input any number between 1 and 200,000,00 to search for.
Thread Run time (seconds)
Count
1 0.0000061
2 0.0000269
4 0.0000421
8 0.0000564
Table 2. Serial Binary Search Timing Table 3. Parallel Binary Search Timing
Comparison
k
Efficiency Estimate log(n
2)
k2
Table 4. Theoretical Estimates of Parallel Binary Search
Real-World Results
Referring to the data displayed in Table 2 and Table 3, the serial implementation of Binary Search is significantly faster than
the parallel implementation. As the number of threads performing Binary Search increased, the operation took exponentially
longer. There are a few reasons I can hypothesize as to why these results occurred.
Effectivity Limits of Parallel Solutions
One crucial idea to note, when finding a parallel approach to a problem, is that the serial operation needs to be costly
enough to computer resources to parallelize. Often times, complex calculations such as matrix multiplication or dot product
are used to demonstrate parallel effectivity, for a specific reason. That is, the operation needs to be costly enough to
outweigh the performance repercussions of a parallel solution. Specific performance repercussions such as thread spawning,
excessive function calls, voltage and thermal limitations across several threads, and cache hierarchy issues can all play a
part in slowing down performance. In the case of Binary Search, the operation simply is not costly enough to make useful
out of a parallel approach. OpenMP is not a solution to all problems.
Implementation
Listing 3. binarySearch_openMP.c
#include"stdio.h"
#include"stdlib.h
"#include"string.
h"
# i n c l u d e < s y s / t im e . h> / / High r e s o l u t i o n t i m e r
# i n c l u d e <omp . h>
/ / High r e s o l u t i o n t i m e r
inlineuint64_t rdtsc()
{
u i n t 3 2 _ t lo , h i ;
__asm volatile__(
" x o r l %%eax , %%e ax \ n "
"cpuid\n"
"rdtsc\n"
: " =a " ( l o ) , " =d " ( h i )
:
: "%ebx " , "%ecx " ) ;
return ( u i n t 6 4 _ t ) h i << 32 | l o ;
}
/ ∗
void r e a d I n p u t F i l e ( ) −
Takes c o n t e n t s from f i l e i n p u t 1 . t x t and p l a c e s i t i n t o array randomNums [ ]
∗/
void r e a d I n p u t F i l e ( )
{
FILE i n∗p u t F i l e ;
i n p u t F i l e = fopen ( " i n p u t 1 . t x t " , " r " ) ;
inti;
i f ( i n p u t F i l e == NULL)
{
p r i n t f ( " E r r o r Reading F i l e i n p u t 1 . t x t \ n " ) ; e
xit(0);
}
close(inputFile);
printf("\n");
}
/ ∗
void b i n a r y S e a r c h _ s e r i a l ( ) −
Performs s e r i a l b i nary s earch on randomNums [ ]
∗/
intbinarySearch_serial(int searchVal)
{
i n t num_elements = ARR_SIZE ;
intfirst=0;
i n t l a s t = num_elements − 1 ;
while ( f i r s t <= l a s t )
{
i n t middle = f i r s t + ( l a s t − f i r s t ) / 2 ;
/ ∗
void binary Search_openmp ( ) −
Performs p a r a l l e l b i nary s earch u s i ng OpenMP on randomNums [ ]
∗/
i n t binary Search_openmp ( i n t f i r s t , i n t l a s t , i n t s e a r c h V a l )
{
/ / / / P r i n t c u r r e n t t h r ead number
/ / i n t t i d = omp_get_thread_num ( ) ;
/ / p r i n t f ( " Hello World from t h r ead = %d \ n " , t i d ) ;
while ( f i r s t <= l a s t )
{
i n t middle = f i r s t + ( l a s t − f i r s t ) / 2 ;
/ ∗
void s e r i a l _ w o r k ( ) −
Performs a l l work r e l a t e d t o S e r i a l code , i n c l u d i n g :
— Allprint statements
— Timing o f S e r i a l work
— Binary s earch f u n c t i o n c a l l s
∗/
void s e r i a l _ w o r k ( i n t s e a r c h V a l )
{
intresult;
double s t a r t , end , t o t a l _ t i m e ;
s t a r t = omp_get_wtime ( ) ;
/ / Perform s e r i a l Binary s earch with t i m i n g
r e s u l t = b i n a r y S e a r c h _ s e r i a l ( s e a r c h V a l ) ; / / Binary s earch here
end = omp_get_wtime ( ) ;
t o t a l _ t i m e = end − s t a r t ;
/ / P r i n t r e s u l t s o f s e r i a l Binary s earch
if (result!= 1)−
{
p r i n t f ( " Element %d found ! At i n d e x %d \ n " , s e a r c h V a l , r e s u l t ) ;
}
else
{
p r i n t f ( " Element %d n o t found \ n " , s e a r c h V a l ) ;
}
printf("\n");
}
/ ∗
void p a r a l l e l _ w o r k ( ) −
Performs a l l work r e l a t e d t o P a r a l l e l i z e d code , i n c l u d i n g :
— Allprint statements
— Timing o f p a r a l l e l work
— Binary s earch f u n c t i o n c a l l s
∗/
void p a r a l l e l _ w o r k ( i n t s e a r c h V a l , i n t num_threads )
{
intresult;
double s t a r t , end , t o t a l _ t i m e ;
/ / Array w i l l be s l i c e d i n t o s e c t i o n s
intthread_one, thread_two, thread_three, thread_four;
i n t q u a r t e r _ s l i c e = middle / 2 ;
s t a r t = omp_get_wtime ( ) ;
# pragma omp s e c t i o n
t h r e a d _ o n e = binary Search_openmp ( 0 , q u a r t e r _ s l i c e , s e a r c h V a l ) ;
# pragma omp s e c t i o n
t h r e a d _ t w o = binary Search_openmp ( q u a r t e r _ s l i c e + 1 , middle , s e a r c h V a l ) ;
# pragma omp s e c t i o n
t h r e a d _ t h r e e = binary Search_openmp ( middle + 1
# pragma , quarter_slice
∗
3 , s e a r c h V a l ) ; omp s e c t i o n
} t h r e a d _ f o u r = binary Search_openmp ( ( q u a r t e r _ s l i c e ∗ 3 ) + 1 , l a s t , s e a r c h V a l ) ;
}
end = omp_get_wtime ( ) ;
t o t a l _ t i m e = end − s t a r t ;
p r i n t f ( " Work t o ok %f s e c o n d s \ n " , t o t a l _ t i m e ) ;
/ / P r i n t r e s u l t s o f s e r i a l Binary s earch
if (result!= 1)−
{
p r i n t f ( " Element %d found ! At i n d e x %d \ n " , s e a r c h
} Val, result);
else
{
printf("Entervaluetosearchfor: \n");s
c a n f ( "%d " , &s e a r c h V a l ) ;
/ / Perform a l l s e r i a l work
serial_work(searchVal);
/ / Perform a l l p a r a l l e l i z e d work
p a r a l l e l _ w o r k ( s e a r c h V a l , num_threads ) ;
return 0 ;
}
Listing 4. integer_generator.py
’’’
Generates random i n t e g e r s and p l a c e s a l l data i n t o i n p u t 1 . t x t Each i n t
e g e r w i l l be w r i t t i n i n t o a new l i n e i n i n p u t 1 . t x t
’’’
import random
n u m s _ t o _ g e n e r a t e = 200000000
lower_bound = 0
upper_bound = n u m s _ t o _ g e n e r a t e
f o r i in range ( n u m s _ t o _ g e n e r a t
e):line=str(i)
output_file.write(line)
output_file.write(’\n’)
output_file.close()