Professional Documents
Culture Documents
Chapter-5:
Data Structures and Algorithms
5.2 Unsorted vs Sorted Lists-
Binary Search Algorithm
Programming using Data-Structures and Algorithms
●
The OOP way:
– Focus on “What to do” rather than “How to do it” (abstraction)
– Class takes responsibility for handling index numbers privately
– Provide higher level interfaces to handle any data structures and
organize more complex data
●
Example using Lists:
– Explicit search: scan through all the items of the list and return True
(resp. False) if item is found (resp. not found)
– Implicit search: use a higher level interface (not need to bother
about detailed implementation). Just do: (item in list)
●
Obviously a programmer should:
– be able to take advantage of any built-in options (application focus)
– have some kind of understanding of the underlying implementation
Basic operations on Lists
●
Insertion of new item (append) ●
Permutation of two items
list1 = [4, 2, 1, 3] #traditional way
list1.append(5) #append at the end temp=list1[1]
list1[1] = list1[2]
●
Shifting items (shift up or down) list1[2]=temp
#python way
#shift up- all 5 items list1[1],list1[2]=list1[2],list1[1]
for i in range(5,0,-1):
list1[i]=list1[i-1] ●
Deleting an item
#shift down- all 5 items
for i in range(5): – Begin with a search
list1[i]=list1[i+1]
– Once item found, shift
●
Searching: step through the list (scan) until all items with higher
item is found. Some built-in approaches: index down
4 2 1 3 5
– list1.index(item) #returns index of item
– (item in list1) #return True/False 4 1 3 5
Unsorted Lists
●
Lists represent a sequence of N items that are unsorted by default
●
To analyze a particular algorithm that operates on lists, one must look
at the operation counts.
●
A computer program isn't able to see the big picture. Algorithms must
rely on performing basic steps: (i) Compare two items, (ii) Swap two
items; (iii) Remove or Insert one item; (iv) Move/Shift an item
●
Searching:
– best case: only 1 comparison needed to find the correct item
– worst case: N comparisons needed (scan through the entire list)
– In Average: N/2 comparisons
●
Insertion (at the end of the list): 0 comparisons, 1 move (assignment)
●
Deletion: N/2 comparisons (search), N/2 moves (shift down)
Unsorted Lists
●
Summary:
– insertion is fast
– search is “slow”, depends linearly on the number of items N
– deletion is even “slower”
●
Concrete example of linear search
https://www.youtube.com/watch?v=oc9H8bo8yg0
Can we do better if the list is sorted?
●
Concrete example of binary search (2 minutes into the video)
https://www.youtube.com/watch?v=REhqoLlRJwY
●
Bottom line: binary search can make you rich...
Sorted Lists
●
The N Data items are stored in ascending or descending order
●
Why use ordered (sorted) list ?
– Searching an unordered list is rather slow (N/2 comparisons)-
it uses a “linear search” algorithm
– Searching an ordered list is very fast using the “binary search”
algorithm (a good example of finely tuned data structure to improve
the efficiency of an algorithm)
●
Drawback
– List must be sorted
– Insertion takes longer- all data with higher key values must be
shifted up
●
Ordered lists are then useful in situation where search are frequent but
insertions and deletions are not.
Sorted Lists- Binary search
●
Analogy: The guess-a-number game !
– Choose a number between 1-100
33
– Ok Let me guess....
●
(1-100) 50? …................................. nope too high
●
(1-49) 25? …................................. nope too low
●
(26-49) 37? …................................. nope too high
●
(26-36) 31? ….................................. nope too low
●
(32-36) 34? …...................................nope too high
●
(32-33) 32? …...................................nope too low
●
(33-33) 33? …...................................Correct!
Binary Search Algorithm
lower mid upper
def binarySearch(mylist, x):
"""Iterative Binary Search Function
It returns location of x in given list ‘mylist’ if present, x
else returns -1""" lower upper
lower=0 #original lowerbound
upper=len(mylist)-1 #original upperbound
while lower <= upper: x
mid = lower + (upper - lower)//2 # find the middle
# Check if x is present at mid mid
if mylist[mid] == x:
return mid #found it- return index a=[2,45,78,100,234,345,444]
# If x is greater, ignore left half
elif mylist[mid] < x: print(binarySearch(a, 100))
lower = mid + 1 print(binarySearch(a, 101))
# If x is smaller, ignore right half
else:
upper = mid - 1 3
# If we reach here, then the element was not present -1
return -1
Binary Search: Number of steps?
●
In the worst case scenario, the algorithm progresses until one item is left in
the search range. In practice, at each step the range of values is
approximately divided by 2 (+ or – 1 item).
●
In theory, we can consider that the number of items N keeps being exactly
divided by 2 until approximately one item is left
●
Example: N=100
Step s N? Number of
items
1 100/2 = 50.0 (N/21) 50
2 50/2 = 25.0 (N/22) 25
3 25/2 = 12.5 (N/23) 13
4 12.5/2 = 6.25 (N/24) 7
5 6.25/2 = 3.125 (N/25) 4
6 3.125/2 = 1.5625 (N/26) 2
7 1.5625/2 = 0.78125 (N/27) 1
Binary Search: Analysis
N
●
Mathematically, the number of steps s needed must satisfy: s
=1
2
●
Problem: N is given, find s such that N =2s
log ( N )
– log=ln=loge is the natural log s= =log 2 ( N )
log ( 2)
– log(e)=1 [log base e] and log2(2)=1 [log base 2]
N Linear search s=N/2 log2(N) Binary search s
10 5 3.32 4
Linear vs 100 50 6.64 7
Binary search 1,000 500 9.97 10
10,000 5,000 13.3 14
100,000 50,000 16.6 17
1,000,000 500,000 19.9 20
10,000,000 5,000,000 23.3 24
100,000,000 50,000,000 26.6 27
1,000,000,000 500,000,000 29.9 30