You are on page 1of 13

RABIN KARP ALGORITHM

20MIA1145 VARUN VAISHNAV

G-drivelink: https://drive.google.com/file/d/1KlXyM7ZGDifubIeAPz8NHcXM4T_wOegQ/view?usp=sharing
CONTENTS
• What is Rabin Karp Algorithm

• History of Rabin Karp

• How Rabin Karp Works

• Examples of Rabin Karp

• Complexity

• Real Life Applications


What is Rabin Karp
• Rabin-Karp algorithm is an algorithm used for searching/matching
patterns in the text using a hash function. It is not like string matching
algorithm, it does not travel through every character in the initial phase
rather it filters the characters that do not match and then performs the
comparison.

• It uses a rolling hash to quickly filter out positions of the text that cannot
match the pattern, and then checks for a match at the remaining
positions. It does not check the string itself, it only checks the value of
the string and then searches for a match.
History of Rabin Karp
• The Rabin–Karp algorithm or Karp–Rabin algorithm is a string searching
algorithm created by Richard M. Karp and Michael O. Rabin in the year
1987, that uses hashing to find an exact match of a pattern string in a text.

• The basic principle employed in Rabin Karp algorithm is hashing. In the


given text every substring is converted to a hash value and compared with
the hash value of the pattern and then it concludes.
How does Rabin Karp work?
• The Rabin–Karp algorithm proceeds by computing, at each position of the
text, the hash value of a string starting at that position with the same
length as the pattern. If this hash value equals the hash value of the
pattern, it performs a full comparison at that position.

• If the hash values are unequal, the algorithm will determine the hash value
for next M-character sequence. If the hash values are equal, the algorithm
will analyze the pattern and the M-character sequence. In this way, there is
only one comparison per text subsequence, and character matching is only
required when the hash values match.
EXAMPLES OF RABIN KARP
Let’s use a problem as an example here:
For string matching, working module q = 11, how many spurious hits does the
Rabin-Karp matcher encounters in Text T = 31415926535.......
Here,
T=31415926535
T.length=11
P=26
Q=11 so,
p mod q=26%11 = 4
We must find the exact match of P mod Q:

T= 3 1 4 1 5 9 2 6 5 3 5

P= 2 6

S=0
33 1 4 1 5 9 2 6 5 3 5

31 mod 11=9 not equal to 4

S=1
3 1 4 1 5 9 2 6 5 3 5

14 mod 11=3 not equal to 4


S=2
3 1 4 1 5 9 2 6 5 3 5

41 mod 11=8 not equal to 4

S=3
3 1 4 1 5 9 2 6 5 3 5

15 mod 11=4 equal to 4(Hit)

S=4
3 1 4 1 5 9 2 6 5 3 5

59 mod 11=4 equal to 4(Hit)


S=5
3 1 4 1 5 9 2 6 5 3 5

92 mod 11=4 equal to 4(Hit)


S=6
3 1 4 1 5 9 2 6 5 3 5

26 mod 11=4 equal to 4(Exact match)


S=7
3 1 4 1 5 9 2 6 5 3 5

65 mod 11=10 not equal to 4


S=8
3 1 4 1 5 9 2 6 5 3 5

53 mod 11=9 not equal to 4


S=9
3 1 4 1 5 9 2 6 5 3 5

35 mod 11=2 not equal to 4

The pattern occurs with shift 6 because it’s an exact match


COMPLEXITY
• The running time of Rabin karp algorithm in the worst case scenario 
• O ((nm+1) m but it has a good average case running time. If the expected
number of strong shifts is small O (1) and prime q is chosen to be quite
large,then the Rabin-Karp algorithm can be expected to run in time O
(n+m) plus the time to require to process spurious hits.

• The time complexity of the searching phase of the Karp-Rabin


algorithm is O(mn) (when searching for am in an for instance). Its
expected number of text character comparisons is O(n+m).
REAL LIFE APPLICATIONS
• Plagiarism Detection: The documents to be compared are decomposed
into string tokens and compared using string matching algorithms. Thus,
these algorithms are used to detect similarities between them and
declare if the work is plagiarized or original.

• Bioinformatics and DNA Sequencing: Bioinformatics involves applying


information technology and computer science to problems involving
genetic sequences to find DNA patterns. String matching algorithms and
DNA analysis are both collectively used for finding the occurrence of the
pattern set.
• Digital Forensics: String matching algorithms are used to locate specific
text strings of interest in the digital forensic text, which are useful for
the investigation.
G-drivelink:
https://drive.google.com/file/d/1KlXyM7ZGDifubIeAPz8NHcXM4T_wOegQ/view?
usp=sharing

You might also like