Professional Documents
Culture Documents
Written by;
Faculty;
Indah Ayu Yuliani, ST, MM.
Class;
2SE1
CEP CCIT - Fakultas Teknik Universitas Indonesia Gedung Engineering Center Lt. 1,
Kampus Baru UI Depok 16424
PREFACE
Praise the author for the presence of God Almighty for the blessing of His abundance of
grace and gifts so that the author can compile this paper on "Rabin Karp Algorithm
Implementation With Java" can be completed in a timely manner. This paper was prepared to
fulfill the assignment of the Information and Communication Technology course.
I would like to express my gratitude to the lecturers on the Java Programming subject who
have allowed me to compile this paper. I am aware this paper is far from perfect. For this
reason, I have strong suggestions and criticisms, for the perfection of the composition of the
next paper.
Thank you, and hopefully, this paper can make a positive contribution to all of us.
Author
1
TABLE OF CONTENT
PREFACE.................................................................................................................................1
TABLE OF CONTENT...........................................................................................................2
TABEL OF FIGURES..............................................................................................................3
CHAPTER I..............................................................................................................................4
INTRODUCTION....................................................................................................................4
1.1 Background..................................................................................................................................4
1.2 Writing Obejctive.........................................................................................................................4
1.3 Problem Domain..........................................................................................................................4
1.4 Writing Methodology...................................................................................................................5
1.5 Writing Framework......................................................................................................................5
CHAPTER II............................................................................................................................6
BASIC THEORY.....................................................................................................................6
2.1 Algorithm....................................................................................................................................6
2.2 Rabin Karp's algorithm.................................................................................................................6
2.3 How Rabin-Karp Algorithm Works...............................................................................................7
2.4 Compare Rabin Karp Algorithm with Another Algorithm.............................................................7
2.5 Advantages and disadvantages of the Rabin Karp Algorithm.......................................................8
CHAPTER III...........................................................................................................................9
PROBLEM ANALYSIS...........................................................................................................9
3.1 Rabin-Karp Implementation on Java Programming......................................................................9
3.2 Dry Run Table.........................................................................................................................14
CHAPTER IV.........................................................................................................................16
CONCLUSION AND SUGGESTION..................................................................................16
4.1 Conclusion.................................................................................................................................16
4.2 Suggestion.................................................................................................................................16
BIBLIOGRAPHY..................................................................................................................17
2
TABEL OF FIGURES
3
CHAPTER I
INTRODUCTION
1.1 Background
Advances in information technology, some work can be done more easily with the
help of computer technology, such as processing data. Data processed with the help of a
computer will feel more effective and efficient so as to produce the desired information.
Behind the conveniences obtained such as copying digital files, this tendency can have a
negative impact on the interests of groups and individuals, one of which is document
plagiarism. A solution that can seek the act of copying the document is to make a comparison
of the copied journals. Comparison is done by calculating the percentage rate of similarity of
each word in the journal. Therefore, an algorithm called Rabin Karp Algorithm was designed.
4
application according to the function owned rabin karp algorithm and comparison of rabin
karp algorithm with one of the other algorithms.
1. Chapter I Introduction
This chapter describes the background of the problem, the problem boundary, the
purpose of the writing, the writing methodology used, and the systematic writing of
the paper.
2. Chapter II Basic Theory
This chapter describes what Algorithm in general is, Rabin Karp Algorithm is, and
Introduction of the principles and functions of the Rabin Karp Algorithm.
3. Chapter III Problem Analysis
This chapter has more in depth discussion of some examples of pattern search using
the application of the Rabin Karp Algorithm, an explanation of how the Rabin Karp
Algorithm works and comparison of Rabin Karp Algorithm with one of the other
algorithms.
4. Chapter IV Conclusion and Suggestion
This chapter contains the conclusions of the authors based on the experience gained
after doing research and getting useful suggestions from various sources.
5
CHAPTER II
BASIC THEORY
2.1 Algorithm
There are several experts define the algorithm as follows :
From some understanding of algorithms by experts, it can be concluded that algorithms can
be interpreted as a series of systematic (sequential) steps to solve a problem. The steps in
solving the problem that must be understood are not in the form of a programming language,
but steps that will later be converted into a programming language.
6
1. Eliminate punctuation and convert to the source text and the word you want to
search for into words without letters.
2. Dividing the text into grams specified by the k-gram value
3. Search for hash values with the hash function of each formed word
4. Looks for the same hash value between two texts
2.3 How Rabin-Karp Algorithm Works
1. Hash Functions: The Rabin-Karp algorithm uses hash functions to calculate the hash
value of the searched pattern and the constantly shifting window within the text.
This hash function must have deterministic properties, meaning that if the inputs
provided are the same, then the output will always be the same. In addition, efficient
hash functions are also very important for the performance of these algorithms.
2. Initialization: The first step in the Rabin-Karp algorithm is to initialize. We need to
calculate the hash value of the searched pattern and the first window in the text. The
first window should be the same size as the length of the pattern being searched. In
this stage, we can also calculate the hash value of the pattern we are looking for later
in the comparison.
3. Hash Comparison: After initialization, the Rabin-Karp algorithm compares the hash
value of the searched pattern with the hash value of the first window in the text. If
these hash values are the same, there is a possible match. However, there may be
false positives, which is when two strings with different hash values produce the
same hash value. Therefore, after a match has occurred in the hash value, it is
necessary to do a character-by-character comparison to ensure the actual match.
4. Swipe Window: If no match occurs in the previous step, the window will be shifted to
the right by one character. The hash value of the new window is calculated based on
the hash value of the previous window and the new characters entered. This process
avoids repeated recalculation of hashes from the same substring.
5. Steps 3 and 4 are repeated until the window reaches the end of the text or a match is
found in the pattern you're looking for. If a match occurs, we can take appropriate
action, such as noting the position of the match or stopping the search if we just
want to know if the pattern is present in the text.
7
1. Rabin-Karp traces text characters one by one in character series (contigu), but the
comparison process (its key hash calculation) is relatively easy (with Horner's rule
the hash key can then be calculated from the previous hash key), while Knut-Morris
Pratt "jumps" several characters in the character series after processing the fringe
(prefix and suffix) which is relatively more difficult as it is hardly related to previous
fringe.
2. Rabin-Karp doesn't really work in complexity when compared to Knut-Morris-Pratt,
which implies a longer string matching time.
3. Rabin-Karp hardly needs as much extra memory as Knut-Morris-Pratt needs to store
fringe (prefixes and suffixes).
So, it can be seen that the rabin-karp algorithm and KMP algorithm have their own
advantages and disadvantages. So that we can adjust the needs for the program we want to
create.
8
CHAPTER III
PROBLEM ANALYSIS
3.1 Rabin-Karp Implementation on Java Programming
Figure 3 1 Code 1
9
Defines a static method named "search" with three parameters: "pattern" (String data
type), "txt" (String data type), and "q" (integer data type). This method aims to look
for certain patterns in a text using the Rabin-Karp algorithm.
4. int m = pattern.length()
Declares a local variable "m" with an integer data type. The value is the length of the
string "pattern". This variable stores the length of the pattern to be searched.
5. int n = txt.length()
Declares a local variable "n" with an integer data type. The value is the length of the
string "txt". This variable stores the length of text in the file .
6. int i, j
Declares two local variables "i" and "j" with an integer data type. This variable will
be used as a loop variable in later iterations.
7. int p = 0
Declares a local variable "p" with an integer data type and gives an initial value of 0.
This variable will be used to store the hash value of the pattern.
8. int t = 0
Declares the local variable "t" and gives an initial value of 0. This variable will be
used to store the hash value of the text as the iteration progresses.
9. int h = 1
Declares a local variable "h" with an integer data type and gives an initial value of 1.
This variable is used to calculate the hash of patterns and text.
10. for (i = 0; i < m - 1; i++)
Loop to calculate the initial hash value of patterns and text. This loop runs from 0 to
m-1, where m is the length of the pattern.
11.h = (h * d) %q
Update the value of the variable "h" by multiplying it by "d" and then taking the rest
of the quotient by "q". This is done on each iteration of the loop to compute the hash.
12. for (i = 0; i <= n - m; i++)
Loop for each possible position of the pattern in the text. This loop runs from 0 to n-
m, where n is the length of the text.
10
figure 3 2 Code 2
11
figure 3 3 Code 3
1. if (p == t)
Checks whether the hash value of the pattern (p) is equal to the hash value of the text
(t) at the current position. If they are equal, it indicates a possible pattern match at
the current position in the text.
2. for(j = 0; j < m; j++)
Loop to check character by character of the pattern and text at the current position.
This loop runs from 0 to m-1, where m is the length of the pattern.
3. if(txt.charAt(i + j) != pattern.charAt(j))
Checks if the characters in the text at the current position + j are not the same as the
characters in the pattern at the j position. If there is a difference in characters, this
loop will be terminated using a 'break' statement.
4. if (j == m)
Checks if the previous loop finishes running until the end of the pattern (j == m). If
yes, it means that the entire pattern matches the text at its current position. In this
case, the pattern is found in the text.
5. System.out.println("Pattern is found at position: " + (i + 1))
12
Print a message indicating that the pattern was found at the current position in the
text.
6. if (i < n - m)
Check if there is still a possibility to match patterns in the text. If i is less than n-m,
it means that there are still characters in the text that have not been processed.
7. t = (d * (t - txt.charAt(i) * h) + txt.charAt(i + m)) % q
In this formula, "t" represents the hash value of the previous text, "oldChar"
represents the character removed from the sliding window, "h" is the multiplier
corresponding to the removed character, "newChar" is the new character inserted
into the sliding window, and "d" and "q" are constants.
8. if (t < 0)
Checks if the text hash value (t) becomes negative after the previous operation. If
yes, then t is converted to t + q to ensure the hash value stays within the right range.
figure 3 4 Code 4
13
In this section we use the scanner function to store the text value and pattern to be filled, then
after that the system will display the same string position
After filling in the text and pattern using the scanner function, we can see the position of the
text and pattern that have something in common.
14
and text
18 p = (d * p + Calculate pattern hash
pattern.charAt(i)) % q;
19 t = (d * t + txt.charAt(i)) % Calculate text hash
q;
23 for (i = 0; i <= n - m; i++) { Iterate through the text
24 if (p == t) { Compare pattern and text
hash
25 for (j = 0; j < m; j++) { Check for pattern match
26 if (txt.charAt(i + j) != Compare characters
pattern.charAt(j)) break; |
Break the loop if characters
don't match
29 if (j == m) Check if the entire pattern
matches
30 System.out.println("Pattern Print the position of the
is found at position: " + (i + match
1));
34 if (i < n - m) { Update text hash and handle
negative values
35 t = (d * (t - txt.charAt(i) * h) Update text hash value
+ txt.charAt(i + m)) % q;
36 if (t < 0) Handle negative hash values
37 t = (t + q); Add q to the negative hash
value
40 public static void Entry point of the program
main(String[] args) {
41 Scanner scanner = new Create a Scanner object for
Scanner(System.in); input
42 System.out.print("Enter the Prompt for text input
text: ");
43 String txt = Read the text input
scanner.nextLine();
44 System.out.print("Enter the Prompt for pattern input
pattern to search: ");
45 String pattern = Read the pattern input
scanner.nextLine();
Line Code Explanation
47 int q = 13; Define the prime number
48 scanner.close(); Close the scanner
50 search(pattern, txt, q); Call the search method to
find the pattern in the text
52 } Close the main method
54 } Close the class
15
CHAPTER IV
4.1 Conclusion
The conclusion of this paper is that the rabin karp algorithm uses a process hashing
with a predetermined formula to detect the presence of similarities, two texts compared to
transforming into the form of a series of numbers referring to the ASCII table. If the pattern
and text iterations are the same then it means there is occurrence of pattern in our text. The
greater the number of characters or sentences in the text, the greater the time it takes to detect
the degree of similarity, because rabin karp this algorithm searches one by one the characters
Overall, the Rabin-Karp algorithm is a powerful tool for string matching, especially when
dealing with large texts and multiple pattern searches. However, remember to consider the
16
potential limitations of the algorithm, such as collisions in the hash function and the need for
efficient hash calculations.
4.2 Suggestion
With the hashing method on the rabin karp algorithm can be used in keyword
searches, such as words that often appear in the text and are unique, a program can be created
to search for them. The resulting program certainly has advantages and disadvantages, but of
course it can be overcome by providing developments in the program.
BIBLIOGRAPHY
[4] Putra, N. P., & Sularno, S. (2019). Penerapan Algoritma Rabin-Karp Dengan
Pendekatan Synonym Recognition Sebagai Antisipasi Plagiarisme Pada
Penulisan Skripsi. Jurnal Teknologi Dan Sistem Informasi Bisnis, 1(2), 130-140.
17
[5] Yusuf, B., Vivianie, S., Marsya, J. M., & Sofyan, Z. (2019, November). Analisis
Perbandingan Algoritma Rabin-Karp dan Ratcliff/Obershelp untuk Menghitung
Kesamaan Teks dalam Bahasa Indonesia. In SEMINAR NASIONAL APTIKOM
(SEMNASTIK) (pp. 61-69).
[9] Lede, P. A. R. L., Fanggidae, A., & Polly, Y. T. (2016). Implementasi Algoritma
Rabin-Karp Untuk Mendeteksi Dugaan Plagiarisme Berdasarkan Tingkat
Kemiripan Kata Pada Dokumen Teks. Jurnal Komputer dan Informatika
(JICON), 2(1), 50-64.
18