The Rabin-Karp algorithm is a string matching algorithm that uses hashing to efficiently find occurrences of a pattern within a text. It employs a rolling hash function to calculate the hash values of substrings and compares the hash values to determine potential matches.
Here's the pseudocode for the Rabin-Karp algorithm:
RabinKarp(pattern, text):
n = length(text)
m = length(pattern)
patternHash = hash(pattern[0...m-1])
textHash = hash(text[0...m-1])
occurrences = []
for i = 0 to n - m do
if patternHash = textHash then
if pattern[0...m-1] = text[i...i+m-1] then
occurrences.append(i)
if i < n - m then
textHash = rehash(textHash, text[i], text[i+m])
return occurrences
Let's go through the pseudocode step by step:
1. `n` represents the length of the text, and `m` represents the length of the pattern.
2. Calculate the hash value of the pattern using the `hash` function. This function converts the characters of the pattern into a numerical hash value.
3. Calculate the hash value of the first substring of length `m` from the text using the `hash` function.
4. Initialize an empty list `occurrences` to store the starting indices of pattern occurrences.
5. Iterate `i` from 0 to `n - m` (inclusive) to slide the pattern window over the text.
6. If the pattern hash value is equal to the text hash value, compare the pattern and the substring of the text starting at index `i`. If they match, add `i` to the `occurrences` list.
7. If `i` is less than `n - m`, update the text hash value using the `rehash` function. This function efficiently updates the hash value by subtracting the contribution of the leftmost character and adding the contribution of the rightmost character.
8. Finally, return the `occurrences` list containing the starting indices of pattern matches.
The `hash` function converts a string into a hash value using a suitable hashing algorithm, such as the rolling hash function. The `rehash` function efficiently updates the hash value based on the previous hash, the leftmost character being removed, and the rightmost character being added.
Please note that the pseudocode assumes 0-based indexing for arrays and lists. Also, keep in mind that the Rabin-Karp algorithm has an average-case time complexity of O(n + m), where n is the length of the text and m is the length of the pattern. However, in the worst case (when hash collisions are frequent), the time complexity can degrade to O(n * m).
Silan Software is one of the India's leading provider of offline & online training for Java, Python, AI (Machine Learning, Deep Learning), Data Science, Software Development & many more emerging Technologies.
We provide Academic Training || Industrial Training || Corporate Training || Internship || Java || Python || AI using Python || Data Science etc