Matching patterns in a long text string

I wrote this code and it works in some cases. Sometimes, however, this fails, and I just don't understand why. Can someone please help me identify the error?

It works for: String: ishanthakkar ishan patter: ishan

But this fails for:

String: cpr ograming patter: cpr

A source:

 #include <stdio.h> #include <stdlib.h> #include <string.h> int *compute_prefix_function(char *pattern, int psize) { int k = -1; int i = 1; int *pi = malloc(sizeof(int)*psize); if (!pi) return NULL; pi[0] = k; for (i = 1; i < psize; i++) { while (k > -1 && pattern[k+1] != pattern[i]) k = pi[k]; if (pattern[i] == pattern[k+1]) k++; pi[i] = k; } return pi; } 

// This function finds the matching string at O ​​(n) time, so repeat it through the text string only once when the unmatching character is detected; it goes with the next character and starts comparing with the first character of the string to be searched, i.e. pattern

 int kmp(char *target, int tsize, char *pattern, int psize) { int i; int *pi = compute_prefix_function(pattern, psize); int k = -1; if (!pi) return -1; for (i = 0; i < tsize; i++) { while (k > -1 && pattern[k+1] != target[i]) k = pi[k]; if (target[i] == pattern[k+1]) k++; if (k == psize - 1) { free(pi); return ik; } } free(pi); return -1; } int main(int argc, const char *argv[]) { char target[200]; char *ch = target; char pattern[20]; int i; printf("Enter the string: \n"); fgets(target,100,stdin); printf("Enter the string to be matched: \n"); fgets(pattern,20,stdin); i = kmp(target, strlen(target), pattern, strlen(pattern)); if (i >= 0) printf("matched @: %s\n", ch + i); getch(); return 0; } 
+4
source share
2 answers

The fgets function reads and includes the ending of CR (or CRLF) in the line.

Add a chomp() function, for example

 void chomp(char *s) { int n = strlen(s); while (n && (s[n-1]==10 || s[n-1]==13)) s[--n] = 0; } 

which removes any CR or LF at the end of the line.
Then the chomp() pattern and target before calling kmp() (and after scanf() )

 chomp(target); chomp(pattern); i = kmp(target, strlen(target), pattern, strlen(pattern)); 

the program should behave better.


Note: 10 is '\n' (LF) and 13 is '\r' (CR)

+2
source

Get the key:

 i = kmp(target, strlen(target), pattern, strlen(pattern)); 

passed the length of the string + 1 (for the null character), so it gave a false result for some text string

 i = kmp(target, strlen(target)-1, pattern, strlen(pattern)-1); 

works in all cases!

Thank you all for your time!

+1
source

Source: https://habr.com/ru/post/1485421/


All Articles