C string comparison with hash compare

I need to compare a string with several other constant strings in c. I am curious that the faster is the hash string that I am going to compare and compare it with all other hashes of the constant string or just compare the strings as strings. thanks in advance

Thank you for the answers that I am going to make with many comparisons. can someone give me a good, fast, resource intensive algorithm to use? The only hash that I know of is MD5, and I have a feeling that this is more than a kill.

I also want to add that strings can be 20 or 30 characters in max, with most of them about 7.

+3
source share
10 answers

Will the comparison be done one or more times? If the comparison is made only once, you will most likely be able to conduct a direct comparison. If you need to compare so many lines with this set of constant lines, then you can probably save time in the long run by executing it with hashes.

This is a fairly simple problem that you can easily write in both directions and see which one is best for a representative set of input data.

+8
source

, Aho-Corasick, trie ( ).

+4

, - O (n). O (n), Oh. , , , . .

- .

+4

- - . , - ( ), strcmp .

+3

, , , bsearch, , . NULL, , , , , , , .

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* cmp function for qsort and bsearch */
static int pstrcmp(const void *a, const void *b)
{
  return strcmp(*(char * const *)a, *(char * const *)b);
}

/* check an input against the list of known strings */
static char *check_for_match(char *input)
{
  static char *static_list[] = { "one", "two", "three", "four", "five" };
  static int nelems;

  /* this sorts the list, for demonstration purposes, but if the list
     is static then it could be sorted prior to compiling */
  if (! nelems)
  {
    nelems = sizeof(static_list) / sizeof(*static_list);
    qsort(static_list, nelems, sizeof(*static_list), pstrcmp);
  }


  return bsearch(&input, static_list, nelems, sizeof(*static_list), pstrcmp);
}

int main(int argc, char *argv[])
{
  if (check_for_match("should_not_match"))
  {
    printf("Match found.\n");
  } else {
    printf("No match found.\n");
  }

  if (check_for_match("two"))
  {
    printf("Match found.\n");
  } else {
    printf("No match found.\n");
  }
  return EXIT_SUCCESS;
}
+3

. ? ? ?

, .

+1

, " ".

: - S -, S .

, " " . :

+1

-. ...

0

, , , , , log2(n) (, 10 1024 20 1000000 ). , , . , , .

0

Thank you for the answers that I am going to make many comparisons. Can anyone give me a good, fast, low resource-intensive algorithm to use? The only hash that I know of is MD5 and I have the feeling that it is a kill.

Murmur's hash is simple, fast and behaves well in statistical tests.

0
source

Source: https://habr.com/ru/post/1758632/


All Articles