Determine if one line is a prefix of another

I wrote a simple function that determines if str1 is the str2 prefix. This is a very simple function that looks like this (in JS):

function isPrefix(str1, str2) // determine if str1 is a prefix of a candidate string { if(str2.length < str1.length) // candidate string can't be smaller than prefix string return false; var i = 0; while(str1.charAt(i) == str2.charAt(i) && i <= str1.length) i++; if(i < str1.length) // i terminated => str 1 is smaller than str 2 return false; return true; } 

As you can see, it scans the entire length of the prefix line to determine if it is the candidate line prefix. This means O (N) complexity, which is not bad, but it becomes a problem when I have a huge dataset to examine a loop to determine which lines have a prefix line as part of the prefix. This makes the complexity plural, like O (M * N), where M is the total number of rows in a given dataset. Not good.

I researched the Internet a bit to determine if Patricia / Radix trie would be the best answer. Where strings are stored as prefixes. Even when I try to insert / find a string, there will be significant overhead in string matching if I use the aforementioned gauging prefix function.

Say I had a prefix string 'rom' and a set of candidate words

var dataset = ["random", "rapid", "romance", "romania", "rome", "rose"];

who would like this in a radix trie:

  r / \ ao / \ / \ ndom pid se m / \ an e / \ ia ce 

This means that for each node I will use the prefix matching function to determine which node has a value corresponding to the prefix line in the index. One way or another, this decision still seems difficult and doesn't fit well with me. Is there something better or in any case, I can improve the function of matching the main prefixes?

+4
source share
2 answers

It looks like you have two different problems.

One of them is to determine if a string is contained as a prefix in another string. For this, I would suggest using a function already implemented in the language string library. In javascript you can do it

 if (str2.indexOf(str1) === 0) { // string str1 is a prefix of str2 } 

See the documentation for String.indexOf here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/indexOf

For another problem, in a bunch of strings, find out which ones have the given string as a prefix, building a data structure like Trie or the one you mention looks like a jump method if you want to look -ups quickly.

+7
source

Check this thread in stackoverflow - How to check if the "StartsWith" line is another line? . The Byers solution seems very effective. Also for Java, String "endsWith" and "startsWith" functions are built-in - http://docs.oracle.com/javase/tutorial/java/data/comparestrings.html

+1
source

Source: https://habr.com/ru/post/1500223/


All Articles