I wrote a simple function that determines if str1 is the str2 prefix. This is a very simple function that looks like this (in JS):
function isPrefix(str1, str2) // determine if str1 is a prefix of a candidate string { if(str2.length < str1.length)
As you can see, it scans the entire length of the prefix line to determine if it is the candidate line prefix. This means O (N) complexity, which is not bad, but it becomes a problem when I have a huge dataset to examine a loop to determine which lines have a prefix line as part of the prefix. This makes the complexity plural, like O (M * N), where M is the total number of rows in a given dataset. Not good.
I researched the Internet a bit to determine if Patricia / Radix trie would be the best answer. Where strings are stored as prefixes. Even when I try to insert / find a string, there will be significant overhead in string matching if I use the aforementioned gauging prefix function.
Say I had a prefix string 'rom' and a set of candidate words
var dataset = ["random", "rapid", "romance", "romania", "rome", "rose"];
who would like this in a radix trie:
r / \ ao / \ / \ ndom pid se m / \ an e / \ ia ce
This means that for each node I will use the prefix matching function to determine which node has a value corresponding to the prefix line in the index. One way or another, this decision still seems difficult and doesn't fit well with me. Is there something better or in any case, I can improve the function of matching the main prefixes?
source share