You can use Boyer-Moore , which includes pre-processing the string, but you cannot defeat the worst case O (KN), where K is the sum of the lengths of the keywords and N is the length of the string. The best case, of course, is sublinear, but you cannot have sublinear performance in the worst case.
Please note that comparisons are not free. This is not like you can compare two strings in O (1), to make sure they are equal, you need to iterate over the characters. Hashing gives you what you need to compare in constant time, but doesn't help anymore, since two different lines can have the same hash. This does not mean that hashing is not good, but it does not change the complexity of execution in the worst case.
In the end, you need to compare the characters, and Boyer-Moore provides a very good way to do this. Of course, if you use some kind of hash-based assembly, you can exclude certain keywords in an arithmetic constant, but this does not change the fact that in the worst case (and in many other cases) you need to compare characters.
Also note that depending on what we take in relation to the data, and how we build our indexing structure (indices), you can achieve very good actual runtimes. Just because the complexity of the worst case is not sublinear does not mean that the actual execution time will not be very fast. There is no single-element simple or correct solution; the problem can be approached in many ways. There is never a quick and dirty answer to solve all your problems when it comes to getting information.
source share