I am curious what the most efficient algorithm (or is usually used) for counting the number of occurrences of a string in a piece of text.
From what I read , Boyer-Moore's string search algorithm is the standard for finding strings, but I'm not sure if counting the occurrences in an efficient way will be the same as finding a string.
In Python, this is what I want:
text_chunck = "one two three four one five six one" occurance_count(text_chunck, "one") # gives 3.
EDIT: It seems that python str.countserves as such a method; however, I cannot find which algorithm it uses.
str.count
, , - . , , .
Aho-Corasick O (m + n + z), m - , n - , z - . , . . , , , . , , , , O (n) , O (m + z).
, , , . O (m) , , O (n) , n.
, -, , Rabin- Karp, - . - , , ( ) .
, !
- , , . , , "" .
, . , .
indexOf strpos , . , .
Hellnar, String. , :
""" The counting algorithm is used to count the occurences of a character in a string. This allows you to compare anagrams and strings themselves. ex. animal, lamina a=2,n=1,i=1,m=1 """ def count_occurences(str): occurences = {} for char in str: if char in occurences: occurences[char] = occurences[char] + 1 else: occurences[char] = 1 return occurences def is_matched(s1,s2): matched = True s1_count_table = count_occurences(s1) for char in s2: if char in s1_count_table and s1_count_table[char]>0: s1_count_table[char] -= 1 else: matched = False break return matched #counting.is_matched("animal","laminar")
This example returns True or False if the strings match. Keep in mind that this algorithm counts the number of times a character appears in a string, this is useful for anagrams.
Source: https://habr.com/ru/post/1744035/More articles:How can I tell where log4net thinks it is getting its configuration file? - log4netjQuery boring selector? - jqueryheroku complains about my public key generated by ssh-keygen2 - ssh-keysScript or automate the creation of feature classes in ESRI / ArcSDE - arcgisUsing interfaces in java..Newb question - javaSetting groovysh path class from pom - javaSingle file design using visibility on div classes / identifiers - javascriptThe recursion problem with operator overloading - c ++Can I use Cassandra to store objects? - c #GNUPlot: change mark labels - gnuplotAll Articles