Effectively determine if part of a string is included in the / dict list keys?

I have many (> 100,000) string lines in a list where a subset might look like this:

str_list = ["hello i am from denmark", "that was in the united states", "nothing here"]

I also have such a dict (in reality it will have a length of about ~ 1000):

dict_x = {"denmark" : "dk", "germany" : "ger", "norway" : "no", "united states" : "us"}

For all the lines in the list that contain any of the dict keys, I want to replace the whole line with the corresponding dict value. The expected result should be like this:

str_list = ["dk", "us", "nothing here"]

What is the most efficient way to do this, given the number of lines I have and the length of the dict?

Additional information: in a line no more than one key of a key.

+4
source share
5 answers

Assuming that:

lst = ["hello i am from denmark", "that was in the united states", "nothing here"]
dict_x = {"denmark" : "dk", "germany" : "ger", "norway" : "no", "united states" : "us"}

You can do:

res = [dict_x.get(next((k for k in dict_x if k in my_str), None), my_str) for my_str in lst]

which returns:

print(res)  # -> ['dk', 'us', 'nothing here']

( , python-ninjas aka list-comprehension) - get my_str next StopIteration None .

+1

, :

input_strings = ["hello i am from denmark",
                 "that was in the united states",
                 "nothing here"]
dict_x = {"denmark" : "dk", "germany" : "ger", "norway" : "no", "united states" : "us"}

output_strings = []

for string in input_strings:
    for key, value in dict_x.items():
        if key in string:
            output_strings.append(value)
            break
    else:
        output_strings.append(string)
print(output_strings)
+3

- . , , . , , , .

strings = [str1, str2, str3]
converted = []
for string in strings:
    updated_string = string
    for key, value in dict_x.items()
        if key in string:
            updated_string = value
            break
    converted.append(updated_string)
print(converted)
+1

Try

str_list = ["hello i am from denmark", "that was in the united states", "nothing here"]

dict_x = {"denmark" : "dk", "germany" : "ger", "norway" : "no", "united states" : "us"}

for k, v in dict_x.items():
    for i in range(len(str_list)):
        if k in str_list[i]:
            str_list[i] = v

print(str_list)

, , . , .

+1

dict .

, , .

class dict_contains(dict):
    def __getitem__(self, value):
        key = next((k for k in self.keys() if k in value), None)
        return self.get(key)

str1 = "hello i am from denmark"
str2 = "that was in the united states"
str3 = "nothing here"

lst = [str1, str2, str3]

dict_x = dict_contains({"denmark" : "dk", "germany" : "ger", "norway" : "no", "united states" : "us"})

res = [dict_x[i] or i for i in lst]

# ['dk', 'us', "nothing here"]
+1

Source: https://habr.com/ru/post/1695159/


All Articles