List item position

I have this dictionary:

db= {'www.baurom.ro':
                     {0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                      1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
                     },
    'slbz2':
            {0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
             1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    }

And the list:

lista=['www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.listafirme.ro', 'www.romanian-companies.eu', 'www.risco.ro']

What am I doing now:

for x in lista:
     if x in db:
        db[x][0][lista.index(x)]+=1

In other words, I want to calculate how many times each site appears in the list and at what position. This works, but in this example it will return something like:

{0: [7, 0, 0, 0, 0, 0, 0, 0, 0, 0]

while I would like it to be:

{0: [1, 1, 1, 1, 1, 1, 1, 0, 0, 0]

How can i achieve this? I can use a variable, initiate it with var = 0 and then + = 1 and use it as an artificial index, but is there a more β€œpythonic” way to do this?

+4
source share
3 answers

If I understand your question correctly, you already have a dictionary dband you are looking for enumerate .

And your code will look like this:

for index, element in enumerate(lista):
    if element in db:
        db[element][0][index] = 1 
+1

- :

for entry in db:
    db[entry][0] = [int(x == entry) for x in lista]
print(db)  # {'slbz2': {0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}, 'www.baurom.ro': {0: [1, 1, 1, 1, 1, 1, 1, 0, 0, 0], 1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}}

dictionary , dictionary lista. True, bool ean int eger (True -> 1, False -> 0).


lista dictionary, :

for entry in set(x for x in lista if x in db):
    # rest stays the same

, key dictionary, lista. , set, lista, ('www.baurom.ro' key , , lista).

0

If I understand your problem correctly, you can simply iterate over listaand create dbas needed:

urls = ['www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.listafirme.ro', 'www.romanian-companies.eu', 'www.risco.ro']
n = len(urls)
db = {}

for i, url in enumerate(urls):
    if not db.get(url):
        db[url] = {0: [0] * n} # NOTE: Use numpy for large arrays
    db[url][0][i] = 1

print(db)
# {'www.romanian-companies.eu': {0: [0, 0, 0, 0, 0, 0, 0, 0, 1, 0]}, 'www.risco.ro': {0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 1]}, 'www.listafirme.ro': {0: [0, 0, 0, 0, 0, 0, 0, 1, 0, 0]}, 'www.baurom.ro': {0: [1, 1, 1, 1, 1, 1, 1, 0, 0, 0]}}

It requires only one pass through listaand should be very fast.

If you have a list of interesting URLs, you can use this option:

from collections import defaultdict

urls = ['www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.listafirme.ro', 'www.romanian-companies.eu', 'www.risco.ro']

interesting_urls = set(['www.baurom.ro', 'slbz2'])

n = len(urls)

def url_array():
    return {0: [0] * n, 1: [0] * n}

db = defaultdict(url_array)

for i, url in enumerate(urls):
    if url in interesting_urls:
        db[url][0][i] = 1

print(db)
# defaultdict(<function url_array at 0x7fe8a95b87d0>, {'www.baurom.ro': {0: [1, 1, 1, 1, 1, 1, 1, 0, 0, 0], 1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}}) 
0
source

Source: https://habr.com/ru/post/1680938/


All Articles