I am an inexperienced programmer working on several bioinformatics exercises in Python.
One problem area counts elements at a given intersection between name groups and repositories that are counted in the dictionary. There are two lists of 2,000 nouns; names in group names are Latin species names. For instance:
list__of_name_groups_1 = [
['Canis Lupus', 'Canis Latrans'],
['Euarctos Americanus', 'Lynx Rufus'],
...
]
list__of_name_groups_2 = [
['Nasua Narica', 'Odocoileus Hemionus'],
['Felis Concolor', 'Peromyscus Eremicus'],
['Canis Latrans', 'Cervus Canadensis']
...
]
And I need a dictionary that contains all sizes of intersections between groups of names, for example.
>>> intersections
{ (0, 0): 0, (0, 1): 0, (0, 2): 1, (1, 0): 0, (1, 1): 0, (2, 1): 0,
(2, 0): 1, (2, 1): 0, (2, 2): 0 }
( 'Canis Latrans'occurs in an element 0in the first list, an element 2in the second list.)
I have an implementation of an algorithm that works, but it works too slowly.
overlap = {}
for i in list_of_lists_of_names_1:
for j in list_of_lists_of_names_2:
overlap[(i,j)] = len(set(i) & set(j))
Is there a faster way to count the number of elements at given intersections?
( ... , , , . , , , . , , , , .)