Python merging unsorted lists - algorithm analysis

For two arrays with the following structure:

array = [(index_1, item_1), (index_2, item_2), ..., (index_n, item_n)]

Inside the array, the elements can be un-orderd, for example two Python lists:

arr1 = [(1,'A'), (2, 'B'), (3,'C')]
arr2 = [(3,'c'), (2, 'b'), (1,'a')]

I would like to analyze the merging of these arrays. There are two ways I could think of a merger. The first one is a naive iteration over both arrays:

merged = []
for item in arr:
    for item2 in arr2:
        if item[0] == item2[0]:
            merged.append((item[0], item[1], item2[1]))

# merged
# [(1, 'A', 'a'), (2, 'B', 'b'), (3, 'C', 'c')]

This naive approach should be in large o O (n ** 2),

A slightly better (?) Approach was to sort the arrays first (with Python, sort will be O (n log n)):

arr.sort(key=lambda t: t[0])
arr2.sort(key=lambda t: t[0])

for idx, item in enumerate(arrs):
    merged_s.append(tuple(list(item)+[arr2s[idx][1]]))

Thus, this approach will be equal to O (n log n) as a whole, is this analysis correct?
What about the case when lists have unequal lengths mand n?
Is there a more efficient way than sorting first?

+4
3

.

, n > m: O (n * m), O (nlogn), . ( NB: , ! n!= m - , len(arr1) > len(arr2), arr2)

, , . , , , . , : a) O (n + m) b) , .

import itertools
arr1 = [(1,'A'), (2, 'B'), (3,'C'), (4, 'D')]
arr2 = [(3,'c'), (2, 'b'), (1,'a'), (5, 'E')]

output_dict = {}
for key, value in itertools.chain(arr1, arr2): # I like itertools
    output_dict.setdefault(key, []).append(value)
output = [(key,)+tuple(values) for key, values in output_dict.items() if len(values)==2]

:

[(1, 'A', 'a'), (2, 'B', 'b'), (3, 'C', 'c')]
+1
arr1 = [(1,'A'), (2, 'B'), (3,'C')]
arr2 = [(3,'c'), (2, 'b'), (1,'a')]
key2value = dict()
for item in arr1:
    key2value[item[0]] = [item[1]]
for item in arr2:
    try:
        value = key2value[item[0]]
        value.append(item[1])
    except:
        key2value[item[0]] = [item[1]]

result = [tuple([key] + value) for key, value in key2value.iteritems()]

- O (m + n), m = len (arr1) n = len (arr2),

0

O(N Log(N))complexity should be preferred O(N²). Compare 1000.Log2(1000) ~ 9966with 1000² = 1000000.

In any case, all the analyzes on this page are incorrect, since they assume that operations on Python structures, such as adding or inserting a dictionary, take a constant time, which is false.

0
source

Source: https://habr.com/ru/post/1606704/


All Articles