The fastest way to find 2 numbers from two lists that total x

My code is:

n = 3 a1 = 0 b1 = 10 a2 = 2 b2 = 2 if b1>n: b1=n if b2>n: b2=n diap1 = [x for x in range(a1, b1+1)] diap2 = [x for x in range(a2, b2+1)] def pairs(d1, d2, n): res = 0 same = 0 sl1 = sorted(d1) sl2 = sorted(d2) for i in sl1: for j in sl2: if i+j==n and i!=j: res+=1 elif i+j==n and i==j: same+=1 return(res+same) result = pairs(diap1, diap2, n) print(result) 

NOTE. n, a1, b1, a2, b2 may vary . The code should find 2 numbers from 2 lists (1 from each), which in total are n. For example: the pairs (a, b) and (b, a) are different , but (a, a) and (a, a) are the same pair . So, the output of my code is correct, and for code above 1 (1, 2), but for large inputs it takes too much time. How can I optimize it to work faster?

+5
source share
5 answers

Use set () for a quick search ...

 setd2 = set(d2) 

Do not try to use all possible pairs of numbers. Once you lock the number from the first list, say i, just look to see if (ni) is in the second set.

 for i in sl1: if (ni) in setd2: # found match else: # no match in setd2 for i 
+4
source

Below you can work faster and find two numbers whose sum is n and store them also in the list of tuples.

 s1 = set(list1) s2 = set(list2) nums = [] for item in s1: if n-item in s2: nums.append((item, n-item)) 
+1
source

The accepted answer is very easy to understand and implement, but I just had to share this method. You can see that your question is the same as this one .
This answer is particularly interesting because you do not need extra space when inserting into a set. I included the algorithm in my answer.

If arrays are sorted, you can do this in linear and persistent storage.

  • Start with two pointers, one pointing to the smallest element of A, the other pointing to the largest element of B.
  • Calculate the sum of the pointed elements.
  • If it is less than k, draw a pointer to A so that it points to the next largest element.
  • If it is greater than k, reduce the pointer to B so that it points to the next smallest element.
  • If that’s accurate, then you’ve found a couple. Move one of the pointers and continue to search for the next pair.

If the arrays are initially unsorted, you can sort them first and then use the algorithm above.

+1
source

Thank you for clearly defining your question and for providing the code with an example that you are trying to optimize.

Using two key definitions from your question and notation you provided that I limited my attempt to optimize the use of lists and added the ability to randomly change the values ​​associated with n, a1, b1, a2 and b2.

To show the results of the optimization, I created a module that includes using the random.randit function to create different list sizes and the timeit.Timer function to fix the amount of time that your original pair () takes, as well as my proposed optimization in pairs2 () function .

In the pairs2 () function, you will notice that each iteration loop contains break. They eliminate unnecessary iteration through each list once the required criteria are met. It should be noted that the size of the lists is increasing, the number of pairs2 () is increasing compared to pairs ().

Test Module Code:

 import random from timeit import Timer max_value = 10000 n = random.randint(1, max_value) a1 = random.randint(0, max_value) b1 = random.randint(1, max_value+1) a2 = random.randint(0, max_value) b2 = random.randint(1, max_value+1) if b1>n: b1=n if b2>n: b2=n if a1>=b1: a1 = random.randint(0, b1-1) if a2>=b2: a2 = random.randint(0, b2-1) diap1 = [x for x in range(a1, b1)] diap2 = [x for x in range(a2, b2)] print("Length diap1 =", len(diap1)) print("Length diap2 =", len(diap2)) def pairs(d1, d2, n): res = 0 same = 0 sl1 = sorted(d1) sl2 = sorted(d2) for i in sl1: for j in sl2: if i+j==n and i!=j: res+=1 elif i+j==n and i==j: same+=1 return(res+same) def pairs2(d1, d2, n): res = 0 same = 0 sl1 = sorted(d1) sl2 = sorted(d2) for i in sl1: for j in sl2: if i+j==n and i!=j: res+=1 break elif i+j==n and i==j: same+=1 break if res+same>0: break return(res+same) if __name__ == "__main__": result=0 timer = Timer("result = pairs(diap1, diap2, n)", "from __main__ import diap1, diap2, n, pairs") print("pairs_time = ", timer.timeit(number=1), "result =", result) result=0 timer = Timer("result = pairs2(diap1, diap2, n)", "from __main__ import diap1, diap2, n, pairs2") print("pairs2_time = ", timer.timeit(number=1), "result =", result) 
+1
source

If you pull the value of n from the first list and then find the value of m in the second list so that the sum matches the value found, you can make several shortcuts. For example, if the sum is less, all values ​​from the second list that are less than or equal to m will also not give the correct amount. Similarly, if the amount is greater.

Using this information, I would use the following steps:

  • Set up two heaps, one minimum heap, one maximum heap.
  • Look at the top elements of each heap:
    • If the amount matches the value found, you are done.
    • If the sum exceeds the search value, remove the value from the maximum heap.
    • If the sum is less than the desired value, remove the value from the minimum heap.
  • If the heap is empty, there is no solution.

Note that using heap is an optimization by directly sorting two sequences. However, if you often find that there is no match, sorting the numbers before the algorithm can be faster. The reason for this is that a good sorting algorithm will outperform implicit sorting through heaps rather than its asymptotic complexity, but rather some constant factors.

0
source

Source: https://habr.com/ru/post/1235453/


All Articles