Let's say I want to calculate the difference of two lists C = A - B :
A = [1,2,3,4,5,6,7,8,9] B = [1,3,5,8,9] C = [2,4,6,7]
A and B are sorted with unique integers (not sure if there is a way to tell Python about this list property). I need to keep the order of the elements. AFAIK, there are two possible ways to do this.
Method 1 : convert B to set and use list use to create C:
s = set(B) C = [x for x in A if x not in s]
Method 2 Direct use of the list:
C = [x for x in A if x not in B]
Why is #1 more effective than #2 ? Is there no overhead to convert to set? What am I missing here?
Some performance tests are provided in this answer.
UPDATE:. I know that the average search time O(1) corresponds to the average of the list O(n) , but if the original list A contains about a million integers, 'Does creating a collection actually take longer?