Delete Python List Item

I have two lists,

l1 = [1,2,3,4,5,6] l2 = [3,2] 

I want to remove the list item l1 that is in l2, because I did something like this,

 for x in l1: if x in l2: l1.remove(x) 

it gives a result, for example

 [1, 3, 4, 5, 6] 

but the output should be similar to

 [1, 4, 5, 6] 

can anyone shed some light on this.

+4
source share
9 answers

This is easily explained as follows.

consider the first array you have:

 | 1 | 2 | 3 | 4 | 5 | 6 | 

Now you start iteration

 | 1 | 2 | 3 | 4 | 5 | 6 | ^ 

Nothing happens, the iterator increases

 | 1 | 2 | 3 | 4 | 5 | 6 | ^ 

2 is deleted

 | 1 | 3 | 4 | 5 | 6 | ^ 

Iterator increases

 | 1 | 3 | 4 | 5 | 6 | ^ 

And voila, 3 is still there.

The solution is to repeat a copy of the vector, for example.

 for x in l1[:]: <- slice on entire array if x in l2: l1.remove(x) 

or to iterate in the opposite direction:

 for x in reversed(l1): if x in l2: l1.remove(x) 

Which acts as follows:

 | 1 | 2 | 3 | 4 | 5 | 6 | ^ | 1 | 2 | 3 | 4 | 5 | 6 | ^ | 1 | 2 | 4 | 5 | 6 | ^ | 1 | 2 | 4 | 5 | 6 | ^ | 1 | 4 | 5 | 6 | ^ | 1 | 4 | 5 | 6 | ^ 
+9
source

Why not make it a little easier? There is no need to iterate over l1 if we only want to remove the elements present in l2 :

 for item in l2: while item in l1: l1.remove(item) 

This gives you the desired result ...

In addition, as commentators note, if there is a possibility that we may have duplicates:

 l1 = filter(lambda x: x not in l2, l1) 

.. or many other options using lists.

+7
source

You want the outer loop to read:

 for x in l1[:]: ... 

You cannot change the list while iterating over it and expect reasonable results. The above trick makes a copy of l1 and iterates over the copy.

Please note that if the order does not matter in the output list, and your elements are unique and hashed, you can use a set:

 set(l1).difference(l2) 

which will give you a set as output, but you can easily create a list from it:

 l1 = list(set(l1).difference(l2)) 
+3
source

As others have said, you cannot edit a list while you loop it. A good option here is to use list comprehension to create a new list.

 removals = set(l2) l1 = [item for item in l1 if item not in removals] 

We make a set as a membership check on a set much faster than in the list.

+2
source

If the order and loss of duplicates in l1 does not matter:

 list(set(l1) - set(l2)) 

The last list () is required only if you need the result as a list. You can also just use the result set, it is also repeated. If you need it, you can of course call l.sort () in the resulting list.

+2
source

Edit: I removed my original answer because, although it gave the correct results, it did it for non-intuitive reasons and was not very fast ... so I just left the timings:

 import timeit setup = """l1 = list(range(20)) + list(range(20)) l2 = [2, 3]""" stmts = { "mgilson": """for x in l1[:]: if x in l2: l1.remove(x)""", "petr": """for item in l2: while item in l1: l1.remove(item)""", "Lattyware": """removals = set(l2) l1 = [item for item in l1 if item not in removals]""", "millimoose": """for x in l2: try: while True: l1.remove(x) except ValueError: pass""", "Latty_mgilson": """removals = set(l2) l1[:] = (item for item in l1 if item not in removals)""", "mgilson_set": """l1 = list(set(l1).difference(l2))""" } for idea in stmts: print("{0}: {1}".format(idea, timeit.timeit(setup=setup, stmt=stmts[idea]))) 

Results (Python 3.3.0 64bit, Win7):

 mgilson_set: 2.5841989922197333 mgilson: 3.7747968857414813 petr: 1.9669433777815701 Latty_mgilson: 7.262900152285258 millimoose: 3.1890831105541793 Lattyware: 4.573971325181478 
+1
source

You modify the list l1 while you repeat it, this will lead to strange behavior. ( 3 will be skipped during iteration.)

Rename the copy or change your algorithm to iterate instead of l2 :

 for x in l2: try: while True: l1.remove(x) except ValueError: pass 

(This should do better than testing if x in l1 explicitly). No, it is very scary when l1 grows in size.

0
source

FWIW I get significantly different results than @Tim Pietzcker did, using what I think is a more realistic set of input data and using a slightly more rigorous (but otherwise the same) approach to timing for different people's answers.

The names and code snippets are the same as the type, except that I added a variant called Lattyware_rev called Lattyware_rev , which determines which elements to save rather than reject - it turned out to be slower than the previous one. Please note that the two fastest ones do not save order l1 .

Here is the latest timecode:

 import timeit setup = """ import random random.seed(42) # initialize to constant to get same test values l1 = [random.randrange(100) for _ in xrange(100)] l2 = [random.randrange(100) for _ in xrange(10)] """ stmts = { "Minion91": """ for x in reversed(l1): if x in l2: l1.remove(x) """, "mgilson": """ for x in l1[:]: # correction if x in l2: l1.remove(x) """, "mgilson_set": """ l1 = list(set(l1).difference(l2)) """, "Lattyware": """ removals = set(l2) l1 = [item for item in l1 if item not in removals] """, "Lattyware_rev": """ keep = set(l1).difference(l2) l1 = [item for item in l1 if item in keep] """, "Latty_mgilson": """ removals = set(l2) l1[:] = (item for item in l1 if item not in removals)""", "petr": """ for item in l2: while item in l1: l1.remove(item) """, "petr (handles dups)": """ l1 = filter(lambda x: x not in l2, l1) """, "millimoose": """ for x in l2: try: while True: l1.remove(x) except ValueError: pass """, "K.-Michael Aye": """ l1 = list(set(l1) - set(l2)) """, } N = 10000 R = 3 timings = [(idea, min(timeit.repeat(stmts[idea], setup=setup, repeat=R, number=N)), ) for idea in stmts] longest = max(len(t[0]) for t in timings) # length of longest name exec(setup) # get an l1 & l2 just for heading length measurements print('fastest to slowest timings of ideas:\n' +\ ' ({:,d} timeit calls, best of {:d} executions)\n'.format(N, R)+\ ' len(l1): {:,d}, len(l2): {:,d})\n'.format(len(l1), len(l2))) for i in sorted(timings, key=lambda x: x[1]): # sort by speed (fastest first) print "{:>{width}}: {}".format(*i, width=longest) 

Output:

 fastest to slowest timings of ideas: (10,000 timeit calls, best of 3 executions) len(l1): 100, len(l2): 10) mgilson_set: 0.143126456832 K.-Michael Aye: 0.213544010551 Lattyware: 0.23666971551 Lattyware_rev: 0.466918513924 Latty_mgilson: 0.547516608553 petr: 0.552547776807 mgilson: 0.614238139366 Minion91: 0.728920176815 millimoose: 0.883061820848 petr (handles dups): 0.984093136969 

Of course, please let me know if something is radically wrong, which will explain the radically different results.

0
source
 l1 = [1, 2, 3, 4, 5, 6] l2 = [3, 2] [l1.remove(x) for x in l2] print l1 [1, 4, 5, 6] 
0
source

Source: https://habr.com/ru/post/1439168/


All Articles