How to remove case insensitive duplicates from a list while maintaining the original list?

Question

How to remove case insensitive duplicates from a list while maintaining the original list?

I have a list of strings, for example:

myList = ["paper", "Plastic", "aluminum", "PAPer", "tin", "glass", "tin", "PAPER", "Polypropylene Plastic"]

I want this result (and this is the only acceptable result):

myList = ["paper", "Plastic", "aluminum", "tin", "glass", "Polypropylene Plastic"]

Note that if element ( "Polypropylene Plastic") contains another element ( "Plastic"), I would still like to keep both elements. Thus, the cases may be different, but this element must be alphabetic so that it can be removed.

The original list order must be kept. All duplicates after the first instance of this item must be deleted. The original case of this first instance must be preserved, as well as the original cases of all non-duplicated elements.

I searched and found only questions that relate to one or the other problem, and not both.

+4

python list

Crickets 16 . '18 14:16

6

EDIT: , , . , , .

import string

def custom_filter(my_list):
    seen = set()
    result_list = []
    for i in my_list:
        item = string.capwords(i)
        if item not in my_list:
            item = item.lower()
        if item not in seen:
            result_list.append(item)
            seen.add(item)
    return result_list


print(custom_filter(myList))

:

['paper', 'Plastic', 'aluminum', 'tin', 'glass', 'Polypropylene Plastic']

0

Gábor Fekete 16 . '18 14:28

mydict = {}
myList = ["paper", "Plastic", "aluminum", "tin", "glass", "Polypropylene Plastic"]
mynewList = []
for elem in myList:
  if elem.lower() in mydict:
     continue
  else:
     mydict[elem.lower()] = elem.lower()
     mynewList.append(elem)
print(mynewList)

['paper', 'Plastic', 'aluminum', 'tin', 'glass', 'Polypropylene Plastic']

, , Jean-François Fabre, .

0

GeneX 16 . '18 14:38

import pandas as pd
df=pd.DataFrame(myList)
df['lower']=df[0].apply(lambda x: x.lower())
df.groupby('lower',sort=0)[0].first().tolist()

:

['paper', 'Plastic', 'aluminum', 'tin', 'glass','Polypropylene Plastic']

0

Binyamin Even 16 . '18 14:42

: collections.defaultdict

from collections import defaultdict

myList = ["paper", "Plastic", "aluminum", "PAPer", "tin", "glass", "tin", "PAPER", "Polypropylene Plastic"]
d_dict = defaultdict(list)
for k,v in enumerate(myList):
    d_dict[v.lower()].append(k)

[myList[j] for j in sorted(i[0] for i in d_dict.values())]

['paper', 'Plastic', 'aluminum', 'tin', 'glass', 'Polypropylene Plastic']

0

Transhuman 16 . '18 14:48

@Gábor Fekete . :

myList = ["paper", "Plastic", "aluminum", "PAPer", "tin", "glass",
          "tin", "PAPER", "Polypropylene Plastic"]

def is_already_in(value, used_elements):
  low = value.lower()
  if low in used_elements:
    return True
  used_elements.add(low)
  return False

used_elements = set()
print([ e for e in myList if not is_already_in(e, used_elements) ])

-1

Alfe Jan 16 '18 at 14:41

source share

Jean-François Fabre · Accepted Answer · 2018-01-16T14:22:24+0000

( ) - /, .

set, .

set, , . ,

myList = ["paper", "Plastic", "aluminum", "PAPer", "tin", "glass", "tin", "PAPER", "Polypropylene Plastic"]
result=[]

marker = set()

for l in myList:
    ll = l.lower()
    if ll not in marker:   # test presence
        marker.add(ll)
        result.append(l)   # preserve order

print(result)

:

['paper', 'Plastic', 'aluminum', 'tin', 'glass', 'Polypropylene Plastic']

.casefold() .lower(), "" (, "s" Strasse/Straße).

: , :

marker = set()
result = [not marker.add(x.casefold()) and x for x in myList if x.casefold() not in marker]

and None set.add ( , ...), x , . :

, casefold() , ,

How to remove case insensitive duplicates from a list while maintaining the original list?

More articles: