Delete values ​​that appear only once in the DataFrame column

I have a data frame with different values ​​in a column x. I want to reset the values ​​that appear only once in a column.

So this is:

   x
1 10
2 30
3 30
4 40
5 40
6 50

You should get the following:

   x
2 30
3 30
4 40
5 40

I was wondering if there is a way to do this.

+4
source share
3 answers

You can easily get this using groupbyand transform:

In [1]: import pandas as pd

In [2]: df = pd.DataFrame([10, 30, 30, 40, 40, 50], columns=['x'])

In [3]: df = df[df.groupby('x').x.transform(len) > 1]

In [4]: df
Out[4]: 
    x
1  30
2  30
3  40
4  40
+8
source

You can use groupbyand then filterit:

In [9]:    
df = pd.DataFrame([10, 30, 30, 40, 40, 50], columns=['x'])
df = df.groupby('x').filter(lambda x: len(x) > 1)
df

Out[9]:
    x
1  30
2  30
3  40
4  40
+1
source

:

df = df.loc[df.duplicated(subset='x', keep=False), :]

, :

df = df.loc[~df.duplicated(subset='x', keep=False), :]

:

df = df.loc[~df.duplicated(subset='x'), :]

:

df = df.drop_duplicates(subset='x')
+1

Source: https://habr.com/ru/post/1611212/


All Articles