Find the position of a value that occurs only once in a dataframe

I have a problem to find the best python way to return the position (row | column) of a value in a pandas DataFrame.

I have a list of numbers ... list = [1,2,3,4,5,8]

and pandas Dataframe.

df = pd.DataFrame({'A':[1,3,8,8], 'B':[3,3,2,8],'x':[0.4,0.3,0.5,0.8]})

df
Out[2]: 
   A  B  x
0  1  3  0.4
1  3  3  0.3
2  8  2  0.5
3  8  8  0.8
  • I will compare the numbers from the list with the numbers in the DataFrame (['A'] and ['B']). In the end, I want to know what number in the list occurs in the DataFrame only once.

I would iterate over a DataFrame with each number in the list, but I think this is not the best python way.

  1. I need the position of the entry value in the DataFrame in the format (row | column), because if a single number is in df ['B'], then I need an additional value df ['A'], If a single number is in df ['A '], I need an extra value in df [' B '],

... , , .

, DataFrame.

, ...

dfnew

  SingleNumber AorB x
0 1            3    0.4           
1 2            8    0.5

, . , .

PS: :)

+4
2

(, list data, ):

data = [1,2,3,4,5,8]
df = pd.DataFrame({'A':[1,3,8,8], 'B':[3,3,2,8],'x':[0.4,0.3,0.5,0.8]})

-, , :

flattened = pd.melt(df, value_vars=['A', 'B'])

:

  variable  value
0        A      1
1        A      3
2        A      8
3        A      8
4        B      3
5        B      3
6        B      2
7        B      8

, data ( , / , ):

in_data = flattened[flattened.value.isin(data)]

:

only_once = in_data.drop_duplicates(subset='value', keep=False)

:

  variable  value
0        A      1
6        B      2

, DF:

new_df = df.iloc[only_once.index // len(df.columns)]

:

   A  B    x
0  1  3  0.4
2  8  2  0.5

...

new_df['single_number'] = only_once.value.values

:

   A  B    x  single_number
0  1  3  0.4              1
2  8  2  0.5              2

, , .reset_index(drop=True), 0 1.


:

, df df = pd.DataFrame({'A':[1,3,8,5], 'B':[3,3,2,8],'x':[0.4,0.3,0.5,0.8]}). new_df, .

, .

reset, .

df = pd.DataFrame({'A':[1,3,8,5], 'B':[3,3,2,8],'x':[0.4,0.3,0.5,0.8]})
unique = pd.melt(
    df.reset_index(), 
    id_vars='index', 
    value_vars=['A', 'B'],
    value_name='SingleNumber'
).drop_duplicates(subset='SingleNumber', keep=False)

:

   index variable  value
0      0        A      1
3      3        A      5
6      2        B      2

, , .

new_df = df.merge(unique, left_index=True, right_on='index')

:

   A  B    x  index variable  SingleNumber
0  1  3  0.4      0        A             1
6  8  2  0.5      2        B             2
3  5  8  0.8      3        A             5

reset index .. .

+2

, :

dfnew = pd.DataFrame([1,2,3,4,5,8], columns=['SingleNumber'])

def func(row):
    match = df_values[df_values == row['SingleNumber']]
    if len(match) == 1:
        idx = match.index.get_level_values(0)[0]
        col = match.index.get_level_values(1)[0]
        return pd.Series({
                'AorB': df.loc[idx, 'A' if col == 'B' else 'B'], 
                'x': df.loc[idx, 'x']
            })

dfnew.join(dfnew.apply(func, axis=1)).dropna()

, , - !

+1

Source: https://habr.com/ru/post/1660500/


All Articles