Values ​​of Individual Combinations in Pandas DataFrames

Is there an easy way to output various combinations of values ​​in a data framework? I used pd.Series.unique () for single columns, but what about multiple columns?

Sample data:

df = pd.DataFrame(data=[[1, 'a'], [2, 'a'], [3, 'b'], [3, 'b'], [1, 'b'], [1, 'b']], columns=['number', 'letter']) Expected output: (1, a) (2, a) (3, b) (1, b) 

Ideally, I need a separate tuple Series object with different values.

+5
source share
2 answers

You can freeze columns and create a set:

 >>> set(zip(df.number, df.letter)) {(1, 'a'), (1, 'b'), (2, 'a'), (3, 'b')} 
+2
source

IIUC, then you can set the index on these columns and then call unique on the index:

 In [165]: idx = df.set_index(['number','letter']).index idx.unique() Out[165]: array([(1, 'a'), (2, 'a'), (3, 'b'), (1, 'b')], dtype=object) 
+5
source

Source: https://habr.com/ru/post/1232288/


All Articles