Separating two columns of an unstretched data frame

I have two columns in a pandas frame.

Column 1 is known and contains rows (e.g. 'a', 'a', 'b,' c ',' c ',' a ')

ed column = ['a','a','b','c','c','a'] 

Column 2 is a task and also contains rows (for example, "aa", "bb", "aa", "aa", "bb", "cc")

 job column = ['aa','bb','aa','aa','bb','cc'] #these are example values from column 2 of my pandas data frame 

Then I create a table with two columns:

 my_counts= pdata.groupby(['ed','job']).size().unstack().fillna(0) 

Now, how to divide the frequencies in one column into the frequencies in another column of this frequency table? I want to accept this ratio and use it for argsort() so that I can sort by the calculated coefficient, but I don’t know how to refer to each column of the resulting table.

+1
source share
1 answer

I initialized the data as follows:

 ed_col = ['a','a','b','c','c','a'] job_col = ['aa','bb','aa','aa','bb','cc'] pdata = pd.DataFrame({'ed':ed_col, 'job':job_col}) my_counts= pdata.groupby(['ed','job']).size().unstack().fillna(0) 

Now my_counts looks like this:

 job aa bb cc ed a 1 1 1 b 1 0 0 c 1 1 0 

To access the column, you can use my_counts.aa or my_counts['aa'] . To access the string, you can use my_counts.loc['a'] .

Thus, the frequencies aa divided by bb are my_counts['aa'] / my_counts['bb']

and now, if you want to sort it, you can do:

 my_counts.iloc[(my_counts['aa'] / my_counts['bb']).argsort()] 
0
source

Source: https://habr.com/ru/post/1247113/


All Articles