Python pandas dataframe sort_values not working

Question

Python pandas dataframe sort_values not working

I have the following pandas data frame that I want to sort by 'test_type'

test_type tps mtt mem cpu 90th 0 sso_1000 205.263559 4139.031090 24.175933 34.817701 4897.4766 1 sso_1500 201.127133 5740.741266 24.599400 34.634209 6864.9820 2 sso_2000 203.204082 6610.437558 24.466267 34.831947 8005.9054 3 sso_500 189.566836 2431.867002 23.559557 35.787484 2869.7670

My code for loading a data frame and sorting it - the first line of printing prints above the data frame.

  df = pd.read_csv(file) #reads from a csv file print df df = df.sort_values(by=['test_type'], ascending=True) print '\nAfter sort...' print df

After sorting and printing the contents of the frames, the data frame still looks as follows.

Program output:

 After sort... test_type tps mtt mem cpu 90th 0 sso_1000 205.263559 4139.031090 24.175933 34.817701 4897.4766 1 sso_1500 201.127133 5740.741266 24.599400 34.634209 6864.9820 2 sso_2000 203.204082 6610.437558 24.466267 34.831947 8005.9054 3 sso_500 189.566836 2431.867002 23.559557 35.787484 2869.7670

I expect line 3 (test type: line sso_500) to be on top after sorting. Can someone help me understand why it is not working as it should?

0

python pandas

jeffsia 20 sept '16 at 9:12

source share

2 answers

Alternatively, you can also extract numbers from test_type and sort them. Reindexing of DF follows these indices.

 df.reindex(df['test_type'].str.extract('(\d+)', expand=False) \ .astype(int).sort_values().index).reset_index(drop=True)

+3

Nickil maveli 20 sept '16 at 9:37

source share

Ami tavory · Accepted Answer · 2016-09-20T09:20:16+0000

Presumably, what you are trying to do is sorted by a numerical value after sso_ . You can do it as follows:

 import numpy as np df.ix[np.argsort(df.test_type.str.split('_').str[-1].astype(int).values)

it

splits strings into _
converts the value after this character to a numeric value
Finds indexes sorted by numeric values
Modifies a DataFrame according to these indices

Example

 In [15]: df = pd.DataFrame({'test_type': ['sso_1000', 'sso_500']}) In [16]: df.sort_values(by=['test_type'], ascending=True) Out[16]: test_type 0 sso_1000 1 sso_500 In [17]: df.ix[np.argsort(df.test_type.str.split('_').str[-1].astype(int).values)] Out[17]: test_type 1 sso_500 0 sso_1000

Python pandas dataframe sort_values ​​not working

More articles:

Python pandas dataframe sort_values not working