Get substring from pandas dataframe during filtering

Let's say I have a data framework with the following information:

  Name Points String
 John 24 FTS8500001A
 Richard 35 FTS6700001B
 John 29 FTS2500001A
 Richard 35 FTS3800001B
 John 34 FTS4500001A

Here's how to get a DataFrame with the sample above:

import pandas as pd keys = ('Name', 'Points', 'String') names = pd.Series(('John', 'Richard', 'John', 'Richard', 'John')) ages = pd.Series((24,35,29,35,34)) strings = pd.Series(('FTS8500001A','FTS6700001B','FTS2500001A','FTS3800001B','FTS4500001A')) df = pd.concat((names, ages, strings), axis=1, keys=keys) 

I want to select each row that matches the following criteria: Name = Richard And Points = 35. And for such rows, I want to read the 4th and 5th char of the String column (two numbers immediately after FTS).

I want to get the output of numbers 67 and 38.

Ive tried several ways to achieve this, but with zero results. Could you help me?

Thank you very much.
Eduardo

+6
source share
2 answers

Use a boolean mask to filter your df, and then call str and cut the string:

 In [77]: df.loc[(df['Name'] == 'Richard') & (df['Points']==35),'String'].str[3:5] Out[77]: 1 67 3 38 Name: String, dtype: object 
+11
source

Pandas string methods

You can mask it according to your criteria and then use the pandas string methods

 mask_richard = df.Name == 'Richard' mask_points = df.Points == 35 df[mask_richard & mask_points].String.str[3:5] 1 67 3 38 
+6
source

Source: https://habr.com/ru/post/988924/


All Articles