Extract int from string in pandas

Suppose I have a dataframe dflike

A B
1 V2
3 W42
1 S03
2 T02
3 U71

I want to have a new column (either end on it df, or replace the column Bwith it, as it doesn't matter), which only extracts int from the column B. That is, I want the column to Clook like

C
2
42
3
2
71

So, if there is 0 in front of the number, for example, for 03, then I want to return 3 not 03

How can i do this?

+9
source share
4 answers

You can convert to a string and extract an integer using regular expressions.

df['B'].str.extract('(\d+)').astype(int)
+37
source

Assuming there is always only one leading letter

df['B'] = df['B'].str[1:].astype(int)
+2

I wrote a short loop to do this, since I did not have rows in the DataFrame, but in the list. So you can also add an if statement for the account for float:

output= ''
input = 'whatever.007'  

for letter in input :
        try :
            int(letter)
            output += letter

        except ValueError :
                pass

        if letter == '.' :
            output += letter

output = float (output)

or you can use int (output) if you want.

0
source

Preparing DF for the same as yours:

df = pd.DataFrame({'A': [1, 3, 1, 2, 3], 'B' : ['V2', 'W42', 'S03', 'T02', 'U71']})

df.head()

Now manipulate it to get the desired result:

df['C'] = df['B'].apply(lambda x: re.search(r'\d+', x).group())

df.head()


    A   B   C
0   1   V2  2
1   3   W42 42
2   1   S03 03
3   2   T02 02
4   3   U71 71
0
source

Source: https://habr.com/ru/post/1628605/


All Articles