Count real integers in a row column

How to count elements in a column that are integers?

Here is what I came up with:

import re
pd.Series(["a","2","z","123","a","oops"]).apply(lambda x: x and re.match(r"^\d+$",x) and 1).sum()
==> 2.0

and

def isint (x):
    try:
        int(x)
        return 1
    except ValueError:
        return 0

pd.Series(["a","2","z","123","a","oops"]).apply(isint).sum()
==> 2

Obviously, the second approach is better (returns to int, easily generalizes to other types - dates, float& c), but I wonder if there is an even better way that would not require me to write my own function.

+4
source share
3 answers

The .strseries attribute offers vectorized string methods:

>>> ser = pd.Series(["a","2","z","123","a","oops"])
>>> ser.str.isdigit().sum()
2
+6
source

I would use the pd.to_numeric () method :

In [62]: pd.to_numeric(s, errors='coerce')
Out[62]:
0      NaN
1      2.0
2      NaN
3    123.0
4      NaN
5      NaN
dtype: float64

In [63]: pd.to_numeric(s, errors='coerce').count()
Out[63]: 2
+4
source

You can do something like this:

isint = lambda x: all([ord(i) >= 48 and ord(i) < 58 for i in str(x)])
pd.Series(["a","2","z","123","a","oops"]).apply(isint).sum()
0
source

Source: https://habr.com/ru/post/1665039/


All Articles