Python - Pandas Selection Lines with Strings

In my dataset, I have several lines containing characters. I only need strings containing all integers. What is the best way to do this? Below the dataset: for example, I want to delete rows 2 and 3, since they contain 051A, 04A and 08B respectively.

1   2017    0   321     3   20  42  18
2   051A    0   321     3   5   69  04A
3   460     0   1633    16  38  17  08B
4   1811    0   822     8   13  65  18
+4
source share
6 answers

Not sure if this can be avoided.

df.apply(lambda x: pd.to_numeric(x, errors = 'coerce')).dropna()

    0   1   2   3   4   5   6   7
0   1   2017.0  0   321 3   20  42  18.0
3   4   1811.0  0   822 8   13  65  18.0
+6
source

This is very similar to @jpp's solution, but differs in that it checks the digit.

df[df.applymap(lambda x: str(x).isdecimal()).all(1)].astype(int)

   0     1  2    3  4   5   6   7
0  1  2017  0  321  3  20  42  18
3  4  1811  0  822  8  13  65  18

Thanks to @jpp for the suggestion isdecimalas opposed toisdigit

+5
source

stack + unstack, .

v = df.stack().astype(str)
v.where(v.str.isdecimal()).unstack().dropna().astype(int)

   0     1  2    3  4   5   6   7
0  1  2017  0  321  3  20  42  18
3  4  1811  0  822  8  13  65  18
+4

, , try / except - , .

pd.DataFrame.applymap .

def CheckInt(s):
    try: 
        int(s)
        return True
    except ValueError:
        return False

res = df[df.applymap(CheckInt).all(axis=1)].astype(int)

#    0     1  2    3  4   5   6   7
# 0  1  2017  0  321  3  20  42  18
# 3  4  1811  0  822  8  13  65  18
+3

In one line, I think you can use the function convert_objectsfrom pandas. In this case, we will convert the object to an integer, which will lead to NA. We finally fell.

df = df.convert_objects(convert_numeric=True).dropna()

More information can be found here pandas documentation .

+2
source

Say the name of the last column in your DataFrame Col

If the type is Colnot a string:

df['Col'] = df['Col'].apply(str)

Then one line contains only numeric lines:

df = df.loc[df['Col'].str.isnumeric()]
0
source

Source: https://habr.com/ru/post/1694705/


All Articles