Can you try to find the string first ?
in columns, create a logical mask and the last lines of the filter - use logical indexing . If you need to convert columns to float
, use astype
:
print ~((df['X'] == '?' ) (df['Y'] == '?' ) | (df['Z'] == '?' )) 0 False 1 True 2 False 3 True 4 False dtype: bool df1 = df[~((df['X'] == '?' ) | (df['Y'] == '?' ) | (df['Z'] == '?' ))].astype(float) print df1 XYZ 1 1 2 3 3 4 4 4 print df1.dtypes X float64 Y float64 Z float64 dtype: object
Or you can try:
df['X'] = pd.to_numeric(df['X'], errors='coerce') df['Y'] = pd.to_numeric(df['Y'], errors='coerce') df['Z'] = pd.to_numeric(df['Z'], errors='coerce') print df XYZ 0 0 1 NaN 1 1 2 3 2 NaN NaN 4 3 4 4 4 4 NaN 2 5 print ((df['X'].notnull() ) & (df['Y'].notnull() ) & (df['Z'].notnull() )) 0 False 1 True 2 False 3 True 4 False dtype: bool print df[ ((df['X'].notnull() ) & (df['Y'].notnull() ) & (df['Z'].notnull() )) ].astype(float) XYZ 1 1 2 3 3 4 4 4
Better to use:
df = df[(df != '?').all(axis=1)]
Or:
df = df[~(df == '?').any(axis=1)]
source share