The error you see may be caused by the value (s) in column x , which is the row:
In [15]: df = pd.DataFrame({'x':['1.0692e+06']}) In [16]: df['x'].astype('int') ValueError: invalid literal for long() with base 10: '1.0692e+06'
Ideally, the problem can be avoided by making sure that the values ββstored in the DataFrame are no longer strings when building the DataFrame. How to do this, of course, depends on how you create the DataFrame.
After the DataFrame could be set using applymap:
import ast df = df.applymap(ast.literal_eval).astype('int')
but calling ast.literal_eval for each value in the DataFrame can be slow, so the best option is to fix the problem from the start.
You can usually drop it to the debugger when an exception occurs to check for the problematic row value.
However, in this case, the exception occurs inside the astype call, which is a thin shell around the C-compiled code. The compiled code loops through the values ββin df['x'] , so the Python debugger does not help here - it will not let you understand what value causes an exception from C-compiled code.
There are many important parts of Pandas and NumPy written in C, C ++, Cython or Fortran, and the Python debugger will not accept you inside those pieces of code that are not Python, where fast loops are processed.
So instead, I will return to a low-brow solution: iterate over the values ββin a Python loop and use try...except to catch the first error:
df = pd.DataFrame({'x':['1.0692e+06']}) for i, item in enumerate(df['x']): try: int(item) except ValueError: print('ERROR at index {}: {!r}'.format(i, item))
gives
ERROR at index 0: '1.0692e+06'