Pandas column width truncation

I am reading large csv files in pandas some of them with String columns in thousands of characters. Is there any quick way to limit the width of a column i.e. Save only the first 100 characters?

+2
source share
1 answer

If you can read all of this in memory, you can use the method strfor vector operations:

>>> df = pd.read_csv("toolong.csv")
>>> df
   a                       b  c
0  1  1256378916212378918293  2

[1 rows x 3 columns]
>>> df["b"] = df["b"].str[:10]
>>> df
   a           b  c
0  1  1256378916  2

[1 rows x 3 columns]

Also note that you can get a series with a length using

>>> df["b"].str.len()
0    10
Name: b, dtype: int64

I was wondering if

>>> pd.read_csv("toolong.csv", converters={"b": lambda x: x[:5]})
   a      b  c
0  1  12563  2

[1 rows x 3 columns]

it would be better, but I really don’t know if the converters will be called row by row or after the fact on the whole column.

+7
source

Source: https://habr.com/ru/post/1016138/


All Articles