You can use the accessory .strin string columns to access certain string methods. And on this you can also chop:
frame[zipcol] = frame[zipcol].str[:5]
Based on a small example, this is about 50 times faster than a line by line loop:
In [29]: s = pd.Series(['testtest']*10000)
In [30]: %timeit s.str[:5]
100 loops, best of 3: 3.06 ms per loop
In [31]: %timeit str_loop(s)
10 loops, best of 3: 164 ms per loop
ty
In [27]: def str_loop(s):
.....: for i in range(len(s)):
.....: s[i] = s[i][:5]
.....:
joris source
share