Split the string in pandas and append it to the old data

Question

Split the string in pandas and append it to the old data

What I am doing seems simple, but I cannot figure it out.

I have a dataframe with data like

City    State ZIP
Ames    IA    50011-3617
Ankeny  IA    50021

I want to split zipcodes into -and save only the first in a new data framework that has old data and only new zipcode. I tried to do the following.

data_short_zip = data
df = data['ZIP'].str.split('-').str[0]
data_short_zip.join(df)

This not only gives rise to an error, but also looks extraordinary. Is there an easy way to do this?

The output will look like

City    State ZIP
Ames    IA    50011
Ankeny  IA    50021

+4

python pandas

Jstuff Jul 19 '16 at 14:17

source share

2 answers

5 data.ZIP. 5, .

0    50011
1    50021
Name: ZIP, dtype: object

data.ZIP.str.extract(r'^(\d{5})', expand=False)
data.ZIP.str[:5]
data.ZIP.str.split('-').str[0]
data.ZIP.str.split('-').str.get(0)

;-) data.ZIP.str[:5] .

data.ZIP

data.ZIP = data.ZIP.str[:5]

Timing

2- , :

2- , 10000 (20 )

data = pd.concat([data for _ in range(10000)])

100 (2 )

data = pd.concat([data for _ in range(100)])

+2

piRSquared 19 . '16 14:46

Edchum · Accepted Answer · 2016-07-19T14:21:08+0000

You can use str.splitto split by your divisor and then str[0]to the result to return the first split:

In [122]:
df['ZIP'] = df['ZIP'].str.split('-').str[0]
df

Out[122]:
     City State    ZIP
0    Ames    IA  50011
1  Ankeny    IA  50021

Split the string in pandas and append it to the old data

Timing

2- , :

2- , 10000 (20 )

100 (2 )

More articles: