Split the string in pandas and append it to the old data

What I am doing seems simple, but I cannot figure it out.

I have a dataframe with data like

City    State ZIP
Ames    IA    50011-3617
Ankeny  IA    50021

I want to split zipcodes into -and save only the first in a new data framework that has old data and only new zipcode. I tried to do the following.

data_short_zip = data
df = data['ZIP'].str.split('-').str[0]
data_short_zip.join(df)

This not only gives rise to an error, but also looks extraordinary. Is there an easy way to do this?

The output will look like

City    State ZIP
Ames    IA    50011
Ankeny  IA    50021
+4
source share
2 answers

You can use str.splitto split by your divisor and then str[0]to the result to return the first split:

In [122]:
df['ZIP'] = df['ZIP'].str.split('-').str[0]
df

Out[122]:
     City State    ZIP
0    Ames    IA  50011
1  Ankeny    IA  50021
+4
source

5 data.ZIP. 5, .

0    50011
1    50021
Name: ZIP, dtype: object

data.ZIP.str.extract(r'^(\d{5})', expand=False)
data.ZIP.str[:5]
data.ZIP.str.split('-').str[0]
data.ZIP.str.split('-').str.get(0)

;-) data.ZIP.str[:5] .

data.ZIP

data.ZIP = data.ZIP.str[:5]

enter image description here


Timing

2- , :

enter image description here

2- , 10000 (20 )

data = pd.concat([data for _ in range(10000)])

enter image description here

100 (2 )

data = pd.concat([data for _ in range(100)])

enter image description here

+2

Source: https://habr.com/ru/post/1648389/


All Articles