I have a data frame pandas.DataFrame:
import pandas as pd
df = pd.DataFrame({"x": ["hello there you can go home now", "why should she care", "please sort me appropriately"],
"y": [np.nan, "finally we were able to go home", "but what about meeeeeeeeeee"],
"z": ["", "alright we are going home now", "ok fine shut up already"]})
cols = ["x", "y", "z"]
I want to iteratively concatenate these columns, rather than writing something like:
df["concat"] = df["x"].str.cat(df["y"], sep = " ").str.cat(df["z"], sep = " ")
I know that the three columns seem trivial to come together, but actually I have 30. So I would like to do something like:
df["concat"] = df[cols[0]]
for i in range(1, len(cols)):
df["concat"] = df["concat"].str.cat(df[cols[i]], sep = " ")
Right now, the start line is df["concat"] = df[cols[0]]working fine, but a value NaNin the location df.loc[1, "y"]will ruin the concatenation. Ultimately, the entire string 1st ends as NaNin df["concat"]because of this one null value. How can I get around this? Is there an option with which pd.Series.str.catI need to specify?
source
share