Subheading: Dumb it down pandas, stop trying to be smart.
I have a ( res ) list of frame frames with one pandas column, each of which contains the same type of numeric data, but each with a different column name. Row indices have no meaning. I want to put them in one, very long, single-column data frame.
When I do pd.concat(res) , I get one column for each input file (both load and load of NaN cells). I tried various values ββfor the parameters (*), but none of them do what I need.
Edit: Example data:
res = [ pd.DataFrame({'A':[1,2,3]}), pd.DataFrame({'B':[9,8,7,6,5,4]}), pd.DataFrame({'C':[100,200,300,400]}), ]
I have an ugly solution: copy each data frame and give it a new column name:
newList = [] for r in res: r.columns = ["same"] newList.append(r) pd.concat( newList, ignore_index=True )
Isn't that the best way to do this?
BTW, pandas: the data concatenation frame with a different column name is similar, but my question is even simpler, since I do not want the index to be supported. (I also start with a list of N frames with one column, not one frame of data from an N-column.)
*: eg. axis=0 - default behavior. axis=1 gives an error. join="inner" just stupid (I only get the index). ignore_index=True renumbers index, but I get stil many columns, many NaNs.
UPDATE for empty lists
I had problems (with all these solutions) when the data had an empty list, for example:
res = [ pd.DataFrame({'A':[1,2,3]}), pd.DataFrame({'B':[9,8,7,6,5,4]}), pd.DataFrame({'C':[]}), pd.DataFrame({'D':[100,200,300,400]}), ]
The trick was to force the type by adding .astype('float64') . For instance.
pd.Series(np.concatenate([df.values.ravel().astype('float64') for df in res]))
or
pd.concat(res,axis=0).astype('float64').stack().reset_index(drop=True)