I will learn to use pandas to use it for some data analysis. The data is provided as a csv file with several columns, of which I need to use only 4 (date, time, o, c). I would like to create a new DataFrame that uses the DateTime64 number as an index, this number is created by merging the first two columns, applying pd.to_datetime to the joined row.
My bootloader code is working fine:
st = pd.read_csv("C:/Data/stockname.txt", names=["date","time","o","h","l","c","vol"])
The challenge is to convert the loaded DataFrame to a new one with the correct format. Below it works, but very slowly. Moreover, it simply creates a single column with the new datetime64 format and does not make it an index.
My code
st_new = pd.concat([pd.to_datetime(st.date + " " + st.time), (st.o + st.c) / 2, st.vol], axis = 1, ignore_index=True)
What would be a more pythonic way to merge two columns and apply the function to the result? How to make a new column a DataFrame index?
source share