I looked at the answer to this question: Parsing dates when YYYYMMDD and HH are in separate columns using pandas in Python , but it seems to work for me, which makes me think that I am doing something wrong.
I have data in CSV files that I am trying to read using the pandas read_csv function. Date and time are in two separate columns, but I want to combine them into a single "Datetime" column containing datetime objects. Csv looks like this:
Note about the data blank line Site Id,Date,Time,WTEQ.I-1... 2069, 2008-01-19, 06:00, -99.9... 2069, 2008-01-19, 07:00, -99.9... ...
I am trying to read it using this line of code:
read_csv("2069_ALL_YEAR=2008.csv", skiprows=2, parse_dates={"Datetime" : [1,2]}, date_parser=True, na_values=["-99.9"])
However, when I write it back to csv, it looks exactly the same (except that -99.9s is changed to NA, as I pointed out with the na_values argument). Date and time are in two separate columns. As far as I understand, this should create a new Datetime column, consisting of columns 1 and 2, parsed using date_parser. I also tried using parse_dates = {"Datetime": ["Date", "Time"]}, parse_dates = [[1,2]] and parse_dates = [["Date", "Time"]]. I also tried using date_parser = parse, where parse is defined as:
parse = lambda x: datetime.strptime(x, '%Y-%m-%d %H:%M')
None of them made the slightest difference, which makes me suspect that there is a deeper problem. Any insight on what this could be?