I have a dataset:
367235 419895 992194 1999-01-11 8 5 1 1999-03-23 NaN 4 NaN 1999-04-30 NaN NaN 1 1999-06-02 NaN 9 NaN 1999-08-08 2 NaN NaN 1999-08-12 NaN 3 NaN 1999-08-17 NaN NaN 10 1999-10-22 NaN 3 NaN 1999-12-04 NaN NaN 4 2000-03-04 2 NaN NaN 2000-09-29 9 NaN NaN 2000-09-30 9 NaN NaN
When I draw it using plt.plot(df, '-o')
, I get the following:

But I would like the data from each column to be connected in a row like this:

I understand that matplotlib does not bind datapoints that are separated by NaN values. I have considered all the options here for processing missing data, but all of them would substantially distort the data in the data frame. This is because each value in the data frame represents an incident; if I try to replace NaN with scalar values โโor use the interpolation option, I get a bunch of points that are not actually in my dataset. Here's what the interpolation looks like:
df_wanted2 = df.apply(pd.Series.interpolate)

If I try to use dropna
, I will lose entire rows / columns from the data framework, and these rows store valuable data.
Does anyone know a way to connect my points? I suspect that I need to extract individual arrays from the framework and draw them as indicated in here , but this seems like a lot of work (and my actual framework is much bigger.) Does anyone have a solution?
source share