Scatter plot on a large amount of data

Question

Scatter plot on a large amount of data

Say I have a large dataset (8500000X50). And I would like to tell you a graph of X (date) and Y (measurement that was taken on a specific day).

I could only get this:

data_X = data['date_local']
data_Y = data['arithmetic_mean']
data_Y = data_Y.round(1)
data_Y = data_Y.astype(int)
data_X = data_X.astype(int)
sns.regplot(data_X, data_Y, data=data)
plt.show()

For some "same" issues that I found in Stackoverflow, I can shuffle my data or take, for example, 1000 random values and build them. But how to implement it in such a way that each X (the date when certain measurements were made) corresponds to the actual (measurement Y).

+4

python matplotlib pandas seaborn

dodo4545 Jul 13 '17 at 22:55

source share

1 answer

Vinícius Aguiar · Accepted Answer · 2017-07-14T01:32:47+0000

First answer your question:

pandas.DataFrame.sample, , regplot :

import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from datetime import datetime
import numpy as np
import pandas as pd
import seaborn as sns

dates = pd.date_range('20080101', periods=10000, freq="D")
df = pd.DataFrame({"dates": dates, "data": np.random.randn(10000)})

dfSample = df.sample(1000) # This is the importante line
xdataSample, ydataSample = dfSample["dates"], dfSample["data"]

sns.regplot(x=mdates.date2num(xdataSample.astype(datetime)), y=ydataSample) 
plt.show()

regplot X- - , , .

, - :

- :

:

sns.jointplot, kind, docs:

: { "" | "reg" | "" | "kde" | "hex" },
.

, , , matplotlib hist2d, , . :

dates = pd.date_range('20080101', periods=10000, freq="D")
df = pd.DataFrame({"dates": dates, "data": np.random.randn(10000)})

xdata, ydata = df["dates"], df["data"]
sns.jointplot(x=mdates.date2num(xdata.astype(datetime)), y=ydata, kind="kde")

plt.show()

, :

Scatter plot on a large amount of data

First answer your question:

:

More articles: