seasonal_decompose() requires freq , which is either provided as part of the DateTimeIndex meta-information, can be displayed on pandas.Index.inferred_freq , or by the user as integer , which gives the number of periods per cycle, for example, 12 for monthly (from docstring for seasonal_mean ):
def seasonal_decompose(x, model="additive", filt=None, freq=None): """ Parameters ---------- x : array-like Time series model : str {"additive", "multiplicative"} Type of seasonal component. Abbreviations are accepted. filt : array-like The filter coefficients for filtering out the seasonal component. The default is a symmetric moving average. freq : int, optional Frequency of the series. Must be used if x is not a pandas object with a timeseries index.
To illustrate - using random sample data:
length = 400 x = np.sin(np.arange(length)) * 10 + np.random.randn(length) df = pd.DataFrame(data=x, index=pd.date_range(start=datetime(2015, 1, 1), periods=length, freq='w'), columns=['value']) <class 'pandas.core.frame.DataFrame'> DatetimeIndex: 400 entries, 2015-01-04 to 2022-08-28 Freq: W-SUN decomp = sm.tsa.seasonal_decompose(df) data = pd.concat([df, decomp.trend, decomp.seasonal, decomp.resid], axis=1) data.columns = ['series', 'trend', 'seasonal', 'resid'] Data columns (total 4 columns): series 400 non-null float64 trend 348 non-null float64 seasonal 400 non-null float64 resid 348 non-null float64 dtypes: float64(4) memory usage: 15.6 KB
So far so good - now randomly dropping elements from DateTimeIndex to create uneven spatial data:
df = df.iloc[np.unique(np.random.randint(low=0, high=length, size=length * .8))] <class 'pandas.core.frame.DataFrame'> DatetimeIndex: 222 entries, 2015-01-11 to 2022-08-21 Data columns (total 1 columns): value 222 non-null float64 dtypes: float64(1) memory usage: 3.5 KB df.index.freq None df.index.inferred_freq None
Running seasonal_decomp on this data "works":
decomp = sm.tsa.seasonal_decompose(df, freq=52) data = pd.concat([df, decomp.trend, decomp.seasonal, decomp.resid], axis=1) data.columns = ['series', 'trend', 'seasonal', 'resid'] DatetimeIndex: 224 entries, 2015-01-04 to 2022-08-07 Data columns (total 4 columns): series 224 non-null float64 trend 172 non-null float64 seasonal 224 non-null float64 resid 172 non-null float64 dtypes: float64(4) memory usage: 8.8 KB
The question is how useful the result is. Even without data gaps that complicate the output of seasonal patterns (see .interpolate() example in the release notes, statsmodels qualifies this procedure as follows:
Notes