A guide to understanding different seasonality and breaking down time series into trends and seasons
Seasonality in time series data refers to a pattern that occurs at regular intervals. This differs from the usual cyclical trends, such as stock price rises and falls, which recur on a regular basis but do not have a fixed period of time. Understanding the seasonality of data can provide a lot of insights, and you can even use it as a starting point by comparing your time series machine learning models.
Quick Reference: In this article II use information published by the Quebec Real Estate Agents Union. The association publishes monthly real estate statistics. For convenience, I have included the monthly median prices for the Province of Quebec and the Montreal Capital Region in the CSV file available here: https://drive.google.com/file/d/1SMrkZPAa0aAl-ZhnHLLFbmdgYmtXgpAb/view?usp=sharing
The fastest way to get an idea of whether your data is seasonal is by drawing it. Let’s see what we get when we plot the median price of houses in Montreal on a monthly basis.
A sharp eye may already see from this plot that prices seem to be falling all year round and rising a few months before the end of summer. Dive into this a little further by drawing a vertical line for January each year.
It seems that there is definitely a trend here. In this case, it seems that the seasonality is one year. Next, we look at a tool we can use to examine seasonality and divide time series into its trend, seasonal, and residual components. Before we can do that, however, you need to understand the difference between the additive and the seasonality being told.
There are two types of seasons that can occur when analyzing time series data. To understand the difference between them, let us consider a standard time series with perfect seasonality, the cosine wave:
We can clearly see that the wave period is 20 and the amplitude (distance from the centerline to the top of the ridge or the bottom of the gutter) is 1 and remains constant.
Seasonality of additives
It is quite rare for actual time series to have constant brush and trough values, and instead we typically see some sort of general trend, such as increase or decrease over time. For example, in our sales price chart, the median price tends to rise over time.
If the amplitude of our seasonality tends to remain the same, we have so-called additive seasonality. Below is an example of the seasonality of additives.
A great way to think about it is to imagine that we took the usual cosine wave and simply added a trend to it:
We can even think of our previous cosine base model as a continuous trend additive model! We can model additive time series with the following simple equation:
Y[t] = T[t] + S[t] + e[t]
Y[t]: Our time series function
T[t]: Trend (general tendency to move up or down)
S[t]: Seasonality (cyclical pattern occurs at regular intervals)
e[t]: Residual (random noise in data not taken into account in trend or seasonality
Another type of season you may encounter in your time series data is multiple. In this type, the amplitude of our seasonality increases or decreases based on the trend. Below is an example of the seasonality to be told.
We can apply a similar mindset to the one we used in our additive model, and imagine that we took the cosine wave, but instead of adding to the trend, we multiplied it (hence the name multiplicative seasonality):
We can model this with a similar equation as our additive model by exchanging only additions for multiplications.
Y[t] = T[t] * S[t] * e[t]
Now that we have a clear picture of the different models, let’s look at how we can divide a property’s time series into its trend, seasonality, and residual components. We use seasonal model from the statistical model library.
The seasonal_distribute model requires that you select a model type for seasonality (additive or multiple). We choose the multiplicative model because it seems that the amplitude of the cycles increases over time. This would make sense because a large factor in the price of housing is the interest rate on the loan, which is paid as a percentage of the price.
Ta-da! Trend, seasonal, and residual components are returned as Pandas series, so you can draw them by calling their plot () methods or performing additional analysis on them. One thing that can be helpful is to measure their correlation with external factors. For example, you can measure the correlation between the trend and the mortgage rate, or you can see if there is a strong correlation between the residual number and number of new babies born in the city.
From our decomposition, we can see that the model has grown by a 5% difference between the seasons. If you want to sell your house, you might want to list it from mid to late spring instead of late fall if you want to get the best dollar!