Daily, Weekly and Yearly seasonalities.
A lot of signals have calendar based seasonalities:
- city noise levels have a daily seasonality: it's busy when people are out and calm at night.
- sales have a weekly seasonality: people buy more on Wednesdays and Saturdays.
- and for yearly seasonality ...you guessed it, seasons.
These seasonalities are hard to define with a simple mathematical function, but most of the time you have a rough idea of their shape. Let's see how you can create them with mockseries !
Let's look at the hourly energy consumption per household.
It's definitely cyclic: you consume less when you're sleeping, and there are pics corresponding to breakfast, lunch and dinner: things that happen everyday.
To generate a similar signal in mockseries, we just need to identify a few
constraint points. A
constraint is a time and its associated value.
An algorithm will generate a realistic curve that respects the constraints.
Times are expressed as
timedeltas from the start of the period, so for DailySeasonality, timedeltas should be between
0h 0min 0 sec and
23h 59min 59.999sec.
That's it ! WeeklySeasonality and YearlySeasonality work the same way, the 3 signals are subclasses of the PeriodSeasonality. But don't leave just yet! We still have important features to explore.
#Weekly seasonality - Managing timezone
Let's imagine the weekly traffic of an american social network:
- the app has almost no users at night
- the app has many users around 7pm, except on Saturday and Sunday and
Let's use WeeklySeasonality to get this:
Looks nice, but there's one problem. The social network is american, straight from the Silicon Valley ! All times were given with PDT timezone in mind.
It's easy to do this in mockseries: just pass the UTC offset of the timezone. For instance: PDT is UTC-7: just pass an
utc_offset of `-7 hours.
That's it! You can see the red curve corresponds to the PDT times.
Of course this also works with the other seasonalities presented in this tutorial.
#Yearly seasonality - Normalizing
Let's have a look at the Historical monthly average temperature in Croatia from the World Bank Group.
We have our pattern. It sounds reasonable to use the middle of a month to represent the average value of a month. For instance:
Not easy to compute mentally, right ? Don't worry, middle-of-month timedeltas are available in the utils. Also, because we're lazy, we won't fill the 12 months and let the interpolation generate a realistic curve.
Repeating on multiple years:
That's nice, but let's make this more realistic now.
Look at the temperature trend:
In 1980 temperature averaged 10.4 degrees. It was 12.6 in 2020.
We can roughly estimate that Croatia temperature follows a linear trend, raising by
21.1% in 40 years.
Let's combine this with our seasonality ! ... but how should we do this ? Our YearlySeasonality is based on average temperatures between 1990 and 2020. We would have to somehow estimate the true values in 1990, and then estimate the values in 1980. It's feasible with our linear trend approximation, but it's error prone.
Remember the two types of interactions ?
A simple way to get around this is to use
normalize in our YearlySeasonality.
normalize transform constraints to a multiplication factor for easy use in multiplicative interactions.
Let's check it out:
Same curve, different scale.
Now it's easy to combine everything: we generate the base value, multiply it by our seasonality and add the trend. Also, let's add a bit of noise to make this more realistic.
Here you are: a simple temperature timeseries for Croatia.
To go further, try to implement some improvements:
- Add the temperature's daily seasonality.
- Make the YearlySeasonality multiplicative with the trend.
- Make the trend non linear, exponential maybe.
Go to the next page to learn how to create switches, and point in time events.
Go directly to the PeriodSeasonality's API Reference to see how to extend the
PeriodSeasonality for your own custom period.