Chapter 20 Time Series

20.1 Univariate Time Series Analysis

Any metric that is measured over regular time intervals forms a time series.

Analysis of time series is commercially importance because of industrial need and relevance especially w.r.t forecasting (demand, sales, supply etc).

A time series can be broken down to its components so as to systematically understand, analyze, model and forecast it.

In this chapter we will answer the following questions:

1. What are the components of a time series?

2. What is a stationary time series?

3. How to decompose it?

4. How to de-trend, de-seasonalize a time series?

5. What is autocorrelation?

20.2 Time series data

A time series is stored in a ts object in R:

• a list of numbers
• information about times those numbers were recorded.

ts(data, frequency, start)

Once you have read the time series data into R, the next step is to store the data in a time series object in R, so that you can use many functions for analyzing time series data. To store the data in a time series object, we use the ts() function in R.

For example, to store the data in the variable tsales as a time series object in R, we type:

# Creating a time series object in R
sales <- c(18, 33, 41,  7, 34, 35, 24, 25, 24, 21, 25, 20,
22, 31, 40, 29, 25, 21, 22, 54, 31, 25, 26, 35)
tsales <- ts(sales, start=c(2003, 1), frequency=12)
tsales
     Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov
2003  18  33  41   7  34  35  24  25  24  21  25
2004  22  31  40  29  25  21  22  54  31  25  26
Dec
2003  20
2004  35

Sometimes the time series data set that you have may have been collected at regular intervals that were less than one year, for example, monthly or quarterly. In this case, you can specify the number of times that data was collected per year by using the frequency parameter in the ts() function. For monthly time series data, you set frequency=12, while for quarterly time series data, you set frequency=4.

You can also specify the first year that the data was collected, and the first interval in that year by using the start parameter in the ts() function. For example, if the first data point corresponds to the second quarter of 2003, you would set start=c(2003,2).

tsales <- ts(sales, start=c(2003, 2), frequency=4)
tsales
     Qtr1 Qtr2 Qtr3 Qtr4
2003        18   33   41
2004    7   34   35   24
2005   25   24   21   25
2006   20   22   31   40
2007   29   25   21   22
2008   54   31   25   26
2009   35               
start(tsales) 
[1] 2003    2
end(tsales)
[1] 2009    1
frequency(tsales)
[1] 4

20.3 Smoothing a Time Series

Problem

You have a noisy time series. You want to smooth the data to eliminate the noise.

Solution

The KernSmooth package contains functions for smoothing. Use the dpill function to select an initial bandwidth parameter, and then use the locploy function to smooth the data:

Here, t is the time variable and y is the time series.

Discussion

The KernSmooth package is a standard part of the R distribution. It includes the locpoly function, which constructs, around each data point, a polynomial that is fitted to the nearby data points. These are called local polynomials. The local polynomials are strung together to create a smoothed version of the original data series.

The algorithm requires a bandwidth parameter to control the degree of smoothing. A small bandwidth means less smoothing, in which case the result follows the original data more closely. A large bandwidth means more smoothing, so the result contains less noise. The tricky part is choosing just the right bandwidth: not too small, not too large.

Fortunately, KernSmooth also includes the function dpill for estimating the appropriate bandwidth, and it works quite well. We recommend that you start with the dpill value and then experiment with values above and below that starting point. There is no magic formula here. You need to decide what level of smoothing works best in your application.

The following is an example of smoothing. We’ll create some example data that is the sum of a simple sine wave and normally distributed “noise”:

Both dpill and locpoly require a grid size—in other words, the number of points for which a local polynomial is constructed. We often use a grid size equal to the number of data points, which yields a fine resolution. The resulting time series is very smooth. You might use a smaller grid size if you want a coarser resolution or if you have a very large dataset:

The locpoly function performs the smoothing and returns a list. The y element of that list is the smoothed data:

In Figure 20.1, the smoothed data is shown as a dashed line, while the solid line is our original example data. The figure demonstrates that locpoly did an excellent job of extracting the original sine wave.

See Also

The ksmooth, lowess, and HoltWinters functions in the base distribution can also perform smoothing. The expsmooth package implements exponential smoothing.

20.4 TS plots

Once you have read a time series into R, the next step is usually to make a plot of the time series data, which you can do with the plot.ts() function in R.

plot.ts(tsales)

We can also take a look at some more detailed time series aspects like seasonal plots, i.e. for antidiabetic drug sales data:

ggseasonplot(a10, year.labels=TRUE, year.labels.left=TRUE) +
ylab("$million") + ggtitle("Seasonal plot: antidiabetic drug sales") Seasonal polar plots: ggseasonplot(a10, polar=TRUE) + ylab("$ million")+
ggtitle("Seasonal polar plot: antidiabetic drug sales")

Seasonal subseries plots:

• Data for each season collected together in time plot as separate time series.
• Enables the underlying seasonal pattern to be seen clearly, and changes in seasonality over time to be visualized.
• In R: ggsubseriesplot()
ggsubseriesplot(a10) + ylab("$million") + ggtitle("Subseries plot: antidiabetic drug sales") Lagged scatterplots: • Each graph shows $$y_t$$ plotted against $$y_{t-k}$$ for different values of $$k$$. • The autocorrelations are the correlations associated with these scatterplots. • Each graph shows $$y_t$$ plotted against $$y_{t-k}$$ for different values of $$k$$. • The autocorrelations are the correlations associated with these scatterplots. beer <- window(ausbeer, start=1992) gglagplot(beer) Here the colours indicate the quarter of the variable on the vertical axis. The lines connect points in chronological order. The window() function used here is very useful when extracting a portion of a time series. In this case, we have extracted the data from ausbeer, beginning in 1992. 20.5 Time series components Trend pattern exists when there is a long-term increase or decrease in the data. Seasonal pattern exists when a series is influenced by seasonal factors (e.g., the quarter of the year, the month, or day of the week). Cyclic pattern exists when data exhibit rises and falls that are (duration usually of at least 2 years). Additive decomposition: $$y_t = S_t + T_t + R_t.$$ Multiplicative decomposition: $$y_t = S_t \times T_t \times R_t.$$ • Additive model appropriate if magnitude of seasonal fluctuations does not vary with level. • If seasonal are proportional to level of series, then multiplicative model appropriate. • Multiplicative decomposition more prevalent with economic series • Alternative: use a Box-Cox transformation, and then use additive decomposition. • Logs turn multiplicative relationship into an additive relationship: $y_t = S_t \times T_t \times E_t \quad\Rightarrow\quad \log y_t = \log S_t + \log T_t + \log R_t.$ 20.5.1 TS patterns Differences between seasonal and cyclic patterns: • seasonal pattern constant length; cyclic pattern variable length • average length of cycle longer than length of seasonal pattern • magnitude of cycle more variable than magnitude of seasonal pattern 20.6 Moving averages • The first step in a classical decomposition is to use a moving average method to estimate the trend-cycle. • A moving average of order $$m$$ can be written as $\hat{T}_t = \frac{1}{m}\sum_{j=-k}^k y_{t+j}$ where $$m=2k+1$$. We call this an $$m$$-MA. • The estimate of the trend-cycle at time $$t$$ is obtained by averaging values of the time series within $$k$$ periods of $$t$$. Observations that are nearby in time are also likely to be close in value. 20.6.1 MA’s of MA’s *Moving averages of moving averages • $$m$$ can be even but the moving average would no longer be symmetric • It is possible to apply a moving average to a moving average. One reason for doing this is to make an even-order moving average symmetric. • When a 2-MA follows a moving average of an even order (such as 4), it is called a centered moving average of order 4 written as $$2\times 4$$-MA. • $\hat{T}_t = \frac{1}{2}\left[\frac{1}{4}(y_{t-2}+y_{t-1}+y_t+y_{t+1})+\frac{1}{4}(y_{t-1}+y_{t}+y_{t+1}+y_{t+2})\right]$ • It is now a weighted average of observations that is symmetric. 20.6.2 Trend-cycle with seasonal data • The most common use of centred moving averages is for estimating the trend-cycle from seasonal data. • For quarterly data, we can use $$2\times 4$$-MA $\hat{T}_t = \frac{1}{8}y_{t-2}+\frac{1}{4}y_{t-1}+\frac{1}{4}y_t +\frac{1}{4}y_{t+1}+\frac{1}{8}y_{t+2}$ • Consequently, the seasonal variation will be averaged out and the resulting values of $$\hat{T}_t$$ will have little or no seasonal variation remaining. • For monthly data, we can use $$2\times 12$$-MA to estimate the trend-cycle. • For weekly data, we can use $$7$$-MA. 20.7 Decomposing Non-Seasonal Data Decomposing a time series means separating it into its constituent components, which are usually a trend component and an irregular component, and if it is a seasonal time series, a seasonal component. A non-seasonal time series consists of a trend component and an irregular component. Decomposing the time series involves trying to separate the time series into these components, that is, estimating the the trend component and the irregular component. To estimate the trend component of a non-seasonal time series that can be described using an additive model, it is common to use a smoothing method, such as calculating the simple moving average of the time series. The SMA() function in the TTR R package can be used to smooth time series data using a simple moving average. To use this function, we first need to install the TTR R package. You can then use the SMA() function to smooth time series data. To use the SMA() function, you need to specify the order (span) of the simple moving average, using the parameter n. For example, to calculate a simple moving average of order 3, we set n=5 in the SMA() function: moving1 <- SMA(tsales,n=3) plot.ts(moving1) There still appears to be quite a lot of random fluctuations in the time series smoothed using a simple moving average of order 3. Thus, to estimate the trend component more accurately, we might want to try smoothing the data with a simple moving average of a higher order. moving2 <- SMA(tsales,n=5) plot.ts(moving2) The data smoothed with a simple moving average of order 5 gives a clearer picture of the trend component. 20.8 Classical decomposition A seasonal time series consists of a trend component, a seasonal component and an irregular component. Decomposing the time series means separating the time series into these three components: that is, estimating these three components. • There are two forms of classical decomposition: an additive decomposition and a multiplicative decomposition. • Let $$m$$ be the seasonal period. Additive decomposition: $$y_t = S_t + T_t + R_t.$$ Multiplicative decomposition: $$y_t = S_t \times T_t \times R_t.$$ 20.8.1 Classical additive decomposition • Step 1: If is an even number, compute the trend-cycle component $$\hat{T}_t$$ using $$2 \times m$$-MA. If $$m$$ is an odd number, use $$m$$-MA. • Step 2: Calculate the detrended series: $$T_t-\hat{T}_t$$. • Step 3: To estimate the seasonal component for each season, simply average the detrended values for that season. • Step 4: The remainder component is calculated by subtracting the estimated seasonal and trend-cycle components. A classical multiplicative decomposition is very similar, except that the subtractions are replaced by divisions. • The trend-cycle estimate tends to over-smooth rapid rises and falls in the data. • Classical decomposition methods assume that the seasonal component repeats from year to year. For many series, this is a reasonable assumption, but for some longer series it is not. • The classical method is not robust to unusual values. To estimate the trend component and seasonal component of a seasonal time series that can be described using an additive model, we can use the decompose() function in R. This function estimates the trend, seasonal, and irregular components of a time series that can be described using an additive model. The function decompose() returns a list object as its result, where the estimates of the seasonal component, trend component and irregular component are stored in named elements of that list objects, called seasonal, trend, and random respectively. Likewise, to plot the time series of antidiabetic drug sales, we type: autoplot(a10) + ylab("$ million") + xlab("Year") +
ggtitle("Antidiabetic drug sales")

We can see from this time series that there seems to be seasonal variation in the number of births per month: there is a peak every summer, and a trough every winter.

Again, it seems that this time series could probably be described using an additive model, as the seasonal fluctuations are roughly constant in size over time and do not seem to depend on the level of the time series, and the random fluctuations also seem to be roughly constant in size over time.

Similarly, to plot the time series of the monthly sales for the souvenir shop at a beach resort, we type:

plot.ts(souvenirtimeseries)

In this case, it appears that an additive model is not appropriate for describing this time series, since the size of the seasonal fluctuations and random fluctuations seem to increase with the level of the time series.

Thus, we may need to transform the time series in order to get a transformed time series that can be described using an additive model. For example, we can transform the time series by calculating the natural log of the original data:

logsouvenirtimeseries <- as.ts(log(souvenirtimeseries))
plot.ts(logsouvenirtimeseries)

Here we can see that the size of the seasonal fluctuations and random fluctuations in the log-transformed time series seem to be roughly constant over time, and do not depend on the level of the time series. Thus, the log-transformed time series can probably be described using an additive model:

decomposed1 <- decompose(logsouvenirtimeseries)
plot(decomposed1)

While classical decomposition is still widely used, it is not recommended, as there are now several much better methods.

Some of the problems with classical decomposition are summarized below:

The estimate of the trend-cycle is unavailable for the first few and last few observations. For example, if m=12, there is no trend-cycle estimate for the first six or the last six observations. Consequently, there is also no estimate of the remainder component for the same time periods.

The trend-cycle estimate tends to over-smooth rapid rises and falls in the data.

Classical decomposition methods assume that the seasonal component repeats from year to year. For many series, this is a reasonable assumption, but for some longer series it is not. For example, electricity demand patterns have changed over time as air conditioning has become more widespread. In many locations, the seasonal usage pattern from several decades ago had its maximum demand in winter (due to heating), while the current seasonal pattern has its maximum demand in summer (due to air conditioning). Classical decomposition methods are unable to capture these seasonal changes over time.

Occasionally, the values of the time series in a small number of periods may be particularly unusual. For example, the monthly air passenger traffic may be affected by an industrial dispute, making the traffic during the dispute different from usual. The classical method is not robust to these kinds of unusual values.

20.9 X11 decomposition

The X-11 method originated in the US Census Bureau and was further developed by Statistics Canada.

It is based on classical decomposition, but includes many extra steps and features in order to overcome the drawbacks of classical decomposition that were discussed in the previous section.

In particular, trend-cycle estimates are available for all observations including the end points, and the seasonal component is allowed to vary slowly over time. X-11 also handles trading day variation, holiday effects and the effects of known predictors.

There are methods for both additive and multiplicative decomposition. The process is entirely automatic and tends to be highly robust to outliers and level shifts in the time series.

fit <- seas(a10, x11="")
autoplot(fit) +
ggtitle("X11 decomposition of antidiabetic drug sales")

Advantages

• Relatively robust to outliers
• Completely automated choices for trend and seasonal changes
• Very widely tested on economic data over a long period of time.

Disadvantages

• No prediction/confidence intervals
• Ad hoc method with no underlying model
• Only developed for quarterly and monthly data

20.10 STL decomposition

STL is a versatile and robust method for decomposing time series. STL is an acronym for Seasonal and Trend decomposition using Loess, while Loess is a method for estimating nonlinear relationships.

STL has several advantages over the classical, SEATS and X11 decomposition methods:

• Unlike SEATS and X11, STL will handle any type of seasonality, not only monthly and quarterly data.

• The seasonal component is allowed to change over time, and the rate of change can be controlled by the user.

• The smoothness of the trend-cycle can also be controlled by the user.

• It can be robust to outliers (i.e., the user can specify a robust decomposition), so that occasional unusual observations will not affect the estimates of the trend-cycle and seasonal components. They will, however, affect the remainder component.

On the other hand, STL has some disadvantages: In particular, it does not handle day or calendar variation automatically, and it only provides facilities for additive decompositions.

tsales %>%
stl(t.window=13, s.window="periodic", robust=TRUE) %>%
autoplot()

The two main parameters to be chosen when using STL are the trend-cycle window (t.window) and the seasonal window (s.window). These control how rapidly the trend-cycle and seasonal components can change. Smaller values allow for more rapid changes. Both t.window and s.window should be odd numbers; t.window is the number of consecutive observations to be used when estimating the trend-cycle; s.window is the number of consecutive years to be used in estimating each value in the seasonal component. The user must specify s.window as there is no default. Setting it to be infinite is equivalent to forcing the seasonal component to be periodic (i.e., identical across years). Specifying t.window is optional, and a default value will be used if it is omitted.

The mstl() function provides a convenient automated STL decomposition using s.window=13, and t.window also chosen automatically. This usually gives a good balance between overfitting the seasonality and allowing it to slowly change over time. But, as with any automated procedure, the default settings will need adjusting for some time series.

20.11 Seasonal Adjustments

If you have a seasonal time series that can be described using an additive model, you can seasonally adjust the time series by estimating the seasonal component, and subtracting the estimated seasonal component from the original time series. We can do this using the estimate of the seasonal component calculated by the decompose() function.

If the seasonal component (St) is removed from the original data, the resulting values are called the seasonally adjusted data. For an additive model, the seasonally adjusted data are given by yt-St, and for multiplicative data, the seasonally adjusted values are obtained using yt/St.

# Subtract St from yt to obtain seasonally adjusted time series:
decomposed1 <- decompose(logsouvenirtimeseries)
[1] 770

$replacements [1] 494.9 Another useful function is tsclean() which identifies and replaces outliers, and also replaces missing values. Obviously this should be used with some caution, but it does allow us to use various models that are sensitive to outliers, i.e. ARIMA or ETS. gold %>% tsclean() %>% ets() %>% forecast(h=50) %>% autoplot() 20.19.1 Anomalies detection Anomalies are essentially the outliers in our data. If something happens regularly, it is not an anomaly but a trend. Things which happen once or twice and are deviant from the usual behavior, whether continuously or with lags are all anomalies. So it all boils down to the definition of outliers for our data. R provides a lot of packages with different approaches to anomaly detection. We will use the anomalize package in R to understand the concept of anomalies using one such method. The anomalize package is a feature rich package for performing anomaly detection. It is geared towards time series analysis, which is one of the biggest needs for understanding when anomalies occur. We are going to use the tidyverse_cran_downloads data set that comes with anomalize. It is a tibbletime object (class tbl_time), which is the object structure that anomalize works. tidyverse_cran_downloads # A time tibble: 6,375 × 3 # Index: date # Groups: package [15] date count package <date> <dbl> <chr> 1 2017-01-01 873 tidyr 2 2017-01-02 1840 tidyr 3 2017-01-03 2495 tidyr 4 2017-01-04 2906 tidyr 5 2017-01-05 2847 tidyr 6 2017-01-06 2756 tidyr 7 2017-01-07 1439 tidyr 8 2017-01-08 1556 tidyr 9 2017-01-09 3678 tidyr 10 2017-01-10 7086 tidyr # … with 6,365 more rows We can use the general workflow for anomaly detection, which involves three main functions: • time_decompose(): Separates the time series into seasonal, trend, and remainder components • anomalize(): Applies anomaly detection methods to the remainder component. • time_recompose(): Calculates limits that separate the normal data from the anomalies! tidyverse_cran_downloads_anomalized <- tidyverse_cran_downloads %>% time_decompose(count, merge = TRUE) %>% anomalize(remainder) %>% time_recompose() tidyverse_cran_downloads_anomalized %>% glimpse() Rows: 6,375 Columns: 12 Groups: package [15]$ package       <chr> "broom", "broom", "broom",…
$date <date> 2017-01-01, 2017-01-02, 2…$ count         <dbl> 1053, 1481, 1851, 1947, 19…
$observed <dbl> 1.053000e+03, 1.481000e+03…$ season        <dbl> -1006.9759, 339.6028, 562.…
$trend <dbl> 1708.465, 1730.742, 1753.0…$ remainder     <dbl> 351.510801, -589.344328, -…
$remainder_l1 <dbl> -1724.778, -1724.778, -172…$ remainder_l2  <dbl> 1704.371, 1704.371, 1704.3…
$anomaly <chr> "No", "No", "No", "No", "N…$ recomposed_l1 <dbl> -1023.2887, 345.5664, 590.…
\$ recomposed_l2 <dbl> 2405.860, 3774.715, 4019.9…

The functions presented above performed a time series decomposition on the count column using seasonal decomposition.

It created four columns:

• observed: The observed values (actuals)

• season: The seasonal or cyclic trend. The default for daily data is a weekly seasonality.

• trend: This is the long term trend. The default is a Loess smoother using spans of 3-months for daily data.

• remainder: This is what we want to analyze for outliers. It is simply the observed minus both the season and trend. Setting merge = TRUE keeps the original data with the newly created columns.

• anomalize (remainder): This performs anomaly detection on the remainder column. It creates three new columns:

• remainder_l1: The lower limit of the remainder

• remainder_l2: The upper limit of the remainder

• anomaly: Yes/No telling us whether or not the observation is an anomaly

• time_recompose(): This recomposes the season, trend and remainder_l1 and remainder_l2 columns into new limits that bound the observed values. The two new columns created are:

• recomposed_l1: The lower bound of outliers around the observed value

• recomposed_l2: The upper bound of outliers around the observed value

We can then visualize the anomalies using the plot_anomalies() function. It is not maybe very clear, because we can see all of variables (packages downloaded from CRAN) and their anomalies detected:

tidyverse_cran_downloads_anomalized %>%
plot_anomalies(ncol = 3, alpha_dots = 0.25)

If you want to visualize what is happening, now is a good point to try out another plotting function, plot_anomaly_decomposition(). It only works on a single time series so we will need to select just one to review. The season is removing the weekly cyclic seasonality. The trend is smooth, which is desirable to remove the central tendency without overfitting. Finally, the remainder is analyzed for anomalies detecting the most significant outliers.

tidyverse_cran_downloads %>%

# Select a single time series
filter(package == "lubridate") %>%
ungroup() %>%

# Anomalize
time_decompose(count, method = "stl", frequency = "auto", trend = "auto") %>%
anomalize(remainder, method = "iqr", alpha = 0.05, max_anoms = 0.2) %>%

# Plot Anomaly Decomposition
plot_anomaly_decomposition() +
ggtitle("Lubridate Downloads: Anomaly Decomposition")

You can also adjust the parameter settings. The first settings you may wish to explore are related to time series decomposition: trend and seasonality. The second are related to anomaly detection: alpha and max anoms. Read more about adjustments here.

To be honest, there is also a very convenient package available on GitHub (so you will need to install that one directly from github) - it is called AnomalyDetection. Read more about that one here. The tutorial for that library is available on DataCamp & youtube: