Introduction
Any metric that is measured over regular time intervals forms a time series.
Analysis of time series is commercially importance because of industrial need and relevance especially w.r.t forecasting (demand, sales, supply etc).
Time series analysis has become a hot topic with the rise of quantitative finance and automated trading of securities. Many of the facilities described in this chapter were invented by practitioners and researchers in finance, securities trading, and portfolio management.
Before you start any time series analysis in R, a key decision is your choice of data representation (object class). This is especially critical in an object-oriented language such as R, because the choice affects more than how the data is stored; it also dictates which functions (methods) will be available for loading, processing, analyzing, printing, and plotting your data. When many people start using R they simply store time series data in vectors. That seems natural. However, they quickly discover that none of the coolest analytics for time series analysis work with simple vectors. We’ve found when users switch to using an object class intended for time series data, the analysis gets easier, opening a gateway to valuable functions and analytics.
Note
The
xts
implementation is a superset ofzoo
, soxts
can do everything thatzoo
can do. In this chapter, whenever a recipe works for azoo
object, you can safely assume (unless stated otherwise) that it also works for anxts
object.
Other Representations
Other representations of time series data are available in the R universe, including:
fts
packageirts
from thetseries
packagetimeSeries
packagets
(base distribution)tsibble
package, a tidyverse style package for time series
In fact, there is a whole toolkit, called tsbox
, just for converting between representations.
Two representations deserve special mention.
ts (base distribution)
The base distribution of R includes a time series class called ts
. We
don’t recommend this representation for general use because the
implementation itself is too limited and restrictive.
However, the base distribution includes some important time series analytics
that depend upon ts
, such as the autocorrelation function (acf
)
and the cross-correlation function (ccf
).
To use those base functions on xts
data, use the to.ts
function
to “downshift” your data into the ts
representation before calling the function.
For example, if x
is an xts
object, you can compute its autocorrelation
like this:
acf(as.ts(x))
tsibble package
The tsibble
package is a recent extension to the tidyverse,
specifically designed for working with time series data
within the tidyverse.
We find it useful for cross-sectional data—that is, data for which the observations are grouped by date,
and you want to perform analytics within dates more
than across dates.
Date Versus Datetime
Every observation in a time series has an associated date or time. The
object classes used in this chapter, zoo
and xts
, give you the
choice of using either dates or datetimes for representing the data’s
time component. You would use dates to represent daily data, of course,
and also for weekly, monthly, or even annual data; in these cases, the
date gives the day on which the observation occurred. You would use
datetimes for intraday data, where both the date and time of observation
are needed.
In describing this chapter’s recipes, we found it pretty cumbersome to keep saying “date or datetime.” So we simplified the prose by assuming that your data are daily and thus use whole dates. Please bear in mind, of course, that you are free and able to use timestamps below the resolution of a calendar date.
See Also
R has many useful functions and packages for time series analysis. You’ll find pointers to them in the task view for Time Series Analysis.