Introduction

Any metric that is measured over regular time intervals forms a time series.

Analysis of time series is commercially importance because of industrial need and relevance especially w.r.t forecasting (demand, sales, supply etc).

Time series analysis has become a hot topic with the rise of quantitative finance and automated trading of securities. Many of the facilities described in this chapter were invented by practitioners and researchers in finance, securities trading, and portfolio management.

Before you start any time series analysis in R, a key decision is your choice of data representation (object class). This is especially critical in an object-oriented language such as R, because the choice affects more than how the data is stored; it also dictates which functions (methods) will be available for loading, processing, analyzing, printing, and plotting your data. When many people start using R they simply store time series data in vectors. That seems natural. However, they quickly discover that none of the coolest analytics for time series analysis work with simple vectors. We’ve found when users switch to using an object class intended for time series data, the analysis gets easier, opening a gateway to valuable functions and analytics.

Note

The xts implementation is a superset of zoo, so xts can do everything that zoo can do. In this chapter, whenever a recipe works for a zoo object, you can safely assume (unless stated otherwise) that it also works for an xts object.

Other Representations

Other representations of time series data are available in the R universe, including:

  • fts package
  • irts from the tseries package
  • timeSeries package
  • ts (base distribution)
  • tsibble package, a tidyverse style package for time series

In fact, there is a whole toolkit, called tsbox, just for converting between representations.

Two representations deserve special mention.

ts (base distribution)

The base distribution of R includes a time series class called ts. We don’t recommend this representation for general use because the implementation itself is too limited and restrictive.

However, the base distribution includes some important time series analytics that depend upon ts, such as the autocorrelation function (acf) and the cross-correlation function (ccf). To use those base functions on xts data, use the to.ts function to “downshift” your data into the ts representation before calling the function. For example, if x is an xts object, you can compute its autocorrelation like this:

acf(as.ts(x))

tsibble package

The tsibble package is a recent extension to the tidyverse, specifically designed for working with time series data within the tidyverse. We find it useful for cross-sectional data—that is, data for which the observations are grouped by date, and you want to perform analytics within dates more than across dates.

Date Versus Datetime

Every observation in a time series has an associated date or time. The object classes used in this chapter, zoo and xts, give you the choice of using either dates or datetimes for representing the data’s time component. You would use dates to represent daily data, of course, and also for weekly, monthly, or even annual data; in these cases, the date gives the day on which the observation occurred. You would use datetimes for intraday data, where both the date and time of observation are needed.

In describing this chapter’s recipes, we found it pretty cumbersome to keep saying “date or datetime.” So we simplified the prose by assuming that your data are daily and thus use whole dates. Please bear in mind, of course, that you are free and able to use timestamps below the resolution of a calendar date.

See Also

R has many useful functions and packages for time series analysis. You’ll find pointers to them in the task view for Time Series Analysis.