Vector Autoregression (VAR) Models
A vector autoregression (VAR) model is a multivariate time series model containing a system of n equations of n distinct, stationary response variables as linear functions of lagged responses and other terms. VAR models are also characterized by their degree p; each equation in a VAR(p) model contains p lags of all variables in the system.
VAR models belong to a class of multivariate linear time series models called vector autoregression moving average (VARMA) models. Although Econometrics Toolbox™ provides functionality to conduct a comprehensive analysis of a VAR(p) model (from model estimation to forecasting and simulation), the toolbox provides limited support for other models in the VARMA class.
In general, multivariate linear time series models are well suited for:
Modeling the movements of several stationary time series simultaneously.
Measuring the delayed effects among the response variables in the system.
Measuring the effects of exogenous series on variables in the system. For example, determine whether the presence of a recently imposed tariff significantly affects several econometric series.
Generating simultaneous forecasts of the response variables.
Types of Stationary Multivariate Time Series Models
This table contains forms of multivariate linear time series models and describes their supported functionality in Econometrics Toolbox.
Model | Abbreviation | Equation | Supported Functionality |
---|---|---|---|
Vector autoregression | VAR(p) |
|
|
Vector autoregression with a linear time trend | VAR(p) |
| Represent the model by using a |
Vector autoregression with exogenous series | VARX(p) |
| Represent the model by using a |
Vector moving average | VMA(q) |
| |
Vector autoregression moving average | VARMA(p, q) |
| |
Structural vector autoregression moving average | SVARMA(p, q) |
| Same support as for VARMA models |
The following variables appear in the equations:
yt is the n-by-1 vector of distinct response time series variables at time t.
c is an n-by-1 vector of constant offsets in each equation.
Φj is an n-by-n matrix of AR coefficients, where j = 1,...,p and Φp is not a matrix containing only zeros.
xt is an m-by-1 vector of values corresponding to m exogenous variables or predictors. In addition to the lagged responses, exogenous variables are unmodeled inputs to the system. Each exogenous variable appears in all response equations by default.
β is an n-by-m matrix of regression coefficients. Row j contains the coefficients in the equation of response variable j, and column k contains the coefficients of exogenous variable k among all equations.
δ is an n-by-1 vector of linear time-trend values.
εt is an n-by-1 vector of random Gaussian innovations, each with a mean of 0 and collectively an n-by-n covariance matrix Σ. For t ≠ s, εt and εs are independent.
Θk is an n-by-n matrix of MA coefficients, where k = 1,...,q and Θq is not a matrix containing only zeros.
Φ0 and Θ0 are the AR and MA structural coefficients, respectively.
Generally, the time series yt and
xt are observable because you have
data representing the series. The values of c,
δ, β, and the autoregressive matrices
Φj are not always known. You
typically want to fit these parameters to your data. See estimate
for ways to estimate unknown parameters or how to hold some
of them fixed to values (set equality constraints) during
estimation. The innovations εt are not
observable in data, but they can be observable in simulations.
Lag Operator Representation
In the preceding table, the models are represented in difference-equation notation. Lag operator notation is an equivalent and more succinct representation of the multivariate linear time series equations.
The lag operator L reduces the time index by one unit: Lyt = yt–1. The operator Lj reduces the time index by j units: Ljyt = yt–j.
In lag operator form, the equation for a SVARMAX(p, q) model is:
The equation is expressed more succinctly in this form:
where
and
Stable and Invertible Models
A multivariate AR polynomial is stable if
With all innovations equal to zero, this condition implies that the VAR process converges to c as t approaches infinity (for more details, see [1], Ch. 2).
A multivariate MA polynomial is invertible if
This condition implies that the pure VAR representation of the VMA process is stable (for more details, see [1], Ch. 11).
A VARMA model is stable if its AR polynomial is stable. Similarly, a VARMA model is invertible if its MA polynomial is invertible.
Models with exogenous inputs (for example, VARMAX models) have no well-defined notion of stability or invertibility. An exogenous input can destabilize a model.
Models with Regression Component
Incorporate feedback from exogenous predictors, or study their linear associations with the response series, by including a regression component in a multivariate linear time series model. By order of increasing complexity, examples of applications that use such models:
Modeling the effects of an intervention, which implies that the exogenous series is an indicator variable.
Modeling the contemporaneous linear associations between a subset of exogenous series to each response. Applications include CAPM analysis and studying the effects of prices of items on their demand. These applications are examples of seemingly unrelated regression (SUR). For more details, see Implement Seemingly Unrelated Regression and Estimate Capital Asset Pricing Model Using SUR.
Modeling the linear associations between contemporaneous and lagged exogenous series and the response as part of a distributed lag model. Applications include determining how a change in monetary growth affects real gross domestic product (GDP) and gross national income (GNI).
Any combination of SUR and the distributed lag model that includes the lagged effects of responses, also known as simultaneous equation models.
The general equation for a VARX(p) model is
where
xt is an m-by-1 vector of observations from m exogenous variables at time t. The vector xt can contain lagged exogenous series.
β is an n-by-m vector of regression coefficients. Row j of β contains the regression coefficients in the equation of response series j for all exogenous variables. Column k of β contains the regression coefficients among the response series equations for exogenous variable k. This figure shows the system with an expanded regression component:
VAR Model Workflow
This workflow describes how to analyze multivariate time series by using
Econometrics Toolbox VAR model functionality. If you believe the response series are
cointegrated, use VEC model functionality instead (see vecm
).
Load, preprocess, and partition the data set. For more details, see Multivariate Time Series Data Formats.
Create a
varm
model object that characterizes a VAR model. Avarm
model object is a MATLAB® variable containing properties that describe the model, such as AR polynomial degree p, response dimensionality n, and coefficient values.varm
must be able to infer n and p from your specifications; n and p are not estimable. You can update the lag structure of the AR polynomial after creating a VAR model, but you cannot change n.varm
enables you to create these types of models:Fully specified model in which all parameters, including coefficients and the innovations covariance matrix, are numeric values. Create this type of model when economic theory specifies the values of all parameters in the model, or you want to experiment with parameter settings. After creating a fully specified model, you can pass the model to all object functions except
estimate
.Model template in which n and p are known values, but all coefficients and the innovations covariance matrix are unknown, estimable parameters. Properties corresponding to estimable parameters are composed of
NaN
values. Pass a model template and data toestimate
to obtain an estimated (fully specified) VAR model. Then, you can pass the estimated model to any other object function.Partially specified model template in which some parameters are known, and others are unknown and estimable. If you pass a partially specified model and data to
estimate
, MATLAB treats the known parameter values as equality constraints during optimization, and estimates the unknown values. A partially specified model is well suited to these tasks:Remove lags from the model by setting the coefficient to zero.
Associate a subset of predictors to a response variable by setting to zero the regression coefficients of predictors you do not want in the response equation.
For more details, see Create VAR Model.
For models with unknown, estimable parameters, fit the model to data. See Fitting Models to Data and
estimate
.Find an appropriate AR polynomial degree by iterating steps 2 and 3. See Select Appropriate Lag Order.
Analyze the fitted model. This step can involve:
Determining whether response series Granger-cause other response series in the system (see
gctest
).Calculating impulse responses, which are forecasts based on an assumed change in an input to a time series.
VAR model forecasting by obtaining either minimum mean square error forecasts or Monte Carlo forecasts.
Comparing model forecasts to holdout data. For an example, see VAR Model Case Study.
Your application does not have to involve all the steps in this workflow, and you can iterate some of the steps. For example, you might not have any data, but want to simulate responses from a fully specified model.
References
[1] Lütkepohl, H. New Introduction to Multiple Time Series Analysis. Berlin: Springer, 2005.
See Also
Objects
Functions
Related Topics
- Multivariate Time Series Data Formats
- Vector Autoregression (VAR) Model Creation
- VAR Model Estimation
- Fit VAR Model to Simulated Data
- Fit VAR Model of CPI and Unemployment Rate
- Estimate Capital Asset Pricing Model Using SUR
- VAR Model Forecasting, Simulation, and Analysis
- Forecast VAR Model
- Forecast VAR Model Using Monte Carlo Simulation
- Simulate Responses of Estimated VARX Model
- VAR Model Case Study