VARMAX models

This is a brief introduction notebook to VARMAX models in Statsmodels. The VARMAX model is generically specified as: $$ y_t = \nu + A_1 y_{t-1} + \dots + A_p y_{t-p} + B x_t + \epsilon_t + M_1 \epsilon_{t-1} + \dots M_q \epsilon_{t-q} $$

where $y_t$ is a $\text{k_endog} \times 1$ vector.

In [1]:
%matplotlib inline
In [2]:
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
In [3]:
dta = sm.datasets.webuse('lutkepohl2', 'http://www.stata-press.com/data/r12/')
dta.index = dta.qtr
endog = dta.ix['1960-04-01':'1978-10-01', ['dln_inv', 'dln_inc', 'dln_consump']]
---------------------------------------------------------------------------
ConnectionRefusedError                    Traceback (most recent call last)
/usr/lib/python3.5/urllib/request.py in do_open(self, http_class, req, **http_conn_args)
   1253             try:
-> 1254                 h.request(req.get_method(), req.selector, req.data, headers)
   1255             except OSError as err: # timeout error

/usr/lib/python3.5/http/client.py in request(self, method, url, body, headers)
   1106         """Send a complete request to the server."""
-> 1107         self._send_request(method, url, body, headers)
   1108 

/usr/lib/python3.5/http/client.py in _send_request(self, method, url, body, headers)
   1151             body = _encode(body, 'body')
-> 1152         self.endheaders(body)
   1153 

/usr/lib/python3.5/http/client.py in endheaders(self, message_body)
   1102             raise CannotSendHeader()
-> 1103         self._send_output(message_body)
   1104 

/usr/lib/python3.5/http/client.py in _send_output(self, message_body)
    933 
--> 934         self.send(msg)
    935         if message_body is not None:

/usr/lib/python3.5/http/client.py in send(self, data)
    876             if self.auto_open:
--> 877                 self.connect()
    878             else:

/usr/lib/python3.5/http/client.py in connect(self)
    848         self.sock = self._create_connection(
--> 849             (self.host,self.port), self.timeout, self.source_address)
    850         self.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)

/usr/lib/python3.5/socket.py in create_connection(address, timeout, source_address)
    711     if err is not None:
--> 712         raise err
    713     else:

/usr/lib/python3.5/socket.py in create_connection(address, timeout, source_address)
    702                 sock.bind(source_address)
--> 703             sock.connect(sa)
    704             return sock

ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

URLError                                  Traceback (most recent call last)
<ipython-input-3-298f302f03ba> in <module>()
----> 1 dta = sm.datasets.webuse('lutkepohl2', 'http://www.stata-press.com/data/r12/')
      2 dta.index = dta.qtr
      3 endog = dta.ix['1960-04-01':'1978-10-01', ['dln_inv', 'dln_inc', 'dln_consump']]

/home/lernstick_debian9/lernstick/backports/stretch/statsmodels-0.8.0/.pybuild/pythonX.Y_3.5/build/statsmodels/datasets/utils.py in webuse(data, baseurl, as_df)
     47 
     48     url = urljoin(baseurl, data+'.dta')
---> 49     dta = urlopen(url)
     50     dta = BytesIO(dta.read())  # make it truly file-like
     51     if as_df:  # could make this faster if we don't process dta twice?

/usr/lib/python3.5/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    161     else:
    162         opener = _opener
--> 163     return opener.open(url, data, timeout)
    164 
    165 def install_opener(opener):

/usr/lib/python3.5/urllib/request.py in open(self, fullurl, data, timeout)
    464             req = meth(req)
    465 
--> 466         response = self._open(req, data)
    467 
    468         # post-process response

/usr/lib/python3.5/urllib/request.py in _open(self, req, data)
    482         protocol = req.type
    483         result = self._call_chain(self.handle_open, protocol, protocol +
--> 484                                   '_open', req)
    485         if result:
    486             return result

/usr/lib/python3.5/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
    442         for handler in handlers:
    443             func = getattr(handler, meth_name)
--> 444             result = func(*args)
    445             if result is not None:
    446                 return result

/usr/lib/python3.5/urllib/request.py in http_open(self, req)
   1280 
   1281     def http_open(self, req):
-> 1282         return self.do_open(http.client.HTTPConnection, req)
   1283 
   1284     http_request = AbstractHTTPHandler.do_request_

/usr/lib/python3.5/urllib/request.py in do_open(self, http_class, req, **http_conn_args)
   1254                 h.request(req.get_method(), req.selector, req.data, headers)
   1255             except OSError as err: # timeout error
-> 1256                 raise URLError(err)
   1257             r = h.getresponse()
   1258         except:

URLError: <urlopen error [Errno 111] Connection refused>

Model specification

The VARMAX class in Statsmodels allows estimation of VAR, VMA, and VARMA models (through the order argument), optionally with a constant term (via the trend argument). Exogenous regressors may also be included (as usual in Statsmodels, by the exog argument), and in this way a time trend may be added. Finally, the class allows measurement error (via the measurement_error argument) and allows specifying either a diagonal or unstructured innovation covariance matrix (via the error_cov_type argument).

Example 1: VAR

Below is a simple VARX(2) model in two endogenous variables and an exogenous series, but no constant term. Notice that we needed to allow for more iterations than the default (which is maxiter=50) in order for the likelihood estimation to converge. This is not unusual in VAR models which have to estimate a large number of parameters, often on a relatively small number of time series: this model, for example, estimates 27 parameters off of 75 observations of 3 variables.

In [4]:
exog = endog['dln_consump']
mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(2,0), trend='nc', exog=exog)
res = mod.fit(maxiter=1000)
print(res.summary())
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-4-40229e5606f7> in <module>()
----> 1 exog = endog['dln_consump']
      2 mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(2,0), trend='nc', exog=exog)
      3 res = mod.fit(maxiter=1000)
      4 print(res.summary())

NameError: name 'endog' is not defined

From the estimated VAR model, we can plot the impulse response functions of the endogenous variables.

In [5]:
ax = res.impulse_responses(10, orthogonalized=True).plot(figsize=(13,3))
ax.set(xlabel='t', title='Responses to a shock to `dln_inv`');
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-5-f1f4897b0cc2> in <module>()
----> 1 ax = res.impulse_responses(10, orthogonalized=True).plot(figsize=(13,3))
      2 ax.set(xlabel='t', title='Responses to a shock to `dln_inv`');

NameError: name 'res' is not defined

Example 2: VMA

A vector moving average model can also be formulated. Below we show a VMA(2) on the same data, but where the innovations to the process are uncorrelated. In this example we leave out the exogenous regressor but now include the constant term.

In [6]:
mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(0,2), error_cov_type='diagonal')
res = mod.fit(maxiter=1000)
print(res.summary())
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-6-2c3b63c31868> in <module>()
----> 1 mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(0,2), error_cov_type='diagonal')
      2 res = mod.fit(maxiter=1000)
      3 print(res.summary())

NameError: name 'endog' is not defined

Caution: VARMA(p,q) specifications

Although the model allows estimating VARMA(p,q) specifications, these models are not identified without additional restrictions on the representation matrices, which are not built-in. For this reason, it is recommended that the user proceed with error (and indeed a warning is issued when these models are specified). Nonetheless, they may in some circumstances provide useful information.

In [7]:
mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(1,1))
res = mod.fit(maxiter=1000)
print(res.summary())
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-7-bf0ad24b127a> in <module>()
----> 1 mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(1,1))
      2 res = mod.fit(maxiter=1000)
      3 print(res.summary())

NameError: name 'endog' is not defined