Cointegration & Error Correction Models¶
Some Basics¶
Linear Combination¶
For example, $2x+3y$ is a first degree combination of variables.
$$ Z = \ell_{1}X_{1} + \ell_{2}X_{2} + \dots + \ell_{k}X_{k} $$ where $\ell_{i}\in \mathbf{R}$ and $X_{1},X_{2},\dots X_{k}$ are variables.
ADL - Autoregressive Distributed Lag Model¶
AI Generated
The Autoregressive Distributed Lag (ADL) Model is a single-equation time-series model used to describe how a dependent variable ($Y_t$) is influenced by its own past values and the current and past values of one or more independent variables ($X_t$).
The ADL Model Structure¶
The general form of an ADL model with $p$ lags of $Y$ and $q$ lags of a single $X$ variable is denoted as ADL($p$, $q$):
$$Y_t = \alpha_0 + \sum_{i=1}^{p} \gamma_i Y_{t-i} + \sum_{j=0}^{q} \delta_j X_{t-j} + u_t$$
Where:
- $Y_t$ is the dependent variable.
- $Y_{t-i}$ terms (the $\gamma$ coefficients) are the Autoregressive (AR) component, capturing the influence of the dependent variable's own past.
- $X_{t-j}$ terms (the $\delta$ coefficients) are the Distributed Lag (DL) component, capturing the influence of the independent variable's current and past values.
- $\delta_0$ captures the impact multiplier (the instantaneous effect of a change in $X_t$ on $Y_t$).
- $u_t$ is the error term (usually assumed to be white noise).
The ADL model is a dynamic model because the effects of changes in $X_t$ are "distributed" over future periods. It is also the model from which the Error Correction Model (ECM), which you inquired about earlier, is derived through algebraic reparameterization.
Linear combination of Integrated Variables¶
Let $\{ x_{t} \}$ and $\{ y_{t} \}$ be two time-series variables, wherein $m=2$ for VAR.
$$ \begin{align*} \{ x_{t} \} & \sim I(1) \\ \{ y_{t} \} & \sim I(1) \\ \end{align*} $$
Let $Z_{t}$ be the linear combination of $x_{t}$ and $y_{t}$ such that,
$$ Z_t = \alpha x_{t} + \beta y_{t} $$
where $\alpha$ and $\beta$ are constants.
If $Z_{t} \sim I(0)$ then we say that $x_{t}$ and $y_{t}$ are cointegrated.
Originally introduced by Grander and extended by Granger and Weiss and "Engel & Granger"
At the levels both $x$ and $y$ are non-stationary. But if such becomes stationary at levels, when there is a linear combination of both, then we say that $x$ and $y$ are cointegrated i.e. $C.I.(1,0)$.
- Why $I(1)$? Almost all the economic variables are $I(1)$.
$$ \left.\begin{matrix} I(d) \\ I(d) \end{matrix} \right\} \to C.I.(d,d-b) \iff Z_{t} \sim I(d-b) $$ where $b \gt 0$.
- What if the $I$ levels are different for $X$ and $Y$?
- Cointegration can be done but not in the usual manner
- Mathematically complex.
Cointegration when the linear combination is integrated at lower levels.
ARDL approach¶
- Assume more variables can be defined
Suppose,
$$ Y_{t} = \{ Y_{1t}\ Y_{2t}\ \dots\ Y_{nt} \} \sim I(d) $$ $$ Z = \beta_{0}+\beta_{1} Y_{1t} + \beta_{2}Y_{2t} +\dots+\beta_{n} Y_{nt} \sim I(d-b) $$
is a linear combination of the vectors in $Y$.
- Then we say such is cointegrated at $(d,b)$ i.e. $C.I(d,b)$ with $n$-number of variables at time $t$.
- Here $\beta = (\beta_{0},\beta_{1},\dots\beta n)$ is the cointegration vector
If there exists cointegration vector $\beta$ and linear combination of this cointegration vector follows $I(d-b)$ when $y_{t}$ is $I(d)$ with $b\gt 0$ then the variables in $y_{t}$ vector are cointegrated.
- For the $\beta_{it}=0$, $y_{i}$ is not cointegrated
- There can be multiple combinations of cointegration equations.
Cointegration and common trends¶
Macro-economic Example¶
"Permanent Income Hypothesis" given by Friedman (1957)¶
$$ C(t) = C_{P}(t) + C_{T}(t) $$
where,
- $C_{T}:$ Transitory consumption, i.e. short term consumption
- $C_{P}:$ Permanent consumption, i.e. steady consumption
$$ C(t) = \beta Y_{P}(t) + C_{T}(t) $$ where,
- $Y_{P}$ is income
Then, $C_{P}(t) = \beta Y_{P}(t)$ i.e., consumption is proportional to income.
- $\beta$ is the Marginal Propensity to Consume (MPC), $0 \lt \beta \lt 1$, which in mathematical terms can be anything.
Thus, we can interpret the equation as:
- $C(t)$ and $\beta Y_{P}(t)$ are in long-run equilibrium or have a long-run relationship
- $C_{T}(t)$ are short-term fluctuations
Purchasing Power Parity¶
The exchange rate $e(t)$ can be explained as
$$ e(t) = p(t) - p^*(t) + q(t) $$ where,
NOTE: Just for your understanding, no need to explain this.
| Variable | Description | Meaning |
|---|---|---|
| $p(t)$ | Log of Domestic Price Level | The natural logarithm of the price index (like the Consumer Price Index (CPI)) for a basket of goods in the domestic (or home) Country. (The note points to this as "Country"). |
| $p^*(t)$ | Log of Foreign Price Level | The natural logarithm of the price index for the same basket of goods in the foreign (or "World") Country. (The note points to this as "World"). |
| $e(t)$ | Log of Nominal Exchange Rate | The natural logarithm of the nominal exchange rate (defined as domestic currency units per one unit of foreign currency, e.g., $\ln(\text{INR}/\text{USD})$). |
| $\epsilon(t)$ | Deviation / Error Term | The residual, representing the real exchange rate or the deviation from PPP equilibrium. This is the term being estimated in the regression. |
- Thus, there is long run equilibrium between $e(t)$ and $p(t)- p^*(t)$, the PPP $\to$ Cointegration!
Money Supply¶
$$ M_{0} = C_{t} + r_{t} +roi $$
CAPM¶
$$ R_i = \beta_m R_m + \epsilon $$
Testing for Cointegration: Engle-Granger methodology¶
- Engle Granger Test
- Simplest of all
- Outlier cointegration test
- Cointegration Regression Durbin Watson Test
- Johansen's Cointegration test
- The most recent
- A bit more complex
- Allows for multiple variables
- Based on MLE estimation
- Full Inforrmation Maximum likelihood (FIML)
Engle Granger¶
- Works for only two variables (particular case)
- Based on OLS (and its assumptions)
NOTE: Johansen's test is better than this, as it is based on ML and doesn't require the OLS assumptions to be met.
Steps¶
- Write the cointegration regression and run it. Obtain the OLS estimated coefficients $$ Y_{t} = \beta_{0} X_{t} + \epsilon_{t} $$
to obtain the estimated coefficients
$$ \hat{Y}_{t} = \hat{\beta}_{0} +\hat{\beta}_{1}X_{t} +\epsilon_{t} $$
and thus obtain the fitted residuals (i.e. errors).
$$ \hat{\epsilon}_{t} = y_{t} - \hat{y}_{t} $$
- Perform DF test on the errors
- If they are stationary, cointegration exists! (i.e. the residuals are $I(0)$)
- If errors are non-stationary, cointegration doesn't exist.
If $E(e)$ is around zero $\implies$ Cointegration
However, know that limitations exist.
- OLS cannot generate efficient estimates in general. (Efficient = smallest possible variance)
- The assumptions can be violated
- Estimation challenges (in test of hypothesis)
- $\begin{matrix}H_{0} \\ H_{1}\end{matrix}$
- $T\text{-statistic} = \dfrac{\hat{\beta}- E(\hat{\beta})}{SE(\hat{\beta})}$. These are OLS estimates
- Critical Values
- Inferences
Owing to these limitations most people prefer Johansen's test which is mathematically more robust.
Johansen Cointegration Model¶
Consider a $n\text{-}$variable VAR model with $p$ lags
$$ \begin{equation} y_{t} = \pi_{1}y_{t-1}+ \pi_{2}y_{t-2} +\dots+ \pi_{p}y_{t-p} + \mathcal{U}_{t} \end{equation} $$
Under VAR, $y_{t}$ is an $n\times 1$ vector of variables following $I(1)$ (for simplicity)
- $\Pi = (\pi_{1},\dots \pi_{p})$ are $m\times m$ coefficient matrices
- $\mathcal{U}_{t}$ is an $n\times1$ error term. Also known as Innovation (for multiple variables)
Reparametrizing (1) and subtracting $y_{t-1}$ on both sides
$$ \Delta y_{t} = \Gamma_{1}y_{t-1} + \Gamma_{2} \Delta y_{t-2}+\dots+\Gamma_{p-1} y_{t-p+1} - \Pi y_{t-1} + \mathcal{U}_{t} $$
Defining the Gammas:
- $\Gamma_{1} = \pi_{1} - I$
- $\Gamma_{i} = \pi_{i} - \sum_{k=1}^{i-1}\pi_{k}$ for $2 \leq i \leq p$
The matrix $\Pi$ determines the extent to which the system is cointegrated and is called the impact matrix.
Using MLE and assumptions
$$ \mathcal{U}_{t} \sim N(\mathbf{0}, \Sigma) $$
where,
- $\mathbf{0}$ is a zero matrix
- $\Sigma$ is the variance covariance matrix.
Trace Test¶
$$ \begin{align} H_{0} & : r \text{ cointegrating vectors} \\ H_{0} & : n \text{ cointegrating vectors} \end{align} $$ where,
- $r \lt n$
- $H_{0}$ $\implies$ some vectors are not cointegrated
- $H_{1}$ $\implies$ all vectors are cointegrated
The test statsitic is given by
$$ \mathcal{J}_{Trace} = -T \sum_{t=r+1}^n \ln (1-\hat{\lambda}_{i}) $$
Max Eigen Values Test¶
$$ \begin{align} H_{0} & : r \text{ cointegrating vectors} \\ H_{0} & : (r+1) \text{ cointegrating vectors} \end{align} $$
$$ \mathcal{J}_{max} = - T \ln(1- \hat{\lambda}_{i}) $$
where,
- $T$ is the sample size
- $\hat{\lambda}_{i}$ is the largest canonical correlation (= correlation between two sets of variables)
Then continue with the standard procedure
- Get critical values for [[#Trace Test]] and [[#Max Eigen Values Test]]
- Compare with calculated values
- Accept or Reject
Limitations¶
- If both are not of the same order, we cannot use it. $\implies$ Go for ARDW test.
- In the case of shocks (Intervention analysis), these tests are not robust.
- Large sample size is required
- It is sensitive to lag-length selection.
Cointegration and Error Correction¶
Cointegration establishes long-run equilibrium ("In ractice the exact point of equilibrium cannot be pin-pointed, it is dynamic"). The equilibrium point is dynamic (ever changing)
Thus, the equilibrium relationship is dynamic in nature.
Take Money demand ($M_E$) and price level ($P_t$) for example
$$ M_{t} = \beta_{0} +\beta_{1}P_{t} + \epsilon_{t} $$
- $\epsilon_{t}$ is the theoretical error.
- $\hat{\epsilon}_{t} = M_{e} - \hat{\beta}_{0} - \hat{\beta}_{1}$ is the estimated error.
If errors are stationary using DF or ADF then cointegration exists.
There exists a dynamic relationship between the both.
How to have more accurate forecasts of $m_t$, knowing both are cointegrated?
- We study the deviations between the both
- Study the time taken to reach back to equilibrium again. What is the rate of speed of adjustment?
We can also observe that
- As the deviation decreases, the speed to reach equilibrium increases
- and vice-versa
Granger Representation Theorem¶
Theorem: If $X_{t}$ and $Y_{t}$ are cointegrated then there exists a dynamic or short-run relationship.
This phenomenon is called Error Correction Mechanism (ECM), which is an extension to cointegration. (Can also be considered, Equilibrium correction mechanism). So,
- ECM $\to$ The short-run relationship
- C.I. $\to$ The long-run relationship
- If $X_{t}$ and $Y_{t}$ are not cointegrated $\implies$ ECM doesn't exist (naturally)
- If multiple variables are involved in the Johansen's Cointegration test, then ECM is called VECM (Vector Error Correction Mechanism)
Two ways to represent short-run equilibrium
- Error terms
- Lag terms
Take a cointegrating regression, which shows the long-run relationship,
$$ Y_{t} = \beta_{0}+\beta_{1} X_{t} + \epsilon_{t} $$
Subtracting both sides by $Y_{t-1}$, we get the difference equation form and eventually we get
$$ \Delta Y_{t} = \underbrace{\alpha_{0} +\alpha_{1} \Delta x_{t}}_{\text{Short-run Dynamics}} + \underbrace{\gamma_{0} (X_{t-1} - Y_{t-1})}_{\text{Error Correction Term}} + u_{t} $$
This is useful for more accurate predictions which involves both the long-run and short-run equilibrium.
Another form
$$ \Delta Y_{t} = \alpha_{0} + \alpha_{1}\Delta X_{t} + \gamma_{0}(\hat{u}_{t-1}) + u_{t} $$
When, $$ \gamma_{0} \to\begin{cases} \text{+ve} & \text{Moving away from equilibrium} \\ \text{-ve} & \text{Pushing to equilibrium, when deviating.} \end{cases} $$
$\gamma_{0}$ represents the speed at which the adjustment is made to move to the equilibrium level from its original level.
Granger Causality Test¶
It is a statistical hypothesis test used to determine *if one time seires is useful in forecasting another.
Note that "causality" in this context refers to the predictive ability based on past values.
A variable $X$ is said to Granger-cause another variable $Y$
- if past values of $X$ provide statistically significant information to help forecast $Y$
- beyond information provided by past values alone.
Types of Granger Causality¶
- Unidirectional Causality (One-way)
- $X\to Y$
- The reverse is not true i.e., $Y \not \to X$
- Bidirectional Causality (Two-way)
- $X \leftrightarrow Y$
- Both variables are simultaneously influencing each other
The Hypothesis:
$$ \begin{matrix} H_{0,1}: & X \text{ doesn't Granger-cause } Y & H_{a,1}: X\text{ Granger-causes }Y\\ H_{0,2}: & Y \text{ doesn't Granger-cause } X & H_{a,2}: Y\text{ Granger-causes }X\\ \end{matrix} $$
- If both get rejected, then there is bidirectional causality.
Method (Using $F\text{-test}$)¶
- Ensure the time-series data, $X$ and $Y$ are stationary.
- If non-stationary, difference them to make them stationary
- Or use VECM, if they are cointegrated
- For $Y$, run two regressions
- Unrestricted (Full) Model: $Y_{t} = \alpha +\sum_{i=1}^p \beta_{i}Y_{t-i}+\sum_{i=1}^p \delta_{i} X_{t-i}+\epsilon_{t}$
- Restricted Model: $Y_{t} = \alpha +\sum_{i=1}^p \beta_{i}Y_{t-i}+\epsilon_{t}$
- Test for $X \to Y$
- Test checks if the lagged $X$ terms (ADL) significantly improves the model's fit
- The $F\text{-statistic}$ compares the SSR of the restricted model to the SSR of the unrestricted model.
- $F\text{-statistic}$ is large ($p\text{-value} \lt 0.05)$ $\implies$ Reject null hypothesis that all $\delta_{i}=0$
- Repeat for $Y\to X$