Forecasting Basics¶

In time series analysis, there is a crucial distinction between estimation, prediction, and forecasting:

Estimation: Focuses on finding unknown model parameters (e.g., estimating $\hat{\phi}_{1}$ from data).
Prediction: Calculating the value of a random process within the sample period using estimated parameters (e.g., $\hat{Y}_{t} = c + \hat{\phi}_{1}y_{t-1} + e_{t}$).
Forecasting: Determining the value of a future random process that has not yet been observed in the sample. This uses the fitted model to project values outside the sample period.

Workflow Example:
$Y_{t} = \phi Y_{t-1} + a_{t}$ $\rightarrow$ Estimation of $\hat{\phi}$ $\rightarrow$ Prediction of $\hat{Y}_{t} = \hat{\phi}Y_{t-1}$ $\rightarrow$ Forecasting of $\hat{Y}_{n+1}$.

1. ARMA Model Forecasting¶

Minimum Mean Squared Error (MSE) Forecasts¶

We use observed data $\{y_{1}, y_{2}, \dots, y_{n}\}$ to forecast unobserved values $\{y_{n+1}, y_{n+2}, \dots\}$.
* Forecast Origin ($n$): The last observed time point.
* $l$-step Ahead Forecast ($\hat{Y}_{n}(l)$): The forecast for time $n+l$ obtained using the minimum MSE criteria.
* Conditional Expectation: The forecast is calculated as the expected value of the future point given the known history:
$$\hat{Y}_{n}(l) = E(Y_{n+l} \mid Y_{n}, Y_{n-1}, \dots, Y_{1})$$

Notation Change

In these derivations, we use $a_{t}$ to represent random shocks (errors) to distinguish them from the forecast error, which is denoted by $e_{t}$.

Random Shock Form¶

By expressing the ARMA model in its random shock form (using the backshift operator polynomials), we can write the future value as:
$$Y_{n+l} = \theta_{0} + \psi(B)a_{t}$$

When taking the expectation of errors ($a_{n+j}$), the following rules apply:
* $j \leq 0$: The error is known ($a_{n+j}$), as it occurred in the past or present.
* $j > 0$: The error is a future shock; its expected value is $0$.

2. Forecast Error Analysis¶

The forecast error $e_{n}(l)$ is the difference between the actual future value and our forecast:
$$e_{n}(l) = Y_{n+l} - \hat{Y}_{n}(l) = \sum_{i=0}^{l-1} \psi_{i}a_{n+l-i}$$
For any lead time $l > 0$, the expected forecast error $E(e_{n}(l))$ is $0$.

1-step Ahead Forecast¶

Actual: $Y_{n+1} = \theta_{0} + a_{n+1} + \psi_{1}a_{n} + \psi_{2}a_{n-1} + \dots$
Forecast: $\hat{Y}_{n}(1) = \theta_{0} + \psi_{1}a_{n} + \psi_{2}a_{n-1} + \dots$
Error: $e_{n}(1) = Y_{n+1} - \hat{Y}_{n}(1) = a_{n+1}$
Error Variance: $V(e_{n}(1)) = \sigma^{2}_{a}$

2-step Ahead Forecast¶

Actual: $Y_{n+2} = \theta_{0} + a_{n+2} + \psi_{1}a_{n+1} + \psi_{2}a_{n} + \dots$
Forecast: $\hat{Y}_{n}(2) = \theta_{0} + \psi_{2}a_{n} + \psi_{3}a_{n-1} + \dots$
Error: $e_{n}(2) = Y_{n+2} - \hat{Y}_{n}(2) = a_{n+2} + \psi_{1}a_{n+1}$
Error Variance: $V(e_{n}(2)) = \sigma^{2}_{a}(1 + \psi_{1}^{2})$