Forecasting Basics¶
In time series analysis, there is a crucial distinction between estimation, prediction, and forecasting:
- Estimation: Focuses on finding unknown model parameters (e.g., estimating \(\hat{\phi}_{1}\) from data).
- Prediction: Calculating the value of a random process within the sample period using estimated parameters (e.g., \(\hat{Y}_{t} = c + \hat{\phi}_{1}y_{t-1} + e_{t}\)).
- Forecasting: Determining the value of a future random process that has not yet been observed in the sample. This uses the fitted model to project values outside the sample period.
Workflow Example:
\(Y_{t} = \phi Y_{t-1} + a_{t}\) \(\rightarrow\) Estimation of \(\hat{\phi}\) \(\rightarrow\) Prediction of \(\hat{Y}_{t} = \hat{\phi}Y_{t-1}\) \(\rightarrow\) Forecasting of \(\hat{Y}_{n+1}\).
1. ARMA Model Forecasting¶
Minimum Mean Squared Error (MSE) Forecasts¶
We use observed data \(\{y_{1}, y_{2}, \dots, y_{n}\}\) to forecast unobserved values \(\{y_{n+1}, y_{n+2}, \dots\}\).
* Forecast Origin (\(n\)): The last observed time point.
* \(l\)-step Ahead Forecast (\(\hat{Y}_{n}(l)\)): The forecast for time \(n+l\) obtained using the minimum MSE criteria.
* Conditional Expectation: The forecast is calculated as the expected value of the future point given the known history:
$\(\hat{Y}_{n}(l) = E(Y_{n+l} \mid Y_{n}, Y_{n-1}, \dots, Y_{1})\)$
Notation Change
In these derivations, we use \(a_{t}\) to represent random shocks (errors) to distinguish them from the forecast error, which is denoted by \(e_{t}\).
Random Shock Form¶
By expressing the ARMA model in its random shock form (using the backshift operator polynomials), we can write the future value as:
$\(Y_{n+l} = \theta_{0} + \psi(B)a_{t}\)$
When taking the expectation of errors (\(a_{n+j}\)), the following rules apply:
* \(j \leq 0\): The error is known (\(a_{n+j}\)), as it occurred in the past or present.
* \(j > 0\): The error is a future shock; its expected value is \(0\).
2. Forecast Error Analysis¶
The forecast error \(e_{n}(l)\) is the difference between the actual future value and our forecast:
$\(e_{n}(l) = Y_{n+l} - \hat{Y}_{n}(l) = \sum_{i=0}^{l-1} \psi_{i}a_{n+l-i}\)$
For any lead time \(l > 0\), the expected forecast error \(E(e_{n}(l))\) is \(0\).
1-step Ahead Forecast¶
- Actual: \(Y_{n+1} = \theta_{0} + a_{n+1} + \psi_{1}a_{n} + \psi_{2}a_{n-1} + \dots\)
- Forecast: \(\hat{Y}_{n}(1) = \theta_{0} + \psi_{1}a_{n} + \psi_{2}a_{n-1} + \dots\)
- Error: \(e_{n}(1) = Y_{n+1} - \hat{Y}_{n}(1) = a_{n+1}\)
- Error Variance: \(V(e_{n}(1)) = \sigma^{2}_{a}\)
2-step Ahead Forecast¶
- Actual: \(Y_{n+2} = \theta_{0} + a_{n+2} + \psi_{1}a_{n+1} + \psi_{2}a_{n} + \dots\)
- Forecast: \(\hat{Y}_{n}(2) = \theta_{0} + \psi_{2}a_{n} + \psi_{3}a_{n-1} + \dots\)
- Error: \(e_{n}(2) = Y_{n+2} - \hat{Y}_{n}(2) = a_{n+2} + \psi_{1}a_{n+1}\)
- Error Variance: \(V(e_{n}(2)) = \sigma^{2}_{a}(1 + \psi_{1}^{2})\)