Non-stationary Time Series¶

In real-world applications, time series data usually exhibits Trend, Seasonality, or Cyclicality, making it inherently non-stationary. To analyze such data, we must transform it into a stationary series using mathematical operators.

1. Mathematical Operators¶

Two primary operators are used to manipulate and stabilize time series data:

Backshift Operator ($B$): Shifts the observation back by $d$ time units.
$$B^dY_{t} = Y_{t-d}$$
Differencing Operator ($\nabla$): A technique popularized by Box & Jenkins to remove non-stationary components.
- First Difference: $\nabla Y_{t} = Y_{t} - Y_{t-1} = (1-B)Y_{t}$
- Second Difference: $\nabla^2 Y_{t} = \nabla(\nabla Y_{t}) = Y_{t} - 2Y_{t-1} + Y_{t-2}$
- Lag $d$ Difference: $\nabla_{d}Y_{t} = Y_{t} - Y_{t-d}$ (Used primarily for removing seasonality).

2. Eliminating Trends through Differencing¶

Example 1: Linear Trend¶

Consider a model with a linear trend: $Y_{t} = bt + S_{t}$. It is non-stationary because the mean is time-dependent. Applying a first difference:
$$\nabla Y_{t} = bt + S_{t} - (b(t-1) + S_{t-1}) = S_{t} + b - S_{t-1}$$
The result is now stationary as the time-dependent term $t$ is eliminated.

Example 2: Quadratic Trend¶

For a stronger trend like $Y_{t} = bt^{2} + S_{t}$, a single differencing is insufficient. We apply the second difference operator:
$$W_{t} = \nabla^{2}Y_{t} = Y_{t} - 2Y_{t-1} + Y_{t-2}$$
This eventually simplifies to:
$$W_{t} = 2b + S_{t} - 2S_{t-1} + S_{t-2}$$
The quadratic growth is neutralized, resulting in a stationary process.

3. Random Walk as a Non-Stationary Process¶

The Random Walk model is $Y_{t} = Y_{t-1} + e_{t}$. If viewed as an $AR(1)$ process:
$$Y_{t} = c + \phi_{1} Y_{t-1} + e_{t}$$
Here, $c=0$ and $\phi_{1} = 1$. Recall that for an $AR(1)$ process to be stationary, we require $|\phi_{1}| < 1$. Since $\phi_{1} = 1$ (often called a Unit Root), the random walk is fundamentally non-stationary.

4. Seasonal Models and Decomposition¶

In the classical decomposition model, a series is expressed as:
$$Y_{t} = m_{t} + s_{t} + S_{t}$$
* $m_{t}:$ Trend component.
* $s_{t}:$ Seasonal component with period $d$.
* $S_{t}:$ Stationary process (noise).

Transformation Rules

Lag $d$ difference ($\nabla_d$): Removes seasonality of period $d$.
Lag 1 difference ($\nabla$): Applied multiple times, it removes the trend aspect of the series.

5. The $ARIMA(p,d,q)$ Process¶

The AutoRegressive Integrated Moving Average (ARIMA) model is used when a series becomes stationary after being differenced $d$ times.

Definition:
If $W_{t} = \nabla^d Y_{t} = (1-B)^d Y_{t}$ is a stationary $ARMA(p,q)$ process, then the original series $Y_t$ is an $ARIMA(p,d,q)$ process.

p: Order of the Autoregressive part.
d: Degree of differencing involved.
q: Order of the Moving Average part.

Example: An $ARIMA(1,1,1)$ with $\phi_{1} = 0.7$ and $\theta_{1} = 0.2$ implies the first difference of the data follows an $ARMA(1,1)$ structure.