Skip to content

L59 Neural Networks for TS

  • Artificial Neural Networks = simple mathematical models of the brain

Network Architecture

  • Network of neurons organized in layers
  • Bottom layer: Predictors (inputs)
  • Intermediate layers: "hidden neurons"
  • Top layer: forecasts
  • Loosely mimic the way our brain solves the problem

The "Perceptron"

  • Network of nodes sending signals
  • "Feed forward" network (one direction)
  • A neuron connected with \(n\) other neurons and thus receives \(n\) inputs (\(x_{1,}x_{2},\dots,x_{n}\))

Weights

  • Do you want to go to work?
    • Weather? 0 or 1
    • Weekday? 0 or 1
  • We cannot tell the neural network these conditions, it learns them for itself
  • It does so by applying weights
  • A higher weight \(\implies\) the input is more important
  • It computes weight form the data

Training in Perceptron

  • For all inputs \(i\)
\[ W(i) = W(i) + a \times g'(\text{sum of inputs}) \times (T-A) \times P(i) \]
  • \(g'\) → derivative of the activation function
  • \(a\) → learning rate
  • \(P\) → Input vector
  • \(T\) → Correct Output
  • \(A\) → Output given by perceptron

Activation Function

  • Mathematical equations
  • Determine the output of NN layer given its inputs
  • This is the secret ingredient that enables it to learn complex patterns in the data

Types of Activation Functions

  • Sigmoid: \(f(x) = \dfrac{1}{1+e^{ -x }}\) for binary classification
  • Tan-h: \(f(x) = \dfrac{e^{ x }-e^{ -x }}{e^{ x} + e^{ -x }}\) for hidden layer where output needs to be 0 centered

No Hidden Layers?

  • Then the model reduces to a regression.

L59__Neural Networks for TS-1769507113462.webp

Extension

  • Multilayer feed-forward
  • The inputs to each node are combined using a weighted linear combination.

L59__Neural Networks for TS-1769507138125.webp

  • More the number of hidden layers, the more complicated the neural network

Neural Network Autoregression (NNAR)

  • lagged values of TS used as inputs
  • only consider
    • feed-forward networks
    • one hidden layer
  • \(NNAR(p,k)\)
    • \(p\) lagged inputs
    • \(k\) nodes in the hidden layer
  • \(NNAR(p,0)\) is equivalent to \(AR(p)\)

Batch, Iterations & Epoch

  • Big dataset (we cannot pass everything to the NN at once)
  • divide datasets into "batches"
  • Batch size is a hyper parameter
  • Iterations it the number of batches needed to compute one epoch
  • One Epoch = the ENTIRE dataset is passed forward and backward through the neural network only ONCE

Use of More Than one Epoch

  • Use gradient descent to update weights
  • As the # of epochs increases,
    • the weights are changed in NN
    • from Underfitting to Optimal to Overfitting