Time Series and Forecasting

Anisha Mohanty
Apr 29, 2021
2 min read

Updated: Apr 30, 2021

It is a model to predict future values based on previous forecasting values. Making predictions about the longer term is named extrapolation within the classical statistical handling of your time series data.

Forecasting Process

Define Goal
Get data
Explore and visualize series
Pre-process data
Partition series
Apply forecasting methods
Evaluate and compare performance
Implement forecast/systems

Goal Definition

Includes purpose, type, cost of forecast errors, data to be available in future.

Descriptive Analysis
Predictive Analysis

Descriptive Analysis

Determine components and relations
Models with explanation
Retrospective in nature

Predictive Analysis

Forecast future values
High accuracy

Basic Notations

t Time Period
yₜ value of series at time t(actual value)
Fₜ forecast value
Fₜ ₊ ₖ k-step ahead forecast
eₜ forecast error for time t(inaccuracy of forecast) = (yₜ - Fₜ)
k Forecast horizon(for large horizon, large uncertainty and less accuracy)

Some Important Concepts

White Noise

It is the sequence of random numbers. If the series is of white noise, then we can’t forecast or predict the future values of the series. Hence the series should not be of white noise, but the errors should be.

Random walk/Drunken walk

Similar to randomly taking steps in any direction, but from here we can make sure that the next step will be near the previous step. In this case we can use naive forecasting which predicts the next value by using previous values as the forecast.

Decomposing Time Series in python

Additive Model

y(t) = Level + Trend + Seasonality + Noise

Multiplicative Model

y(t) = Level * Trend * Seasonality * Noise

from statsmodels.tsa.seasonal import seasonal_decompose
miles_decomp_df.head()

Month MilesMM

1963-01-01 6827

1963-02-01 6178

1963-03-01 7084

1963-04-01 8162

1963-05-01 8462

miles_decomp_df.index = miles_decomp_df['Month'] 
result = seasonal_decompose(miles_decomp_df['MilesMM'], model='additive')
result.plot()

result2 = seasonal_decompose(miles_decomp_df['MilesMM'], model='multiplicative')
result2.plot()

Differencing

A simple and popular method for removing trends and seasonality from a series is given by differencing. This suggests taking the difference between two consecutive values during a series.

Lag 1 Differencing Yₜ - Yₜ ₋ ₁

Lag K Differencing Yₜ - Yₜ ₋ ₖ

With one time differencing, we can remove linear trends. For both quadratic and exponential trends, we have another round of lag 1 differencing on the new series. For removing a monthly seasonality pattern on a yearly sales data, we can apply lag 12 differencing. In case, we have both trends and seasonality in data, we have to differentiate twice, once to remove trend and again to remove seasonality.

Differencing in Python

Month MilesMM

1963-01-01 6827

1963-02-01 6178

1963-03-01 7084

1963-04-01 8162

1963-05-01 8462

miles_df['lag1'] =  miles_df['MilesMM'].shift(1)
miles_df['MilesMM_diff_1'] = miles_df['MilesMM'].diff(periods=1)
miles_df.head()

Month MilesMM lag1 MilesMM_diff_1

1963-01-01 6827 NaN NaN

1963-02-01 6178 6827.0 -649.0

1963-03-01 7084 6178.0 906.0

1963-04-01 8162 7084.0 1078.0

1963-05-01 8462 8162.0 300.0

miles_df.index = miles_df['Month'] 
result_a = seasonal_decompose(miles_df['MilesMM'], model='additive')
result_a.plot()

miles_df.index = miles_df['Month'] 
result_b = seasonal_decompose(miles_df.iloc[1:,3], model='additive')
result_b.plot()

miles_df['MilesMM'].plot()

miles_df['MilesMM_diff_1'].plot()

miles_df['MilesMM_diff_12'] = miles_df['MilesMM_diff_1'].diff(periods=12)
miles_df['MilesMM_diff_12'].plot()

miles_df.head()

Month MilesMM lag1 MilesMM_diff_1 MilesMM_diff_12 1963-01-01 6827 NaN NaN NaN

1963-02-01 6178 6827.0 -649.0 NaN

1963-03-01 7084 6178.0 906.0 NaN

1963-04-01 8162 7084.0 1078.0 NaN

1963-05-01 8462 8162.0 300.0 NaN

CoE in Artificial Intelligence

Time Series and Forecasting

Recent Posts

Comments