Time Series and Forecasting
Updated: Apr 30, 2021
It is a model to predict future values based on previous forecasting values. Making predictions about the longer term is named extrapolation within the classical statistical handling of your time series data.
Explore and visualize series
Apply forecasting methods
Evaluate and compare performance
Includes purpose, type, cost of forecast errors, data to be available in future.
Determine components and relations
Models with explanation
Retrospective in nature
Forecast future values
t Time Period
yₜ value of series at time t(actual value)
Fₜ forecast value
Fₜ ₊ ₖ k-step ahead forecast
eₜ forecast error for time t(inaccuracy of forecast) = (yₜ - Fₜ)
k Forecast horizon(for large horizon, large uncertainty and less accuracy)
Some Important Concepts
It is the sequence of random numbers. If the series is of white noise, then we can’t forecast or predict the future values of the series. Hence the series should not be of white noise, but the errors should be.
Random walk/Drunken walk
Similar to randomly taking steps in any direction, but from here we can make sure that the next step will be near the previous step. In this case we can use naive forecasting which predicts the next value by using previous values as the forecast.
Decomposing Time Series in python
y(t) = Level + Trend + Seasonality + Noise
y(t) = Level * Trend * Seasonality * Noise
from statsmodels.tsa.seasonal import seasonal_decompose miles_decomp_df.head()
miles_decomp_df.index = miles_decomp_df['Month'] result = seasonal_decompose(miles_decomp_df['MilesMM'], model='additive') result.plot()
result2 = seasonal_decompose(miles_decomp_df['MilesMM'], model='multiplicative') result2.plot()
A simple and popular method for removing trends and seasonality from a series is given by differencing. This suggests taking the difference between two consecutive values during a series.
Lag 1 Differencing Yₜ - Yₜ ₋ ₁
Lag K Differencing Yₜ - Yₜ ₋ ₖ
With one time differencing, we can remove linear trends. For both quadratic and exponential trends, we have another round of lag 1 differencing on the new series. For removing a monthly seasonality pattern on a yearly sales data, we can apply lag 12 differencing. In case, we have both trends and seasonality in data, we have to differentiate twice, once to remove trend and again to remove seasonality.
Differencing in Python
miles_df['lag1'] = miles_df['MilesMM'].shift(1) miles_df['MilesMM_diff_1'] = miles_df['MilesMM'].diff(periods=1) miles_df.head()
Month MilesMM lag1 MilesMM_diff_1
1963-01-01 6827 NaN NaN
1963-02-01 6178 6827.0 -649.0
1963-03-01 7084 6178.0 906.0
1963-04-01 8162 7084.0 1078.0
1963-05-01 8462 8162.0 300.0
miles_df.index = miles_df['Month'] result_a = seasonal_decompose(miles_df['MilesMM'], model='additive') result_a.plot()
miles_df.index = miles_df['Month'] result_b = seasonal_decompose(miles_df.iloc[1:,3], model='additive') result_b.plot()
miles_df['MilesMM_diff_12'] = miles_df['MilesMM_diff_1'].diff(periods=12) miles_df['MilesMM_diff_12'].plot()
Month MilesMM lag1 MilesMM_diff_1 MilesMM_diff_12 1963-01-01 6827 NaN NaN NaN
1963-02-01 6178 6827.0 -649.0 NaN
1963-03-01 7084 6178.0 906.0 NaN
1963-04-01 8162 7084.0 1078.0 NaN
1963-05-01 8462 8162.0 300.0 NaN