In the previous sections, we explored Autoregressive (AR) models, which leverage past values, and Moving Average (MA) models, which use past errors. We also saw how ARMA models combine these two ideas. However, a significant assumption for AR, MA, and ARMA models is that the underlying time series data must be stationary. Its statistical properties, like mean and variance, should not change over time.

What happens when we encounter real-world data, which frequently exhibits trends or other forms of non-stationarity? Applying ARMA models directly to such data leads to unreliable results. This is where the 'I' in ARIMA comes into play.

ARIMA stands for Autoregressive Integrated Moving Average. The "Integrated" component addresses the non-stationarity issue. Recall from Chapter 2, "Time Series Decomposition and Stationarity," that differencing is a common technique used to transform a non-stationary series into a stationary one. We achieve this by computing the difference between consecutive observations. Sometimes, we might need to apply differencing more than once if the first difference doesn't result in a stationary series.

The term "Integrated" indicates that the modeling process incorporates this differencing step. It refers to the idea that the original non-stationary series can be thought of as the reverse of differencing, essentially a summation or integration (in a discrete sense) of the stationary differenced series.

An ARIMA model is characterized by three parameters: $(p, d, q)$ .

p (AR order): The number of lag observations included in the model. This is the same p as in AR and ARMA models, but it's determined based on the differenced series.
d (Degree of Differencing): The number of times the raw observations are differenced to achieve stationarity. If the original series is already stationary, then $d=0$ .
q (MA order): The size of the moving average window, representing the number of lagged forecast errors in the prediction equation. This is the same q as in MA and ARMA models, applied to the differenced series.

So, an ARIMA(p, d, q) model effectively applies an ARMA(p, q) model to the time series after it has been differenced d times.

Flow showing how differencing (d) makes data stationary, allowing ARMA(p, q) structure identification, leading to the ARIMA(p, d, q) model applied to the original data.

If a time series is already stationary, we set $d=0$ , and the ARIMA(p, 0, q) model simplifies directly to an ARMA(p, q) model. The power of ARIMA lies in its ability to handle a broader class of time series, specifically those that become stationary after one or more differencing steps.

When using libraries like statsmodels in Python, you typically specify the $(p, d, q)$ order and provide the original non-stationary time series. The library handles the differencing internally as part of the model fitting process. You don't usually need to difference the data manually before passing it to the ARIMA function, although understanding the differencing step (d) is important for selecting the correct model order.

In the upcoming sections, we'll discuss strategies for choosing the appropriate values for $p$ , $d$ , and $q$ , how to fit the model using Python, and how to evaluate its performance.

Was this section helpful?