What are ACF and PACF Plots in Time Series Analysis?

Ilyas Ahmed
5 min readMay 28, 2023

In time series analysis, ACF and PACF plots are two of the most important tools for identifying the underlying structure of a time series. The ACF plot shows the correlation of a time series with itself at different lags, while the PACF plot shows the correlation of a time series with itself at different lags, after removing the effects of the previous lags.

Autocorrelation Function (ACF)

The ACF plot is a graphical representation of the correlation of a time series with itself at different lags. The correlation coefficient is a measure of how closely two variables are related. A correlation coefficient of 1 indicates a perfect positive relationship, while a correlation coefficient of -1 indicates a perfect negative relationship. A correlation coefficient of 0 indicates no relationship between the two variables.

The ACF plot can be used to identify the order of an AR model. The order of an AR model is the number of lags that are included in the model. The ACF plot will show spikes at the lags that are included in the model.

Partial Autocorrelation Function (PACF)

The PACF plot is a graphical representation of the correlation of a time series with itself at different lags, after removing the effects of the previous lags. The PACF plot can be used to identify the order of an MA model. The order of an MA model is the number of lags that are included in the model. The PACF plot will show spikes at the lags that are included in the model.

AR, MA and ARMA Models

AR Models

An autoregressive (AR) model is a type of time series model that uses the past values of a time series to predict future values. AR models are a popular choice for forecasting time series data because they are relatively simple to understand and implement.

An AR model is typically written as:

y_t = c + \phi_1 y_{t-1} + \phi_2 y_{t-2} + ... + \phi_p y_{t-p} + \epsilon_t

where:

  • yt​ is the value of the time series at time t
  • c is the intercept term
  • ϕ1​,ϕ2​,…,ϕp​ are the model coefficients
  • ϵt​ is the error term

The model coefficients, ϕ1​,ϕ2​,…,ϕp​, are estimated using a variety of methods, such as least squares estimation. Once the model coefficients are estimated, the model can be used to predict future values of the time series. AR models are a versatile tool that can be used to forecast a wide variety of time series data. The accuracy of an AR model will depend on the characteristics of the time series data.

Here are some of the advantages of using AR models:

  • They are relatively simple to understand and implement.
  • They can be used to forecast a wide variety of time series data.
  • They are relatively efficient, meaning that they require a relatively small amount of data to train.

Here are some of the disadvantages of using AR models:

  • They can be sensitive to outliers.
  • They may not be accurate for time series data that is not stationary.
  • They may not be able to capture long-term trends.

MA Models

A moving average (MA) model is a type of time series model that uses the past errors of a time series to predict future values. MA models are a popular choice for forecasting time series data because they are relatively simple to understand and implement.

An MA model is typically written as:

y_t = c + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + … + \theta_q \epsilon_{t-q} + \epsilon_t

where:

  • yt​ is the value of the time series at time t
  • c is the intercept term
  • θ1​,θ2​,…,θq​ are the model coefficients
  • ϵt​ is the error term

The model coefficients, θ1​,θ2​,…,θq​, are estimated using a variety of methods, such as least squares estimation. Once the model coefficients are estimated, the model can be used to predict future values of the time series. It is important to note that MA models are not always accurate. The accuracy of an MA model will depend on the characteristics of the time series data.

Here are some of the advantages of using MA models:

  • They are relatively simple to understand and implement.
  • They can be used to forecast a wide variety of time series data.
  • They are relatively efficient, meaning that they require a relatively small amount of data to train.

Here are some of the disadvantages of using MA models:

  • They can be sensitive to outliers.
  • They may not be accurate for time series data that is not stationary.
  • They may not be able to capture long-term trends.

MA models are often used in combination with AR models to create ARMA models. ARMA models are more complex than AR or MA models, but they can be more accurate for certain types of time series data.

ARMA Models

ARMA models are a combination of AR and MA models. The order of an ARMA model is the sum of the order of the AR model and the order of the MA model. It uses both the past values and the past errors of a time series to predict future values. ARMA models are a popular choice for forecasting time series data because they are relatively simple to understand and implement, and they can be more accurate than AR or MA models for certain types of time series data.

Interpreting ACF and PACF Plots

The ACF and PACF plots can be used to identify the underlying structure of a time series. The following are some general guidelines for interpreting ACF and PACF plots:

  • If the ACF plot shows spikes at the first few lags, then an AR model may be appropriate.
  • If the PACF plot shows spikes at the first few lags, then an MA model may be appropriate.
  • If the ACF and PACF plots both show spikes at the first few lags, then an ARMA model may be appropriate.

It is important to note that the ACF and PACF plots are just a starting point for identifying the underlying structure of a time series. The final model should be chosen based on a combination of the ACF and PACF plots, as well as other factors such as the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC).

Python Code to Display ACF and PACF Plots

In the below code a sample data.csv file is loaded and then the PACF and ACF plots are displayed using the fuction acf and pacf from statsmodels.tsa.stattools.

import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import acf, pacf

# Load the data
data = pd.read_csv('data.csv', index_col='Date')

# Plot the ACF and PACF
plt.subplot(211)
plt.plot(acf(data['Value']))
plt.axhline(0, color='black')
plt.axhline(1.96/np.sqrt(len(data)), color='red', linestyle='dashed')
plt.axhline(-1.96/np.sqrt(len(data)), color='red', linestyle='dashed')
plt.title('ACF')

plt.subplot(212)
plt.plot(pacf(data['Value']))
plt.axhline(0, color='black')
plt.axhline(1.96/np.sqrt(len(data)), color='red', linestyle='dashed')
plt.axhline(-1.96/np.sqrt(len(data)), color='red', linestyle='dashed')
plt.title('PACF')

plt.show()

Conclusion

The ACF and PACF plots are two of the most important tools for identifying the underlying structure of a time series. The ACF plot shows the correlation of a time series with itself at different lags, while the PACF plot shows the correlation of a time series with itself at different lags, after removing the effects of the previous lags. The ACF and PACF plots can be used to identify the order of AR, MA, and ARMA models.

Follow me more for exciting topics on Machine Learning Ilyas Ahmed

--

--

Ilyas Ahmed

Data Scientist at Wipro Arabia Ltd. Experienced in ML, NLP and Comp. Vision. Sharing what I know :)