ARMA/ARIMA/SARIMA Models

1 ARIMA Modeling

In this part, I first use the weekly sales of type A stores from the Walmart dataset mentioned in Data Visualization and EDA tab.

1.1 Check the stationarity

This is a visualization of this time series.

Now review the ACF, PACF and Augmented Dickey-Fuller Test in EDA tab.

Code

ggAcf(temp.ts,150)+ggtitle("ACF graph of the weekly sales")

Code

ggPacf(temp.ts,150)+ggtitle("PACF graph of the weekly sales")

According to the ACF and PACF plots, this time series shows good stationarity besides a extreme value in ACF plot at the lag 52.

Code

adf.test(temp.ts)


    Augmented Dickey-Fuller Test

data:  temp.ts
Dickey-Fuller = -5.3171, Lag order = 5, p-value = 0.01
alternative hypothesis: stationary

As for the ADF test, since p-value is less than 0.05, we have enough evidence to reject the null hypothesis and assert that this time series is stationary.

1.2 Determine the parameter

Now use the ACF and PACF again to determine the parameter p and q in the ARIMA model.

Code

ggAcf(temp.ts,150)+ggtitle("ACF graph of the weekly sales")

From this ACF plot, we can observe that q is likely to be 1, 2, or 5.

Code

ggPacf(temp.ts,150)+ggtitle("PACF graph of the weekly sales")

From this PACF plot, we can predict that p is likely to be 1, 4, or 5. Since we have not differenced the series, d is 0, and this is actually an ARMA model.

1.3 Fit and choose the model

In this part, I will fit the ARIMA model with all possible parameters mentioned above, and use AIC, BIC and AICc as metrics to choose the best model. Here is the result.

Code

i=1
ls=matrix(rep(NA,6*9),nrow=9) 

for (p in c(1,2,5))
{
  for(q in c(1,4,5))
  {
    model<- Arima(temp.ts, order=c(p,0,q),include.drift = TRUE)
    ls[i,]= c(p,0,q,model$aic,model$bic,model$aicc)
    i=i+1
  }
}

temp= as.data.frame(ls)
names(temp)= c("p","d","q","AIC","BIC","AICc")

knitr::kable(temp)

p	q	AIC	BIC	AICc
1	1	2599.059	2613.873	2599.497
1	4	2575.516	2599.219	2576.591
1	5	2566.104	2592.770	2567.458
2	1	2595.558	2613.335	2596.176
2	4	2576.097	2602.763	2577.451
2	5	2568.081	2597.710	2569.748
5	1	2581.966	2608.631	2583.319
5	4	2580.545	2616.099	2582.945
5	5	2573.915	2612.432	2576.737

  p d q      AIC     BIC     AICc
3 1 0 5 2566.104 2592.77 2567.458

We can see that the model with parameters (p,d,q)=(1,0,5) has minimum AIC, BIC and AICc. Therefore, I choose this model as the best model.

Here is the summary of the ARIMA(1,0,5) model.

Code

fit <- Arima(temp.ts, order=c(1,0,5),include.drift = TRUE)
summary(fit)

Series: temp.ts 
ARIMA(1,0,5) with drift 

Coefficients:
         ar1      ma1      ma2      ma3     ma4      ma5   intercept   drift
      0.7632  -0.4402  -0.1257  -0.2495  0.3608  -0.5453  19932.3528  2.9485
s.e.  0.0722   0.0896   0.0762   0.0684  0.0858   0.0835    139.7463  1.8740

sigma^2 = 3291926:  log likelihood = -1274.05
AIC=2566.1   AICc=2567.46   BIC=2592.77

Training set error measures:
                    ME     RMSE     MAE        MPE     MAPE     MASE
Training set -12.95413 1762.885 1152.09 -0.6588532 5.480607 1.960847
                     ACF1
Training set -0.005082284

The equation of this ARIMA model:

\[ X_t-0.7632X_{t-1}=\omega_t-0.4402\omega_{t-1}-0.1257\omega_{t-2}-0.2495\omega_{t-3}+0.3608\omega_{t-4}-0.5453\omega_{t-5}+19932.3528 \]

1.4 Model diagnostic

Code

model_output <- capture.output(sarima(temp.ts, 1,0,5))

Code

checkresiduals(fit)


    Ljung-Box test

data:  Residuals from ARIMA(1,0,5) with drift
Q* = 6.8906, df = 23, p-value = 0.9995

Model df: 6.   Total lags used: 29

According to the Ljung-Box test, the residuals are independent, which means the model fits this time series well. However, from the Q-Q plot we can observe that a small fraction of the residuals do not follow the normal distribution.

1.5 Auto.arima fitting

Try to use auto.arima function to fit an ARIMA model with this time series:

Code

auto.arima(temp.ts,seasonal = FALSE)

Series: temp.ts 
ARIMA(2,0,2) with non-zero mean 

Coefficients:
          ar1      ar2     ma1     ma2       mean
      -0.9938  -0.3999  1.4020  0.9055  20102.877
s.e.   0.1172   0.1118  0.0735  0.0719    219.415

sigma^2 = 3746572:  log likelihood = -1283.58
AIC=2579.16   AICc=2579.77   BIC=2596.93

Auto.arima chooses parameters (p,d,q)=(2,0,2), which is different from what we have chosen in last part. According to the AIC and BIC, we notice that this model is not so good as the ARIMA(1,0,5) for our time series. The reason auto.arima function makes this choice is perhaps because seasonal component actually exists in the time series and it misleads the auto.arima when this function try to fit an ARIMA model. In fact, a SARIMA model might be better to fit this data than ARIMA model, and I will discuss this later.

1.6 Forecasting

Code

fit %>% forecast(h=12) %>% autoplot()

This plot shows the forecasting of next three months and it seems to make sense.

Code

fit %>% forecast(h=52) %>% autoplot()

This plot shows the forecasting of next whole year. From this plot, we can clearly see that this ARIMA model actually does not fit the seasonal fluctuations.

1.7 Benchmark comparison

In this part, I will compare the ARIMA model with several benchmark methods, including mean, naive, snaive and drift, using MAE and RMSE.

Use first 120 observations as training data and last 23 observations as test set. Then train the ARIMA model again on the training set.

Code

train = ts(A_sales$avg[1:120], start=decimal_date(as.Date("2010-02-05")), frequency = 365.25/7)
test=temp.ts[121:143]
train_fit=Arima(train, order=c(1,0,5),include.drift = TRUE) 
summary(train_fit)

Series: train 
ARIMA(1,0,5) with drift 

Coefficients:
         ar1      ma1      ma2      ma3     ma4      ma5   intercept   drift
      0.7227  -0.3615  -0.1676  -0.2783  0.4447  -0.6373  19833.9447  4.7891
s.e.  0.0846   0.1024   0.0771   0.0738  0.1058   0.0899    175.1524  2.7894

sigma^2 = 3720135:  log likelihood = -1077.18
AIC=2172.36   AICc=2174   BIC=2197.45

Training set error measures:
                   ME     RMSE      MAE        MPE     MAPE     MASE
Training set 5.740623 1863.364 1262.993 -0.6045651 5.981074 1.939874
                    ACF1
Training set -0.02798821

Code

pred=forecast(train_fit,23)
res=pred$mean-test

pred_mean=meanf(train, h=23)
pred_naive=naive(train, h=23)
pred_snaive=snaive(train, h=23)
pred_drift=rwf(temp.ts, h=23, drift=TRUE)

res_mean=pred_mean$mean-test
res_naive=pred_naive$mean-test
res_snaive=pred_snaive$mean-test
res_drift=pred_drift$mean-test

mae_fit=mean(abs(res))
mae_mean=mean(abs(res_mean))
mae_naive=mean(abs(res_naive))
mae_snaive=mean(abs(res_snaive))
mae_drift=mean(abs(res_drift))

rmse_fit=sqrt(mean(res**2))
rmse_mean=sqrt(mean(res_mean**2))
rmse_naive=sqrt(mean(res_naive**2))
rmse_snaive=sqrt(mean(res_snaive**2))
rmse_drift=sqrt(mean(res_drift**2))

temp=data.frame(
  methods=c("ARIMA", "mean", "naive", "snaive", "drift"),
  MAE=c(mae_fit, mae_mean, mae_naive, mae_snaive, mae_drift),
  RMSE=c(rmse_fit, rmse_mean, rmse_naive, rmse_snaive, rmse_drift)
)
knitr::kable(temp)

methods	MAE	RMSE
ARIMA	778.2579	965.2251
mean	655.3043	819.5058
naive	657.3680	827.9891
snaive	706.8828	952.0468
drift	887.8810	1063.6860

From this result, we can find that the ARIMA model trained on the training set does not perform better than some of the benchmark methods.

Following is a graph that displays the forecasting of ARIMA and benchmark methods for next half year.

Code

autoplot(temp.ts) +
  autolayer(meanf(temp.ts, h=26),
            series="Mean", PI=FALSE) +
  autolayer(naive(temp.ts, h=26),
            series="Naïve", PI=FALSE) +
  autolayer(snaive(temp.ts, h=26),
            series="SNaïve", PI=FALSE)+
  autolayer(rwf(temp.ts, h=26, drift=TRUE),
            series="Drift", PI=FALSE)+
  autolayer(forecast(fit,26), 
            series="fit",PI=FALSE) +
  guides(colour=guide_legend(title="Forecast"))

There might be several reasons that this ARIMA model do not perform better than benchmark methods. First, the training set is relatively small, which only contains data almost in two years. Second, there exists seasonal component every year, but a basic ARIMA model can not fit the underline seasonal pattern well. In further analysis, a SARIMA can be used to enhance the performance.

2 SARIMA Modeling

Now, let’s go back to review this ACF plot again.

Code

ggAcf(temp.ts,150)+ggtitle("ACF graph of the weekly sales")

In this plot, it is clear that a significant value appears at lag 52, which is actually the number of weeks in a year. This phenomenon suggests strong seasonal effects in this time series. Therefore, it is appropriate to fit a SARIMA model, which takes seasonal components into consideration, compared to a traditional ARIMA model that we have fitted in the above section.

Additionally, in order to more comprehensively analyze the sales of Walmart, I will also respectively consider the weekly sales of type B and C stores later.

2.1 Differencing and seasonal differencing

Apply first order differencing to the time series and plot:

Code

temp.ts %>% diff() %>% ggtsdisplay()

The ACF and PACF show relatively good stationarity.

Now try to apply a seasonal differencing where the seasonal lag is 52.

Code

temp.ts %>% diff(lag=52) %>% ggtsdisplay()

The ACF and PACF indicate much better stationarity compared to first order differencing. Thus, for the parameters of SARIMA model, I will consider:
d=0,1
D=1
p=0,1
P=1,2
q=0,1
Q=1,2

2.2 Fit and choose the model

Code

i=1
ls=matrix(rep(NA,9*12),nrow=12) 

for (p in 0:1)
{
  for(q in 0:1)
  {
    for(P in 1:2)
    {
      for(Q in 1:2)
        {
        for(d in 0:1){
          model<- try(Arima(temp.ts,order=c(p,d,q),seasonal=c(P,1,Q)),silent = TRUE)
          if('try-error' %in% class(model)){
            next
            }else{
            ls[i,]= c(p,d,q,P,1,Q,model$aic,model$bic,model$aicc)
            i=i+1
          }
        }
      }
    }
  }
}

temp= as.data.frame(ls)
names(temp)= c("p","d","q","P","D","Q","AIC","BIC","AICc")

#temp
knitr::kable(temp)

p	d	q	P	D	Q	AIC	BIC	AICc
0	0	0	1	1	1	1480.785	1488.317	1481.061
0	1	0	1	1	1	1506.566	1514.065	1506.845
0	0	0	1	1	2	1482.785	1492.828	1483.250
0	1	0	1	1	2	1508.566	1518.565	1509.036
0	0	1	1	1	1	1480.143	1490.186	1480.608
0	1	1	1	1	1	1463.849	1473.848	1464.320
0	0	1	1	1	2	1482.143	1494.697	1482.849
0	1	1	1	1	2	1465.849	1478.348	1466.563
1	1	0	1	1	1	1491.692	1501.691	1492.163
1	1	0	1	1	2	1493.692	1506.191	1494.406
1	0	1	1	1	1	1482.677	1495.232	1483.383
1	0	1	1	1	2	1484.514	1499.579	1485.514

  p d q P D Q      AIC      BIC    AICc
6 0 1 1 1 1 1 1463.849 1473.848 1464.32

We can see that the model with parameters (p,d,q,P,D,Q)=(0,1,1,1,1,1) has minimum AIC, BIC and AICc. Therefore, I choose this model as the best model.

Here is the summary of the SARIMA(0,1,1,1,1,1,52) model.

Code

fit=Arima(temp.ts,order=c(0,1,1),seasonal=c(1,1,1))
summary(fit)

Series: temp.ts 
ARIMA(0,1,1)(1,1,1)[52] 

Coefficients:
          ma1      sar1      sma1
      -0.9291   -0.3302   -0.0861
s.e.   0.0434  118.5891  141.9681

sigma^2 = 566412:  log likelihood = -727.92
AIC=1463.85   AICc=1464.32   BIC=1473.85

Training set error measures:
                   ME     RMSE      MAE       MPE     MAPE      MASE      ACF1
Training set 68.27586 587.0266 335.8448 0.3596146 1.642288 0.5716048 0.0539026

Model diagnostic:

Code

checkresiduals(fit)


    Ljung-Box test

data:  Residuals from ARIMA(0,1,1)(1,1,1)[52]
Q* = 18.305, df = 26, p-value = 0.8644

Model df: 3.   Total lags used: 29

According to the Ljung-Box test, the residuals are independent, which means the SARIMA model fits this time series well.

2.3 Auto.arima fitting

Try to use auto.arima function to fit a SARIMA model with this time series:

Code

auto.arima(temp.ts,seasonal = TRUE)

Series: temp.ts 
ARIMA(0,1,2)(0,1,0)[52] 

Coefficients:
          ma1      ma2
      -0.7677  -0.1576
s.e.   0.1062   0.1043

sigma^2 = 647091:  log likelihood = -729.61
AIC=1465.21   AICc=1465.49   BIC=1472.71

auto.arima function fits a SARIMA(0,1,2,0,1,0)[52] model, which is different from the SARIMA(0,1,1,1,1,1)[52] that we chose just now. The AIC and AICc of this model is larger than our chosen model, while BIC of this model is slightly smaller due to less parameters. Actually, both models have relatively good performance and they have very close AIC, BIC and AICc values. The results of auto.arima this time make much more sense than in ARIMA section early in this page.

2.4 Forecasting

Code

fit %>% forecast(h=52) %>% autoplot()

This is the forecasting for the next whole year with our SARIMA model. It is obvious that this model has fitted the seasonal effect quite well.

2.5 Benchmark comparison

Similar to ARIMA section, I will compare the SARIMA model with several benchmark methods, including mean, naive, snaive and drift, using MAE and RMSE.

Use first 120 observations as training data and last 23 observations as test set. Then train the SARIMA model again on the training set.

Code

train = ts(A_sales$avg[1:120], start=decimal_date(as.Date("2010-02-05")), frequency = 365.25/7)
test=temp.ts[121:143]
train_fit=Arima(train,order=c(0,1,1),seasonal=c(1,1,1))
summary(train_fit)

Series: train 
ARIMA(0,1,1)(1,1,1)[52] 

Coefficients:
          ma1     sar1     sma1
      -0.9149  -0.3009  -0.0224
s.e.   0.0512      NaN      NaN

sigma^2 = 724956:  log likelihood = -549.3
AIC=1106.6   AICc=1107.24   BIC=1115.41

Training set error measures:
                   ME     RMSE      MAE      MPE     MAPE      MASE      ACF1
Training set 83.15758 621.8062 338.2122 0.443131 1.647231 0.5194715 0.1017468

Code

pred=forecast(train_fit,23)
res=pred$mean-test

pred_mean=meanf(train, h=23)
pred_naive=naive(train, h=23)
pred_snaive=snaive(train, h=23)
pred_drift=rwf(temp.ts, h=23, drift=TRUE)

res_mean=pred_mean$mean-test
res_naive=pred_naive$mean-test
res_snaive=pred_snaive$mean-test
res_drift=pred_drift$mean-test

mae_fit=mean(abs(res))
mae_mean=mean(abs(res_mean))
mae_naive=mean(abs(res_naive))
mae_snaive=mean(abs(res_snaive))
mae_drift=mean(abs(res_drift))

rmse_fit=sqrt(mean(res**2))
rmse_mean=sqrt(mean(res_mean**2))
rmse_naive=sqrt(mean(res_naive**2))
rmse_snaive=sqrt(mean(res_snaive**2))
rmse_drift=sqrt(mean(res_drift**2))

temp=data.frame(
  methods=c("SARIMA", "mean", "naive", "snaive", "drift"),
  MAE=c(mae_fit, mae_mean, mae_naive, mae_snaive, mae_drift),
  RMSE=c(rmse_fit, rmse_mean, rmse_naive, rmse_snaive, rmse_drift)
)
knitr::kable(temp)

methods	MAE	RMSE
SARIMA	386.7546	482.1954
mean	655.3043	819.5058
naive	657.3680	827.9891
snaive	706.8828	952.0468
drift	887.8810	1063.6860

According to this result table, SARIMA outperforms all the benchmark methods without any doubt. Both MAE and RMSE of the model are much less than benchmarks. This means the SARIMA model performs well and it can fit the pattern within the data.

Following is a graph that displays the forecasting of ARIMA and benchmark methods for next half year.

Code

autoplot(temp.ts) +
  autolayer(meanf(temp.ts, h=26),
            series="Mean", PI=FALSE) +
  autolayer(naive(temp.ts, h=26),
            series="Naïve", PI=FALSE) +
  autolayer(snaive(temp.ts, h=26),
            series="SNaïve", PI=FALSE)+
  autolayer(rwf(temp.ts, h=26, drift=TRUE),
            series="Drift", PI=FALSE)+
  autolayer(forecast(fit,26), 
            series="fit",PI=FALSE) +
  guides(colour=guide_legend(title="Forecast"))

2.6 Cross-validation

In this part, I will try seasonal cross validation using 1 step ahead forecasts and 52 steps (the seasonal period of this data) ahead forecasts and compare the RMSE of our chosen SARIMA model and the auto.arima model.

Code

set.seed(1001)

farima1 <- function(x, h){forecast(Arima(x, order=c(0,1,1),seasonal = c(1,1,1)), h=h)}
farima2 <- function(x, h){forecast(Arima(x, order=c(0,1,2),seasonal = c(0,1,0)), h=h)}

e1 <- tsCV(temp.ts, farima1, h=1)
e2 <- tsCV(temp.ts, farima2, h=1)
RMSE1=sqrt(mean(e1^2, na.rm=TRUE))
RMSE2=sqrt(mean(e2^2, na.rm=TRUE))

e3 <- tsCV(temp.ts, farima1, h=52)
e4 <- tsCV(temp.ts, farima2, h=52)
RMSE3=sqrt(mean(e3^2, na.rm=TRUE))
RMSE4=sqrt(mean(e4^2, na.rm=TRUE))

Code

temp=data.frame(
  Models=c("Chosen model", "Auto model"),
  step_1=c(RMSE1, RMSE2),
  step_52=c(RMSE3, RMSE4)
)
knitr::kable(temp)

Models	step_1	step_52
Chosen model	829.5337	713.0525
Auto model	860.5003	926.7938

This RMSE result table shows that our chosen model SARIMA(0,1,1,1,1,1)[52] have much smaller RMSE for both 1 step and 52 steps cross-validation, which means our model performs better than the model selected by auto.arima.

2.7 Type B and C stores

In this part, I will fit SARIMA models for the weekly sales of type B and C stores. First, let’s have a quick look of these two time series.

We can notice that the data pattern of type B stores are very similar to type A stores, but the weekly sales of type C stores look quite different.

Now, fit different SARIMA models for type B and C stores then choose the best models based on AIC, BIC and AICc.

Code

i=1
ls=matrix(rep(NA,9*32),nrow=32) 

for (p in 0:1)
{
  for(q in 0:1)
  {
    for(P in 0:1)
    {
      for(Q in 0:1)
        {
        for(d in 0:1){
          model<- try(Arima(temp.ts.b,order=c(p,d,q),seasonal=c(P,1,Q)),silent = TRUE)
          if('try-error' %in% class(model)){
            next
            }else{
            ls[i,]= c(p,d,q,P,1,Q,model$aic,model$bic,model$aicc)
            i=i+1
          }
        }
      }
    }
  }
}

temp= as.data.frame(ls)
names(temp)= c("p","d","q","P","D","Q","AIC","BIC","AICc")

   p d q P D Q      AIC      BIC     AICc
14 0 1 1 1 1 0 1402.548 1410.047 1402.827

For type B stores, the best model with minimum AIC, BIC and AICc is SARIMA(0,1,1,1,1,0)[52]. The parameters of this model is almost the same as those we discussed for type A stores except Q.

Code

fit=Arima(temp.ts.b,order=c(0,1,1),seasonal=c(1,1,0))
fit %>% forecast(h=52) %>% autoplot()

This forecasting above shows that the model captured the seasonal effect well.

Code

i=1
ls=matrix(rep(NA,9*32),nrow=32) 

for (p in 0:1)
{
  for(q in 0:1)
  {
    for(P in 0:1)
    {
      for(Q in 0:1)
        {
        for(d in 0:1){
          model<- try(Arima(temp.ts.c,order=c(p,d,q),seasonal=c(P,1,Q)),silent = TRUE)
          if('try-error' %in% class(model)){
            next
            }else{
            ls[i,]= c(p,d,q,P,1,Q,model$aic,model$bic,model$aicc)
            i=i+1
          }
        }
      }
    }
  }
}

temp= as.data.frame(ls)
names(temp)= c("p","d","q","P","D","Q","AIC","BIC","AICc")

   p d q P D Q      AIC      BIC     AICc
14 0 1 1 1 1 0 1259.452 1266.952 1259.731

As for type C stores, the number of model parameters are exactly the same as type B stores. The optimal model is also SARIMA(0,1,1,1,1,0)[52].

Code

fit=Arima(temp.ts.c,order=c(0,1,1),seasonal=c(1,1,0))
fit %>% forecast(h=52) %>% autoplot()

In a conclusion, although type A, B and C stores have different sizes, the SARIMA models fitted with the data have similar parameters and same seasonal period. Additionally, the data patterns of weekly sales for type B and C stores look very different, but the final models of these two types of store selected based on AIC, BIC and AICc have exactly the same numbers of parameters, which suggests that the hidden patterns and seasonal components are actually similar.