Complete Exercise 6 at the end of Chapter 6 in the textbook. Then, write a paper that reconstructs a time series forecast. Cite at least one outside source to answer the last part of the assignment, in which you are asked to compare forecasts from stlf() with those from snaive(), using a test set comparing the last two years of data. Which is better? Also, what errors discussed in this module, if any, could occur when creating this forecast?
Your paper should be written in narrative form in APA style.
Exercise 6
We will use the bricksq data (Australian quarterly clay brick production. 1956–1994) for this exercise.
a. Use an STL decomposition to calculate the trend-cycle and seasonal indices. (Experiment with having fixed or changing seasonality.)
b. Compute and plot the seasonally adjusted data.
c. Use a naïve method to produce forecasts of the seasonally adjusted data.
d. Use stlf() to reseasonalise the results, giving forecasts for the original data.
e. Do the residuals look uncorrelated?
f. Repeat with a robust STL decomposition. Does it make much difference?
g. Compare forecasts from stlf() with those from snaive(), using a test set comprising the last 2 years of data. Which is better?
Exercise answers below:
We will use the bricksq data (Australian quarterly clay brick production. 1956-1994) for this exercise. 6a.Use an STL decomposition to calculate the trend-cycle and seasonal indices. (Experiment with having fixed or changing seasonality.)
bricksq %>%
stl(t.window=30, s.window="periodic", robust=TRUE) %>%
autoplot()
bricksq %>%
stl(t.window=30, s.window=7, robust=TRUE) %>%
autoplot()
bricksq %>%
stl(t.window=30, s.window=50, robust=TRUE) %>%
autoplot()
6b.Compute and plot the seasonally adjusted data.
bricksq %>%
stl(t.window=30, s.window="periodic", robust=FALSE) %>% seasadj() %>%
autoplot()
6c.Use a naïve method to produce forecasts of the seasonally adjusted data.
ft_stl <- stl(bricksq, t.window=30, s.window="periodic", robust=FALSE) ft_stl %>% seasadj() %>% naive() %>% autoplot() + ylab("New orders index") +
ggtitle("Naive forecasts of seasonally adjusted data")
6d.Use stlf() to reseasonalise the results, giving forecasts for the original data.
ft_stlf <- stlf(bricksq, method='naive')
forecast(ft_stlf,h=8)
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
1994 Q4 465.7338 440.1878 491.2798 426.6645 504.8030
1995 Q1 424.6739 388.5464 460.8014 369.4217 479.9261
1995 Q2 483.9926 439.7456 528.2395 416.3227 551.6624
1995 Q3 494.0000 442.9080 545.0920 415.8616 572.1384
1995 Q4 465.7338 408.6112 522.8563 378.3723 553.0952
1996 Q1 424.6739 362.0993 487.2485 328.9742 520.3736
1996 Q2 483.9926 416.4042 551.5809 380.6251 587.3600
1996 Q3 494.0000 421.7450 566.2550 383.4956 604.5044
6e.Do the residuals look uncorrelated?
checkresiduals(ft_stlf)
Warning in checkresiduals(ft_stlf): The fitted degrees of freedom is based
on the model used for the seasonally adjusted data.
Ljung-Box test
data: Residuals from STL + Random walk
Q* = 40.829, df = 8, p-value = 2.244e-06
Model df: 0. Total lags used: 8
at the quater of 4 and 8, there does show slight correlation. Howevere, this is acceptable for a naive forecast.
6f.Repeat with a robust STL decomposition. Does it make much difference?
ft_stl2 <- stl(bricksq, t.window=30, s.window="periodic", robust=TRUE) ft_stl2 %>% seasadj() %>% naive() %>% autoplot() + ylab("New orders index") +
ggtitle("Naive forecasts of seasonally adjusted data,r")
forecast(ft_stl2, h=8)
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
1994 Q4 470.9904 436.6159 505.3649 418.4191 523.5617
1995 Q1 437.2186 389.5246 484.9127 364.2768 510.1605
1995 Q2 479.5628 421.5014 537.6241 390.7656 568.3600
1995 Q3 493.6721 426.8080 560.5362 391.4123 595.9319
1995 Q4 470.9904 396.3326 545.6482 356.8111 585.1696
1996 Q1 437.2186 355.4869 518.9504 312.2208 562.2165
1996 Q2 479.5628 391.3036 567.8219 344.5820 614.5435
1996 Q3 493.6721 399.3184 588.0258 349.3706 637.9736
robust or not does not significantly influence the result.
6g.Compare forecasts from stlf() with those from snaive(), using a test set comprising the last 2 years of data. Which is better?
bricksq
Qtr1 Qtr2 Qtr3 Qtr4
1956 189 204 208 197
1957 187 214 227 223
1958 199 229 249 234
1959 208 253 267 255
1960 242 268 290 277
1961 241 253 265 236
1962 229 265 275 258
1963 231 263 308 313
1964 293 328 349 340
1965 309 349 366 340
1966 302 350 362 337
1967 326 358 359 357
1968 341 380 404 409
1969 383 417 454 428
1970 386 428 434 417
1971 385 433 453 436
1972 399 461 476 477
1973 452 461 534 516
1974 478 526 518 417
1975 340 437 459 449
1976 424 501 540 533
1977 457 513 522 478
1978 421 487 470 482
1979 458 526 573 563
1980 513 551 589 564
1981 519 581 581 578
1982 500 560 512 412
1983 303 409 420 413
1984 400 469 482 484
1985 447 507 533 503
1986 443 503 505 443
1987 415 485 495 458
1988 427 519 555 539
1989 511 572 570 526
1990 472 524 497 460
1991 373 436 424 430
1992 387 413 451 420
1993 394 462 476 443
1994 421 472 494
bricksq.train<-window(bricksq, end=c(1992,3))
bricksq.test<-window(bricksq, start=c(1992,4), end=c(1994,3))
bricksq.train
Qtr1 Qtr2 Qtr3 Qtr4
1956 189 204 208 197
1957 187 214 227 223
1958 199 229 249 234
1959 208 253 267 255
1960 242 268 290 277
1961 241 253 265 236
1962 229 265 275 258
1963 231 263 308 313
1964 293 328 349 340
1965 309 349 366 340
1966 302 350 362 337
1967 326 358 359 357
1968 341 380 404 409
1969 383 417 454 428
1970 386 428 434 417
1971 385 433 453 436
1972 399 461 476 477
1973 452 461 534 516
1974 478 526 518 417
1975 340 437 459 449
1976 424 501 540 533
1977 457 513 522 478
1978 421 487 470 482
1979 458 526 573 563
1980 513 551 589 564
1981 519 581 581 578
1982 500 560 512 412
1983 303 409 420 413
1984 400 469 482 484
1985 447 507 533 503
1986 443 503 505 443
1987 415 485 495 458
1988 427 519 555 539
1989 511 572 570 526
1990 472 524 497 460
1991 373 436 424 430
1992 387 413 451
bricksq.test
Qtr1 Qtr2 Qtr3 Qtr4
1992 420
1993 394 462 476 443
1994 421 472 494
bricksq_naive<-snaive(bricksq.train)
bricksq_stlf<-stlf(bricksq.train)
bricksq.fc_naive<-forecast(bricksq_naive,h=8)
bricksq.fc_stlf<-forecast(bricksq_stlf,h=8)
autoplot(bricksq, series = "Original data") +
geom_line(size = 1) +
autolayer(bricksq.fc_naive, PI = FALSE, size = 1,
series = "stlf") +
autolayer(bricksq.fc_stlf, PI = FALSE, size = 1,
series = "snaive") +
scale_color_manual(values = c("gray50", "blue", "red"),
breaks = c("Original data", "stlf", "snaive")) +
scale_x_continuous(limits = c(1990, 1994.5)) +
guides(colour = guide_legend(title = "Data")) +
ggtitle("Forecast from stlf and snaive functions") +
annotate(
"rect",
xmin=1992.75,xmax=1994.5,ymin=-Inf,ymax=Inf,
fill="lightgreen",alpha = 0.3
)