Calculating the trend-cycle and seasonal indices.

Complete Exercise 6 at the end of Chapter 6 in the textbook. Then, write a paper that reconstructs a time series forecast. Cite at least one outside source to answer the last part of the assignment, in which you are asked to compare forecasts from stlf() with those from snaive(), using a test set comparing the last two years of data. Which is better? Also, what errors discussed in this module, if any, could occur when creating this forecast?
Your paper should be written in narrative form in APA style.
Exercise 6
We will use the bricksq data (Australian quarterly clay brick production. 1956–1994) for this exercise.
a. Use an STL decomposition to calculate the trend-cycle and seasonal indices. (Experiment with having fixed or changing seasonality.)
b. Compute and plot the seasonally adjusted data.
c. Use a naïve method to produce forecasts of the seasonally adjusted data.
d. Use stlf() to reseasonalise the results, giving forecasts for the original data.
e. Do the residuals look uncorrelated?
f. Repeat with a robust STL decomposition. Does it make much difference?
g. Compare forecasts from stlf() with those from snaive(), using a test set comprising the last 2 years of data. Which is better?
Exercise answers below:
We will use the bricksq data (Australian quarterly clay brick production. 1956-1994) for this exercise. 6a.Use an STL decomposition to calculate the trend-cycle and seasonal indices. (Experiment with having fixed or changing seasonality.)
bricksq %>%
stl(t.window=30, s.window="periodic", robust=TRUE) %>%
autoplot()

bricksq %>%
stl(t.window=30, s.window=7, robust=TRUE) %>%
autoplot()

bricksq %>%
stl(t.window=30, s.window=50, robust=TRUE) %>%
autoplot()

6b.Compute and plot the seasonally adjusted data.
bricksq %>%
stl(t.window=30, s.window="periodic", robust=FALSE) %>% seasadj() %>%
autoplot()

6c.Use a naïve method to produce forecasts of the seasonally adjusted data.
ft_stl <- stl(bricksq, t.window=30, s.window="periodic", robust=FALSE) ft_stl %>% seasadj() %>% naive() %>% autoplot() + ylab("New orders index") +
ggtitle("Naive forecasts of seasonally adjusted data")

6d.Use stlf() to reseasonalise the results, giving forecasts for the original data.
ft_stlf <- stlf(bricksq, method='naive')
forecast(ft_stlf,h=8)

Point Forecast Lo 80 Hi 80 Lo 95 Hi 95

1994 Q4 465.7338 440.1878 491.2798 426.6645 504.8030

1995 Q1 424.6739 388.5464 460.8014 369.4217 479.9261

1995 Q2 483.9926 439.7456 528.2395 416.3227 551.6624

1995 Q3 494.0000 442.9080 545.0920 415.8616 572.1384

1995 Q4 465.7338 408.6112 522.8563 378.3723 553.0952

1996 Q1 424.6739 362.0993 487.2485 328.9742 520.3736

1996 Q2 483.9926 416.4042 551.5809 380.6251 587.3600

1996 Q3 494.0000 421.7450 566.2550 383.4956 604.5044

6e.Do the residuals look uncorrelated?
checkresiduals(ft_stlf)

Warning in checkresiduals(ft_stlf): The fitted degrees of freedom is based

on the model used for the seasonally adjusted data.

Ljung-Box test

data: Residuals from STL + Random walk

Q* = 40.829, df = 8, p-value = 2.244e-06

Model df: 0. Total lags used: 8

at the quater of 4 and 8, there does show slight correlation. Howevere, this is acceptable for a naive forecast.
6f.Repeat with a robust STL decomposition. Does it make much difference?
ft_stl2 <- stl(bricksq, t.window=30, s.window="periodic", robust=TRUE) ft_stl2 %>% seasadj() %>% naive() %>% autoplot() + ylab("New orders index") +
ggtitle("Naive forecasts of seasonally adjusted data,r")

forecast(ft_stl2, h=8)

Point Forecast Lo 80 Hi 80 Lo 95 Hi 95

1994 Q4 470.9904 436.6159 505.3649 418.4191 523.5617

1995 Q1 437.2186 389.5246 484.9127 364.2768 510.1605

1995 Q2 479.5628 421.5014 537.6241 390.7656 568.3600

1995 Q3 493.6721 426.8080 560.5362 391.4123 595.9319

1995 Q4 470.9904 396.3326 545.6482 356.8111 585.1696

1996 Q1 437.2186 355.4869 518.9504 312.2208 562.2165

1996 Q2 479.5628 391.3036 567.8219 344.5820 614.5435

1996 Q3 493.6721 399.3184 588.0258 349.3706 637.9736

robust or not does not significantly influence the result.
6g.Compare forecasts from stlf() with those from snaive(), using a test set comprising the last 2 years of data. Which is better?
bricksq

Qtr1 Qtr2 Qtr3 Qtr4

1956 189 204 208 197

1957 187 214 227 223

1958 199 229 249 234

1959 208 253 267 255

1960 242 268 290 277

1961 241 253 265 236

1962 229 265 275 258

1963 231 263 308 313

1964 293 328 349 340

1965 309 349 366 340

1966 302 350 362 337

1967 326 358 359 357

1968 341 380 404 409

1969 383 417 454 428

1970 386 428 434 417

1971 385 433 453 436

1972 399 461 476 477

1973 452 461 534 516

1974 478 526 518 417

1975 340 437 459 449

1976 424 501 540 533

1977 457 513 522 478

1978 421 487 470 482

1979 458 526 573 563

1980 513 551 589 564

1981 519 581 581 578

1982 500 560 512 412

1983 303 409 420 413

1984 400 469 482 484

1985 447 507 533 503

1986 443 503 505 443

1987 415 485 495 458

1988 427 519 555 539

1989 511 572 570 526

1990 472 524 497 460

1991 373 436 424 430

1992 387 413 451 420

1993 394 462 476 443

1994 421 472 494

bricksq.train<-window(bricksq, end=c(1992,3))
bricksq.test<-window(bricksq, start=c(1992,4), end=c(1994,3))
bricksq.train

Qtr1 Qtr2 Qtr3 Qtr4

1956 189 204 208 197

1957 187 214 227 223

1958 199 229 249 234

1959 208 253 267 255

1960 242 268 290 277

1961 241 253 265 236

1962 229 265 275 258

1963 231 263 308 313

1964 293 328 349 340

1965 309 349 366 340

1966 302 350 362 337

1967 326 358 359 357

1968 341 380 404 409

1969 383 417 454 428

1970 386 428 434 417

1971 385 433 453 436

1972 399 461 476 477

1973 452 461 534 516

1974 478 526 518 417

1975 340 437 459 449

1976 424 501 540 533

1977 457 513 522 478

1978 421 487 470 482

1979 458 526 573 563

1980 513 551 589 564

1981 519 581 581 578

1982 500 560 512 412

1983 303 409 420 413

1984 400 469 482 484

1985 447 507 533 503

1986 443 503 505 443

1987 415 485 495 458

1988 427 519 555 539

1989 511 572 570 526

1990 472 524 497 460

1991 373 436 424 430

1992 387 413 451

bricksq.test

Qtr1 Qtr2 Qtr3 Qtr4

1992 420

1993 394 462 476 443

1994 421 472 494

bricksq_naive<-snaive(bricksq.train)
bricksq_stlf<-stlf(bricksq.train)
bricksq.fc_naive<-forecast(bricksq_naive,h=8)
bricksq.fc_stlf<-forecast(bricksq_stlf,h=8)

autoplot(bricksq, series = "Original data") +
geom_line(size = 1) +
autolayer(bricksq.fc_naive, PI = FALSE, size = 1,
series = "stlf") +
autolayer(bricksq.fc_stlf, PI = FALSE, size = 1,
series = "snaive") +
scale_color_manual(values = c("gray50", "blue", "red"),
breaks = c("Original data", "stlf", "snaive")) +
scale_x_continuous(limits = c(1990, 1994.5)) +
guides(colour = guide_legend(title = "Data")) +
ggtitle("Forecast from stlf and snaive functions") +
annotate(
"rect",
xmin=1992.75,xmax=1994.5,ymin=-Inf,ymax=Inf,
fill="lightgreen",alpha = 0.3
)

Scale for 'x' is already present. Adding another scale for 'x', which

will replace the existing scale.

Warning: Removed 136 rows containing missing values (geom_path).