Components of time series
u=rnorm(100)
plot.ts(u)
Exponential Smoothing is typically used when the data contains only an irregular component and for short term predictions. In the moving average method of prediction, equal weights are assigned to the last \(k\) observations. However, during forecasting, the most recent values often hold the most significant information. Therefore, it is wise to give higher weight to recent values and lower weight to older observations. Simple Exponential Smoothing (SES) methods apply exponentially decreasing weights as observations become older.
Let \(Y_t\) represent the observed time series, and \(F_t\) represent the forecast at time \(t\). Thus, the forecast for time \(t\) using the SES method is given by: \(\begin{eqnarray} F_t&=&\alpha Y_{t-1}+(1-\alpha)F_{t-1}\\ &=& F_{t-1}+\alpha\left(Y_{t-1}-F_{t-1}\right) \end{eqnarray}\)
where \(\alpha\) is a constant lying in the interval \((0,1)\). Thus, the predicted value is the sum of the last predicted value and a correction factor, which adjusts for the difference between the last observed and predicted value. This self-adjusting process automatically corrects for forecasting errors.
Now, let’s explore why it is called exponential smoothing. The weighting coefficients for previous values decrease rapidly with time, making it an efficient method for emphasizing more recent observations. \(\begin{eqnarray} F_t&=&\alpha Y_{t-1}+(1-\alpha)F_{t-1}= \alpha Y_{t-1}+(1-\alpha)\left[\alpha Y_{t-2}+(1-\alpha)F_{t-2}\right]\\ &=& \alpha Y_{t-1}+(1-\alpha)\alpha Y_{t-2}+(1-\alpha)^2F_{t-2}=\alpha Y_{t-1}+(1-\alpha)\alpha Y_{t-2}+(1-\alpha)^2\alpha Y_{t-1}+(1-\alpha)^3F_{t-3}\\&=& \alpha Y_{t-1}+(1-\alpha)\alpha Y_{t-2}+(1-\alpha)^2\alpha Y_{t-3}+(1-\alpha)^4 \alpha Y_{t-4}+\cdots+(1-\alpha)^tF_1 \end{eqnarray}\)
Now in order to find the appropriate value of \(\alpha\), we use any statistical accuracy measures. If \(Y_t\) is the actual observation for time period \(t\) and \(F_t\) is the forecast for the same period, then the error is defined as \(e_t = Y_t − F_t\) Now to accommodate the impact of scale of the data, we define another quantity called relative or percentage error. \[\begin{equation} PE_t=\dfrac{Y_t-F_t}{Y_t}*100 \end{equation}\]
The commonly used statistical accuracy measures are
We will evaluate \(\alpha\) such a way that the statistical measures of accuracy is minimum.
One major issue associated with SES is that which value to be used as the predicted value while computing \(F_2\). The formula by SES for predicting \(F_2\) is \(F_2=\alpha Y_1+(1-\alpha)F_1\). Since the value for \(F_1\) is not known, we can use the first observed value \((Y_1)\) as the first forecast \((F_1 = Y_1)\).
Now we use R to forecast the value for December. As an illustration we will evaluate the predicted values for all starting from February using \(\alpha=0.1,0.2,0.5,0.7,0.9\).
data=read.csv('/home/stephy/Documents/class/Time Series/electricCar.csv')
head(data,5)
## Month TimePeriod ObservedValue
## 1 Jan 1 200.0
## 2 Feb 2 135.0
## 3 Mar 3 195.0
## 4 Apr 4 197.5
## 5 May 5 310.0
alpha=c(0.1,0.2,0.5,0.7,0.9)
for (j in 1:length(alpha)){
column=colnames(data)
NewTitle=paste0('Predict',alpha[j])
data[1,3+j]=data[1,3]
colnames(data)=c(column,NewTitle)
for (i in seq(2,12)){
data[i,3+j]=alpha[j]*data[i-1,3]+(1-alpha[j])*data[i-1,3+j]
}
}
data[12,]
## Month TimePeriod ObservedValue Predict0.1 Predict0.2 Predict0.5 Predict0.7
## 12 Dec 12 NA 205.5561 213.2169 233.9795 240.469
## Predict0.9
## 12 238.5878
Now we shall consider the statistical accuracy measures for various values of alpha.
mae=c()
mse=c()
mape=c()
for (j in 1:6){
ae=0
se=0
ape=0
for (i in 2:11){
ae=ae+abs(data[i,3]-data[i,3+j])
se=se+(data[i,3]-data[i,3+j])^2
ape=ape+abs((data[i,3]-data[i,3+j])/data[i,3]*100)
}
mae=c(mae,ae/10)
mse=c(mse,se/10)
mape=c(mape,ape/10)
}
alpha
## [1] 0.1 0.2 0.5 0.7 0.9
mae
## [1] 47.75841 50.98881 56.93652 60.40820 61.31826
mse
## [1] 3438.332 3695.791 4347.237 4678.959 5039.368
mape
## [1] 24.58362 26.17299 29.20315 30.65223 30.81294
Now we will plot MAE,MSE and MAPE againist \(alpha\) values in \((0,1)\). As the range of variation of these quantites lie on different region it is not wise to plot them on a single plot. Hence we use subplot to plo these values.
mae=c()
mse=c()
mape=c()
alpha=seq(0,1,.001)
for (j in seq(0,1,.001)){
ae=0
se=0
ape=0
for (i in 2:11){
data[i,4]=j*data[i-1,3]+(1-j)*data[i-1,4]
ae=ae+abs(data[i,3]-data[i,4])
se=se+(data[i,3]-data[i,4])^2
ape=ape+abs((data[i,3]-data[i,4])/data[i,3]*100)
}
mae=c(mae,ae/10)
mse=c(mse,se/10)
mape=c(mape,ape/10)
}
par(mfrow = c(1, 3))
plot(alpha,mae,type = "l", col = "green",xlab =expression(alpha), lwd=1,ylab = "MAE",main =paste("MAE aganist various values of",expression(alpha)))
plot(alpha,mse,type = "l", col = "green",xlab =expression(alpha), lwd=1,ylab = "MSE",main =paste("MSE aganist various values of",expression(alpha)))
plot(alpha,mape,type = "l", col = "green",xlab =expression(alpha), lwd=1,ylab = "MAPE",main =paste("MAPE aganist various values of",expression(alpha)))
Here we can see that statistical measures of accuracy increases with \(\alpha\) and the smallest value of \(\alpha\) gives the minimum for the statistical accuracy measures. This indicates that values are independent. When we have to forecast many future values, in exponential smoothing, we use the immediate future value as predicted value for all the time ie \(F_{t+h}=F_{t+1},h=2,3,4,\cdots\)
Example 2: We shall consider predicting the the rainfall in London using the annual rainfall data available at https://robjhyndman.com/tsdldata/hurst/precip1.dat
We try to manually do exponential smoothing using the \(alpha\) which minimizes MSE.
#rain <- scan("http://robjhyndman.com/tsdldata/hurst/precip1.dat",skip=1)
rain=read.table("/home/stephy/Documents/class/Time Series/raindata")
mse=c()
rainPred=c()
rain[1,2]=rain[1,1]
for (i in seq(0,1,0.001)){
se=c()
for (j in seq(2,dim(rain)[1])){
rain[j,2]=i*rain[j-1,1]+(1-i)*rain[j-1,2]
se=c(se,(rain[j,2]-rain[j,1])^2)
}
mse=c(mse,sum(se)/dim(rain)[1]-1)
}
Now we got the value \(\alpha\) which manimizes the MSE subject to 5 decimal places as 0.0024
seq(0,1,0.001)[which.min(mse)]
## [1] 0.024
Now we will find the minimum of MSE.
min(mse)
## [1] 17.28856
rain=read.table("/home/stephy/Documents/class/Time Series/raindata")
rain[1,2]=rain[1,1]
mse<-function(alpha){
for (j in seq(2,dim(rain)[1])){
rain[j,2]=alpha*rain[j-1,1]+(1-alpha)*rain[j-1,2]
}
sum((rain[,1]-rain[,2])^2)}
result=optimize(mse,lower = 0,upper = 1)
result$minimum
## [1] 0.02412151
Now we will try to .
rain[1,2]=rain[1,1]
for (j in seq(2,dim(rain)[1])){
rain[j,2]=result$minimum*rain[j-1,1]+(1-result$minimum)*rain[j-1,2]
}
sse=sum((rain[,1]-rain[,2])^2)
sse
## [1] 1828.855
#rain <- scan("http://robjhyndman.com/tsdldata/hurst/precip1.dat",skip=1)
rain=read.table("/home/stephy/Documents/class/Time Series/raindata")
rainseries <- ts(rain,start=c(1813))
plot.ts(rainseries)
The plot indicates that the rainfall remains fairly constant over time,
with consistent random fluctuations. Therefore, an additive model can be
assumed for this data.
rainseriesforecasts <- HoltWinters(rainseries, beta=FALSE, gamma=FALSE)
rainseriesforecasts$alpha
## [1] 0.02412151
rainseriesforecasts$SSE
## [1] 1828.855
rainseriesforecasts$fitted
## Time Series:
## Start = 1814
## End = 1912
## Frequency = 1
## xhat level
## 1814 23.56000 23.56000
## 1815 23.62054 23.62054
## 1816 23.57808 23.57808
## 1817 23.76290 23.76290
## 1818 23.76017 23.76017
## 1819 23.76306 23.76306
## 1820 23.82691 23.82691
## 1821 23.79900 23.79900
## 1822 23.98935 23.98935
## 1823 23.98623 23.98623
## 1824 23.98921 23.98921
## 1825 24.19282 24.19282
## 1826 24.17032 24.17032
## 1827 24.13171 24.13171
## 1828 24.10442 24.10442
## 1829 24.19549 24.19549
## 1830 24.22261 24.22261
## 1831 24.24329 24.24329
## 1832 24.32812 24.32812
## 1833 24.21938 24.21938
## 1834 24.23290 24.23290
## 1835 24.13369 24.13369
## 1836 24.13867 24.13867
## 1837 24.21782 24.21782
## 1838 24.10257 24.10257
## 1839 24.04293 24.04293
## 1840 24.12608 24.12608
## 1841 24.01280 24.01280
## 1842 24.18448 24.18448
## 1843 24.15808 24.15808
## 1844 24.19889 24.19889
## 1845 24.16153 24.16153
## 1846 24.12748 24.12748
## 1847 24.18133 24.18133
## 1848 24.02499 24.02499
## 1849 24.16454 24.16454
## 1850 24.13476 24.13476
## 1851 24.01621 24.01621
## 1852 23.93453 23.93453
## 1853 24.20964 24.20964
## 1854 24.25018 24.25018
## 1855 24.11509 24.11509
## 1856 24.08964 24.08964
## 1857 24.04430 24.04430
## 1858 23.99933 23.99933
## 1859 23.87319 23.87319
## 1860 23.97780 23.97780
## 1861 24.17710 24.17710
## 1862 24.13110 24.13110
## 1863 24.21405 24.21405
## 1864 24.15075 24.15075
## 1865 23.97658 23.97658
## 1866 24.10933 24.10933
## 1867 24.29001 24.29001
## 1868 24.33729 24.33729
## 1869 24.31468 24.31468
## 1870 24.34134 24.34134
## 1871 24.26847 24.26847
## 1872 24.28659 24.28659
## 1873 24.51752 24.51752
## 1874 24.47295 24.47295
## 1875 24.33660 24.33660
## 1876 24.43558 24.43558
## 1877 24.47717 24.47717
## 1878 24.56625 24.56625
## 1879 24.79573 24.79573
## 1880 25.01341 25.01341
## 1881 25.14045 25.14045
## 1882 25.20750 25.20750
## 1883 25.25411 25.25411
## 1884 25.23351 25.23351
## 1885 25.11571 25.11571
## 1886 25.15248 25.15248
## 1887 25.19729 25.19729
## 1888 25.05286 25.05286
## 1889 25.11768 25.11768
## 1890 25.08710 25.08710
## 1891 24.99407 24.99407
## 1892 25.07019 25.07019
## 1893 25.01085 25.01085
## 1894 24.88515 24.88515
## 1895 24.95884 24.95884
## 1896 24.87469 24.87469
## 1897 24.84201 24.84201
## 1898 24.79420 24.79420
## 1899 24.62284 24.62284
## 1900 24.57259 24.57259
## 1901 24.54141 24.54141
## 1902 24.48421 24.48421
## 1903 24.39631 24.39631
## 1904 24.72686 24.72686
## 1905 24.62852 24.62852
## 1906 24.58852 24.58852
## 1907 24.58059 24.58059
## 1908 24.54271 24.54271
## 1909 24.52166 24.52166
## 1910 24.57541 24.57541
## 1911 24.59433 24.59433
## 1912 24.59905 24.59905
Holt (1957) extended the simple exponential smoothing method to account for time series data with a trend. This method introduces two smoothing constants, \(\alpha\) and \(\beta\), both of which lie between zero and one. In Holt’s linear method, the model is structured similarly to a linear trend line. Here, \(L_t\) represents the level of the series at time \(t\), and \(b_t\) denotes the estimated slope at time \(t\). The method is governed by the following three equations: \(\begin{eqnarray} L_t&=&\alpha Y_t+(1-\alpha)(L_{t-1}+b_{t-1})\\b_t &=& \beta(L_t-L_{t-1})+(1-\beta)b_{t-1}\\F_{t+m}&=& L_t+b_tm \end{eqnarray}\)
In this approach, we incorporate the trend into the model, similar to how simple exponential smoothing (SES) is applied. The slope \(b_t\) is estimated using a weighted average of the previous slope, \(b_{t-1}\) and an estimate of slope, difference between consecutive levels \(L_t−L_{t−1}\). Likewise, the level \(L_t\) is calculated as a weighted average of the current observation \(Y_t\) and the forecasted value from the previous period \(L_{t−1}+b_{t−1}\).
This method allows for more accurate forecasting by adjusting both the level and trend components of the time series over time. Unlike the simple exponential smoothing procedure, where forecast function is flat but it is trending here. The \(m\)-step-ahead forecast is equal to the last estimated level \(L_t\) plus \(m\) times the last estimated trend value \(b_t\). Holt’s method is also called double exponential smoothing by various authors.
In order to trigger the estimation process in Holt’s linear exponential smoothing, we require two estimates—one to get the first smoothed value for \(L_1\) and the other to get the trend \(b_1\). Though there are different methods for the same, we use \(L_1 = Y_1\) and \(b_1 = Y_2 − Y_1\).
Now we shall consider the application of Holt’s Linear exponential smoothing in the Inventory demand data.
inventory=read.csv('/home/stephy/Documents/class/Time Series/Inventory.csv')
head(inventory,5)
## Period Demand
## 1 1 143
## 2 2 152
## 3 3 161
## 4 4 139
## 5 5 137
dim(inventory)[1]
## [1] 24
inventory[1,3]=inventory[1,2]
inventory[1,4]=inventory[2,2]-inventory[1,2]
alpha=0.501
beta=0.072
for (i in seq(2,dim(inventory)[1])){
inventory[i,3]=alpha*inventory[i,2]+(1-alpha)*(inventory[i-1,3]+inventory[i-1,4])
inventory[i,4]=beta*(inventory[i,3]-inventory[i-1,3])+(1-beta)*inventory[i-1,4]
inventory[i,5]=inventory[i-1,3]+inventory[i-1,4]
}
Simple Exponential Smoothing (SES) methods are suitable when a time series lacks trend or seasonality. On the other hand, Holt’s linear trend method is more appropriate for data that shows a trend. When dealing with data that exhibits both trend and seasonality, the Holt-Winters method is recommended. This method includes three equations: one for the level, one for the trend, and one for seasonality. Compared to Holt’s linear trend method, the Holt-Winters method adds an equation to account for the seasonality in the data. Depending on whether the seasonality is additive or multiplicative, the Holt-Winters method offers two variations of its formula.
The basic equations for Holt-Winters’ multiplicative method are \(\begin{eqnarray} Level &: & L_t&=&\alpha \dfrac{Y_t}{S_{t-s}}+(1-\alpha)(L_{t-1}+b_{t-1})\\Trend&:& b_t &=& \beta(L_t-L_{t-1})+(1-\beta)b_{t-1}\\ Seasonal&:& S_t&=& \gamma \dfrac{Y_t}{L_t}+(1-\gamma)S_{t-s}\\ Forecast&:& F_{t+m}&=&\left(L_t+b_tm\right)S_{t-s+m} \end{eqnarray}\)
where \(s\) is the length of seasonality (e.g., number of months or quarters in a year), \(L_t\) represents the level of the series, \(b_t\) denotes the trend, \(S_t\) is the seasonal component, and \(F_{t+m}\) is the forecast for \(m\) periods ahead.
Holt-Winters Equations when the data follows additive model is given by \(\begin{eqnarray} Level &: & L_t&=&\alpha \left(Y_t-S_{t-s}\right)+(1-\alpha)(L_{t-1}+b_{t-1})\\Trend&:& b_t &=& \beta(L_t-L_{t-1})+(1-\beta)b_{t-1}\\ Seasonal&:& S_t&=& \gamma \left(Y_t-L_t\right) +(1-\gamma)S_{t-s}\\ Forecast&:& F_{t+m}&=&L_t+b_tm+ S_{t-s+m} \end{eqnarray}\)