The memochange
package can be used for two things: Checking for a break in persistence and checking for a change in mean. This vignette presents the functions related to a break in persistence. This includes BP_estim
, cusum_test
, LBI_test
, LKSN_test
, MR_test
, ratio_test
, and pb_sim
. Before considering the usage of these functions, a brief literature review elaborates on their connection.
The degree of memory is an important determinant of the characteristics of a time series. For an \(I(0)\), or short-memory, process (e.g., AR(1) or ARMA(1,1)), the impact of shocks is short-lived and dies out quickly. On the other hand, for an \(I(1)\), or difference-stationary, process such as the random walk, shocks persist infinitely. Thus, any change in a variable will have an impact on all future realizations. For an \(I(d)\), or long-memory, process with \(0<d<1\), shocks neither die out quickly nor persist infinitely, but have a hyperbolically decaying impact. In this case, the current value of a variable depends on past shocks, but the less so the further these shocks are past.
There are plenty of procedures to determine the memory of a series (see Robinson (1995), Shimotsu (2010), among others). However, there is also the possibility that series exhibit a structural change in memory, often referred to as a change in persistence. Starting with Kim (2000) various procedures have been proposed to detect these changes and consistently estimate the change point. Busetti and Taylor (2004) and Leybourne and Taylor (2004) suggest approaches for testing the null of constant \(I(0)\) behaviour of the series against the alternative that a change from either \(I(0)\) to \(I(1)\) or \(I(1)\) to \(I(0)\) occurred. However, both approaches show serious distortions if neither the null nor the alternative is true, e.g. the series is constant \(I(1)\). In this case the procedures by Leybourne et al. (2003) and Leybourne, Taylor, and Kim (2007) can be applied as they have the same alternative, but assume constant \(I(1)\) behaviour under the null. Again, the procedures exhibit distortions when neither the null nor the alternative is true. To remedy this issue, Harvey, Leybourne, and Taylor (2006) suggest an approach that entails the same critical values for constant \(I(0)\) and constant \(I(1)\) behavior. Consequently, it accommodates both, constant \(I(0)\) and constant \(I(1)\) behavior under the null.
While this earlier work focussed on the \(I(0)/I(1)\) framework, more recent approaches are able to detect changes from \(I(d_1)\) to \(I(d_2)\) where \(d_1\) and \(d_2\) are allowed to be non-integers. Sibbertsen and Kruse (2009) extend the approach of Leybourne, Taylor, and Kim (2007) such that the testing procedure consistently detects changes from \(0 \leq d_1<1/2\) to \(1/2<d_2<3/2\) and vice versa. Under the null the test assumes constant \(I(d)\) behavior with \(0 \leq d <3/2\). The approach suggested by Martins and Rodrigues (2014) is even able to identify changes from \(-1/2<d_1<2\) to \(-1/2<d_2<2\) with \(d_1 \neq d_2\). Here, under the null the test assumes constant \(I(d)\) behavior with \(-1/2<d<2\).
Examples for series that potentially exhibit breaks in persistence are macroeconomic and financial time series such as inflation rates, trading volume, interest rates, volatilities and so on. For these series it is therefore strongly recommended to investigate the possibility of a break in persistence before modeling and forecasting the series.
The memochange
package contains all procedure mentioned above to identify whether a time series exhibits a break in persistence mentioned above. Additionally, several estimators are implemented which consistently estimate the point at which the series exhibits a break in persistence and the order of integration in the two regimes. We will now show how the usage of the implemented procedures while investigating the price of crude oil.
First, we download the monthly price series from the FRED data base.
oil=data.table::fread("https://fred.stlouisfed.org/graph/fredgraph.csv?bgcolor=%23e1e9f0&chart_type=line&drp=0&fo=open%20sans&graph_bgcolor=%23ffffff&height=450&mode=fred&recession_bars=on&txtcolor=%23444444&ts=12&tts=12&width=1168&nt=0&thu=0&trc=0&show_legend=yes&show_axis_titles=yes&show_tooltip=yes&id=MCOILWTICO&scale=left&cosd=1986-01-01&coed=2019-08-01&line_color=%234572a7&link_values=false&line_style=solid&mark_type=none&mw=3&lw=2&ost=-99999&oet=99999&mma=0&fml=a&fq=Monthly&fam=avg&fgst=lin&fgsnd=2009-06-01&line_index=1&transformation=lin&vintage_date=2019-09-23&revision_date=2019-09-23&nd=1986-01-01")
To get a first visual impression, we plot the series.
oil=as.data.frame(oil)
oil$DATE=zoo::as.Date(oil$DATE)
oil_xts=xts::xts(oil[,-1],order.by = oil$DATE)
zoo::plot.zoo(oil_xts, xlab="", ylab="Price", main="Crude Oil Price: West Texas Intermediate")
From the plot we observe that the series seems to be more variable in its second part from year 2000 onwards. This is first evidence that a change in persistence has occurred. We can test this hypothesis using the functions cusum_test
(Leybourne, Taylor, and Kim (2007), Sibbertsen and Kruse (2009)) LBI_test
(Busetti and Taylor (2004)), LKSN_test
(Leybourne et al. (2003)), MR_test
(Martins and Rodrigues (2014)) , and ratio_test
(Busetti and Taylor (2004), Leybourne and Taylor (2004), Harvey, Leybourne, and Taylor (2006)). In this vignette we use the ratio and MR test since these are the empirically most often applied ones. The functionality of the other tests is similar. They all require a univariate numeric vector x
as an input variable and yield a matrix of test statistic and critical values as an output variable.
As a starting point the default version of the ratio test is applied.
ratio_test(x)
#> 90% 95% 99% Teststatistic
#> Against change from I(0) to I(1) 3.5148 4.6096 7.5536 225.943543
#> Against change from I(1) to I(0) 3.5588 4.6144 7.5304 1.170217
#> Against change in unknown direction 4.6144 5.7948 9.0840 225.943543
This yields a matrix that gives test statistic and critical values for the null of constant \(I(0)\) against a change from \(I(0)\) to \(I(1)\) or vice versa. Furthermore, the statistics for a change in an unknown direction are included as well. This accounts for the fact that we perform two tests facing a multiple testing problem. The results suggest that a change from \(I(0)\) to \(I(1)\) has occurred somewhere in the series since the test statistic exceeds the critical value at the one percent level. In addition, this value is also significant when accounting for the multiple testing problem. Consequently, the default version of the ratio test suggests a break in persistence.
We can modify this default version by choosing the arguments trend
, tau
, statistic
, type
, m
, z
, simu
, and M
(see the help page of the ratio test for details). The plot does not indicate a linear trend so that it seems unreasonable to change the trend argument. Also, the plot suggests that the break is rather in the middle of the series than at the beginning or the end so that changing tau
seems unnecessary as well. The type of test statistic calculated can be easily changed using the statistic argument. However, simulation results indicate mean, max, and exp statistics to deliver qualitatively similar results.
Something that is of more importance is the type of test performed. The default version considers the approach by Busetti and Taylor (2004). In case of a constant \(I(1)\) process this test often spuriously identifies a break in persistence. Harvey, Leybourne and Taylor (2006) account for this issue by adjusting the test statistic such that its critical values are the same under constant \(I(0)\) and constant \(I(1)\). We can calculate their test statistic by setting type="HLT"
. For this purpose, we need to state the number of polynomials z
used in their test statistic. The default value is 9 as suggested by Harvey, Leybourne and Taylor (2006). Choosing another value is only sensible for very large data sets (number of obs. > 10000) where the test statistic cannot be calculated due to computational singularity. In this case decreasing z
can allow the test statistic to be calculated. This invalidates the critical values so that we would have to simulate them by setting simu=1
. However, as our data set is rather small we can stick with the default of z=9
.
ratio_test(x, type="HLT")
#> 90% 95% 99% Teststatistic 90%
#> Against change from I(0) to I(1) 3.5148 4.6096 7.5536 58.9078204
#> Against change from I(1) to I(0) 3.5588 4.6144 7.5304 0.3085495
#> Against change in unknown direction 4.6144 5.7948 9.0840 44.2171379
#> Teststatistic 95% Teststatistic 99%
#> Against change from I(0) to I(1) 43.4772689 25.3369507
#> Against change from I(1) to I(0) 0.2290113 0.1290305
#> Against change in unknown direction 34.1367566 20.0058559
Again the test results suggests that there is a break from \(I(0)\) to \(I(1)\). Consequently, it is not a constant \(I(1)\) process that led to a spurious rejection of the test by Busetti and Taylor (2004).
Another test for a change in persistence is that by Martins and Rodrigues (2014). This is more general as it is not restricted to the \(I(0)/I(1)\) framework, but can identify changes from \(I(d_1)\) to \(I(d_2)\) with \(d_1 \neq d_2\) and \(-1/2<d_1,d_2<2\). The default version is applied by
MR_test(x)
#> 90% 95% 99% Teststatistic
#> Against increase in memory 4.270666 5.395201 8.233674 16.21494
#> Against decrease in memory 4.060476 5.087265 7.719128 2.14912
#> Against change in unknown direction 5.065695 6.217554 9.136441 16.21494
Again, the function returns a matrix consisting of test statistic and critical values. Here, the alternative of the test is an increase respectively a decrease in memory. In line with the results of the ratio test, the approach by Martins and Rodrigues (2014) suggests that the series exhibits an increase in memory, i.e. that the memory of the series increases from \(d_1\) to \(d_2\) with \(d_1<d_2\) at some point in time. Again, this also holds if we consider the critical values that account for the multiple testing problem.
Similar to the ratio test and all other tests against a change in persistence in the memochange
package, the MR test also has the same arguments trend
, tau
, simu
, and M
. Furthermore, we can choose again the type of test statistic. This time we can decide whether to use the squared t-statistic or the standard t-statistic.
MR_test(x, statistic="standard")
#> 90% 95% 99% Teststatistic
#> Against increase in memory -1.637306 -1.920434 -2.504862 -2.880545
#> Against decrease in memory -1.651586 -1.951420 -2.514165 -1.277410
#> Against change in unknown direction -1.933137 -2.203370 -2.722017 -2.880545
As for the ratio test, changing the type of statistic has a rather small effect on the empirical performance of the test.
If we believe that the underlying process exhibits additional short run components, we can account for these by setting serial=TRUE
MR_test(x, serial=TRUE)
#> Registered S3 method overwritten by 'quantmod':
#> method from
#> as.zoo.data.frame zoo
#> Registered S3 methods overwritten by 'forecast':
#> method from
#> fitted.fracdiff fracdiff
#> residuals.fracdiff fracdiff
#> 90% 95% 99% Teststatistic
#> Against increase in memory 4.270666 5.395201 8.233674 10.727202
#> Against decrease in memory 4.060476 5.087265 7.719128 6.758906
#> Against change in unknown direction 5.065695 6.217554 9.136441 10.727202
While the test statistic changes, the conclusion remains the same.
All tests indicate that the oil price series exhibits an increase in memory over time. To correctly model and forecast the series, the exact location of the break is important. This can be estimated by the BP_estim
function. It is important for the function that the direction of the change is correctly specified. In our case, an increase in memory has occurred so that we set direction="01"
BP_estim(x, direction="01")
#> $Breakpoint
#> [1] 151
#>
#> $d_1
#> [1] 0.8127501
#>
#> $sd_1
#> [1] 0.08574929
#>
#> $d_2
#> [1] 1.088039
#>
#> $sd_2
#> [1] 0.07142857
This yields a list stating the location of the break (observation 151), semiparametric estimates of the order of integration in the two regimes (0.86 and 1.03) as well as the standard deviations of these estimates (0.13 and 0.15).
Consequently, the function indicates that there is a break in persistence in July, 1998. This means that from the beginning of the sample until June 1998 the series is integrated with an order of 0.85 and from July 1998 on the order of integration increased to 1.03.
As before, the function allows for various types of break point estimators. Instead of the default estimator of Busetti and Taylor (2004), one can also rely on the estimator of Leybourne, Kim, and Taylor (2007) by setting type="LKT"
. This estimator relies on estimates of the long-run variance. Therefore, it is also needed that m
is chosen, which determines how many covariances are used when estimating the long-run variance. Leybourne, Kim, and Taylor (2007) suggest m=0
.
BP_estim(x, direction="01", type="LKT", m=0)
#> $Breakpoint
#> [1] 148
#>
#> $d_1
#> [1] 0.7660609
#>
#> $sd_1
#> [1] 0.08703883
#>
#> $d_2
#> [1] 1.067404
#>
#> $sd_2
#> [1] 0.07142857
This yields a similar result with the break point lying in the year 1998 and d increasing from approximately 0.8 to approximately 1.
All other arguments of the function (trend
, tau
, serial
) were already discussed above except for d_estim
and d_bw
. These two arguments determine which estimator and bandwidth are used to estimate the order of integration in the two regimes. Concerning the estimator, the GPH (Geweke and Porter-Hudak (1983)) and the exact local Whittle estimator (Shimotsu and Phillips (2005)) can be selected. Although the exact local Whittle estimator has a lower variance, the GPH estimator is still often considered in empirical applications due to its simplicity. In our example the results of the two estimators are almost identical.
BP_estim(x, direction="01", d_estim="GPH")
#> $Breakpoint
#> [1] 151
#>
#> $d_1
#> [1] 0.855238
#>
#> $sd_1
#> [1] 0.129834
#>
#> $d_2
#> [1] 1.034389
#>
#> $sd_2
#> [1] 0.1468516
The d_bw
argument determines how many frequencies are used for estimation. Larger values imply a lower variance of the estimates, but also bias the estimator if the underlying process possesses short run dynamics. Usually a value between 0.5 and 0.8 is considered.
BP_estim(x, direction="01", d_bw=0.75)
#> $Breakpoint
#> [1] 151
#>
#> $d_1
#> [1] 0.9146951
#>
#> $sd_1
#> [1] 0.07624929
#>
#> $d_2
#> [1] 1.173524
#>
#> $sd_2
#> [1] 0.0625
BP_estim(x, direction="01", d_bw=0.65)
#> $Breakpoint
#> [1] 151
#>
#> $d_1
#> [1] 0.5803242
#>
#> $sd_1
#> [1] 0.09805807
#>
#> $d_2
#> [1] 0.9353325
#>
#> $sd_2
#> [1] 0.08219949
In our setup, it can be seen that increasing d_bw
to 0.75 does not severely change the estimated order of integration in the two regimes. Decreasing d_bw
, however, leads to smaller estimates of \(d\).
Busetti, Fabio, and AM Robert Taylor. 2004. “Tests of Stationarity Against a Change in Persistence.” Journal of Econometrics 123 (1): 33–66. https://doi.org/10.1016/j.jeconom.2003.10.028.
Harvey, David I, Stephen Leybourne, and AM Robert Taylor. 2006. “Modified Tests for a Change in Persistence.” Journal of Econometrics 134 (2): 441–69. https://doi.org/10.1016/j.jeconom.2005.07.002.
Kim, Jae-Young. 2000. “Detection of Change in Persistence of a Linear Time Series.” Journal of Econometrics 95 (1): 97–116. https://doi.org/10.1016/S0304-4076(99)00031-7.
Leybourne, Stephen, Tae-Hwan Kim, Vanessa Smith, and Paul Newbold. 2003. “Tests for a Change in Persistence Against the Null of Difference-Stationarity.” The Econometrics Journal 6 (2): 291–311. https://doi.org/10.1111/1368-423X.t01-1-00110.
Leybourne, Stephen, and AM Robert Taylor. 2004. “On Tests for Changes in Persistence.” Economics Letters 84 (1): 107–15. https://doi.org/10.1016/j.econlet.2003.12.015.
Leybourne, Stephen, Robert Taylor, and Tae-Hwan Kim. 2007. “CUSUM of Squares-Based Tests for a Change in Persistence.” Journal of Time Series Analysis 28 (3): 408–33. https://doi.org/10.1111/j.1467-9892.2006.00517.x.
Martins, Luis F, and Paulo MM Rodrigues. 2014. “Testing for Persistence Change in Fractionally Integrated Models: An Application to World Inflation Rates.” Computational Statistics & Data Analysis 76: 502–22. https://doi.org/10.1016/j.csda.2012.07.021.
Robinson, Peter M. 1995. “Gaussian Semiparametric Estimation of Long Range Dependence.” The Annals of Statistics 23 (5): 1630–61. https://doi.org/10.1214/aos/1176324317.
Shimotsu, Katsumi. 2010. “Exact Local Whittle Estimation of Fractional Integration with Unknown Mean and Time Trend.” Econometric Theory 26 (2): 501–40. https://doi.org/10.1017/S0266466609100075.
Sibbertsen, Philipp, and Robinson Kruse. 2009. “Testing for a Break in Persistence Under Long-Range Dependencies.” Journal of Time Series Analysis 30 (3): 263–85. https://doi.org/10.1111/j.1467-9892.2009.00611.x.