Autoregressive distributed lag (ARDL) models are an integral part of estimating scientific processes over time. However, as we extend their usefulness by adding richness in dynamic specifications (through multiple lags of variables, either in levels or differences, or lags of the dependent variable), we begin to challenge our ability to draw meaningful inferences from coefficients alone. Variables may appear in multiple time periods, in multiple forms, and might even be filtered through lagged values of the dependent variable. Coefficients tell us about the immediate effect of some variable but have little to say about the long-run effect.
There is a better solution. Instead of performing intense operations to develop a closed-form or algebraic solution, we can rely on the power of computing to simulate the over-time response in the dependent variable of some model, given a corresponding change in an \(x\) variable. We often call these “changes” in \(x\) variables, and the associated response in the dependent variable \(y\), a “counterfactual response” (a simulated response to a “shock” we control).
dynamac helps simulate these counterfactuals. More generally, though, it is built to make using and drawing inferences from single-equation ARDL models as easy as possible. Moreover, it helps users implement the useful cointegration test from Pearson, Shin, and Smith (2001): the ARDL-bounds testing procedure.
We illustrate the usefulness of these functions through example. After a brief discussion of the ARDL model in the general sense, we estimate a collection of these models and demonstrate both the challenges of the ARDL model in the abstract and the solutions of dynamac in particular.
The ARDL model has a general form where \(y\), modeled in levels or differences, is a function of itself (in lagged levels or differences), up to \(k\) variables \(x\), either in contemporaneous (same period, or appearing at time \(t\)) levels, lagged levels, contemporaneous differences, or lagged differences. Conventionally, the number of lags of the dependent variable in levels is counted by \(p\), the number of lags of the dependent variable in differences is counted by \(m\), the number of lags of the independent variables in levels is counted by \(l\), and the number of lags of the independent variables in differences is counted by \(q\) (note: \(l\) and \(q\) can begin at 0, i.e, contemporaneous effects appearing at time \(t\)).
The number of time periods of each variable can, theoretically, be quite large. Or, put differently, \(p\), \(m\), \(l\), and \(q\), especially \(l\) and \(q\), might be hard to account for. Analysts without strong theory about the nature of the effects often begin with restricting all but the contemporaneous and first lag of each series, or an ARDL model of the nature
\[y_t = \alpha_0 + \phi_1 y_{t-1} + \theta_{1,0} x_{1,t} + \theta_{1,1} x_{1,t-1} + \cdots + \theta_{k,0}x_{k,t} + \theta_{k,1}x_{k,t-1}+ \beta *T + \epsilon_t\] where \(\alpha_0\) is a constant and \(\beta *T\) is a trend term. (The exact nature of this equation depends on whether \(y\) is stationary or integrated, as well as if differences or lagged differences are entered into the model. But we will get to this below.)
It’s useful at this point to stop and think: if I have multiple lags of a variable \(x\), and they are filtered through a lagged dependent variable \(y_{t-1}\), it might be difficult to get a sense of the “total” effect of \(x\) on \(y\). This becomes more and more difficult as \(x\) increases in lagged levels \(l\) and potentially also lagged differences \(q\). Our coefficient estimates, while estimated, are no longer as useful in direct interpretation. Put differently, the very flexibility of the ARDL model also undoes its usefulness in interpretation! So we might seek an alternative way of interpreting these models. dynamac provides a unified way of estimating ARDL models and interpreting their effects. It also provides a way of implementing a popular test for cointegration. We uncover these methods below.
dynamac includes two datasets. We will use the Wright (2017) dataset on income inequality. We can look at the first few rows of this dataset
library(dynamac)
head(ineq)
#> year mood urate concern demcontrol incshare10 csentiment incshare01
#> 1 1966 60.793 3.8 0.465 1 0.3198 97.0 0.0837
#> 2 1967 61.332 3.8 0.475 1 0.3205 94.7 0.0843
#> 3 1968 58.163 3.6 0.490 1 0.3198 94.2 0.0835
#> 4 1969 54.380 3.5 0.507 0 0.3182 90.3 0.0802
#> 5 1970 60.555 4.9 0.519 0 0.3151 75.8 0.0780
#> 6 1971 64.077 5.9 0.531 0 0.3175 80.6 0.0779
concern
is public concern about income inequality; incshare10
is the income share of the top ten percent; urate
is the unemployment rate. Wright (2017) argues that concern about income inequality grows as inequality itself worsens and economic conditions deteriorate. A simple model, then, is that concern is a function of past values of itself, the level of income share held by the top ten percent, changing levels of income share held by the top ten percent, as well changing levels of unemployment in the short term.
\[\Delta Concern_t = \alpha_0 + \phi_1 Concern_{t-1} + \theta_1 Income Top 10_{t-1} + \beta_1 \Delta Income Top 10_t + \beta_2 \Delta Unemployment_t + \epsilon_t\]
where the residuals are white noise. Let’s develop the corresponding ARDL model using dynamac.
Step 1 in any time series analysis is a visual inspection of the series coupled with formal tests of stationarity: whether the series has a constant mean, variance, and covariance over time (so that it reverts back to mean level), or if the series instead violates any of these three conditions. We advocate for using the urca package for this (Pfaff, Zivot, and Stigler 2016), which includes a variety of tools and tests. Simply plotting the series reveals the following:
None of the three series looks especially well-behaved. concern
rose quickly and has been moving sluggishly since. incshare10
has only grown over time, which cannot be mean-reverting (like a stationary series). urate
experiences steep peaks and valleys with significant interludes in between. All three series look integrated. But we should be more discerning by using empirical tests, also included in urca. So we can execute the Augmented Dickey-Fuller (ADF) test, Phillips-Perron test, DF-GLS test, and KPSS test on each series. On concern
this looks like
summary(ur.df(ineq$concern, type = c("none"), lags = 1))
#>
#> ###############################################
#> # Augmented Dickey-Fuller Test Unit Root Test #
#> ###############################################
#>
#> Test regression none
#>
#>
#> Call:
#> lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.029626 -0.006624 0.000511 0.011123 0.033842
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> z.lag.1 0.002474 0.003597 0.688 0.495
#> z.diff.lag -0.131523 0.148125 -0.888 0.379
#>
#> Residual standard error: 0.01422 on 45 degrees of freedom
#> Multiple R-squared: 0.0243, Adjusted R-squared: -0.01907
#> F-statistic: 0.5603 on 2 and 45 DF, p-value: 0.575
#>
#>
#> Value of test-statistic is: 0.6878
#>
#> Critical values for test statistics:
#> 1pct 5pct 10pct
#> tau1 -2.62 -1.95 -1.61
summary(ur.pp(ineq$concern, type = c("Z-tau"), model = c("constant"), use.lag = 1))
#>
#> ##################################
#> # Phillips-Perron Unit Root Test #
#> ##################################
#>
#> Test regression with intercept
#>
#>
#> Call:
#> lm(formula = y ~ y.l1)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.031791 -0.006041 0.000738 0.006481 0.031282
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 0.09997 0.02946 3.393 0.00143 **
#> y.l1 0.83010 0.05085 16.324 < 2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.01274 on 46 degrees of freedom
#> Multiple R-squared: 0.8528, Adjusted R-squared: 0.8496
#> F-statistic: 266.5 on 1 and 46 DF, p-value: < 2.2e-16
#>
#>
#> Value of test-statistic, type: Z-tau is: -3.4373
#>
#> aux. Z statistics
#> Z-tau-mu 3.496
#>
#> Critical values for Z statistics:
#> 1pct 5pct 10pct
#> critical values -3.571174 -2.92277 -2.599003
summary(ur.ers(ineq$concern, type = c("DF-GLS"), model = c("constant"), lag.max = 1))
#>
#> ###############################################
#> # Elliot, Rothenberg and Stock Unit Root Test #
#> ###############################################
#>
#> Test of type DF-GLS
#> detrending of series with intercept
#>
#>
#> Call:
#> lm(formula = dfgls.form, data = data.dfgls)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.027024 -0.003352 0.003511 0.013156 0.036332
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> yd.lag -0.02951 0.03305 -0.893 0.377
#> yd.diff.lag1 -0.10824 0.14675 -0.738 0.465
#>
#> Residual standard error: 0.01417 on 45 degrees of freedom
#> Multiple R-squared: 0.0312, Adjusted R-squared: -0.01186
#> F-statistic: 0.7247 on 2 and 45 DF, p-value: 0.4901
#>
#>
#> Value of test-statistic is: -0.8928
#>
#> Critical values of DF-GLS are:
#> 1pct 5pct 10pct
#> critical values -2.61 -1.95 -1.62
summary(ur.kpss(ineq$concern, type = c("mu"), use.lag = 1))
#>
#> #######################
#> # KPSS Unit Root Test #
#> #######################
#>
#> Test is of type: mu with 1 lags.
#>
#> Value of test-statistic is: 0.6415
#>
#> Critical value for a significance level of:
#> 10pct 5pct 2.5pct 1pct
#> critical values 0.347 0.463 0.574 0.739
The ADF test, Phillips-Perron test, and DF-GLS test all have null hypotheses of a unit root, all of which we fail to reject. The KPSS test has a null hypothesis of stationarity which we do reject. We have compelling evidence, then, of integration (I[1]) in the series concern
. We can check whether differencing is enough to make the series stationary by executing the same tests
summary(ur.df(diff(ineq$concern), type = c("none"), lags = 1))
summary(ur.pp(diff(ineq$concern), type = c("Z-tau"), model = c("constant"), use.lag = 1))
summary(ur.ers(diff(ineq$concern), type = c("DF-GLS"), model = c("constant"), lag.max = 1))
summary(ur.kpss(diff(ineq$concern), type = c("mu"), use.lag = 1))
Each of the above tests, when run, indicated that the differenced series of concern
is stationary. Having gathered the basic information about the nature of the history of our variables, we might be itching to estimate some preliminary models. To this point, R hasn’t offered much in the way of improving beyond the basic lm()
framework for using regression to estimate time series models in a consistent syntax. We elaborate on this problem and illustrate our solution, dynardl
.
dynardl
ARDL models are flexible, but their flexibility often results in variables of different lengths due to differencing and lagging. For instance, consider our simple model from above where incshare10
appears as a first lag and a first difference, urate
appears as a first difference, there is a lagged dependent variable, and concern
is the dependent variable in differences. We can introduce a lag through the built-in lshift
function in dynamac. The syntax is just lshift(variable, num.lags)
, where num.lags
is the number of periods for the variable to be lagged. We can also difference a series through dshift
. The syntax is just dshift(variable)
for a first difference. For instance,
head(ineq$incshare10)
#> [1] 0.3198 0.3205 0.3198 0.3182 0.3151 0.3175
head(lshift(ineq$incshare10, 1))
#> [1] NA 0.3198 0.3205 0.3198 0.3182 0.3151
head(dshift(ineq$incshare10))
#> [1] NA 0.0007 -0.0007 -0.0016 -0.0031 0.0024
So the syntax for the simple model described would be easy to write. But notice the problem
summary(lm(diff(ineq$concern) ~ lshift(ineq$concern, 1) + lshift(ineq$incshare10, 1) + dshift(ineq$incshare10) + dshift(ineq$urate)))
#> Error in model.frame.default(formula = diff(ineq$concern) ~ lshift(ineq$concern, : variable lengths differ (found for 'lshift(ineq$concern, 1)')
Introducing the lag changed the variable lengths, and we probably don’t want to have to think about time series operation in our model specification, anyway. Other software has unified this operation by introducing a common syntax, like d. for differencing and l. for lagging, or even l.d. for lag differences (what program could that be?). In R, though, it remains more of a nuisance than it should be. The dynardl
function helps to remedy this challenge by encouraging the user only to think about model structure in exactly the way outlined above: which variables \(x\) get lags \(l\) and lagged differences \(q\), etc. The function expects a basic formula of all the variables in the model, the data set (data
), if there is one, then lagged levels (lags
) and lagged differences (lagdiffs
) as a list and differences (diffs
) and contemporaneous levels (levels
) as a vector. The sole exception is if the user wants to run the model in differences, s/he needs to specify ec = TRUE
(in an effort to force us to think critically about the error-correction form of the model). For instance, our above model now becomes (ignoring the simulate
option for now)
dynardl(concern ~ incshare10 + urate, data = ineq,
lags = list("concern" = 1, "incshare10" = 1),
diffs = c("incshare10", "urate"),
ec = TRUE, simulate = FALSE)
Purely hypothetically, if we wanted more lagged levels of concern
, we would just change 1
to c(1:5)
for lags at \(t-1\) to \(t-5\), or any other number of lags. If we wanted to include the first-difference of urate
lagged at periods \(t-1\) and \(t-2\), we would now run
dynardl(concern ~ incshare10 + urate, data = ineq,
lags = list("concern" = 1, "incshare10" = 1),
diffs = c("incshare10", "urate"),
lagdiffs = list("urate" = c(1, 2)),
ec = TRUE, simulate = FALSE)
If the variable can appear in multiple periods (lags
and lagdiffs
), it must be specified as a list. If it can only appear contemporaneously (levels
and diffs
), it must be specified as a vector. (The alternative was to specify levels
through a lag
at time 0: this did not seem practical.) We can also add or suppress a constant from the model by specifying constant = TRUE/FALSE
, and we can do the same with a linear trend term with trend = TRUE/FALSE
(by default, constants are included and trends are not).
The option ec
specifies the nature of the dependent variable. If it is possibly error-correcting (ec = TRUE
), it is run in differences, or period-over-period changes. If it is not (ec = FALSE
), the dependent variable is run in levels.
At this point, dynardl
is just a unifying way of thinking about time series models for estimation. Yet it is going to offer two other important advantages: interpreting the effects of our variables, and executing a powerful test for cointegration. We will start with the latter. Remember testing for I(1) processes earlier: each test indicated that concern
was I(1). Running the same tests for each of incshare10
and urate
suggests that all three series are I(1). This might cause us concern about cointegration: the special relationship between I(1) series where the series are in a long-run relationship, even if they move apart in the short term. Cointegrating series are difficult to uncover. Traditional methods, like the Engle and Granger (1987) two-step method or likelihood-based approaches (Johansen 1995) too often conclude cointegration when it does not exist, at least in smaller sample sizes (Philips 2018). An alternative test (Pesaran, Shin, and Smith 2001), which we refer to as the ARDL-bounds procedure, performs much better in small samples (\(t < 80\)), but it is more cumbersome to implement. The package dynamac is meant to resolve that deficiency by implementing critical value testing procedure for the user. Following Philips (2018), it requires estimating the relationship in error-correction form, obtaining statistics of interest, and then testing them via the function pssbounds
.
pssbounds
The ARDL-bounds procedure begins with two preliminaries. First, we must ensure that the regressors are not of order I(2) or more and that any seasonal components have been removed. We demonstrated this above when we found that the first-difference of incshare10
and urate
were both stationary. In addition, there were no seasonal components, looking at the series (although more formal diagnostics are probably warranted). Second, we must ensure that the dependent variable is integrated of order I(1). And again, above, we demonstrated that incshare10
is integrated.
The next step in ARDL-bounds is to estimate the error-correction form of the model. Two points are key: the independent variables that are potentially I(1) must be entered in levels, and the resulting residuals of the error-correction ARDL model must be white noise (random with no residual autocorrelation). Here, that means that we run our dependent variable in differences, we include the lagged levels of the dependent variable, we include the levels of the potentially cointegrating variable, incshare10
, and the short-run effects of urate
through differences. Returning to our model above, and using dynardl
, this is now straightforward:
res1 <- dynardl(concern ~ incshare10 + urate, data = ineq,
lags = list("concern" = 1, "incshare10" = 1),
diffs = c("incshare10", "urate"),
ec = TRUE, simulate = FALSE)
#> [1] "Error correction (EC) specified; dependent variable to be run in differences."
summary(res1)
#>
#> Call:
#> lm(formula = as.formula(paste(paste(dvnamelist), "~", paste(colnames(IVs),
#> collapse = "+"), collapse = " ")))
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.025848 -0.005293 0.000692 0.006589 0.031563
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 0.122043 0.027967 4.364 7.87e-05 ***
#> l.1.concern -0.167655 0.048701 -3.443 0.0013 **
#> d.1.incshare10 0.800585 0.296620 2.699 0.0099 **
#> d.1.urate 0.001118 0.001699 0.658 0.5138
#> l.1.incshare10 -0.068028 0.031834 -2.137 0.0383 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.01169 on 43 degrees of freedom
#> (1 observation deleted due to missingness)
#> Multiple R-squared: 0.3671, Adjusted R-squared: 0.3083
#> F-statistic: 6.236 on 4 and 43 DF, p-value: 0.0004755
Next we need to ensure that the residuals from this model are white noise. To help with this, we also introduce dynardl.auto.correlated
. This expects a dynardl
model and implements a few white-noise-residual tests with readable output that reminds of the null hypotheses. Here, we just run
dynardl.auto.correlated(res1)
#>
#> ------------------------------------------------------
#> Breusch-Godfrey LM Test
#> Test statistic: 3.704
#> p-value: 0.054
#> H_0: no autocorrelation up to AR 1
#>
#> ------------------------------------------------------
#> Shapiro-Wilk Test for Normality
#> Test statistic: 0.965
#> p-value: 0.158
#> H_0: residuals are distributed normal
#>
#> ------------------------------------------------------
#> Log-likelihood: 148.094
#> AIC: -284.187
#> BIC: -272.96
#> Note: AIC and BIC calculated with k = 5 on T = 48 observations.
#>
#> ------------------------------------------------------
#> Breusch-Godfrey test indicates we reject the null hypothesis of no autocorrelation at p < 0.10.
#> Add lags to remove autocorrelation before running dynardl simulations.
At a reasonable level of significance (\(p < 0.10\)), the BG test indicates that we reject the null hypothesis of no autocorrelation in the residuals of the model in res
. Philips (2018) indicates that the next step is to add lagged first-differences to our model. To add a lagged difference of \(y\), we would run
res2 <- dynardl(concern ~ incshare10 + urate, data = ineq,
lags = list("concern" = 1, "incshare10" = 1),
diffs = c("incshare10", "urate"),
lagdiffs = list("concern" = 1),
ec = TRUE, simulate = FALSE)
#> [1] "Error correction (EC) specified; dependent variable to be run in differences."
summary(res2)
#>
#> Call:
#> lm(formula = as.formula(paste(paste(dvnamelist), "~", paste(colnames(IVs),
#> collapse = "+"), collapse = " ")))
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.027161 -0.006046 0.001101 0.007727 0.026620
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 0.145837 0.031144 4.683 3.09e-05 ***
#> l.1.concern -0.195132 0.052971 -3.684 0.000665 ***
#> ld.1.concern -0.195748 0.131225 -1.492 0.143434
#> d.1.incshare10 0.679022 0.302936 2.241 0.030471 *
#> d.1.urate 0.001068 0.001665 0.641 0.524878
#> l.1.incshare10 -0.085775 0.032804 -2.615 0.012434 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.01144 on 41 degrees of freedom
#> (2 observations deleted due to missingness)
#> Multiple R-squared: 0.4174, Adjusted R-squared: 0.3464
#> F-statistic: 5.875 on 5 and 41 DF, p-value: 0.0003507
and then again test for autocorrelation.
dynardl.auto.correlated(res2)
#>
#> ------------------------------------------------------
#> Breusch-Godfrey LM Test
#> Test statistic: 0.453
#> p-value: 0.501
#> H_0: no autocorrelation up to AR 1
#>
#> ------------------------------------------------------
#> Shapiro-Wilk Test for Normality
#> Test statistic: 0.986
#> p-value: 0.857
#> H_0: residuals are distributed normal
#>
#> ------------------------------------------------------
#> Log-likelihood: 146.636
#> AIC: -279.271
#> BIC: -266.32
#> Note: AIC and BIC calculated with k = 6 on T = 47 observations.
#>
#> ------------------------------------------------------
length(res2$model$residuals)
#> [1] 47
We can now be much more confident that there is no more autocorrelation in the residuals. At this point, we can execute the ARDL-bounds test procedure. We need a few pieces of information: the length of the time series, the \(t\)-statistic on the lagged dependent variable in the ARDL error-correction model, the “case” of the regression (a combination of whether the intercept, if any, and trend, if any, were restricted (Pesaran, Shin, and Smith 2001)), and the number of regressors \(k\) appearing in first lags (NOT including the lagged dependent variable). We also need the number of observations in the model time series and the \(F\)-statistic on the restriction that the first lags of each of the variables in the model (including the lagged dependent variable) are jointly equal to zero.
From our regression, we know the \(t\)-statistic on the lagged dependent variable is -3.684 (just looking at the output). Additionally, we estimated a model with an unrestricted intercept and no trend, which happens to be case 3 (which we would know by referencing Pesaran, Shin, and Smith (2001)). There are \(k = 1\) variables in first lags (incshare10
, not including the LDV), and the number of obs
in the model is 47 periods. We now only need the test of the restriction that the first lags are equal to zero. We can calculate this by hand through coefficient testing. If we have all of the coefficients, we just need to compare them to the set that are lagged levels:
coef(res2$model)
#> (Intercept) l.1.concern ld.1.concern d.1.incshare10 d.1.urate
#> 0.145837457 -0.195131693 -0.195747775 0.679022362 0.001067503
#> l.1.incshare10
#> -0.085775490
# The second coefficient is the LDV, and the last coefficient is also a first lag. So they're both restricted
B <- coef(res2$model)
V <- vcov(res2$model)
# tag restrictions on LDV and l.1.incshare10
R <- matrix(c(0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1), byrow = T, nrow = 2)
k.plus1 <- sum(R)
# Restriction is that it is equal to 0
q <- 0
fstat <- (1/k.plus1)*t(R%*%B-q)%*%solve(R%*%V%*%t(R))%*%(R%*%B-q)
fstat
#> [,1]
#> [1,] 12.20351
To test for cointegration, we need special critical values for the F-test statistic. For this, we’re ready for pssbounds
pssbounds(obs = 47, fstat = 12.20351, tstat = -3.684, case = 3, k = 1)
#>
#> PESARAN, SHIN AND SMITH (2001) COINTEGRATION TEST
#>
#> Observations: 47
#> Number of Lagged Regressors (not including LDV) (k): 1
#> Case: 3 (Unrestricted intercept; no trend)
#>
#> ------------------------------------------------------
#> - F-test -
#> ------------------------------------------------------
#> <------- I(0) ------------ I(1) ----->
#> 10% critical value 4.19 4.94
#> 5% critical value 5.22 6.07
#> 1% critical value 7.56 8.69
#>
#>
#> F-statistic = 12.204
#> ------------------------------------------------------
#> - t-test -
#> ------------------------------------------------------
#> <------- I(0) ------------ I(1) ----->
#> 10% critical value -2.57 -2.91
#> 5% critical value -2.86 -3.22
#> 1% critical value -3.43 -3.82
#>
#>
#> t statistic = -3.684
#> ------------------------------------------------------
#>
#> t-statistic note: Small-sample critical values not provided for Case III. Asymptotic critical values used.
Finally, a payoff. The \(t\)-statistic and \(F\)-statistic are situated between special sample critical values. Depending on the level of significance that we preselected, these values offer a number of different conclusions. If the \(F\)-statistic exceeds the upper I(1) critical value, we may conclude cointegration. Thus, we can be confident that we have a well-specified model. If the F-statistic falls below the I(0) critical value, Pesaran, Shin, and Smith (2001) note that this indicates that all regressors are in fact stationary. Thus, cointegration cannot exist. If the F-statistic falls between the lower I(0) and upper I(1) critical values, the results are inconclusive. This means that cointegration may exist, but further testing and re-specification of the model is needed. Users that end up with these results are encouraged to see Philips (2018), who provides a strategy for dealing with this.
In our model here, since the value of the \(F\)-statistic exceeds the critical value at the upper I(1) bound of the test at the 5% level, we may conclude that Income Top 10 (the variable in the model in levels) and the dependent variable, Concern, are in a cointegrating relationship. This is furthered by the \(t\)-statistic of -3.684 falling below the 5% critical value for I(1). Thus, we can be confident both in our model specification and the unique, long-run relationship that exists between the two variables.
Calculating all of those values by hand was unnecessarily involved, especially since we know most of the values are saved post-estimation anyway. We’re further motivated not to have to reference the tables in Pesaran, Shin, and Smith (2001) for the case of the regression every time we run pssbounds
. Thus, our preferred way of estimating the ARDL-bounds test is just by passing the error-correction model to pssbounds
. In other words, the test is equivalent by just running
pssbounds(res2)
#>
#> PESARAN, SHIN AND SMITH (2001) COINTEGRATION TEST
#>
#> Observations: 47
#> Number of Lagged Regressors (not including LDV) (k): 1
#> Case: 3 (Unrestricted intercept; no trend)
#>
#> ------------------------------------------------------
#> - F-test -
#> ------------------------------------------------------
#> <------- I(0) ------------ I(1) ----->
#> 10% critical value 4.19 4.94
#> 5% critical value 5.22 6.07
#> 1% critical value 7.56 8.69
#>
#>
#> F-statistic = 12.204
#> ------------------------------------------------------
#> - t-test -
#> ------------------------------------------------------
#> <------- I(0) ------------ I(1) ----->
#> 10% critical value -2.57 -2.91
#> 5% critical value -2.86 -3.22
#> 1% critical value -3.43 -3.82
#>
#>
#> t statistic = -3.684
#> ------------------------------------------------------
#>
#> t-statistic note: Small-sample critical values not provided for Case III. Asymptotic critical values used.
Pesaran, Shin, and Smith (2001) also provide for testing that the intercept is also restricted, which they refer to as case 2. If we wanted to add this restriction to the test, we would use restriction = TRUE
in pssbounds
.
pssbounds(res2, restriction = TRUE)
#>
#> PESARAN, SHIN AND SMITH (2001) COINTEGRATION TEST
#>
#> Observations: 47
#> Number of Lagged Regressors (not including LDV) (k): 1
#> Case: 2 (Intercept included in F-stat restriction; no trend)
#>
#> ------------------------------------------------------
#> - F-test -
#> ------------------------------------------------------
#> <------- I(0) ------------ I(1) ----->
#> 10% critical value 3.18 3.65
#> 5% critical value 3.86 4.44
#> 1% critical value 5.50 6.24
#>
#>
#> F-statistic = 8.137
#>
#> ------------------------------------------------------
#>
#> t-statistic note: Critical values do not currently exist for Case II.
Of course, users are encouraged to read Pesaran, Shin, and Smith (2001) and Philips (2018) for the implications of this test. We merely note that cases 2 and 4 accessible through restriction = TRUE
. We also note that, for most users most of the time, case 3 will be the most common case (an unrestricted intercept and no trend).
Any dynardl
object run in an error-correction format is available for pssbounds
post-estimation. This, of course, is not meant to imply that all dynardl
models are meant to be tested for cointegration. Stationary dependent variables cannot be cointegrated with other variables. As such, we would want to run those models in levels, with EC = FALSE
. Like always, we point users to the importance of pre-regression testing of the nature of our variables, most specifically whether or not they are integrated. dynardl
will attempt to be as helpful as possible, but in the end, it is up to the user to know which model is appropriate and to ensure that the model is specified correctly.
We had another motivation. dynardl
+ pssbounds
are a powerful combination for estimating and testing for cointegration in a unified framework. But once we have tested for cointegration, we still want inferences about the nature of the effect of some \(x\) variable on the dependent variable. And these inferences become much more difficult to obtain as our models increase in complexity. For instance, the very lagged difference of \(y\) that we introduced to help purge autocorrelation from the model we just estimated made interpreting the coefficients from that model much less straightforward. In the next section, we elaborate on the ability of dynardl
to provide these inferences through the process we outlined earlier: counterfactual simulation of a response to a shock in some \(x\) variable. These inferences do not require anything beyond what we have already covered: a dynardl
model and a theoretically interesting \(x\) variable.
dynardl
ARDL models are useful because they are flexible, but their flexibility undermines our ability to make sense of the coefficients estimated in any given model. An alternative approach to interpretation is to use the coefficients from an estimated model to simulate meaningful responses in the dependent variable to counterfactual changes in an independent variable \(x\) (that we control), allowing the change to filter through the various forms of the \(x\) variable in the model, as well as different forms of the \(y\) variable (like differences and lagged levels) that might be included.
dynamac handles this simulated interpretation through a function we have already seen: dynardl
. All we need to do is specify that simulate = TRUE
, and we will produce simulated responses: we can observe how a change to a variable in a dynardl
model produces a corresponding change in the dependent variable. Other arguments are required, but only one has no default: we need to know which \(x\) variable to simulate a shock to (the shockvar
). This “shock” means that, at a time specified by the user, the value of an \(x\) variable will move to some level. If the variable is in levels or lagged levels, this means that its new value becomes the pre-shock average plus whatever the shock value is. If the variable is in differences or lagged differences, the shock lasts for one period (as a permanent change in a differenced variable would imply that it is changing every period!).
dynardl
has reasonable defaults for all of the other required parameters: simulations default to a range
of 20 periods, lines are drawn at roughly sig
= 95% confidence, the shockvar
is shocked by a standard deviation (the shockval
), we use 1,000 sims
to simulate the average response, we don’t force the other \(x\) variables to any values (we allow them to take their means, except for differenced variables, which we set to be zero: assuming that, period-over-period, there is no change), we allow for 20 periods of burnin
, and we create predicted values of the dependent variable, rather than expected values. All of these options can be controlled by the user, but we’ll return to that in a moment.
The simulation process is fairly straightforward. dynardl
uses the coefficients from the model. It draws a number (specifically, sims
) of values of the coefficients \(\hat\beta\) from the estimated regression model. The distribution is assumed to be multivariate normal with mean of \(\hat\beta\) and variance-covariance of the estimated model. Uncertainty is incorporated by simulating \(\hat\sigma\) \(^{2}\) as a scaled draw from the chi-squared distribution. These fit together by using values of the independent variables (usually their means: see the preceding paragraph) \(X\), multiplying by \(\hat\beta\) to get predicted values of the dependent variable \(y\), and then using \(\hat \sigma\) \(^{2}\) to introduce uncertainty in the predicted values. If you want to exert more direct control over the values of any \(x\) variables used, you can forceset
them to any other value you like. This is not limited to means or integers; if you have any substantively interesting value you’d like to hold a variable at, you are free to specify whichever value you like.
But what we’re really interested in is the effect of some variable \(x\). In the world of dynardl
, this is our shockvar
. We specify a variable from the model, tell dynardl
how much we want it to theoretically change by (in shockval
, defaulting to a standard deviation of the shockvar
) and when we want it to change (time
), and then observe the change in \(y\).
Let’s go back to our earlier model. Remember we ran
res2 <- dynardl(concern ~ incshare10 + urate, data = ineq,
lags = list("concern" = 1, "incshare10" = 1),
diffs = c("incshare10", "urate"),
lagdiffs = list("concern" = 1),
ec = TRUE, simulate = FALSE)
summary(res2)
Here, we set simulate = FALSE
because we were just illustrating estimation without simulations (given that simulations take a few seconds to estimate, we may only want to simulate changes when we have a final model, free of autocorrelation and the other things we tested for). Now, we’ll turn simulate = TRUE
and specify an \(x\) variable to observe the response of. We’ll observe the changes resulting from incshare10
, given its lagged level demonstrated a significant effect. In other words, our shockvar = "incshare10"
. By default, dynardl
will shock it by its standard deviation, but we can observe any change we like with shockval
. As with any other time in which stochastic values are involved, we should set a seed to ensure our results are directly replicable.
set.seed(020990)
res2 <- dynardl(concern ~ incshare10 + urate, data = ineq,
lags = list("concern" = 1, "incshare10" = 1),
diffs = c("incshare10", "urate"),
lagdiffs = list("concern" = 1),
ec = TRUE, simulate = TRUE,
shockvar = "incshare10")
#> [1] "Error correction (EC) specified; dependent variable to be run in differences."
#> [1] "incshare10 shocked by one standard deviation of incshare10 by default."
#> [1] "dynardl estimating ..."
#>
|
| | 0%
|
|=== | 5%
|
|===== | 8%
|
|====== | 10%
|
|======== | 12%
|
|========== | 15%
|
|=========== | 18%
|
|============= | 20%
|
|=============== | 22%
|
|================ | 25%
|
|================== | 28%
|
|==================== | 30%
|
|===================== | 32%
|
|======================= | 35%
|
|======================== | 38%
|
|========================== | 40%
|
|============================ | 42%
|
|============================= | 45%
|
|=============================== | 48%
|
|================================ | 50%
|
|================================== | 52%
|
|==================================== | 55%
|
|===================================== | 58%
|
|======================================= | 60%
|
|========================================= | 62%
|
|========================================== | 65%
|
|============================================ | 68%
|
|============================================== | 70%
|
|=============================================== | 72%
|
|================================================= | 75%
|
|================================================== | 78%
|
|==================================================== | 80%
|
|====================================================== | 82%
|
|======================================================= | 85%
|
|========================================================= | 88%
|
|========================================================== | 90%
|
|============================================================ | 92%
|
|============================================================== | 95%
|
|=============================================================== | 98%
|
|=================================================================| 100%
summary(res2$model)
#>
#> Call:
#> lm(formula = as.formula(paste(paste(dvnamelist), "~", paste(colnames(IVs),
#> collapse = "+"), collapse = " ")))
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.027161 -0.006046 0.001101 0.007727 0.026620
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 0.145837 0.031144 4.683 3.09e-05 ***
#> l.1.concern -0.195132 0.052971 -3.684 0.000665 ***
#> ld.1.concern -0.195748 0.131225 -1.492 0.143434
#> d.1.incshare10 0.679022 0.302936 2.241 0.030471 *
#> l.1.incshare10 -0.085775 0.032804 -2.615 0.012434 *
#> d.1.urate 0.001068 0.001665 0.641 0.524878
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.01144 on 41 degrees of freedom
#> (2 observations deleted due to missingness)
#> Multiple R-squared: 0.4174, Adjusted R-squared: 0.3464
#> F-statistic: 5.875 on 5 and 41 DF, p-value: 0.0003507
It looks somewhat goofy in the vignette, as dynardl
displays a progress bar to inform you of how many simulations have been completed. But the payoff is in res2$simulation
. We get to observe the effect of a standard-deviation shock in incshare10
in the dependent variable. This response is best visualized in a plot. To see this plot, the model can be saved to an object, and plots can be created with dynardl.simulation.plot
.
Here, the shock has an immediate effect that does not dissipate for a long time. So long, in fact, that we might want to lengthen the range
of the simulation.
set.seed(020990)
res3 <- dynardl(concern ~ incshare10 + urate, data = ineq,
lags = list("concern" = 1, "incshare10" = 1),
diffs = c("incshare10", "urate"),
lagdiffs = list("concern" = 1),
ec = TRUE, simulate = TRUE, range = 30,
shockvar = "incshare10")
#> [1] "Error correction (EC) specified; dependent variable to be run in differences."
#> [1] "incshare10 shocked by one standard deviation of incshare10 by default."
#> [1] "dynardl estimating ..."
#>
|
| | 0%
|
|=== | 4%
|
|==== | 6%
|
|===== | 8%
|
|====== | 10%
|
|======== | 12%
|
|========= | 14%
|
|========== | 16%
|
|============ | 18%
|
|============= | 20%
|
|============== | 22%
|
|================ | 24%
|
|================= | 26%
|
|================== | 28%
|
|==================== | 30%
|
|===================== | 32%
|
|====================== | 34%
|
|======================= | 36%
|
|========================= | 38%
|
|========================== | 40%
|
|=========================== | 42%
|
|============================= | 44%
|
|============================== | 46%
|
|=============================== | 48%
|
|================================ | 50%
|
|================================== | 52%
|
|=================================== | 54%
|
|==================================== | 56%
|
|====================================== | 58%
|
|======================================= | 60%
|
|======================================== | 62%
|
|========================================== | 64%
|
|=========================================== | 66%
|
|============================================ | 68%
|
|============================================== | 70%
|
|=============================================== | 72%
|
|================================================ | 74%
|
|================================================= | 76%
|
|=================================================== | 78%
|
|==================================================== | 80%
|
|===================================================== | 82%
|
|======================================================= | 84%
|
|======================================================== | 86%
|
|========================================================= | 88%
|
|========================================================== | 90%
|
|============================================================ | 92%
|
|============================================================= | 94%
|
|============================================================== | 96%
|
|================================================================ | 98%
|
|=================================================================| 100%
dynardl.simulation.plot(res3, type = "area", response = "levels")
If instead of an area plot we desired a spike plot:
In res3
there are actually three sets of output. The first is the model, the second is the information for pssbounds
, and the last is for plotting. We mention this so that users can use their usual TeX shortcuts for creating tables, customizing plots, or testing using pssbounds later.
res3$model
#>
#> Call:
#> lm(formula = as.formula(paste(paste(dvnamelist), "~", paste(colnames(IVs),
#> collapse = "+"), collapse = " ")))
#>
#> Coefficients:
#> (Intercept) l.1.concern ld.1.concern d.1.incshare10
#> 0.145837 -0.195132 -0.195748 0.679022
#> l.1.incshare10 d.1.urate
#> -0.085775 0.001068
res3$pssbounds
#> NULL
res3$simulation
#> time central ll95 ll90 ll75 ul75 ul90
#> 1 1 0.5792020 0.5559682 0.5592160 0.5651339 0.5925600 0.5991881
#> 2 2 0.5787013 0.5554403 0.5600252 0.5654825 0.5917471 0.5974948
#> 3 3 0.5789356 0.5570463 0.5608863 0.5668301 0.5918044 0.5975220
#> 4 4 0.5787997 0.5551703 0.5592707 0.5650761 0.5922909 0.5979111
#> 5 5 0.5793091 0.5542530 0.5602907 0.5658583 0.5931276 0.5988996
#> 6 6 0.5795942 0.5571192 0.5600127 0.5667194 0.5924350 0.5988118
#> 7 7 0.5798832 0.5581873 0.5605118 0.5657941 0.5944158 0.5999712
#> 8 8 0.5802234 0.5593644 0.5619827 0.5672503 0.5935926 0.5994727
#> 9 9 0.5799968 0.5577335 0.5610041 0.5665941 0.5933536 0.6005407
#> 10 10 0.6167511 0.5797056 0.5859896 0.5951996 0.6386637 0.6465144
#> 11 11 0.5973816 0.5713986 0.5756398 0.5816251 0.6119134 0.6175554
#> 12 12 0.5928346 0.5692578 0.5727756 0.5786376 0.6069353 0.6135546
#> 13 13 0.5856054 0.5618984 0.5655589 0.5715046 0.5995451 0.6052888
#> 14 14 0.5809483 0.5572153 0.5613369 0.5670413 0.5950571 0.6012514
#> 15 15 0.5763318 0.5541748 0.5574485 0.5625207 0.5899999 0.5960573
#> 16 16 0.5727459 0.5492940 0.5532698 0.5591775 0.5864000 0.5930059
#> 17 17 0.5703495 0.5464895 0.5504217 0.5564761 0.5840667 0.5899451
#> 18 18 0.5672469 0.5449277 0.5480448 0.5539188 0.5807144 0.5869591
#> 19 19 0.5653370 0.5419017 0.5455841 0.5514246 0.5786327 0.5847831
#> 20 20 0.5636972 0.5388558 0.5430262 0.5495522 0.5778304 0.5833093
#> 21 21 0.5625315 0.5387461 0.5422292 0.5486370 0.5763966 0.5820638
#> 22 22 0.5615651 0.5397573 0.5425750 0.5477181 0.5743292 0.5819248
#> 23 23 0.5602949 0.5357544 0.5400916 0.5464939 0.5742114 0.5800424
#> 24 24 0.5592019 0.5368418 0.5398974 0.5463206 0.5729166 0.5789579
#> 25 25 0.5584347 0.5346550 0.5389714 0.5445627 0.5725257 0.5790650
#> 26 26 0.5578433 0.5336365 0.5377534 0.5429992 0.5720966 0.5782946
#> 27 27 0.5579985 0.5348062 0.5376151 0.5437441 0.5717968 0.5774593
#> 28 28 0.5566465 0.5326561 0.5369266 0.5424172 0.5703003 0.5769466
#> 29 29 0.5569417 0.5340579 0.5369856 0.5440453 0.5700653 0.5755345
#> 30 30 0.5565073 0.5328118 0.5361860 0.5422267 0.5701398 0.5762442
#> ul95 d.central d.ll95 d.ll90 d.ll75
#> 1 0.6018618 0.0004254538 -0.0228083716 -0.019560503 -0.01364261
#> 2 0.6014550 -0.0005007145 -0.0237617318 -0.019176795 -0.01371949
#> 3 0.6000707 0.0002343734 -0.0216549368 -0.017814920 -0.01187115
#> 4 0.6012698 -0.0001358976 -0.0237653150 -0.019664922 -0.01385957
#> 5 0.6030645 0.0005093204 -0.0245467083 -0.018509030 -0.01294141
#> 6 0.6033832 0.0002850980 -0.0221898171 -0.019296377 -0.01258964
#> 7 0.6036243 0.0002889919 -0.0214068575 -0.019082324 -0.01380002
#> 8 0.6040383 0.0003402761 -0.0205187529 -0.017900491 -0.01263282
#> 9 0.6045266 -0.0002265877 -0.0224899460 -0.019219351 -0.01362931
#> 10 0.6532313 0.0367542274 -0.0002912449 0.005992723 0.01520280
#> 11 0.6226215 -0.0193695051 -0.0453524455 -0.041111309 -0.03512593
#> 12 0.6161177 -0.0045469415 -0.0281237411 -0.024605934 -0.01874393
#> 13 0.6104257 -0.0072292034 -0.0309362178 -0.027275693 -0.02133002
#> 14 0.6051003 -0.0046571172 -0.0283901378 -0.024268563 -0.01856416
#> 15 0.5998357 -0.0046164894 -0.0267735305 -0.023499795 -0.01842756
#> 16 0.5968420 -0.0035858975 -0.0270377883 -0.023061974 -0.01715427
#> 17 0.5941536 -0.0023964171 -0.0262563905 -0.022324247 -0.01626981
#> 18 0.5911982 -0.0031025991 -0.0254217833 -0.022304670 -0.01643069
#> 19 0.5902170 -0.0019098826 -0.0253452050 -0.021662811 -0.01582229
#> 20 0.5869392 -0.0016398588 -0.0264812467 -0.022310866 -0.01578478
#> 21 0.5874475 -0.0011656724 -0.0249511070 -0.021467954 -0.01506013
#> 22 0.5855887 -0.0009664072 -0.0227742211 -0.019956522 -0.01481340
#> 23 0.5847322 -0.0012701799 -0.0258106578 -0.021473496 -0.01507121
#> 24 0.5824619 -0.0010929582 -0.0234530774 -0.020397472 -0.01397435
#> 25 0.5822461 -0.0007672501 -0.0245469469 -0.020230557 -0.01463928
#> 26 0.5832261 -0.0005913811 -0.0247981816 -0.020681296 -0.01543551
#> 27 0.5817657 0.0001551995 -0.0230371348 -0.020228196 -0.01409920
#> 28 0.5808367 -0.0013519643 -0.0253424119 -0.021071924 -0.01558132
#> 29 0.5788088 0.0002951263 -0.0225886396 -0.019660904 -0.01260129
#> 30 0.5801271 -0.0004343467 -0.0241298890 -0.020755687 -0.01471502
#> d.ul75 d.ul90 d.ul95 shocktime
#> 1 0.013783426 0.020411585 0.023085233 10
#> 2 0.012545130 0.018292793 0.022253016 10
#> 3 0.013103084 0.018820762 0.021369451 10
#> 4 0.013355292 0.018975487 0.022334112 10
#> 5 0.014327884 0.020099900 0.024264724 10
#> 6 0.013125935 0.019502703 0.024074178 10
#> 7 0.014821609 0.020376991 0.024030172 10
#> 8 0.013709454 0.019589539 0.024155147 10
#> 9 0.013130146 0.020317277 0.024303148 10
#> 10 0.058666890 0.066517512 0.073234459 10
#> 11 -0.004837701 0.000804376 0.005870427 10
#> 12 0.009553755 0.016173082 0.018736182 10
#> 13 0.006710462 0.012454184 0.017591054 10
#> 14 0.009451657 0.015646006 0.019494915 10
#> 15 0.009051613 0.015108969 0.018887356 10
#> 16 0.010068223 0.016674052 0.020510154 10
#> 17 0.011320804 0.017199230 0.021407646 10
#> 18 0.010364867 0.016609624 0.020848743 10
#> 19 0.011385780 0.017536248 0.022970073 10
#> 20 0.012493344 0.017972330 0.021602221 10
#> 21 0.012699410 0.018366642 0.023750308 10
#> 22 0.011797668 0.019393270 0.023057248 10
#> 23 0.012646361 0.018477312 0.023167130 10
#> 24 0.012621721 0.018663032 0.022166954 10
#> 25 0.013323770 0.019863015 0.023044154 10
#> 26 0.013661891 0.019859865 0.024791362 10
#> 27 0.013953479 0.019616014 0.023922348 10
#> 28 0.012301787 0.018948054 0.022838160 10
#> 29 0.013418715 0.018887972 0.022162296 10
#> 30 0.013198078 0.019302571 0.023185472 10
Only models where ec = TRUE
will have $pssbounds
information, as only those models can possibly be testing a cointegrating relationship. We suspect that no one will need them by hand, as you can just pass the whole object via pssbounds(res3)
to get the same results.
Other options might be of interest. In order to smooth our confidence intervals (note: this does NOT make us more confident), we can increase the number of simulations. Additionally, the plotting function allows the user to produce plot in grayscale (for publications). Consider the following (notice the sims
argument of dynardl
and the new arguments under dynardl.simulation.plot
:
set.seed(020990)
res4 <- dynardl(concern ~ incshare10 + urate, data = ineq,
lags = list("concern" = 1, "incshare10" = 1),
diffs = c("incshare10", "urate"),
lagdiffs = list("concern" = 1),
ec = TRUE, simulate = TRUE, range = 30, sims = 10000,
shockvar = "incshare10")
#> [1] "Error correction (EC) specified; dependent variable to be run in differences."
#> [1] "incshare10 shocked by one standard deviation of incshare10 by default."
#> [1] "dynardl estimating ..."
#>
|
| | 0%
|
|=== | 4%
|
|==== | 6%
|
|===== | 8%
|
|====== | 10%
|
|======== | 12%
|
|========= | 14%
|
|========== | 16%
|
|============ | 18%
|
|============= | 20%
|
|============== | 22%
|
|================ | 24%
|
|================= | 26%
|
|================== | 28%
|
|==================== | 30%
|
|===================== | 32%
|
|====================== | 34%
|
|======================= | 36%
|
|========================= | 38%
|
|========================== | 40%
|
|=========================== | 42%
|
|============================= | 44%
|
|============================== | 46%
|
|=============================== | 48%
|
|================================ | 50%
|
|================================== | 52%
|
|=================================== | 54%
|
|==================================== | 56%
|
|====================================== | 58%
|
|======================================= | 60%
|
|======================================== | 62%
|
|========================================== | 64%
|
|=========================================== | 66%
|
|============================================ | 68%
|
|============================================== | 70%
|
|=============================================== | 72%
|
|================================================ | 74%
|
|================================================= | 76%
|
|=================================================== | 78%
|
|==================================================== | 80%
|
|===================================================== | 82%
|
|======================================================= | 84%
|
|======================================================== | 86%
|
|========================================================= | 88%
|
|========================================================== | 90%
|
|============================================================ | 92%
|
|============================================================= | 94%
|
|============================================================== | 96%
|
|================================================================ | 98%
|
|=================================================================| 100%
dynardl.simulation.plot(res4, type = "spike", response = "levels", bw = TRUE)
The full extent of these options is addressed in the documentation.
There is some question as to whether or not quantities of interest from these types of stochastic simulations are best summarized by the means or the medians of the resulting distributions. In most cases, the difference is likely to be minimal. In cases of skew, though, the median might serve significantly better than the mean (described in Rainey (2017)). Here, we can do that by setting qoi = median
.
set.seed(020990)
res5 <- dynardl(concern ~ incshare10 + urate, data = ineq,
lags = list("concern" = 1, "incshare10" = 1),
diffs = c("incshare10", "urate"),
lagdiffs = list("concern" = 1),
ec = TRUE, simulate = TRUE, range = 30,
shockvar = "incshare10", qoi = "median")
#> [1] "Error correction (EC) specified; dependent variable to be run in differences."
#> [1] "incshare10 shocked by one standard deviation of incshare10 by default."
#> [1] "dynardl estimating ..."
#>
|
| | 0%
|
|=== | 4%
|
|==== | 6%
|
|===== | 8%
|
|====== | 10%
|
|======== | 12%
|
|========= | 14%
|
|========== | 16%
|
|============ | 18%
|
|============= | 20%
|
|============== | 22%
|
|================ | 24%
|
|================= | 26%
|
|================== | 28%
|
|==================== | 30%
|
|===================== | 32%
|
|====================== | 34%
|
|======================= | 36%
|
|========================= | 38%
|
|========================== | 40%
|
|=========================== | 42%
|
|============================= | 44%
|
|============================== | 46%
|
|=============================== | 48%
|
|================================ | 50%
|
|================================== | 52%
|
|=================================== | 54%
|
|==================================== | 56%
|
|====================================== | 58%
|
|======================================= | 60%
|
|======================================== | 62%
|
|========================================== | 64%
|
|=========================================== | 66%
|
|============================================ | 68%
|
|============================================== | 70%
|
|=============================================== | 72%
|
|================================================ | 74%
|
|================================================= | 76%
|
|=================================================== | 78%
|
|==================================================== | 80%
|
|===================================================== | 82%
|
|======================================================= | 84%
|
|======================================================== | 86%
|
|========================================================= | 88%
|
|========================================================== | 90%
|
|============================================================ | 92%
|
|============================================================= | 94%
|
|============================================================== | 96%
|
|================================================================ | 98%
|
|=================================================================| 100%
dynardl.simulation.plot(res5, type = "area", response = "levels")
There are a variety of quantities that are plottable from the simulations, outside of just the response of the dependent variable. dynardl.simulation.plot
includes options for plotting the levels
of the dependent variable, the levels but demeaned (levels.from.mean
) of the dependent variable, and the period-over-period diffs
of the changes in the simulated values. You can get a sense of how the shock to the independent variable is decaying through time by observing the differences in each period as an absolute value (how much is the dependent variable adjusting) through shock.effect.decay.
You can also see the cumulative.diffs
of those changes and the absolute value of the changes cumulative.abs.diffs
. For the final two options, fullsims = TRUE
must be specified in the dynardl
simulation, which reports the full raw simulated values as a part of the dynardl
object. These values are used to draw confidence intervals over the simulated changes.
In addition, dynardl.simulation.plot
will expect a value for when it should regard the changes in the dependent variable as noise, rather than real changes. The default tol
is to disregard changes that are less than 0.1% of the mean of the dependent variable. Alternatively, we can specify a last.period
in which we believe the dependent variable is responding. dynardl.simulation.plot
also allows you to pass whatever normal plotting arguments you would use in the normal ...
way.
set.seed(020990)
res6 <- dynardl(concern ~ incshare10 + urate, data = ineq,
lags = list("concern" = 1, "incshare10" = 1),
diffs = c("incshare10", "urate"),
lagdiffs = list("concern" = 1),
ec = TRUE, simulate = TRUE, range = 30,
shockvar = "incshare10", fullsims = TRUE)
#> [1] "Error correction (EC) specified; dependent variable to be run in differences."
#> [1] "incshare10 shocked by one standard deviation of incshare10 by default."
#> [1] "dynardl estimating ..."
#>
|
| | 0%
|
|=== | 4%
|
|==== | 6%
|
|===== | 8%
|
|====== | 10%
|
|======== | 12%
|
|========= | 14%
|
|========== | 16%
|
|============ | 18%
|
|============= | 20%
|
|============== | 22%
|
|================ | 24%
|
|================= | 26%
|
|================== | 28%
|
|==================== | 30%
|
|===================== | 32%
|
|====================== | 34%
|
|======================= | 36%
|
|========================= | 38%
|
|========================== | 40%
|
|=========================== | 42%
|
|============================= | 44%
|
|============================== | 46%
|
|=============================== | 48%
|
|================================ | 50%
|
|================================== | 52%
|
|=================================== | 54%
|
|==================================== | 56%
|
|====================================== | 58%
|
|======================================= | 60%
|
|======================================== | 62%
|
|========================================== | 64%
|
|=========================================== | 66%
|
|============================================ | 68%
|
|============================================== | 70%
|
|=============================================== | 72%
|
|================================================ | 74%
|
|================================================= | 76%
|
|=================================================== | 78%
|
|==================================================== | 80%
|
|===================================================== | 82%
|
|======================================================= | 84%
|
|======================================================== | 86%
|
|========================================================= | 88%
|
|========================================================== | 90%
|
|============================================================ | 92%
|
|============================================================= | 94%
|
|============================================================== | 96%
|
|================================================================ | 98%
|
|=================================================================| 100%
par(mfrow = c(2, 3))
dynardl.simulation.plot(res6, type = "area", response = "levels")
dynardl.simulation.plot(res6, type = "area", response = "levels.from.mean")
dynardl.simulation.plot(res6, type = "area", response = "diffs")
dynardl.simulation.plot(res6, type = "area", response = "shock.effect.decay")
dynardl.simulation.plot(res6, type = "area", response = "cumulative.diffs", axes = F)
dynardl.simulation.plot(res6, type = "area", response = "cumulative.abs.diffs")
#> Warning in dynardl.simulation.plot(res6, type = "area", response =
#> "cumulative.abs.diffs"): Cumulative absolute effects assumed to be noise
#> (by tolerance) at t = 13.
If we don’t want to see the equilibriating behavior before the shock, we can draw the plot starting in a different period through start.period.
If you’re in an exploratory phase of model building and prefer not to have to run the plots separately, you are free to combine all of them using dynardl.all.plots
.
dynardl
) about data and modelingdynamac is meant to be a unifying package for estimating and interpreting times series ARDL models. It is not, however, a data manager. It assumes that your data are ordered, that the progression between time series makes sense (i.e. there is a consistent unit of time separating the ordered observations), that there are not problematic missing values, and that all the other usual pre-estimation data testing has been performed by the user. Users should know and explore their datasets well before passing those data to any statistical software.
Nor will dynamac stop you from running “bad idea” models. Users should be careful about the order of integration in their variables, whether seasonal unit roots are present, if variables have nuanced lagged-difference structures, and so on. We offer functions (like dynardl.auto.correlated
) to help users make these decisions on their path to a final model. But care must be taken at every step.
Engle, Robert F, and Clive WJ Granger. 1987. “Co-Integration and Error Correction: Representation, Estimation, and Testing.” Econometrica 55 (2). JSTOR: 251–76.
Johansen, Soren. 1995. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford University Press.
Pesaran, M Hashem, Yongcheol Shin, and Richard J Smith. 2001. “Bounds Testing Approaches to the Analysis of Level Relationships.” Journal of Applied Econometrics 16 (3). Wiley Online Library: 289–326. https://doi.org/10.1002/jae.616.
Pfaff, Bernhard, Eric Zivot, and Matthieu Stigler. 2016. Urca: Unit Root and Cointegration Tests for Time Series Data. https://CRAN.R-project.org/package=urca.
Philips, Andrew Q. 2018. “Have Your Cake and Eat It Too? Cointegration and Dynamic Inference from Autoregressive Distributed Lag Models.” American Journal of Political Science 62 (1): 230–44. https://doi.org/10.1111/ajps.12318.
Rainey, Carlisle. 2017. “Transformation-Induced Bias: Unbiased Coefficients Do Not Imply Unbiased Quantities of Interest.” Political Analysis 25 (3). Cambridge University Press: 402–9. https://doi.org/10.1017/pan.2017.11.
Wright, Graham. 2017. “The Political Implications of American Concerns About Economic Inequality.” Political Behavior. Springer, 1–23. https://doi.org/10.1007/s11109-017-9399-3.