The Michel’s calibration strategy (Calibration_Michel()
function) is the calibration algorithm proposed in airGR. However, other optimization methods can be used in combination with airGR. We show here how to use different R packages to perform parameter estimation.
In this vignette, we use the GR4J model to illustrate the different optimization strategies. In particular, we assume that the R global environment contains input climate data, observed discharge and functions from the Get Started vignette, as shown below. Please note that the calibration period is defined in the CreateRunOptions()
function .
example("Calibration_Michel")
In order for the RunModel_*()
functions to run faster during the parameter estimation process, it is recommended that the outputs contain only the simulated flows (see the Outputs_Sim
argument in the CreateRunOptions()
help page).
<- airGR::CreateRunOptions(FUN_MOD = RunModel_GR4J, InputsModel = InputsModel,
RunOptions IndPeriod_Run = Ind_Run,
Outputs_Sim = "Qsim")
Regarding the different optimization strategies presented here, we refer to each package for in-depth information about the description of the methods used.
Please note that this vignette is only for illustration purposes and does not provide any guidance about which optimization strategies is recommended for the family of the GR models.
Parameter estimation can be performed by defining a function that takes a parameter set as input and returns the value of the performance criterion. There are two important steps: the transformation of parameters to real space and the computation of the value of the performance criterion. Here we choose to minimize the root mean square error.
The change of the repository from the “real” parameter space to a “transformed” space ensures homogeneity of displacement in the different dimensions of the parameter space during the step-by-step procedure of the calibration algorithm of the model.
<- function(ParamOptim) {
OptimGR4J ## Transformation of the parameter set to real space
<- airGR::TransfoParam_GR4J(ParamIn = ParamOptim,
RawParamOptim Direction = "TR")
## Simulation given a parameter set
<- airGR::RunModel_GR4J(InputsModel = InputsModel,
OutputsModel RunOptions = RunOptions,
Param = RawParamOptim)
## Computation of the value of the performance criteria
<- airGR::ErrorCrit_RMSE(InputsCrit = InputsCrit,
OutputsCrit OutputsModel = OutputsModel,
verbose = FALSE)
return(OutputsCrit$CritValue)
}
In addition, we need to define the lower and upper bounds of the four GR4J parameters in the transformed parameter space:
<- rep(-9.99, times = 4)
lowerGR4J <- rep(+9.99, times = 4) upperGR4J
We start with a local optimization strategy by using the PORT routines (using the nlminb()
of the stats
package) and by setting a starting point in the transformed parameter space:
<- c(4.1, 3.9, -0.9, -8.7)
startGR4J <- stats::nlminb(start = startGR4J,
optPORT objective = OptimGR4J,
lower = lowerGR4J, upper = upperGR4J,
control = list(trace = 1))
The RMSE value reaches a local minimum value after 35 iterations.
We can also try a multi-start approach to test the consistency of the local optimization. Here we use the same grid used for the filtering step of the Michel’s calibration strategy (Calibration_Michel()
function). For each starting point, a local optimization is performed.
<- TransfoParam_GR4J(ParamIn = CalibOptions$StartParamDistrib,
startGR4JDistrib Direction = "RT")
<- expand.grid(data.frame(startGR4JDistrib))
startGR4J <- function(x) {
optPORT_ <- stats::nlminb(start = x,
opt objective = OptimGR4J,
lower = lowerGR4J, upper = upperGR4J,
control = list(trace = 1))
}<- apply(startGR4J, MARGIN = 1, FUN = optPORT_) listOptPORT
We can then extract the best parameter sets and the value of the performance criteria:
<- t(sapply(listOptPORT, function(x) x$par))
parPORT <- sapply(listOptPORT, function(x) x$objective)
objPORT <- data.frame(parPORT, RMSE = objPORT) resPORT
As can be seen below, the optimum performance criterion values (column objective) can differ from the global optimum value in many cases, resulting in various parameter sets.
summary(resPORT)
## X1 X2 X3 X4
## Min. :5.548 Min. :0.1240 Min. :-0.038062 Min. :-8.242667
## 1st Qu.:5.548 1st Qu.:0.1243 1st Qu.:-0.003741 1st Qu.:-8.242667
## Median :5.548 Median :0.8866 Median : 4.478766 Median :-8.242667
## Mean :5.682 Mean :0.6325 Mean : 2.977265 Mean :-5.820890
## 3rd Qu.:5.946 3rd Qu.:0.8866 3rd Qu.: 4.478766 3rd Qu.:-1.965413
## Max. :5.953 Max. :0.8866 Max. : 4.478767 Max. : 0.009845
## RMSE
## Min. :0.7864
## 1st Qu.:0.7864
## Median :0.7864
## Mean :0.9220
## 3rd Qu.:1.1630
## Max. :1.2235
The existence of several local minima illustrates the importance of defining an appropriate starting point or of using a multi-start strategy or a global optimization strategy.
Global optimization is most often used when facing a complex response surface, with multiple local mimina. Here we use the following R implementation of some popular strategies:
<- DEoptim::DEoptim(fn = OptimGR4J,
optDE lower = lowerGR4J, upper = upperGR4J,
control = DEoptim::DEoptim.control(NP = 40, trace = 10))
<- hydroPSO::hydroPSO(fn = OptimGR4J,
optPSO lower = lowerGR4J, upper = upperGR4J,
control = list(write2disk = FALSE, verbose = FALSE))
<- Rmalschains::malschains(fn = OptimGR4J,
optMALS lower = lowerGR4J, upper = upperGR4J,
maxEvals = 2000)
As it can be seen in the table below, the four additional optimization strategies tested lead to very close optima.
## Algo X1 X2 X3 X4
## 1 airGR 257.238 1.012 88.235 2.208
## 2 PORT 256.840 1.007 88.126 2.205
## 3 DE 256.840 1.007 88.126 2.205
## 4 PSO 256.799 1.007 88.147 2.205
## 5 MA-LS 256.820 1.007 88.116 2.205
Multiobjective optimization is used to explore possible trade-offs between different performances criteria. Here we use the following R implementation of an efficient strategy:
Motivated by using the rainfall-runoff model for low flow simulation, we explore the trade-offs between the KGE values obtained without any data transformation and with the inverse transformation.
First, the OptimGR4J function previously used is modified to return two values.
<- InputsCrit
InputsCrit_inv $transfo <- "inv"
InputsCrit_inv
<- function(i) {
MOptimGR4J if (algo == "caRamel") {
<- x[i, ]
ParamOptim
}## Transformation of the parameter set to real space
<- airGR::TransfoParam_GR4J(ParamIn = ParamOptim,
RawParamOptim Direction = "TR")
## Simulation given a parameter set
<- airGR::RunModel_GR4J(InputsModel = InputsModel,
OutputsModel RunOptions = RunOptions,
Param = RawParamOptim)
## Computation of the value of the performance criteria
<- airGR::ErrorCrit_KGE(InputsCrit = InputsCrit,
OutputsCrit1 OutputsModel = OutputsModel,
verbose = FALSE)
## Computation of the value of the performance criteria
<- airGR::ErrorCrit_KGE(InputsCrit = InputsCrit_inv,
OutputsCrit2 OutputsModel = OutputsModel,
verbose = FALSE)
return(c(OutputsCrit1$CritValue, OutputsCrit2$CritValue))
}
caRamel is a multiobjective evolutionary algorithm combining the MEAS algorithm and the NGSA-II algorithm.
<- "caRamel"
algo <- caRamel::caRamel(nobj = 2,
optMO nvar = 4,
minmax = rep(TRUE, 2),
bounds = matrix(c(lowerGR4J, upperGR4J), ncol = 2),
func = MOptimGR4J,
popsize = 100,
archsize = 100,
maxrun = 15000,
prec = rep(1.e-3, 2),
carallel = FALSE,
graph = FALSE)
The algorithm returns parameter sets that describe the pareto front, illustrating the trade-off between overall good performance and good performance for low flow.
ggplot() +
geom_point(aes(optMO$objectives[, 1], optMO$objectives[, 2])) +
coord_equal(xlim = c(0.4, 0.9), ylim = c(0.4, 0.9)) +
xlab("KGE") + ylab("KGE [1/Q]") +
theme_bw()
The parameter sets can be viewed in the parameter space, illustrating different populations.
<- apply(optMO$parameters, MARGIN = 1, FUN = function(x) {
param_optMO ::TransfoParam_GR4J(x, Direction = "TR")
airGR
})::ggpairs(data.frame(t(param_optMO)), diag = NULL) + theme_bw() GGally
$Outputs_Sim <- "Qsim"
RunOptions<- apply(optMO$parameters, MARGIN = 1, FUN = function(x) {
run_optMO ::RunModel_GR4J(InputsModel = InputsModel,
airGRRunOptions = RunOptions,
Param = x)
$Qsim)
}<- data.frame(run_optMO)
run_optMO
ggplot() +
geom_line(aes(x = as.POSIXct(InputsModel$DatesR[Ind_Run]),
y = run_optMO$X1)) +
geom_line(aes(x = as.POSIXct(InputsModel$DatesR[Ind_Run]),
y = run_optMO$X54),
colour = "darkred") +
scale_x_datetime(limits = c(as.POSIXct("1998-01-01"), NA)) +
ylab("Discharge [mm/d]") + xlab("Date") +
scale_y_sqrt() +
theme_bw()