This vignette aims to illustrate how the SPLICE
package can be used to generate the case estimates of incurred losses of individual claims.
SPLICE
(Synthetic Paid Loss and Incurred Cost Experience) is built on an existing simulator of paid claim experience called SynthETIC
, which offers flexible modelling of occurrence, notification, as well as the timing and magnitude of individual partial payments (see the package documentation and vignette for a detailed example on how to use the package to simulate paid claims experience).
SPLICE
enables the modelling of incurred loss estimates, via the following three modules:
library(SPLICE)
#> Loading required package: SynthETIC
set.seed(20201006)
<- return_parameters()[1] # 200,000
ref_claim <- return_parameters()[2] # 0.25 time_unit
For the definition and functionality of ref_claim
and time_unit
, we refer to the documentation of SynthETIC
.
For this demo, we will start with the paid losses simulated by the example implementation of SynthETIC
:
<- SynthETIC::test_claims_object test_claims
and simulate the case estimates of incurred losses of the 3624 indidual claims included in the claims
object above.
Sections 1-3 introduce the three modelling steps in detail and include extensive examples on how to replace the default implementation with sampling distributions deemed appropriate by the individual users of the program. For those who prefer to stick to the default assumptions, the following code is all that is required to generate the full incurred history:
# major revisions
<- claim_majRev_freq(test_claims)
major <- claim_majRev_time(test_claims, major)
major <- claim_majRev_size(major)
major
# minor revisions
<- claim_minRev_freq(test_claims)
minor <- claim_minRev_time(test_claims, minor)
minor <- claim_minRev_size(test_claims, major, minor)
minor
# development of case estimates
<- claim_history(test_claims, major, minor)
test <- claim_history(test_claims, major, minor,
test_inflated base_inflation_vector = rep((1 + 0.02)^(1/4) - 1, times = 80))
# transactional data
<- generate_incurred_dataset(test_claims, test)
test_incurred_dataset_noInf <- generate_incurred_dataset(test_claims, test_inflated)
test_incurred_dataset_inflated
# incurred cumulative triangles
<- output_incurred(test_inflated, incremental = FALSE) incurred_inflated
This section introduces a suite of functions that works together to simulate, in sequential order, the (1) frequency, (2) time, and (3) size of major revisions of incurred loss, for each of the claims occurring in each of the occurrence periods.
In particular, claim_majRev_freq()
sets up the structure of the major revisions: a nested list such that the jth component of the ith sub-list is a list of information on major revisions of the jth claim of occurrence period i. The “unit list” (i.e. the smallest, innermost sub-list) contains the following components:
Name | Description |
---|---|
majRev_freq |
Number of major revisions of incurred loss; see claim_majRev_freq() |
majRev_time |
Time of major revisions (from claim notification); see claim_majRev_time() |
majRev_factor |
Major revision multiplier of incurred loss; see claim_majRev_size() |
majRev_atP |
An indicator, 1 if the last major revision occurs at the time of the last major payment (i.e. second last payment), 0 otherwise; see claim_majRev_time() |
claim_majRev_freq()
generates the number of major revisions associated with a particular claim, from a user-defined random generation function. Users are free to choose any distribution (through the argument rfun
), whether it be a pre-defined distribution in R
, or more advanced ones from packages, or a proper user-defined function, to better match their own claim experience.
Let \(K\) represent the number of major revisions associated with a particular claim. The notification of a claim is considered as a major revision, so all claims have at least 1 major revision (\(K \ge 1\)).
One possible sampling distribution for this is the zero-truncated Poisson distribution from the actuar
package.
SPLICE
by default assumes the (removable) dependence of frequency of major revisions on claim size, which means that the user can specify the lambda
parameter in actuar::rztpois
as a paramfun
(parameter function) of claim_size
(and possibly more, see Example 1.1.2).
## paramfun input
# lambda as a function of claim size
<- function(claim_size) {
no_majRev_param <- pmax(1, log(claim_size / 15000) - 2)
majRevNo_mean c(lambda = majRevNo_mean)
}
## implementation and output
<- claim_majRev_freq(
major_test rfun = actuar::rztpois, paramfun = no_majRev_param)
test_claims, # show the distribution of number of major revisions
table(unlist(major_test))
#>
#> 1 2 3 4 5 6 7
#> 1958 1084 418 124 32 5 3
Like SynthETIC
, users of SPLICE
are able to add further dependencies in their simulation. This is illustrated in the example below.
Suppose we would like to add the additional dependence of claim_majRev_freq
(number of major revisions) on the number of partial payments of the claim - which is not natively included in SPLICE
default setting. For example, let’s consider the following parameter function:
## paramfun input
# an extended parameter function
<- function(claim_size, no_payment) {
majRevNo_param <- pmax(0, log(claim_size / 1500000)) + no_payment / 10
majRevNo_mean c(lambda = majRevNo_mean)
}
As this parameter function is dependent on no_payment
, it should not come at a surprise that we need to supply the number of partial payments when calling claim_majRev_freq()
. We need to make sure that the argument names are matched exactly (no_payment
in this example) and that the input is specified as a vector of simulated quantities (not a list).
## implementation and output
<- unlist(test_claims$no_payments_list)
no_payments_vect # sample the frequency of major revisions from zero-truncated Poisson
# with parameters above
<- claim_majRev_freq(
major_test rfun = actuar::rztpois, paramfun = majRevNo_param,
test_claims, no_payment = no_payments_vect)
# show the distribution of number of major revisions
table(unlist(major_test))
#>
#> 1 2 3 4 5 6 7
#> 2818 621 146 28 7 3 1
The default claim_majRev_freq()
assumes that no additional major revisions will occur for claims of size smaller than or equal to a claim_size_benchmark
. For claims above this threshold, a maximum of 3 major revisions can occur and the larger the claim size, the more likely there will be more major revisions.
There is no need to specify a sampling distribution if the user is happy with the default specification. This example is mainly to demonstrate how the default function works, and at the same time, to provide an example that one can modify to input a random sampling distribution of their choosing.
## input
# package default function for frequency of major revisions
<- function(
dflt.majRev_freq_function claim_size_benchmark = 0.075 * ref_claim) {
n, claim_size,
# construct the range indicator
<- (claim_size > claim_size_benchmark)
test
# if claim_size <= claim_size_benchmark
# "small" claims assumed to have no major revisions except at notification
<- rep(1, n)
no_majRev # if claim_size is above the benchmark
# probability of 2 major revisions, increases with claim size
<- 0.1 + 0.3 *
Pr2 min(1, (claim_size[test] - 0.075 * ref_claim)/(0.925 * ref_claim))
# probability of 3 major revisions, increases with claim size
<- 0.5 *
Pr3 min(1, max(0, claim_size[test] - 0.25 * ref_claim)/(0.75 * ref_claim))
# probability of 1 major revision i.e. only one at claim notification
<- 1 - Pr2 - Pr3
Pr1 <- sample(
no_majRev[test] c(1, 2, 3), size = sum(test), replace = T, prob = c(Pr1, Pr2, Pr3))
no_majRev }
Since the random function directly takes claim_size
as an input, no additional parameterisation is required (unlike in Examples 1 and 2, where we first need a paramfun
that turns the claim_size
into the lambda
parameter required in a zero-truncated Poisson distribution). Here we can simply run claim_majRev_freq()
without inputting a paramfun
.
## implementation and output
# simulate the number of major revisions
<- claim_majRev_freq(
major claims = test_claims,
rfun = dflt.majRev_freq_function
)
# show the distribution of number of major revisions
table(unlist(major))
#>
#> 1 2 3
#> 1877 309 1438
# view the major revision history of the first claim in the 1st occurrence period
# note that the time and size of the major revisions are yet to be generated
1]][[1]]
major[[#> $majRev_freq
#> [1] 3
#>
#> $majRev_time
#> [1] NA
#>
#> $majRev_factor
#> [1] NA
#>
#> $majRev_atP
#> [1] NA
Note that SPLICE
by default assumes the (removable) dependence of frequency of major revisions on claim size, hence there is no need to supply any additional arguments to claim_majRev_freq()
, unlike in Example 1.1.2.
If one would like to keep the structure of the default sampling function but modify the benchmark value, they may do so via e.g.
<- claim_majRev_freq(
major_test claims = test_claims,
claim_size_benchmark = 30000
)
claim_majRev_time()
generates the epochs of the major revisions (time measured from claim notification). It takes a very similar structure as claim_majRev_freq()
, allowing users to input a sampling distribution via rfun
and a parameter function which relates the parameter(s) of the distribution to selected claim characteristics.
Let \(\tau_k\) represent the epoch of the \(k\)th major revision (time measured from claim notification), \(k = 1, ..., K\). As the notification of a claim is considered a major revision itself, we have \(\tau_1 = 0\) for all claims.
One simplistic option is to use a modified version of the uniform distribution (modified such that the first major revision always occurs at time 0 i.e. at claim notification).
majRev_time_paramfun
in the example below specifies the min
and max
parameters for an individual claim as a function of setldel
(settlement delay). Note that SPLICE
by default assumes the (removable) dependence of timing of major revisions on claim size, settlement delay, and the partial payment times. Thanks to that, there is no need to supply any additional arguments to claim_majRev_time()
. Users who wish to add further dependencies to the simulator can refer to Example 1.1.2.
## input
<- function(n, min, max) {
majRev_time_rfun # n = number of major revisions of an individual claim
<- vector(length = n)
majRev_time 1] <- 0 # first major revision at notification
majRev_time[if (n > 1) {
2:n] <- sort(stats::runif(n - 1, min, max))
majRev_time[
}
return(majRev_time)
}<- function(setldel, ...) {
majRev_time_paramfun # setldel = settlement delay
c(min = setldel/3, max = setldel)
}
## implementation and output
<- claim_majRev_time(
major_test rfun = majRev_time_rfun, paramfun = majRev_time_paramfun
test_claims, major,
)1]][[1]]
major_test[[#> $majRev_freq
#> [1] 3
#>
#> $majRev_time
#> [1] 0.00000 13.01127 14.13173
#>
#> $majRev_factor
#> [1] NA
#>
#> $majRev_atP
#> [1] 0
The default implementation takes into account much complexity from the real-life claim process. It assumes that with a positive probability, the last major revision for a claim may coincide with the second last partial payment (which is usually the major settlement payment). In such cases, majRev_atP
would be set to 1 indicating that there is a major revision simultaneous with the penultimate payment.
The epochs of the remaining major revisions are sampled from triangular distributions with maximum density at the earlier part of the claim’s lifetime.
## package default function for time of major revisions
<- function(
dflt.majRev_time_function # n = number of major revisions
# setldel = settlement delay
# penultimate_delay = time from claim notification to second last payment
n, claim_size, setldel, penultimate_delay) {
<- rep(NA, times = n)
majRev_time
# first revision at notification
1] <- 0
majRev_time[if (n > 1) {
# if the claim has multiple major revisions
# the probability of having the last revision exactly at the second last partial payment
<- 0.2 *
p min(1, max(0, (claim_size - ref_claim) / (14 * ref_claim)))
<- sample(c(0, 1), size = 1, replace = TRUE, prob = c(1-p, p))
at_second_last_pmt
# does the last revision occur at the second last partial payment?
if (at_second_last_pmt == 0) {
# -> no revision at second last payment
2:n] <- sort(rtri(n - 1, min = setldel/3, max = setldel, mode = setldel/3))
majRev_time[else {
} # -> yes, revision at second last payment
<- penultimate_delay
majRev_time[n] if (n > 2) {
2:(n-1)] <- sort(
majRev_time[rtri(n - 2, min = majRev_time[n]/3, max = majRev_time[n], mode = majRev_time[n]/3))
}
}
}
majRev_time }
Note that rtri
is a function to generate random numbers from a triangular distribution that is included as part of the SPLICE
package.
claim_size
and setldel
are both directly accessible claim characteristics, but we need paramfun
to take care of the computation of penultimate_delay
as a function of the partial payment delays that we can access.
<- function(payment_delays, ...) {
dflt.majRev_time_paramfun c(penultimate_delay = sum(payment_delays[1:length(payment_delays) - 1]),
...) }
## implementation and output
<- claim_majRev_time(
major claims = test_claims,
majRev_list = major, # we will update the previous major list
rfun = dflt.majRev_time_function,
paramfun = dflt.majRev_time_paramfun
)
# view the major revision history of the first claim in the 1st occurrence period
# observe that we have now updated the time of major revisions
1]][[1]]
major[[#> $majRev_freq
#> [1] 3
#>
#> $majRev_time
#> [1] 0.000000 7.727115 14.203130
#>
#> $majRev_factor
#> [1] NA
#>
#> $majRev_atP
#> [1] 0
The above sampling distribution has been included as the default. There is no need to reproduce the above code if the user is happy with this default distribution. A simple equivalent to the above code is just
<- claim_majRev_time(claims = test_claims, majRev_list = major) major
This example is here only to demonstrate how the default function operates.
claim_majRev_size()
generates the sizes of the major revisions. The major revision multipliers apply to the incurred loss estimates, that is, a revision multiplier of 2.54 means that at the time of the major revision the incurred loss increases by a factor of 2.54. We highlight this as in the case of minor revisions, the multipliers will instead apply to outstanding claim amounts, see claim_minRev_size().
The reason for this differentiation is that major revisions represent a total change of perspective on ultimate incurred cost, whereas minor revisions respond more to matters of detail, causing the case estimator to apply a revision factor to the estimate of outstanding payments.
Suppose that we believe the major revision multipliers follow a gamma distribution with parameters dependent on the size of the claim. Then we can set up the simulation in the following way:
## input
<- function(n, shape, rate) {
majRev_size_rfun # n = number of major revisions of an individual claim
<- vector(length = n)
majRev_size 1] <- 1 # first major revision at notification
majRev_size[if (n > 1) {
2:n] <- stats::rgamma(n - 1, shape, rate)
majRev_size[
}
majRev_size
}
<- function(claim_size) {
majRev_size_paramfun <- max(log(claim_size / 5000), 1)
shape <- 10 / shape
rate c(shape = shape, rate = rate)
}
The default implementation of claim_majRev_size()
assumes no further dependencies on claim characteristics. Hence we need to supply claim_size
as an additional argument when running claim_majRev_size()
when the above set up.
## implementation and output
<- unlist(test_claims$claim_size_list)
claim_size_vect <- claim_majRev_size(
major_test majRev_list = major,
rfun = majRev_size_rfun,
paramfun = majRev_size_paramfun,
claim_size = claim_size_vect
)
# view the major revision history of the first claim in the 1st occurrence period
# observe that we have now updated the size of major revisions
1]][[1]]
major_test[[#> $majRev_freq
#> [1] 3
#>
#> $majRev_time
#> [1] 0.000000 7.727115 14.203130
#>
#> $majRev_factor
#> [1] 1.0000000 0.9597788 2.0640283
#>
#> $majRev_atP
#> [1] 0
The default implementation samples the major revision multipliers from lognormal distributions:
## input
# package default function for sizes of major revisions
<- function(n) {
dflt.majRev_size_function <- rep(NA, times = n)
majRev_factor # set revision size = 1 for first revision (i.e. the one at notification)
1] <- 1
majRev_factor[if (n > 1) {
# if the claim has multiple major revisions
2] <- stats::rlnorm(n = 1, meanlog = 1.8, sdlog = 0.2)
majRev_factor[if (n > 2) {
# the last revision factor depends on what happened at the second major revision
<- 1 + 0.07 * (6 - majRev_factor[2])
mu 3] <- stats::rlnorm(n = 1, meanlog = mu, sdlog = 0.1)
majRev_factor[
}
}
majRev_factor
}
## implementation and output
<- claim_majRev_size(
major majRev_list = major,
rfun = dflt.majRev_size_function
)
# view the major revision history of the first claim in the 1st occurrence period
# observe that we have now updated the size of major revisions
1]][[1]]
major[[#> $majRev_freq
#> [1] 3
#>
#> $majRev_time
#> [1] 0.000000 7.727115 14.203130
#>
#> $majRev_factor
#> [1] 1.000000 5.165349 2.619366
#>
#> $majRev_atP
#> [1] 0
For this particular claim record, we observe 3 major revisions:
Compared to the major revisions, the simulation of minor revisions may require slightly more complicated input specification, as we need to separate the case of minor revisions that occur simultaneously with a partial payment (minRev_atP
) and the ones that do not.
Similar to the case of major revisions, the suite of functions under this heading run in sequential order to simulate the (1) frequency, (2) time, and (3) size of minor revisions of outstanding claim payments, for each of the claims occurring in each of the occurrence periods. In particular, claim_minRev_freq()
sets up the structure of the minor revisions: a nested list such that the jth component of the ith sub-list is a list of information on minor revisions of the jth claim of occurrence period i. The “unit list” contains the following components:
Name | Description |
---|---|
minRev_atP |
A logical vector indicating whether there is a minor revision at each partial payment; see claim_minRev_freq() |
minRev_freq_atP (minRev_freq_notatP ) |
Number of minor revisions that occur (or do not occur) simultaneously with a partial payment. minRev_freq_atP is numerically equal to the sum of minRev_atP |
minRev_time_atP , (minRev_time_notatP ) |
Time of minor revisions that occur (or do not occur) simultaneously with a partial payment (time measured from claim notification); see claim_minRev_time() |
minRev_factor_atP , (minRev_factor_notatP ) |
Minor revision multiplier of outstanding claim payments for revisions at partial payments and at any other times, respectively; see claim_minRev_size() |
Minor revisions may occur simultaneously with a partial payment, or at any other time:
prob_atP
(defaults to 1/2);rfun_notatP
for simulation and a paramfun_notatP
to input the parameters for the sampling distribution, much in the same way as the analogous case of major revisions. The default implementation assumes a geometric distribution with mean = min(3, setldel / 4)
and is illustrated below.## input
# package default function for frequency of minor revisions NOT at partial payments
<- function(n, setldel) {
dflt.minRev_freq_notatP_function # setldel = settlement delay
<- stats::rgeom(n, prob = 1 / (min(3, setldel/4) + 1))
minRev_freq_notatP
minRev_freq_notatP
}
## implementation and output
<- claim_minRev_freq(
minor
test_claims,prob_atP = 0.5,
rfun_notatP = dflt.minRev_freq_notatP_function)
# view the minor revision history of the 10th claim in the 1st occurrence period
1]][[10]]
minor[[#> $minRev_atP
#> [1] 1 1 0 0 1
#>
#> $minRev_freq_atP
#> [1] 3
#>
#> $minRev_freq_notatP
#> [1] 1
#>
#> $minRev_time_atP
#> [1] NA
#>
#> $minRev_time_notatP
#> [1] NA
#>
#> $minRev_factor_atP
#> [1] NA
#>
#> $minRev_factor_notatP
#> [1] NA
An equivalent way of setting up the same structure using paramfun
:
<- function(setldel) {
minRev_freq_notatP_paramfun c(prob = 1 / (min(3, setldel/4) + 1))
}
<- claim_minRev_freq(
minor
test_claims,prob_atP = 0.5,
rfun_notatP = stats::rgeom,
paramfun_notatP = minRev_freq_notatP_paramfun)
Again the above example is only for illustrative purposes and users can run the default without explicitly spelling out the sampling distributions as above:
<- claim_minRev_freq(claims = test_claims) minor
Suppose we believe that there should be no minor revisions at partial payments (prob_atP = 0
) and that the number of minor revisions should follow a geometric distribution but with a higher mean. SPLICE
can easily account for these assumptions through the following code.
<- function(setldel) {
minRev_freq_notatP_paramfun c(prob = 1 / (min(3, setldel/4) + 2))
}
<- claim_minRev_freq(
minor_test
test_claims,prob_atP = 0,
rfun_notatP = stats::rgeom,
paramfun_notatP = minRev_freq_notatP_paramfun)
1]][[10]]
minor_test[[#> $minRev_atP
#> [1] 0 0 0 0 0
#>
#> $minRev_freq_atP
#> [1] 0
#>
#> $minRev_freq_notatP
#> [1] 0
#>
#> $minRev_time_atP
#> [1] NA
#>
#> $minRev_time_notatP
#> [1] NA
#>
#> $minRev_factor_atP
#> [1] NA
#>
#> $minRev_factor_notatP
#> [1] NA
claim_minRev_time()
generates the epochs of the minor revisions (time measured from claim notification). Note that there is no need to specify a random sampling function for minor revisions that occur simultaneously with a partial payment because the revision times simply coincide with the epochs of the relevant partial payments.
For revisions outside of the partial payments, users are free to input a sampling distribution via rfun_notatP
and a parameter function paramfun_notatP
which relates the parameter(s) of the distribution to selected claim characteristics.
By default we assume that the epochs of the minor revision can be sampled from a uniform distribution:
## input
# package default function for time of minor revisions that do not coincide with a payment
<- function(n, setldel) {
dflt.minRev_time_notatP sort(stats::runif(n, min = setldel/6, max = setldel))
}
## implementation and output
<- claim_minRev_time(
minor claims = test_claims,
minRev_list = minor, # we will update the previous minor list
rfun_notatP = dflt.minRev_time_notatP
)
# view the minor revision history of the 10th claim in the 1st occurrence period
# observe that we have now updated the time of minor revisions
1]][[10]]
minor[[#> $minRev_atP
#> [1] 1 1 0 0 1
#>
#> $minRev_freq_atP
#> [1] 3
#>
#> $minRev_freq_notatP
#> [1] 1
#>
#> $minRev_time_atP
#> [1] 0.7911041 2.8190765 6.7787507
#>
#> $minRev_time_notatP
#> [1] 5.835193
#>
#> $minRev_factor_atP
#> [1] NA
#>
#> $minRev_factor_notatP
#> [1] NA
Let’s consider an alternative example where we believe the epochs of minor revisions better follow a triangular distribution (see ?triangular
from SPLICE
). This can be set up as follows:
## input
<- function(n, setldel) {
minRev_time_notatP_rfun # n = number of minor revisions
# setldel = settlement delay
sort(rtri(n, min = setldel/6, max = setldel, mode = setldel/6))
}
## implementation and output
<- claim_minRev_time(
minor_test claims = test_claims,
minRev_list = minor, # we will update the previous minor list
rfun_notatP = minRev_time_notatP_rfun
)
# view the minor revision history of the 10th claim in the 1st occurrence period
# observe that we have now updated the time of minor revisions
1]][[10]]
minor_test[[#> $minRev_atP
#> [1] 1 1 0 0 1
#>
#> $minRev_freq_atP
#> [1] 3
#>
#> $minRev_freq_notatP
#> [1] 1
#>
#> $minRev_time_atP
#> [1] 0.7911041 2.8190765 6.7787507
#>
#> $minRev_time_notatP
#> [1] 3.702022
#>
#> $minRev_factor_atP
#> [1] NA
#>
#> $minRev_factor_notatP
#> [1] NA
claim_minRev_size()
generates the sizes of the minor revisions. Unlike the major revision multipliers which apply to the incurred loss estimates, the minor revision multipliers apply to the case estimate of outstanding claim payments i.e. a revision multiplier of 2.54 means that at the time of the minor revision the outstanding claims payment increases by a factor of 2.54. The reason for making this differentiation is briefly explained here.
SPLICE
assumes a common sampling distribution for minor revisions that occur at partial payments and those that occur at any other times. But users may provide separate parameter functions (paramfun_atP
and paramfun_notatP
) for the two cases.
In the default setting, we incorporate sampling dependence on the delay from notification to settlement, the delay from notification to the subject minor revisions, and the history of major revisions (in particular, the time of the second major revision).
Let \(\tau\) denote the delay from notification to the epoch of the minor revision, and \(w\) the settlement delay. Then
meanlog = 0.15
and sdlog = 0.05
if preceded by a 2nd major revision, sdlog = 0.1
otherwise;meanlog = 0
and sdlog = 0.05
if preceded by a 2nd major revision, sdlog = 0.1
otherwise;meanlog = -0.1
and sdlog = 0.05
if preceded by a 2nd major revision, sdlog = 0.1
otherwise.Note that minor revisions tend to be upward in the early part of a claim’s life, and downward in the latter part.
## input
# package default function for the size of minor revisions
<- function(
dflt.minRev_size # n = number of minor revisions
# minRev_time = epochs of the minor revisions (from claim notification)
# majRev_time_2nd = epoch of 2nd major revision (from claim notification)
# setldel = settlement delay
n, minRev_time, majRev_time_2nd, setldel) {
<- length(minRev_time)
k <- vector(length = k)
minRev_factor
if (k >= 1) {
for (i in 1:k) {
<- minRev_time[i]
curr if (curr <= setldel/3) {
<- 0.15
meanlog else if (curr <= (2/3) * setldel) {
} <- 0
meanlog else {
} <- -0.1
meanlog
}<- ifelse(curr > majRev_time_2nd, 0.05, 0.1)
sdlog <- stats::rlnorm(n = 1, meanlog, sdlog)
minRev_factor[i]
}
}
minRev_factor }
While setldel
(settlement delay) is a directly accessible claim characteristic, we need paramfun
to take care of the extraction and computation of majRev_time_2nd
and minRev_time
as a function of the revision lists that we can access.
# parameter function for minor revision at payments
<- function(major, minor, setldel) {
minRev_size_param_atP list(minRev_time = minor$minRev_time_atP,
majRev_time_2nd = ifelse(
# so it always holds minRev_time < majRev_time_2nd
is.na(major$majRev_time[2]), setldel + 1, major$majRev_time[2]),
setldel = setldel)
}
# parameter function for minor revisions NOT at payments
<- function(major, minor, setldel) {
minRev_size_param_notatP list(minRev_time = minor$minRev_time_notatP,
majRev_time_2nd = ifelse(
# so it always holds minRev_time < majRev_time_2nd
is.na(major$majRev_time[2]), setldel + 1, major$majRev_time[2]),
setldel = setldel)
}
## implementation and output
<- claim_minRev_size(
minor claims = test_claims,
majRev_list = major,
minRev_list = minor,
rfun = dflt.minRev_size,
paramfun_atP = minRev_size_param_atP,
paramfun_notatP = minRev_size_param_notatP
)
# view the minor revision history of the 10th claim in the 1st occurrence period
# observe that we have now updated the size of minor revisions
1]][[10]]
minor[[#> $minRev_atP
#> [1] 1 1 0 0 1
#>
#> $minRev_freq_atP
#> [1] 3
#>
#> $minRev_freq_notatP
#> [1] 1
#>
#> $minRev_time_atP
#> [1] 0.7911041 2.8190765 6.7787507
#>
#> $minRev_time_notatP
#> [1] 5.835193
#>
#> $minRev_factor_atP
#> [1] 0.9613874 0.9873257 0.9366064
#>
#> $minRev_factor_notatP
#> [1] 0.9107412
For this particular claim record, we observe 3 minor revisions that coincide with a payment and 1 minor revisions outside of the partial payment times.
For illustrative purposes, let’s now assume that the minor revision multipliers should be sampled from a uniform distribution.
## input
<- function(claim_size, ...) {
paramfun_atP c(min = pmin(1, pmax(log(claim_size / 15000), 0.5)),
max = pmin(1, pmax(log(claim_size / 15000), 0.5)) + 1)
}<- paramfun_atP
paramfun_notatP
## implementation and output
<- unlist(test_claims$claim_size_list)
claim_size_vect <- claim_minRev_size(
minor_test
test_claims, major, minor,rfun = stats::runif, paramfun_atP, paramfun_notatP,
claim_size = claim_size_vect)
1]][[10]]
minor_test[[#> $minRev_atP
#> [1] 1 1 0 0 1
#>
#> $minRev_freq_atP
#> [1] 3
#>
#> $minRev_freq_notatP
#> [1] 1
#>
#> $minRev_time_atP
#> [1] 0.7911041 2.8190765 6.7787507
#>
#> $minRev_time_notatP
#> [1] 5.835193
#>
#> $minRev_factor_atP
#> [1] 0.8357490 1.3070304 0.8162695
#>
#> $minRev_factor_notatP
#> [1] 1.487637
This section requires no additional input specification from the program user (except the quarterly inflation rates - which should match with what was used in SynthETIC::claim_payment_inflation()
when generating the inflated amount of partial payments) and simply consolidates the partial payments and the incurred revisions generated above, subject to some additional revision constraints (?claim_history
for details). The end product is a full transactional history of the case estimates of the individual claims over its lifetime.
We can choose to exclude (default) or include adjustment for inflation:
# exclude inflation (by default)
<- claim_history(test_claims, major, minor)
result # include inflation
<- claim_history(
result_inflated
test_claims, major, minor, base_inflation_vector = rep((1 + 0.02)^(1/4) - 1, times = 80))
Observe how the results differ between the case estimates with/without inflation:
<- generate_incurred_dataset(test_claims_object, result)
data str(data)
#> 'data.frame': 32384 obs. of 9 variables:
#> $ claim_no : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ claim_size: num 785871 785871 785871 785871 785871 ...
#> $ txn_time : num 0.689 3.949 4.198 6.327 7.096 ...
#> $ txn_delay : num 0 3.26 3.51 5.64 6.41 ...
#> $ txn_type : chr "Ma" "Mi" "P" "Mi" ...
#> $ incurred : num 71767 76242 76242 78597 78597 ...
#> $ OCL : num 71767 76242 51137 53493 27316 ...
#> $ cumpaid : num 0.00 6.55e-11 2.51e+04 2.51e+04 5.13e+04 ...
#> $ multiplier: num 1 1.06 NA 1.05 NA ...
head(data, n = 9)
#> claim_no claim_size txn_time txn_delay txn_type incurred OCL
#> 1 1 785870.8 0.6889986 0.000000 Ma 71766.93 71766.93
#> 2 1 785870.8 3.9493565 3.260358 Mi 76241.74 76241.74
#> 3 1 785870.8 4.1975938 3.508595 P 76241.74 51136.97
#> 4 1 785870.8 6.3271693 5.638171 Mi 78597.46 53492.68
#> 5 1 785870.8 7.0960120 6.407013 P 78597.46 27316.06
#> 6 1 785870.8 8.4161141 7.727115 Ma 405983.27 354701.87
#> 7 1 785870.8 11.1576971 10.468698 PMi 422014.90 344400.32
#> 8 1 785870.8 14.4457620 13.756763 PMi 379360.08 275404.40
#> 9 1 785870.8 14.8921284 14.203130 Ma 993682.82 889727.14
#> cumpaid multiplier
#> 1 0.000000e+00 1.0000000
#> 2 6.548362e-11 1.0623521
#> 3 2.510478e+04 NA
#> 4 2.510478e+04 1.0460667
#> 5 5.128140e+04 NA
#> 6 5.128140e+04 5.1653486
#> 7 7.761458e+04 1.0451975
#> 8 1.039557e+05 0.8761476
#> 9 1.039557e+05 2.6193658
<- generate_incurred_dataset(test_claims_object, result_inflated)
data_inflated str(data_inflated)
#> 'data.frame': 32384 obs. of 9 variables:
#> $ claim_no : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ claim_size: num 785871 785871 785871 785871 785871 ...
#> $ txn_time : num 0.689 3.949 4.198 6.327 7.096 ...
#> $ txn_delay : num 0 3.26 3.51 5.64 6.41 ...
#> $ txn_type : chr "Ma" "Mi" "P" "Mi" ...
#> $ incurred : num 71514 77210 77210 80528 80528 ...
#> $ OCL : num 71514 77210 51578 54896 27784 ...
#> $ cumpaid : num 0.00 2.18e-11 2.56e+04 2.56e+04 5.27e+04 ...
#> $ multiplier: num 1 1.06 NA 1.05 NA ...
head(data_inflated, n = 9)
#> claim_no claim_size txn_time txn_delay txn_type incurred OCL
#> 1 1 785870.8 0.6889986 0.000000 Ma 71514.42 71514.42
#> 2 1 785870.8 3.9493565 3.260358 Mi 77209.73 77209.73
#> 3 1 785870.8 4.1975938 3.508595 P 77209.73 51577.79
#> 4 1 785870.8 6.3271693 5.638171 Mi 80528.15 54896.21
#> 5 1 785870.8 7.0960120 6.407013 P 80528.15 27783.67
#> 6 1 785870.8 8.4161141 7.727115 Ma 420279.94 367535.46
#> 7 1 785870.8 11.1576971 10.468698 PMi 442861.81 362288.63
#> 8 1 785870.8 14.4457620 13.756763 PMi 404523.03 295655.95
#> 9 1 785870.8 14.8921284 14.203130 Ma 1061937.89 953070.80
#> cumpaid multiplier
#> 1 0.000000e+00 1.0000000
#> 2 2.182787e-11 1.0623521
#> 3 2.563194e+04 NA
#> 4 2.563194e+04 1.0460667
#> 5 5.274448e+04 NA
#> 6 5.274448e+04 5.1653486
#> 7 8.057318e+04 1.0451975
#> 8 1.088671e+05 0.8761476
#> 9 1.088671e+05 2.6193658
Note that the above data
and data_inflated
datasets are included as part of the package as test_incurred_dataset_noInf
and test_incurred_dataset_inflated
:
str(test_incurred_dataset_noInf)
#> 'data.frame': 31250 obs. of 9 variables:
#> $ claim_no : int 1 1 1 1 1 1 1 1 1 2 ...
#> $ claim_size: num 785871 785871 785871 785871 785871 ...
#> $ txn_time : num 0.689 4.198 7.096 8.554 11.158 ...
#> $ txn_delay : num 0 3.51 6.41 7.86 10.47 ...
#> $ txn_type : chr "Ma" "P" "P" "Ma" ...
#> $ incurred : num 64033 64033 64033 339643 345196 ...
#> $ OCL : num 64033 38928 12752 288361 267581 ...
#> $ cumpaid : num 0 25105 51281 51281 77615 ...
#> $ multiplier: num 1 NA NA 5.3 1.02 ...
str(test_incurred_dataset_inflated)
#> 'data.frame': 31250 obs. of 9 variables:
#> $ claim_no : int 1 1 1 1 1 1 1 1 1 2 ...
#> $ claim_size: num 785871 785871 785871 785871 785871 ...
#> $ txn_time : num 0.689 4.198 7.096 8.554 11.158 ...
#> $ txn_delay : num 0 3.51 6.41 7.86 10.47 ...
#> $ txn_type : chr "Ma" "P" "P" "Ma" ...
#> $ incurred : num 63905 63905 63905 352421 362839 ...
#> $ OCL : num 63905 38273 11160 299677 282266 ...
#> $ cumpaid : num 0 25632 52744 52744 80573 ...
#> $ multiplier: num 1 NA NA 5.3 1.02 ...
SPLICE
also provides an option to produce the incurred triangles aggregated by accident and development periods:
<- output_incurred(result)
square_inc <- output_incurred(result, incremental = F)
square_cum <- output_incurred(result_inflated)
square_inflated_inc <- output_incurred(result_inflated, incremental = F)
square_inflated_cum
<- output_incurred(result, aggregate_level = 4)
yearly_inc <- output_incurred(result, aggregate_level = 4, incremental = F)
yearly_cum
yearly_cum#> DP1 DP2 DP3 DP4 DP5 DP6 DP7 DP8
#> AP1 21846312 37758611 49627375 55794329 58767892 62366936 62902546 62486029
#> AP2 17989801 37594432 48227626 50816309 55323643 55641507 55534306 57396402
#> AP3 18299898 38378501 46407079 53043975 55787454 57065817 57598917 57953386
#> AP4 17313124 34289765 44952334 51129516 53734088 54468306 55962256 56407511
#> AP5 20511783 42211689 54327913 60431132 62285147 61654702 62794043 63144648
#> AP6 21436730 39448518 47811984 53766037 57768597 58088457 58712026 58376682
#> AP7 20186643 35172249 44256497 48936876 54586154 55642298 54895552 55403589
#> AP8 13077553 29871334 42877233 45884245 47782393 49904558 49836704 49896893
#> AP9 19941842 41091858 50342713 56461572 56659111 55898789 56698061 57343630
#> AP10 19402526 40737328 52180283 57487056 56801187 55745586 55940701 56276123
#> DP9 DP10
#> AP1 61739214 61995953
#> AP2 59477258 59229413
#> AP3 57966421 57805433
#> AP4 56206775 55752961
#> AP5 62825138 62572956
#> AP6 58348050 58322971
#> AP7 55341593 55297816
#> AP8 49803452 49725363
#> AP9 57979184 57712377
#> AP10 56238590 56093866
# apply standard actuarial reserving techniques using the `ChainLadder` package
# selected <- attr(ChainLadder::ata(yearly_cum), "vwtd")
We can also set future = FALSE
to hide the future triangle and perform a chain-ladder analysis using the ChainLadder
package:
# output the past cumulative triangle
<- output_incurred(result, aggregate_level = 4,
cumtri incremental = FALSE, future = FALSE)
# calculate the age to age factors
<- attr(ChainLadder::ata(cumtri), "vwtd")
selected # complete the triangle
<- cumtri
CL_prediction <- nrow(cumtri)
J for (i in 2:J) {
for (j in (J - i + 2):J) {
<- CL_prediction[i, j - 1] * selected[j - 1]
CL_prediction[i, j]
}
}
CL_prediction#> DP1 DP2 DP3 DP4 DP5 DP6 DP7 DP8
#> AP1 21846312 37758611 49627375 55794329 58767892 62366936 62902546 62486029
#> AP2 17989801 37594432 48227626 50816309 55323643 55641507 55534306 57396402
#> AP3 18299898 38378501 46407079 53043975 55787454 57065817 57598917 57953386
#> AP4 17313124 34289765 44952334 51129516 53734088 54468306 55962256 56534496
#> AP5 20511783 42211689 54327913 60431132 62285147 61654702 62314234 62951426
#> AP6 21436730 39448518 47811984 53766037 57768597 58839322 59468737 60076832
#> AP7 20186643 35172249 44256497 48936876 51750611 52709794 53273640 53818388
#> AP8 13077553 29871334 42877233 47771337 50518056 51454394 52004811 52536584
#> AP9 19941842 41091858 52770452 58793790 62174269 63326652 64004069 64658540
#> AP10 19402526 38192007 49046443 54644714 57786634 58857693 59487304 60095590
#> DP9 DP10
#> AP1 61739214 61995953
#> AP2 59477258 59724590
#> AP3 58598287 58841964
#> AP4 57163608 57401319
#> AP5 63651944 63916636
#> AP6 60745363 60997968
#> AP7 54417275 54643565
#> AP8 53121207 53342108
#> AP9 65378055 65649926
#> AP10 60764328 61017013