Individual claim simulation fits into two basic categories: 1) wait-time and 2) link ratio. An example of the first may be found in Stanard and an example of the second may be found in Guszcza.
Claim simulation occurs once we have a data frame of policies. For each row in this data frame, we will simulate zero or more claims and zero or more claim transactions.
library(imaginator)
set.seed(12345)
<- policies_simulate(2, 2001:2005) tbl_policy
We’ll begin with non-stochastic wait times and claim frequencies.
<- claims_by_wait_time(
tbl_claim_transaction
tbl_policy,claim_frequency = 2,
payment_frequency = 3,
occurrence_wait = 10,
report_wait = 5,
pay_wait = 5,
pay_severity = 50)
Here we have assumed that each policy will generate 2 claims and each claim will produce 3 payment. Because we have 10 policies, this means we have 60 claim payments. Here they are for the first policy:
claim_id | occurrence_date | report_date | payment_date | payment_amount |
---|---|---|---|---|
1 | 2001-06-02 | 2001-06-07 | 2001-06-12 | 50 |
1 | 2001-06-02 | 2001-06-07 | 2001-06-17 | 50 |
1 | 2001-06-02 | 2001-06-07 | 2001-06-22 | 50 |
2 | 2001-06-02 | 2001-06-07 | 2001-06-12 | 50 |
2 | 2001-06-02 | 2001-06-07 | 2001-06-17 | 50 |
2 | 2001-06-02 | 2001-06-07 | 2001-06-22 | 50 |
Let’s do that again with some random amounts. We’ll keep the claim frequency fixed so that we can compare to the output above.
library(distributions3)
<- claims_by_wait_time(
tbl_claim_transaction
tbl_policy,claim_frequency = 2,
payment_frequency = Poisson(2),
occurrence_wait = Poisson(10),
report_wait = Poisson(5),
pay_wait = Poisson(5),
pay_severity = LogNormal(log(50), 0.5 * log(50)))
claim_id | occurrence_date | report_date | payment_date | payment_amount |
---|---|---|---|---|
1 | 2001-05-27 | 2001-06-03 | 2001-06-07 | 8.660829 |
1 | 2001-05-27 | 2001-06-03 | 2001-06-13 | 273.995097 |
1 | 2001-05-27 | 2001-06-03 | 2001-06-23 | 134.503747 |
1 | 2001-05-27 | 2001-06-03 | 2001-06-29 | 95.663423 |
2 | 2001-06-03 | 2001-06-06 | 2001-06-11 | 1503.396967 |
2 | 2001-06-03 | 2001-06-06 | 2001-06-19 | 49.134663 |
Note that the transaction data is denormalized. The policy and claim information fields are repeated.
This is basically chain ladder applied to individual claims. First, we’ll need to generate a random number of claims by developmemt lag. This is effectively a triangle of “IBNYR”, or Incurred But Not Yet Reported claims. With that in place, we can then develop the claims with (probably) randomized link ratios.
As usual, we’ll start with fixed values and then display a randomized example.
set.seed(12345)
<- policies_simulate(2, 2001:2005)
tbl_policy
<- list(
lstFreq 4
3
, 2
, 1
,
)
<- list(
lstSev 250
)1:4] <- lstSev[1]
lstSev[
<- claims_by_first_report(
tbl_ibnyr_fixed
tbl_policyfrequency = lstFreq
, payment_severity = lstSev
, lags = 1:4) ,
Because we’re using fixed values for the frequencies, we’ll have 10 claims per policy.
%>%
tbl_ibnyr_fixed filter(policyholder_id == 1) %>%
filter(policy_effective_date == min(policy_effective_date)) %>%
kable()
policyholder_id | policy_effective_date | claim_id | lag | payment_amount |
---|---|---|---|---|
1 | 2001-05-23 | 1 | 1 | 250 |
1 | 2001-05-23 | 2 | 1 | 250 |
1 | 2001-05-23 | 3 | 1 | 250 |
1 | 2001-05-23 | 4 | 1 | 250 |
1 | 2001-05-23 | 41 | 2 | 250 |
1 | 2001-05-23 | 42 | 2 | 250 |
1 | 2001-05-23 | 43 | 2 | 250 |
1 | 2001-05-23 | 71 | 3 | 250 |
1 | 2001-05-23 | 72 | 3 | 250 |
1 | 2001-05-23 | 91 | 4 | 250 |
Let’s try that again with some randomness:
<- list(
lstFreq Poisson(4)
Poisson(3)
, Poisson(2)
, Poisson(1)
,
)
<- list(
lstSev LogNormal(log_mu = log(10000), log_sigma = .5*log(10000))
)1:4] <- lstSev[1]
lstSev[
<- claims_by_first_report(
tbl_ibnyr_random
tbl_policyfrequency = lstFreq
, payment_severity = lstSev
, lags = 1:4) ,
We see that in this case, the first policy does not have 10 claims.
%>%
tbl_ibnyr_random filter(policyholder_id == 1) %>%
filter(policy_effective_date == min(policy_effective_date)) %>%
kable()
policyholder_id | policy_effective_date | claim_id | lag | payment_amount |
---|---|---|---|---|
1 | 2001-05-23 | 1 | 1 | 5854.6965 |
1 | 2001-05-23 | 58 | 3 | 238.1875 |
We can now develop the claims in the IBNYR triangle. Again we’ll start with fixed link ratios.
<- list(2, 1.5, 1.25) fixedLinks
<- claims_by_link_ratio(
tbl_claims_fixed
tbl_ibnyr_fixed,links = fixedLinks,
lags = 1:4)
%>%
tbl_claims_fixed filter(policyholder_id == 1) %>%
filter(
== min(policy_effective_date),
policy_effective_date %in% c(1, 41)) %>%
claim_id arrange(claim_id, lag) %>%
kable()
policyholder_id | policy_effective_date | claim_id | lag | payment_amount |
---|---|---|---|---|
1 | 2001-05-23 | 1 | 1 | 250 |
1 | 2001-05-23 | 1 | 2 | 500 |
1 | 2001-05-23 | 1 | 3 | 1000 |
1 | 2001-05-23 | 1 | 4 | 2000 |
1 | 2001-05-23 | 41 | 2 | 250 |
1 | 2001-05-23 | 41 | 3 | 500 |
1 | 2001-05-23 | 41 | 4 | 1000 |
Note that the second claim was unknown as of Lag 1.
We can make things a bit more complicated by introducing variable link ratios
<- list(
normalLinks Normal(2, 1),
Normal(1.5, .5),
Normal(1.25, .5))
<- claims_by_link_ratio(
tbl_claims_random
tbl_ibnyr_random, links = normalLinks,
lags = 1:4)
%>%
tbl_claims_random filter(policyholder_id == 1) %>%
filter(
== min(policy_effective_date),
policy_effective_date %in% c(1, 41)) %>%
claim_id arrange(claim_id, lag) %>%
kable()
policyholder_id | policy_effective_date | claim_id | lag | payment_amount |
---|---|---|---|---|
1 | 2001-05-23 | 1 | 1 | 5854.697 |
1 | 2001-05-23 | 1 | 2 | 24472.645 |
1 | 2001-05-23 | 1 | 3 | 47256.255 |
1 | 2001-05-23 | 1 | 4 | 91251.010 |
Note that the link ratios apply to individual claims only. IBNYR This means that it’s possible for individual claim development