simfam
: simulate and model family pedigrees with structured foundersThe simfam
package is for constructing large random families—with population studies in mind—with realistic constraints including avoidance of close relative pairings but otherwise biased for closest pairs in a 1-dimensional geography. There is also code to draw genotypes across the pedigree starting from genotypes for the founders. Our model allows for the founders to be related or structured—which arises in practice when there are different ancestries among founders—and given known parameters for these founders (including known kinship matrices and ancestry proportions) we provide code to calculate the true kinship and expected admixture proportions of all descendants.
This vignette focuses on two examples. One is small, necessary to actually visualize the pedigree. The second is larger, where the pedigree cannot be fully visualized but whose population parameters are more realistic. This vignette draws from other general packages to visualize the data (popkin
and kinship2
), as well as to construct/simulate structured founders (bnpsd
, which is based on the admixture model).
Let’s start with a 3-generation pedigree with 15 individuals per generation. (The code can also handle variable numbers of individuals per generation, which is not demonstrated in this vignette for simplicity).
library(simfam)
# data dimensions
15 # number of individuals per generation
n <- 3 # number of generations
G <-
# the result changes in every run unless we set the seed
set.seed(1)
# draw the random pedigree with those properties
sim_pedigree( n, G )
data <-# save these objects separately
data$fam
fam <- data$ids
ids <- data$kinship_local kinship_local_G <-
The function returns a list with two objects. The first element in the list is the most important: fam
is a table that describes the pedigree in a standard notation in genetics research: the plink FAM format. Every row corresponds to one individual, and columns name individuals (id
), their parents (pat
and mat
, founders have parents set to NA
), and their sex
(1
=male, 2
=female). The FAM table format also requires a family ID (fam
) and a phenotype (pheno
) column, both of which take on dummy values in the output of sim_pedigree
:
# the top of the table
head( fam )
#> # A tibble: 6 × 6
#> fam id pat mat sex pheno
#> <chr> <chr> <chr> <chr> <int> <dbl>
#> 1 fam1 1-1 <NA> <NA> 1 0
#> 2 fam1 1-2 <NA> <NA> 2 0
#> 3 fam1 1-3 <NA> <NA> 1 0
#> 4 fam1 1-4 <NA> <NA> 1 0
#> 5 fam1 1-5 <NA> <NA> 2 0
#> 6 fam1 1-6 <NA> <NA> 1 0
# and the bottom
tail( fam )
#> # A tibble: 6 × 6
#> fam id pat mat sex pheno
#> <chr> <chr> <chr> <chr> <int> <dbl>
#> 1 fam1 3-10 2-7 2-10 2 0
#> 2 fam1 3-11 2-7 2-10 1 0
#> 3 fam1 3-12 2-7 2-10 2 0
#> 4 fam1 3-13 2-14 2-11 2 0
#> 5 fam1 3-14 2-14 2-11 2 0
#> 6 fam1 3-15 2-14 2-11 2 0
The IDs (columns id
, pat
, mat
) are formatted as g-i
, where the first number is the generation and the second number is an index within the generation (1 to n
). Every individual belongs to a generation in the simulation (founders are the first generation), and only individuals in the same generation may be paired. Sex is drawn at random for each individual prior to pairing. Every individual has a unique index within the generation, that places them in a 1D geography—an important detail since pairings are strongly biased to be between closest individuals. Founders are ordered as they are constructed, and children are ordered according to the mean index of their parents. Pairings occur randomly and iteratively: while there are available male-female pairs, a male is drawn at random from the available males and he is paired with the closest unpaired female that is not a close relative (by default they must be less related than second cousins, i.e. have a pedigree kinship coefficient of less than 1/4^3
).
The second element in the return list of sim_pedigree
is ids
, the list of IDs separated by generation. This is very handy as often we want to copy the names of the founders (ids[[ 1 ]]
) to other objects, or want to filter data to contain the last generation only (ids[[ G ]]
).
The third element in the return list of sim_pedigree
is the local (i.e., pedigree-based) kinship matrix of the individuals. This data can be recalculated from the fam
table alone, so it is redundant, but it is calculated because the pairing algorithm requires it to avoid close relatives, so it is returned mostly because it’s there already. By default, only individuals in the last generation are included in the matrix that is returned, so this is a n * n
(here 15 x 15) matrix:
kinship_local_G#> 3-1 3-2 3-3 3-4 3-5 3-6 3-7 3-8 3-9 3-10
#> 3-1 0.5000 0.2500 0.2500 0.1250 0.1250 0.1250 0.0625 0.0625 0.0625 0.0625
#> 3-2 0.2500 0.5000 0.2500 0.1250 0.1250 0.1250 0.0625 0.0625 0.0625 0.0625
#> 3-3 0.2500 0.2500 0.5000 0.1250 0.1250 0.1250 0.0625 0.0625 0.0625 0.0625
#> 3-4 0.1250 0.1250 0.1250 0.5000 0.2500 0.2500 0.0625 0.0625 0.0625 0.0625
#> 3-5 0.1250 0.1250 0.1250 0.2500 0.5000 0.2500 0.0625 0.0625 0.0625 0.0625
#> 3-6 0.1250 0.1250 0.1250 0.2500 0.2500 0.5000 0.0625 0.0625 0.0625 0.0625
#> 3-7 0.0625 0.0625 0.0625 0.0625 0.0625 0.0625 0.5000 0.2500 0.1250 0.1250
#> 3-8 0.0625 0.0625 0.0625 0.0625 0.0625 0.0625 0.2500 0.5000 0.1250 0.1250
#> 3-9 0.0625 0.0625 0.0625 0.0625 0.0625 0.0625 0.1250 0.1250 0.5000 0.2500
#> 3-10 0.0625 0.0625 0.0625 0.0625 0.0625 0.0625 0.1250 0.1250 0.2500 0.5000
#> 3-11 0.0625 0.0625 0.0625 0.0625 0.0625 0.0625 0.1250 0.1250 0.2500 0.2500
#> 3-12 0.0625 0.0625 0.0625 0.0625 0.0625 0.0625 0.1250 0.1250 0.2500 0.2500
#> 3-13 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0625 0.0625 0.0625 0.0625
#> 3-14 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0625 0.0625 0.0625 0.0625
#> 3-15 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0625 0.0625 0.0625 0.0625
#> 3-11 3-12 3-13 3-14 3-15
#> 3-1 0.0625 0.0625 0.0000 0.0000 0.0000
#> 3-2 0.0625 0.0625 0.0000 0.0000 0.0000
#> 3-3 0.0625 0.0625 0.0000 0.0000 0.0000
#> 3-4 0.0625 0.0625 0.0000 0.0000 0.0000
#> 3-5 0.0625 0.0625 0.0000 0.0000 0.0000
#> 3-6 0.0625 0.0625 0.0000 0.0000 0.0000
#> 3-7 0.1250 0.1250 0.0625 0.0625 0.0625
#> 3-8 0.1250 0.1250 0.0625 0.0625 0.0625
#> 3-9 0.2500 0.2500 0.0625 0.0625 0.0625
#> 3-10 0.2500 0.2500 0.0625 0.0625 0.0625
#> 3-11 0.5000 0.2500 0.0625 0.0625 0.0625
#> 3-12 0.2500 0.5000 0.0625 0.0625 0.0625
#> 3-13 0.0625 0.0625 0.5000 0.2500 0.2500
#> 3-14 0.0625 0.0625 0.2500 0.5000 0.2500
#> 3-15 0.0625 0.0625 0.2500 0.2500 0.5000
This kinship matrix is later visualized more conveniently as a heatmap using the popkin
package. If you’re interested, a list of local kinship matrices (one per generation) is returned in place of this if sim_pedigree
is called with the full = TRUE
option.
We use the excellent kinship2
package to visualize our randomly-constructed pedigree:
# load the package
library(kinship2)
#> Loading required package: Matrix
#> Loading required package: quadprog
# Convert the `fam` table into a `pedigree` object required by the `kinship2` package.
pedigree(
obj <-id = fam$id,
dadid = fam$pat,
momid = fam$mat,
sex = fam$sex,
affected = fam$pheno
)
# plot the pedigree
plot( obj, cex = 0.7 )
#> Did not plot the following people: 1-3 1-4 1-6 1-7 1-11 1-13 1-15
In these kinds of pedigree plots, node shapes represent sex (box=male, circle=female), and the dashed curved lines (if any) represent individuals copied in two places (once next to its parents, the second time next to its mate) to simplify the plot (the two nodes connected by a dashed line have the same individual ID). It is immediate clear that only individuals with opposite sex were paired, and no close relatives were paired (i.e. no siblings or first or second cousins).
You will notice several things from this random pedigree. One is that some individuals do not get paired in each generation, which happens because there are too many constraints: the number of males and females is random so they may be unequal, and close-relative avoidance further limits pairings. However, this problem is exacerbated in this toy example because the population is small; in a large population most people are paired (see further below). Here, founders that weren’t paired are not plotted (see message below figure). Pairings are always between people of the same generation (first number in ID), and usually but not always between people with close indexes (second number in ID).
When there are pairings, sim_pedigree
enforces a minimum number of children per family (default 1). Each generation has exactly the desired number of total individuals (15 per generation in this example), which is achieved by randomly drawing family sizes from a modified Poisson distribution (the number of children in excess of the minimum is drawn from Poisson with target mean per family minus minimum size) followed by smaller random adjustments in the direction of the discrepancy until the target population size is met.
If we are only interested in the last generation, then individuals from previous generations that were unpaired are irrelevant, since for example they don’t determine the genotypes of the last generation. So for simplicity or efficiency, we may want to remove those individuals without descendants. We do this with the function prune_fam
!
# replace original with pruned FAM table
# containing only ancestors of last generation
prune_fam( fam, ids[[ G ]] )
fam <-# inspect
fam#> # A tibble: 33 × 6
#> fam id pat mat sex pheno
#> <chr> <chr> <chr> <chr> <int> <dbl>
#> 1 fam1 1-1 <NA> <NA> 1 0
#> 2 fam1 1-2 <NA> <NA> 2 0
#> 3 fam1 1-5 <NA> <NA> 2 0
#> 4 fam1 1-8 <NA> <NA> 1 0
#> 5 fam1 1-9 <NA> <NA> 2 0
#> 6 fam1 1-10 <NA> <NA> 2 0
#> 7 fam1 1-12 <NA> <NA> 1 0
#> 8 fam1 1-14 <NA> <NA> 1 0
#> 9 fam1 2-2 1-1 1-2 2 0
#> 10 fam1 2-3 1-1 1-2 2 0
#> # … with 23 more rows
# and plot pedigree
pedigree(
obj <-id = fam$id,
dadid = fam$pat,
momid = fam$mat,
sex = fam$sex,
affected = fam$pheno
)plot( obj, cex = 0.7 )
Inspection of the top of the table shows that founders that were originally unpaired are now removed (the plot also doesn’t report people not plotted). The plot further shows that many individuals in the second generation (childless uncles and aunts of the last generation) were also removed, but nobody in the third generation was removed (as desired).
We will draw genotypes in two ways to reflect the traditional (unrelated founders) and the intended use of simfam
: structured founders. Let’s do unrelated founders first. Note that, to keep this vignette runtime low, the number of loci is much lower than any real dataset, so data will look artificially noisier here.
# number of loci to draw
5000
m <-# draw uniform ancestral allele frequencies
runif( m )
p_anc <-# draw independent (unstructured) genotypes for founders (includes founders without descendants)
1 <- matrix(
X_rbinom( n * m, 2, p_anc ),
nrow = m,
ncol = n
)# `simfam` requires names for the founders on `X_1`, to validate identities and align
colnames( X_1 ) <- ids[[ 1 ]]
Before moving on, it’s important to understand that these founders have a kinship matrix, albeit a trivial one, with zero kinship (unrelated) between individuals and a self-kinship of 1/2 (expected for outbred individuals). We can verify this directly using a kinship estimator applied to the genotypes we just drew. We use popkin
, which is the only unbiased kinship estimator. popkin
also provides heatmap visualization of kinship matrices via plot_popkin
.
library(popkin)
# estimate kinship
1 <- popkin( X_1 )
kinship_est_# true/expected kinship
1 <- diag( n ) / 2
kinship_# assign same IDs as `X_1` and `fam`; `simfam` generally requires these to validate/align
colnames( kinship_1 ) <- colnames( X_1 )
rownames( kinship_1 ) <- colnames( X_1 )
# plot them together for comparison
plot_popkin(
list( kinship_1, kinship_est_1 ),
titles = c('Expected', 'Estimated'),
names = TRUE,
leg_width = 0.2,
mar = c(3, 2)
)
Now we draw genotypes across the pedigree using geno_fam
! Since we have genotypes of the founders, we can simulate the random inheritance of alleles to every descendant, which produces the correct pedigree-based kinship/covariance structure at every locus. However, this code draws loci independently from each other (there’s no linkage disequilibrium). We also construct the total expected kinship across this pedigree with a similar function, kinship_fam
.
geno_fam( X_1, fam )
X <-# expected kinship of entire pruned pedigree, starting from true kinship of founders.
# Called "local" kinship because it ignores the ancestral population structure (will be reused)
# NOTE: agrees with `kinship2::kinship( fam )`, but only because founders are unstructured
kinship_fam( kinship_1, fam ) kinship_local <-
We can view and validate the resulting data in several ways. The more complete validation is to estimate kinship for all X
(all 3 generations) and see that it agrees with kinship_fam
:
# estimate kinship
popkin( X )
kinship_local_est <-# plot them together for comparison
plot_popkin(
list( kinship_local, kinship_local_est ),
titles = c('Expected', 'Estimated'),
names = TRUE,
names_cex = 0.8,
leg_width = 0.2,
mar = c(3, 2)
)
Above you can see that only founders in the pruned FAM table were kept in the output. However, in a real dataset previous generations might not be available (especially if G
were much larger than 3), so we might only have the genotype data of a subset of more recent relatives. Here we subset to keep the last generation, and compare not only to kinship_fam
, but also to the original kinship_local_G
that was calculated by sim_pedigree
to avoid pairing close relatives.
# indexes (really logicals) marking individuals in last generation,
# to subset data (which is aligned with `fam`)
fam$id %in% ids[[ G ]]
indexes_G <-# estimate kinship
popkin( X )
kinship_local_est <-# plot them together for comparison
plot_popkin(
list(
kinship_local[ indexes_G, indexes_G ],
kinship_local_est[ indexes_G, indexes_G ],
kinship_local_G
),titles = c('Expected', 'Estimated', 'Local'),
names = TRUE,
mar = c(3, 2)
)
Now we simulate data from the much more interesting case, where founders have differing mixed ancestries! This is much more like real human data. For this we use another external package, bnpsd
, to construct the admixture parameters of the founders, which also includes calculating their true population kinship matrix.
library(bnpsd)
# additional admixture parameters:
# number of ancestries
3
K <-# define population structure
# FST values for k=3 subpopulations
c(0.2, 0.4, 0.6)
inbr_subpops <-# admixture proportions from 1D geography (founders)
1 <- admix_prop_1d_linear( n, K, sigma = 1 )
admix_proportions_# add founder names to `admix_proportions_1` (will propagate to other vars of interest)
rownames( admix_proportions_1 ) <- ids[[ 1 ]]
# also name subpopulations simply as S1, S2, S3
colnames( admix_proportions_1 ) <- paste0( 'S', 1:K )
# calculate true kinship matrix of the admixed founders
# NOTE: overwrites `kinship` for unstructured founders
1 <- coanc_to_kinship( coanc_admix( admix_proportions_1, inbr_subpops ) )
kinship_
# draw genotypes from admixture model
# NOTE: overwrites `X_1` for unstructured founders
1 <- draw_all_admix( admix_proportions_1, inbr_subpops, m )$X X_
We can again verify the population kinship of the founders, which is no longer full of zeroes but rather takes on continuous values that depend on the admixture proportions of each pair of founders:
# estimate kinship (NOTE: overwrites `kinship_est_1` for unstructured founders)
1 <- popkin( X_1 )
kinship_est_# plot them together for comparison
plot_popkin(
list( kinship_1, kinship_est_1 ),
titles = c('Expected', 'Estimated'),
names = TRUE,
leg_width = 0.2,
mar = c(3, 2)
)
Note that, unlike the corresponding demonstration in the bnpsd
package, here the diagonal plots self-kinship values instead of inbreeding coefficients, as kinship matrices are the central focus of this simfam
package.
We draw genotypes across the pedigree the same way as before:
geno_fam( X_1, fam )
X <-# expected kinship of entire pruned pedigree, starting from true kinship of founders
# NOTE: DOESN'T agree with `kinship2::kinship( fam )` anymore because founders are structured!
kinship_fam( kinship_1, fam ) kinship_total <-
We again compare the expected kinship matrix to the estimate from genotypes:
# estimate kinship
popkin( X )
kinship_total_est <-# plot them together for comparison
plot_popkin(
list( kinship_total, kinship_total_est ),
titles = c('Expected', 'Estimated'),
names = TRUE,
names_cex = 0.8,
leg_width = 0.2,
mar = c(3, 2)
)
Lastly, we again subset individuals in the last generation only and compare their total to their local kinship, which are related but different things when a population is structured. Although undoubtedly the total kinship (panels A-B below) capture much of the same correlation patterns as the local kinship (panel C), the total kinship captures additional covariance due to the structure of the founders that the local kinship ignores (the local kinship uses only the pedigree information and ignores the ancestry differences of the founders).
plot_popkin(
list(
kinship_total[ indexes_G, indexes_G ],
kinship_total_est[ indexes_G, indexes_G ],
kinship_local_G
),titles = c('Total Expected', 'Total Estimated', 'Local'),
names = TRUE,
mar = c(3, 2)
)
In this admixture model the structure comes from fractional shared ancestry due to admixture as well as the relatedness between ancestries. We can look at these relatedness components in isolation from the pedigree and see that their contribution to the total kinship matrix is nearly additive (will explain more precisely shortly)! First propagate the admixture proportions across the pedigree, under the simple model that the ancestry of a child is the average of the ancestry of its parents:
# admixture proportions of pruned family
admix_fam( admix_proportions_1, fam )
admix_proportions <-
# visualize as bar plot
# for nice colors
library(RColorBrewer)
# colors for independent subpopulations
brewer.pal( K, "Paired" )
col_subpops <-# shrink default margins
par(mar = c(4, 4, 0.5, 0) + 0.2)
par_orig <-barplot(
t( admix_proportions ),
col = col_subpops,
legend.text = TRUE,
args.legend = list( cex = 0.7, bty = 'n' ),
border = NA,
space = 0,
ylab = 'Admixture prop.',
las = 2
)mtext('Individuals', side = 1, line = 3)
par( par_orig ) # reset `par`
The above figure shows that admixture proportions in the first generation vary a lot: for example, 1-1 has a large proportion of S1 ancestry, while 1-14 has mostly S3 instead. In contrast, in the third generation ancestry is much more averaged out, with fewer extremes. This effect will be weaker in larger populations, an example we go into shortly.
This code calculates the kinship expected from the admixture structure only, which will be visualized shortly.
# this is the admixture-only relatedness
coanc_to_kinship( coanc_admix( admix_proportions, inbr_subpops ) ) kinship_admix <-
We also demonstrate a good approximation of the total kinship that treats the structural (here admixture) and pedigree relatedness (in this case treating founders as unstructured) as independent effects, which roughly says that the kinship components satisfy (pseudocode):
# These quantities are independent
(1 - total) = (1 - struct) * (1 - family).
# solved for total, reveals near additivity:
total = struct + family - struct * family.
So when both the structural and family kinship values are much smaller than 1, their sum is a good match of the total kinship. Only when these values approach 1 does it become necessary to subtract their product, which is necessary since all of these values are probabilities and the total kinship cannot exceed 1. The same applies to inbreeding coefficient, but if we start from kinship matrices we need to transform the diagonal values back and forth (since, in pseudocode, kinship[i,i] = ( 1 + inbreeding[i] ) / 2
). We plot all of the matrices and verify that the approximation is very good, so the conceptual partitioning into structural and family effects is justified:
# this is the approximation of the total kinship:
# NOTE: a bit complicated because diagonal has to be transformed.
# Approximation operates on inbreeding coefficients, not self-kinship,
# so `popkin::inbr_diag` converts self-kinship to inbreeding,
# `bnpsd::coanc_to_kinship` converts inbreeding back to self-kinship.
# Off-diagonal values are unmodified by both functions.
coanc_to_kinship(
kinship_total_approx <-inbr_diag( kinship_local ) +
inbr_diag( kinship_admix ) -
inbr_diag( kinship_admix ) * inbr_diag( kinship_local )
)
# plot all model kinship matrices (no estimates from genotypes this time)
plot_popkin(
list( kinship_admix, kinship_local, kinship_total, kinship_total_approx ),
titles = c('Admixture only', 'Family only', 'Total', 'Total approx.'),
names = TRUE,
names_cex = 0.8,
leg_width = 0.2,
layout_rows = 2,
mar = c(3, 2)
)
Here we will simulate a much larger population size and a pedigree with a larger number of generations, to produce data that may better resemble real human data. We will make no attempt of visualize pedigrees or show individual labels. The whole code, redundant with our earlier analysis, is presented with minor explanations. Note, however, use of the newly-introduced functions geno_last_gen
, kinship_last_gen
, and admix_last_gen
to directly construct data for the last generation only, which also saves memory compared to the more general geno_fam
, kinship_fam
, and admix_fam
shown earlier.
library(simfam)
# data dimensions
500 # number of individuals per generation
n <- 10 # number of generations
G <- 3 # number of ancestries
K <- 5000 # number of loci
m <-
# draw the random pedigree with those properties.
sim_pedigree( n, G )
data <- data$fam
fam <- data$ids
ids <-# use this local kinship calculation in plot
data$kinship_local
kinship_local_G <-
# prune pedigree to speed up simulations/etc
prune_fam( fam, ids[[ G ]] )
fam <-
### admixture model
# define population structure
# FST values for k=3 subpopulations
c(0.2, 0.4, 0.6)
inbr_subpops <-# admixture proportions from 1D geography (founders)
1 <- admix_prop_1d_linear( n, K, sigma = 1 )
admix_proportions_# add founder names to `admix_proportions_1` (will propagate to other vars of interest)
rownames( admix_proportions_1 ) <- ids[[ 1 ]]
# also name subpopulations simply as S1, S2, S3
colnames( admix_proportions_1 ) <- paste0( 'S', 1:K )
# draw genotypes from admixture model
# admixed founders
1 <- draw_all_admix( admix_proportions_1, inbr_subpops, m )$X
X_# draw genotypes, returning last generation only
geno_last_gen( X_1, fam, ids )
X_G <-# total kinship estimated from genotypes
popkin( X_G )
kinship_total_G_est <-
# calculate true total kinship matrix
1 <- coanc_to_kinship( coanc_admix( admix_proportions_1, inbr_subpops ) )
kinship_total_ kinship_last_gen( kinship_total_1, fam, ids )
kinship_total_G <-
# kinship due to admixture only
admix_last_gen( admix_proportions_1, fam, ids )
admix_proportions_G <- coanc_to_kinship( coanc_admix( admix_proportions_G, inbr_subpops ) )
kinship_admix_G <-
# total kinship approximation from components
coanc_to_kinship(
kinship_total_approx_G <-inbr_diag( kinship_local_G ) +
inbr_diag( kinship_admix_G ) -
inbr_diag( kinship_admix_G ) * inbr_diag( kinship_local_G )
)
The first important validation is that the total structure (estimated from genotypes) agrees with the theoretical calculations based on the pedigree and the known founder total kinship. For simplicity we visualize the last generation only:
plot_popkin(
list( kinship_total_G, kinship_total_G_est ),
titles = c('Expected', 'Estimated'),
leg_width = 0.2,
mar = c(3, 2)
)
The next demonstration is that the component partitioning approximation works again (limited to last generation):
# plot all model kinship matrices (no estimates from genotypes this time)
plot_popkin(
list( kinship_admix_G, kinship_local_G, kinship_total_G, kinship_total_approx_G ),
titles = c('Admixture only', 'Family only', 'Total', 'Total approx.'),
leg_width = 0.2,
layout_rows = 2,
mar = c(3, 2)
)
Lastly, let’s determine how many individuals in each generation were ancestors of individuals in the last generation. The bar plot below shows that, in this large simulation, most individuals in the first generation (founders), and in every subsequent generation, had descendants in the last generation.
# the IDs in `fam` (the pruned FAM table) contain the individuals of interest
vector('numeric', G)
frac_per_gen <-for ( g in 1 : G ) {
# this is the desired fraction
sum( fam$id %in% ids[[ g ]] ) / n
frac_per_gen[g] <-
}# visualize as a barplot
barplot(
frac_per_gen,names.arg = 1 : G,
xlab = 'Generation',
ylab = 'Frac. individuals with descendants'
)# add a line marking the maximum fraction of individuals per generation
abline( h = 1, lty = 2, col = 'red' )