This vignette demonstrates how to use the latest version of HierDpart to do landscape genetic analysis. This package is designed for population geneticists and community geneticists to solve the gap between community ecology and population genetics - how to partition diversity across metrics and spatial scales, from genetic diversity to species diversity using an integrated method. The method, known as the unifying framework, has been explained in recent paper (Gaggiotti, O. E. et al, 2018, Evol Appl, 11:1176-1193). This package was developed to implement the hierarchical decomposition analysis using not only the common genetic metrics, such as He, Fst, allelic richness, but also new genetic metrics from the unifying framework.
Here, this vignette will briefly show how to do analyses using this package.
HierDpart can read genepop files both in haploid and diploid which are quite common in genetics.
##Install HierDpart
#Install from CRAN
#install.packages("HierDpart")
## or you can get the latest version of HierDpart from github
#library(devtools)
#install_github("xinghuq/HierDpart")
library("HierDpart")
#> Registered S3 method overwritten by 'spdep':
#> method from
#> plot.mst ape
#> Registered S3 method overwritten by 'GGally':
#> method from
#> +.gg ggplot2
#> Registered S3 method overwritten by 'pegas':
#> method from
#> print.amova ade4
# example genepop file
f <- system.file('extdata',package='HierDpart')
infile <- file.path(f, "Island.gen")
##Function to calculate genetic diversity profiles (q=0,1,2) qD is the function calculating genetic diversity profile (qD, q=0,1,2), the true diversities. If we want to calculate the allelic diversity profiles, we can do the following command line:
qD(infile,q=0,ncode=3)
#> pop1 pop2 pop3 pop4 pop5 pop6 pop7 pop8 pop9 pop10 pop11 pop12 pop13
#> locus1 2 3 2 3 3 2 2 5 5 4 5 3 3
#> locus2 9 9 8 7 7 7 8 4 5 5 6 7 6
#> locus3 8 9 8 7 8 8 8 8 9 9 7 6 6
#> locus4 8 8 8 7 5 6 7 8 10 9 8 5 6
#> locus5 7 7 9 6 7 5 10 6 9 6 7 7 7
#> locus6 4 6 5 6 6 3 5 4 6 6 6 5 4
#> locus7 6 5 6 6 8 7 4 6 7 4 5 4 4
#> locus8 9 8 9 9 9 8 7 8 6 9 7 5 5
#> locus9 5 6 8 6 8 6 6 6 6 7 6 5 4
#> locus10 9 8 7 7 8 6 6 6 6 8 7 7 6
#> pop14 pop15 pop16
#> locus1 6 3 4
#> locus2 5 5 5
#> locus3 7 7 7
#> locus4 9 9 10
#> locus5 8 9 8
#> locus6 5 6 5
#> locus7 5 4 5
#> locus8 3 6 4
#> locus9 8 8 6
#> locus10 6 7 8
You can choose which diversity you want to get by defining the value of q.
##Plot diversity profiles If we want to plot the diversity profiles, we can do:
##Hierarchical diversity and differentiation decomposition For geneticists studying landscape genetics, they may want to partition the metapopulation into different subpopulation, thus having several hierarchical groups. Now I am showing how to decompose genetic diversity into different hierarchical levels.
Here it is easy to define or delimit your structure (hierarchy), type ?HierAr will show you how to do that. Here, “nreg” is the number of aggregate ( or population) in your metapopulation, and “r” is the number of subaggregate (or subpopulation) in each aggregate.
HAr=HierAr(infile, nreg=4, r=c(7,4,2,3), ncode=3)
print(HAr)
#> $Ar_pop
#> Arpop 1 Arpop 2 Arpop 3 Arpop 4 Arpop 5 Arpop 6 Arpop 7 Arpop 8 Arpop 9
#> locus1 2 3 2 3 3 2 2 5 5
#> locus2 9 9 8 7 7 7 8 4 5
#> locus3 8 9 8 7 8 8 8 8 9
#> locus4 8 8 8 7 5 6 7 8 10
#> locus5 7 7 9 6 7 5 10 6 9
#> locus6 4 6 5 6 6 3 5 4 6
#> locus7 6 5 6 6 8 7 4 6 7
#> locus8 9 8 9 9 9 8 7 8 6
#> locus9 5 6 8 6 8 6 6 6 6
#> locus10 9 8 7 7 8 6 6 6 6
#> Arpop 10 Arpop 11 Arpop 12 Arpop 13 Arpop 14 Arpop 15 Arpop 16
#> locus1 4 5 3 3 6 3 4
#> locus2 5 6 7 6 5 5 5
#> locus3 9 7 6 6 7 7 7
#> locus4 9 8 5 6 9 9 10
#> locus5 6 7 7 7 8 9 8
#> locus6 6 6 5 4 5 6 5
#> locus7 4 5 4 4 5 4 5
#> locus8 9 7 5 5 3 6 4
#> locus9 7 6 5 4 8 8 6
#> locus10 8 7 7 6 6 7 8
#>
#> $Hierareco
#> ecosystem
#> locus1 6
#> locus2 11
#> locus3 13
#> locus4 11
#> locus5 14
#> locus6 7
#> locus7 10
#> locus8 13
#> locus9 13
#> locus10 11
#>
#> $Ar_reg
#> Ar_region 1 Ar_region 2 Ar_region 3 Ar_region 4
#> pop 1 9.2 8.7 6 7.3
#>
#> $Hierar_loc
#> Ar_region 1 Ar_region 2 Ar_region 3 Ar_region 4
#> locus1 4 6 3 6
#> locus2 10 8 8 5
#> locus3 12 11 6 7
#> locus4 9 10 7 10
#> locus5 13 10 8 10
#> locus6 6 7 5 6
#> locus7 8 7 6 5
#> locus8 10 12 5 6
#> locus9 11 8 5 10
#> locus10 9 8 7 8
#>
#> $Ar_ovell
#> Art Arr Arp
#> ecosystem 10.9 7.8 6.3375
HiD=HierD(infile, nreg=4, r=c(7,4,2,3), ncode=3)
print(HiD)
#> D_gamma D_alpha.2 D_alpha.1 D_beta.2 D_beta.1 Differentiation.2
#> Locus 1 3.193142 2.725683 2.547081 1.171501 1.070120 0.1234638
#> Locus 2 8.127991 5.570462 4.756040 1.459123 1.171240 0.2947130
#> Locus 3 9.182470 7.039348 5.851804 1.304449 1.202936 0.2073097
#> Locus 4 8.384311 6.830226 5.819550 1.227531 1.173669 0.1599041
#> Locus 5 8.815656 6.014476 5.031749 1.465740 1.195305 0.2982420
#> Locus 6 4.466209 3.846453 3.517153 1.161124 1.093627 0.1165235
#> Locus 7 6.466375 4.320795 3.849778 1.496570 1.122349 0.3144786
#> Locus 8 8.193804 5.671722 4.805933 1.444677 1.180150 0.2869518
#> Locus 9 9.424112 5.279522 4.494791 1.785031 1.174587 0.4519619
#> Locus 10 8.429114 5.828401 5.117923 1.446214 1.138821 0.2877814
#> Differentiation.1
#> Locus 1 0.04546744
#> Locus 2 0.10604372
#> Locus 3 0.12395858
#> Locus 4 0.10743385
#> Locus 5 0.11968905
#> Locus 6 0.06004503
#> Locus 7 0.07743759
#> Locus 8 0.11112839
#> Locus 9 0.10795825
#> Locus 10 0.08721245
The result returns a list of hierarchical metrics, the first five metrics are diversities. It explains in the below section.
He=HierHe(infile, nreg=4, r=c(7,4,2,3), ncode=3)
print(He)
#> $HierHe_perloc
#> Ht Hr 1 Hr 2 Hr 3 Hr 4 Hp
#> Locus 1 0.5937 0.4725 0.7302 0.4131 0.6587 0.5529
#> Locus 2 0.8549 0.8508 0.6745 0.7738 0.7518 0.7665
#> Locus 3 0.8781 0.8251 0.8675 0.7884 0.8342 0.8173
#> Locus 4 0.8595 0.8164 0.8765 0.7348 0.8702 0.8149
#> Locus 5 0.8488 0.7230 0.7936 0.7808 0.8541 0.7557
#> Locus 6 0.7036 0.6288 0.7947 0.6910 0.5559 0.6546
#> Locus 7 0.8052 0.7524 0.7951 0.3291 0.6892 0.6873
#> Locus 8 0.8410 0.8593 0.8014 0.6530 0.5448 0.7424
#> Locus 9 0.8758 0.8058 0.7774 0.5692 0.7589 0.7355
#> Locus 10 0.8592 0.7907 0.7548 0.8044 0.8200 0.7754
#>
#> $HierHr
#> Ho Hs Ht Dst Htp Dstp Fst Fstp Fis Dest
#> 1 0.7450 0.7320 0.7525 0.0205 0.7559 0.0239 0.0272 0.0316 -0.0178 0.0892
#> 2 0.7512 0.7726 0.7866 0.0139 0.7912 0.0186 0.0177 0.0235 0.0277 0.0817
#> 3 0.6750 0.6521 0.6537 0.0016 0.6554 0.0033 0.0025 0.0050 -0.0351 0.0094
#> 4 0.7183 0.7217 0.7338 0.0121 0.7398 0.0181 0.0164 0.0245 0.0047 0.0650
#>
#> $Hrperloc
#> [,1] [,2] [,3] [,4]
#> [1,] 0.4725 0.7302 0.4131 0.6587
#> [2,] 0.8508 0.6745 0.7738 0.7518
#> [3,] 0.8251 0.8675 0.7884 0.8342
#> [4,] 0.8164 0.8765 0.7348 0.8702
#> [5,] 0.7230 0.7936 0.7808 0.8541
#> [6,] 0.6288 0.7947 0.6910 0.5559
#> [7,] 0.7524 0.7951 0.3291 0.6892
#> [8,] 0.8593 0.8014 0.6530 0.5448
#> [9,] 0.8058 0.7774 0.5692 0.7589
#> [10,] 0.7907 0.7548 0.8044 0.8200
#>
#> $HieHr_overall
#> Ht Hr Hp
#> [1,] 0.812 0.73165 0.7302
Besides the hierarchical diversity, this package also calculates hierarchical genetic differentiation measured by Fst and Delta D. The Delta D here, which corresponds to q=1, measures allelic differentiation. Now we are showing how to calculate the traditional hierarchical Fst and new hierarchical Delta D.
For Fst:
Hier_Fst=HierFst(infile, nreg=4, r=c(7,4,2,3), ncode=3)
print(Hier_Fst)
#> region pop Ind
#> Total 0.1077832 0.13154297 0.128481750
#> region 0.0000000 0.02663001 0.023198983
#> pop 0.0000000 0.00000000 -0.003524894
This gives you the Fst among subaggregate within in aggregate. See ?HierFst for more details.
For DeltaD:
Hier_D=HierD(infile, nreg=4, r=c(7,4,2,3), ncode=3)
print(Hier_D)
#> D_gamma D_alpha.2 D_alpha.1 D_beta.2 D_beta.1 Differentiation.2
#> Locus 1 3.193142 2.725683 2.547081 1.171501 1.070120 0.1234638
#> Locus 2 8.127991 5.570462 4.756040 1.459123 1.171240 0.2947130
#> Locus 3 9.182470 7.039348 5.851804 1.304449 1.202936 0.2073097
#> Locus 4 8.384311 6.830226 5.819550 1.227531 1.173669 0.1599041
#> Locus 5 8.815656 6.014476 5.031749 1.465740 1.195305 0.2982420
#> Locus 6 4.466209 3.846453 3.517153 1.161124 1.093627 0.1165235
#> Locus 7 6.466375 4.320795 3.849778 1.496570 1.122349 0.3144786
#> Locus 8 8.193804 5.671722 4.805933 1.444677 1.180150 0.2869518
#> Locus 9 9.424112 5.279522 4.494791 1.785031 1.174587 0.4519619
#> Locus 10 8.429114 5.828401 5.117923 1.446214 1.138821 0.2877814
#> Differentiation.1
#> Locus 1 0.04546744
#> Locus 2 0.10604372
#> Locus 3 0.12395858
#> Locus 4 0.10743385
#> Locus 5 0.11968905
#> Locus 6 0.06004503
#> Locus 7 0.07743759
#> Locus 8 0.11112839
#> Locus 9 0.10795825
#> Locus 10 0.08721245
Here we get alpha, beta, gamma diversity. D alpha1, D alpha2 and Dgamma correspond to number of equivalent alleles at population level, number of equivalent alleles at regional level, and total number of equivalent alleles at ecosystem. Beta.1 is the number of equivalent subaggregate (population or subpopulation depends on the hierarchy), and beta.2 is the number of equivalent aggregate (eg. regions). Differentiation.2 represents genetic differentiation between aggragates, Differentiation.1 represents genetic differentiation between subaggregates.
Untill now you may know how to implement the hierarchical diversity analysis. Furthermore, this package also includes several functions that can plot correlation between geographic distances and genetic differentiations, (see COR_Fstd, and COR_DeltaDd), as well as Delta D matrix and Delta D across loci.
’Gaggiotti, O. E., Chao, A., Peres-Neto, P., Chiu, C. H., Edwards, C., Fortin, M. J., … & Selkoe, K. A. (2018). Diversity from genes to ecosystems: A unifying framework to study variation across biological metrics and scales. Evolutionary Applications.
’Weir, B.S. (1996) Genetic Data Analysis II. Sinauer Associates.
’Yang, R.C. (1998). Estimating hierarchical F-statistics. Evolution 52 (4):950-956