Since version 3.1.0 of geomorph, we have introduced the function
gm.prcomp
, and related utility functions
(summary()
and plot()
), for performing
principal components analyses on Procrustes shape variables for a set of
aligned specimens. This function now includes several different types of
analytical options and, combined with other visualization tools
available in geomorph, provides tools for exploring variation in shape
space. This vignette illustrates the analytical, plotting, and
visualization capacities of gm.prcomp
and other related
functions.
NOTE: The combination of the functions and
procedures illustrated here are meant to substitute and extend the
options previously available through the functions
plotTangentSpace
and plotGMPhyloMorphoSpace
,
which are no longer maintained and have thus been deprecated.
Throughout, we will be using shape data of several Plethodon species as an example, so let´s first load and superimpose those.
library(geomorph)
## Loading required package: RRPP
## Loading required package: rgl
## Loading required package: Matrix
data(plethspecies)
<- gpagen(plethspecies$land, print.progress = F) Y.gpa
One first option is to perform a “traditional” PCA, i.e. based on
OLS-centering and projection of the data, very much like what is
performed in the basic R function prcomp
. Note that this
also corresponds to the analytical part of the old (now deprecated)
geomorph function plotTangentSpace
.
<- gm.prcomp(Y.gpa$coords)
PCA summary(PCA)
##
## Ordination type: Principal Component Analysis
## Centering by OLS mean
## Orthogonal projection of OLS residuals
## Number of observations: 9
## Number of vectors 8
##
## Importance of Components:
## Comp1 Comp2 Comp3 Comp4
## Eigenvalues 0.0002720474 0.0001120524 0.0001084758 0.0000568924
## Proportion of Variance 0.4564029477 0.1879858086 0.1819855633 0.0954461044
## Cumulative Proportion 0.4564029477 0.6443887563 0.8263743196 0.9218204240
## Comp5 Comp6 Comp7 Comp8
## Eigenvalues 0.0000264508 1.260516e-05 5.959622e-06 1.584785e-06
## Proportion of Variance 0.0443754550 2.114717e-02 9.998220e-03 2.658730e-03
## Cumulative Proportion 0.9661958790 9.873431e-01 9.973413e-01 1.000000e+00
plot(PCA, main = "PCA")
Note that, by contrast to older functions, gm.prcomp
provides a much higher flexibility of plotting options, by allowing to
directly pass arguments to the plot()
R-base function,
e.g.:
plot(PCA, main = "PCA", pch = 22, bg = "green", cex = 1.5, cex.lab = 1.5, font.lab = 2)
One then has several solutions for exploring shape variation across
PC space and visualizing shape patterns. First, the user may choose to
manually produce deformation grids to compare the shapes corresponding
to the extremes of a chosen PC axis using plotRefToTarget
,
i.e.:
<- mshape(Y.gpa$coords)
msh plotRefToTarget(PCA$shapes$shapes.comp1$min, msh)
plotRefToTarget(msh, PCA$shapes$shapes.comp1$max)
or b) by comparing the two extremes to each other:
plotRefToTarget(PCA$shapes$shapes.comp1$min, PCA$shapes$shapes.comp1$max, method = "vector", mag = 2)
Of course here one can use all the plotting options available in
plotRefToTarget
. Please see the help file of that function
for details.
Alternatively, one may use the interactive function
picknplot.shape
to explore shape variation across the PC
scatterplot produced above. This allows one to click on any point in the
above scatterplot (or any scatterplot produced by geomorph) to obtain a
deformation grid illustrating the shape that corresponds to the selected
point, as compared to the global mean shape. The function then
interactively prompts the user to export the grid as an image file, if
desired, and to continue by picking another point. To explore the
function, run the code below and then click at any point inside the PCA
scatterplot:
picknplot.shape(plot(PCA), method = "vector")
Again, options of the function plotRefToTarget
are
directly available and can be passed to picknplot.shape as additional
arguments.
One may also want to project a phylogeny (if dealing with species-level observations), and estimated ancestral states into the ordination plot produced before, to obtain what is commonly referred to as a “phylomorphospace” plot (Klingenberg & Ekau 1996, Biol. J. Linn. Soc. 59: 143 - 177; Rohlf 2002, Geometric morphometrics and phylogeny. Pp. 175-193 in N. MacLeod, and P. L. Forey, eds. Morphology, shape and phylogeny. Francis & Taylor, London; Sidlauskas 2008, Evolution 62: 3135 - 3156). This can be easily done by providing a phylogenetic tree during the analysis (which is however NOT used in the analytical step in this case, other than for estimating ancestral states), and then indicating that the phylogeny should also be plotted during the plotting step. For the Plethodon example data, we may project the phylogeny into the previous ordination plot as such:
<- gm.prcomp(Y.gpa$coords, phy = plethspecies$phy)
PCA.w.phylo summary(PCA.w.phylo)
##
## Ordination type: Principal Component Analysis
## Centering by OLS mean
## Orthogonal projection of OLS residuals
## Number of observations: 9
## Number of vectors 8
##
## Importance of Components:
## Comp1 Comp2 Comp3 Comp4
## Eigenvalues 0.0002720474 0.0001120524 0.0001084758 0.0000568924
## Proportion of Variance 0.4564029477 0.1879858086 0.1819855633 0.0954461044
## Cumulative Proportion 0.4564029477 0.6443887563 0.8263743196 0.9218204240
## Comp5 Comp6 Comp7 Comp8
## Eigenvalues 0.0000264508 1.260516e-05 5.959622e-06 1.584785e-06
## Proportion of Variance 0.0443754550 2.114717e-02 9.998220e-03 2.658730e-03
## Cumulative Proportion 0.9661958790 9.873431e-01 9.973413e-01 1.000000e+00
##
##
## Dispersion (variance) of points, after projection:
## Comp1 Comp2 Comp3
## Tips Dispersion 2.720474e-04 1.120524e-04 1.084758e-04
## Proportion Tips Dispersion 4.564029e-01 1.879858e-01 1.819856e-01
## Cumulative Tips Dispersion 4.564029e-01 6.443888e-01 8.263743e-01
## Ancestors Dispersion 3.073342e-05 2.320038e-05 2.039005e-05
## Proportion Ancestors Dispersion 3.854327e-01 2.909597e-01 2.557149e-01
## Cumulative Ancestors Dispersion 3.854327e-01 6.763923e-01 9.321072e-01
## Comp4 Comp5 Comp6
## Tips Dispersion 5.689240e-05 2.645080e-05 1.260516e-05
## Proportion Tips Dispersion 9.544610e-02 4.437545e-02 2.114717e-02
## Cumulative Tips Dispersion 9.218204e-01 9.661959e-01 9.873431e-01
## Ancestors Dispersion 3.171758e-06 1.748508e-06 3.945777e-07
## Proportion Ancestors Dispersion 3.977752e-02 2.192831e-02 4.948462e-03
## Cumulative Ancestors Dispersion 9.718847e-01 9.938130e-01 9.987615e-01
## Comp7 Comp8
## Tips Dispersion 5.959622e-06 1.584785e-06
## Proportion Tips Dispersion 9.998220e-03 2.658730e-03
## Cumulative Tips Dispersion 9.973413e-01 1.000000e+00
## Ancestors Dispersion 8.354868e-08 1.520561e-08
## Proportion Ancestors Dispersion 1.047797e-03 1.906960e-04
## Cumulative Ancestors Dispersion 9.998093e-01 1.000000e+00
plot(PCA.w.phylo, phylo = TRUE, main = "PCA.w.phylo")
Note that the summary statistics obtained for this analysis are identical to those from the previous one. This is because here the phylogeny is merely used for plotting, and is NOT considered during the analytical procedures.
Again, all plotting arguments can be directly manipulated by the
user. Please see the help file of plot.gm.prcomp
for
details.
Here, the phylogeny IS considered during the analytical step of the ordination, as the principal components analysis is in this case calculated based on GLS-centering and projection of the data. For details on the analytical part of this method, see Revell 2009, Evolution 63: 3258 - 3268; Polly et al 2013, Hystrix 24: 33 - 41; Collyer & Adams, submitted.
For the Plethodon example data, this analysis would be implemented and plotted as follows (fir with untransformed residual projection and second with transformed residual projection):
# Phylo PCA without projecting untransformed residuals
<- gm.prcomp(Y.gpa$coords, phy = plethspecies$phy, GLS = TRUE)
phylo.PCA summary(phylo.PCA)
##
## Ordination type: Principal Component Analysis
## Centering by GLS mean
## Oblique projection of GLS-centered residuals
## Number of observations: 9
## Number of vectors 8
##
## Importance of Components:
## Comp1 Comp2 Comp3 Comp4
## Eigenvalues 2.761405e-05 9.322868e-06 7.889847e-06 6.605120e-06
## Proportion of Variance 4.855705e-01 1.639351e-01 1.387365e-01 1.161457e-01
## Cumulative Proportion 4.855705e-01 6.495056e-01 7.882421e-01 9.043878e-01
## Comp5 Comp6 Comp7 Comp8
## Eigenvalues 3.017886e-06 1.203779e-06 9.895110e-07 2.262231e-07
## Proportion of Variance 5.306707e-02 2.116747e-02 1.739975e-02 3.977950e-03
## Cumulative Proportion 9.574548e-01 9.786223e-01 9.960221e-01 1.000000e+00
##
##
## Dispersion (variance) of points, after projection:
## Comp1 Comp2 Comp3
## Tips Dispersion 0.0002498092 8.395185e-05 1.057114e-04
## Proportion Tips Dispersion 0.4190949497 1.408427e-01 1.773479e-01
## Cumulative Tips Dispersion 0.4190949497 5.599376e-01 7.372855e-01
## Ancestors Dispersion 0.0000169982 9.862682e-06 2.194379e-05
## Proportion Ancestors Dispersion 0.2131770701 1.236895e-01 2.752006e-01
## Cumulative Ancestors Dispersion 0.2131770701 3.368665e-01 6.120671e-01
## Comp4 Comp5 Comp6
## Tips Dispersion 0.0001027383 2.765308e-05 1.675548e-05
## Proportion Tips Dispersion 0.1723598707 4.639247e-02 2.811000e-02
## Cumulative Tips Dispersion 0.9096453329 9.560378e-01 9.841478e-01
## Ancestors Dispersion 0.0000272722 1.114653e-06 1.772175e-06
## Proportion Ancestors Dispersion 0.3420249711 1.397905e-02 2.222513e-02
## Cumulative Ancestors Dispersion 0.9540920945 9.680711e-01 9.902963e-01
## Comp7 Comp8
## Tips Dispersion 7.598972e-06 1.850019e-06
## Proportion Tips Dispersion 1.274849e-02 3.103703e-03
## Cumulative Tips Dispersion 9.968963e-01 1.000000e+00
## Ancestors Dispersion 7.068511e-07 6.689930e-08
## Proportion Ancestors Dispersion 8.864732e-03 8.389947e-04
## Cumulative Ancestors Dispersion 9.991610e-01 1.000000e+00
plot(phylo.PCA, phylo = TRUE, main = "phylo PCA")
# Phylo PCA without projecting transformed residuals
<- gm.prcomp(Y.gpa$coords, phy = plethspecies$phy, GLS = TRUE, transform = TRUE)
phylo.tPCA summary(phylo.tPCA)
##
## Ordination type: Principal Component Analysis
## Centering by GLS mean
## GLS residuals transformed for orthogonal projection
## Number of observations: 9
## Number of vectors 8
##
## Importance of Components:
## Comp1 Comp2 Comp3 Comp4
## Eigenvalues 2.761405e-05 9.322868e-06 7.889847e-06 6.605120e-06
## Proportion of Variance 4.855705e-01 1.639351e-01 1.387365e-01 1.161457e-01
## Cumulative Proportion 4.855705e-01 6.495056e-01 7.882421e-01 9.043878e-01
## Comp5 Comp6 Comp7 Comp8
## Eigenvalues 3.017886e-06 1.203779e-06 9.895110e-07 2.262231e-07
## Proportion of Variance 5.306707e-02 2.116747e-02 1.739975e-02 3.977950e-03
## Cumulative Proportion 9.574548e-01 9.786223e-01 9.960221e-01 1.000000e+00
##
##
## Dispersion (variance) of points, after projection:
## Comp1 Comp2 Comp3
## Tips Dispersion 2.740155e-05 8.385080e-06 7.889441e-06
## Proportion Tips Dispersion 4.933393e-01 1.509655e-01 1.420420e-01
## Cumulative Tips Dispersion 4.933393e-01 6.443048e-01 7.863468e-01
## Ancestors Dispersion 1.604482e-06 8.942699e-07 1.578102e-06
## Proportion Ancestors Dispersion 3.034061e-01 1.691056e-01 2.984176e-01
## Cumulative Ancestors Dispersion 3.034061e-01 4.725117e-01 7.709293e-01
## Comp4 Comp5 Comp6
## Tips Dispersion 6.590826e-06 2.909294e-06 1.175603e-06
## Proportion Tips Dispersion 1.186617e-01 5.237912e-02 2.116563e-02
## Cumulative Tips Dispersion 9.050085e-01 9.573876e-01 9.785532e-01
## Ancestors Dispersion 1.007063e-06 6.196138e-08 1.114791e-07
## Proportion Ancestors Dispersion 1.904346e-01 1.171684e-02 2.108060e-02
## Cumulative Ancestors Dispersion 9.613640e-01 9.730808e-01 9.941614e-01
## Comp7 Comp8
## Tips Dispersion 9.878523e-07 2.033663e-07
## Proportion Tips Dispersion 1.778536e-02 3.661419e-03
## Cumulative Tips Dispersion 9.963386e-01 1.000000e+00
## Ancestors Dispersion 2.711996e-08 3.755919e-09
## Proportion Ancestors Dispersion 5.128360e-03 7.102409e-04
## Cumulative Ancestors Dispersion 9.992898e-01 1.000000e+00
plot(phylo.tPCA, phylo = TRUE, main = "phylo PCA with transformed projection")
par(mfrow = c(1, 1))
Again, one can use the geomorph functions
plotRefToTarget
or picknplot.shape
to
visualize variation on the produced ordination plot, as described for
the case of the simple PCA above.
This recently introduced method (Collyer & Adams, submitted) provides an ordination that aligns phenotypic data with phylogenetic signal, by maximizing variation in directions that describe phylogenetic signal, while simultaneously preserving the the Euclidean distances among observations in the data space. PaCA provides a projection that shows the most phylogenetic signal in the first few components, irrespective of other signals in the data. By comparing PCA, phyloPCA and PaCA results, one may glean the relative importance of phylogenetic and other (ecological) signals in the data.
For the Plethodon example data, this analysis would be implemented and plotted as follows:
<- gm.prcomp(Y.gpa$coords, phy = plethspecies$phy, align.to.phy = TRUE)
PaCA summary(PaCA)
##
## Ordination type: Alignment to an alternative matrix
## Alignment matrix: phy
## Centering by OLS mean
## OLS residuals
## Alignment to phy means residual projection is not orthogonal.
## Number of observations: 9
## Number of vectors 8
## Warning in rbind(deparse.level, ...): number of columns of result, 8, is not a
## multiple of vector length 9 of arg 2
## Warning in rbind(deparse.level, ...): number of columns of result, 8, is not a
## multiple of vector length 9 of arg 3
## Importance of Components:
## Comp1 Comp2 Comp3 Comp4
## Singular Value 0.0003931629 0.0001155198 5.091258e-05 2.660786e-05
## Proportion of Covariance 0.6595936040 0.1938028696 8.541399e-02 4.463894e-02
## Cumulative Proportion 0.6595936040 0.8533964737 9.388105e-01 9.834494e-01
## RV by Component 0.0993211764 0.0291827102 1.286158e-02 6.721702e-03
## Cumulative RV 0.0993211764 0.1285038867 1.413655e-01 1.480872e-01
## Comp5 Comp6 Comp7 Comp8
## Singular Value 6.598675e-06 2.351469e-06 7.108871e-07 2.042572e-07
## Proportion of Covariance 1.107033e-02 3.944966e-03 1.192627e-03 3.426742e-04
## Cumulative Proportion 9.945197e-01 9.984647e-01 9.996573e-01 1.000000e+00
## RV by Component 1.666963e-03 5.940304e-04 1.795850e-04 5.159965e-05
## Cumulative RV 1.497541e-01 1.503482e-01 1.505278e-01 1.505794e-01
##
##
## Dispersion (variance) of points, after projection:
## Comp1 Comp2 Comp3
## Tips Dispersion 2.301865e-04 1.124810e-04 1.133102e-04
## Proportion Tips Dispersion 3.861747e-01 1.887049e-01 1.900960e-01
## Cumulative Tips Dispersion 3.861747e-01 5.748795e-01 7.649755e-01
## Ancestors Dispersion 4.069158e-05 2.437196e-05 8.067595e-06
## Proportion Ancestors Dispersion 5.103196e-01 3.056526e-01 1.011770e-01
## Cumulative Ancestors Dispersion 5.103196e-01 8.159722e-01 9.171492e-01
## Comp4 Comp5 Comp6
## Tips Dispersion 7.526878e-05 4.046474e-05 1.571235e-05
## Proportion Tips Dispersion 1.262754e-01 6.788606e-02 2.635999e-02
## Cumulative Tips Dispersion 8.912509e-01 9.591370e-01 9.854970e-01
## Ancestors Dispersion 6.244464e-06 3.539330e-07 3.508799e-09
## Proportion Ancestors Dispersion 7.831282e-02 4.438730e-03 4.400441e-05
## Cumulative Ancestors Dispersion 9.954620e-01 9.999008e-01 9.999448e-01
## Comp7 Comp8
## Tips Dispersion 6.396433e-06 2.248360e-06
## Proportion Tips Dispersion 1.073104e-02 3.771984e-03
## Cumulative Tips Dispersion 9.962280e-01 1.000000e+00
## Ancestors Dispersion 2.594059e-09 1.810230e-09
## Proportion Ancestors Dispersion 3.253251e-05 2.270238e-05
## Cumulative Ancestors Dispersion 9.999773e-01 1.000000e+00
plot(PaCA, phylo = TRUE, main = "PaCA")
Finally, substituting the corresponding option previously available
through plotGMPhyloMorphoSpace
, plot.gm.prcomp provides the
possibility of producing a 3D plot of any two PCA axes, with the
phylogenetic tree connecting the observations and time on the z-axis.
Again, different plotting parameters can be controlled to manipulate
plot aesthetics. Note, that in this case an rgl plotting device will
open for the 3D plot, but the corresponding biplot with the phylogeny
projected (option 2, above) will also be produced.
plot(PCA.w.phylo, time.plot = TRUE, pch = 22, bg = c(rep("red", 5), rep("green", 4)), cex = 2,
phylo.par = list(edge.color = "grey60", edge.width = 1.5, tip.txt.cex = 0.75,
node.labels = F, anc.states = F))