03: Day 3 Material

General Introduction

Today we will briefly introduce some geomorph functions for performing the analyses described yesterday. Note: You should be able to read in your data and perform the analyses from earlier this week!

IMPORTANT NOTE!!! With GLM and pairwise comparisons via RRPP, one’s world of GM analyses really opens up! Be sure to go through the help files in detail to see all of the myriad ways these components of the toolkit may be applied. Here is probably the place where one’s sophistication catapults when using the RRPP Permutational Paradigm we advocate.

1: GLM: ANOVA

Here is a simple example of ANOVA using a linear model:

data(pupfish) # GPA already performed

fit <- procD.lm(coords ~ Sex * Pop, 
                 data = pupfish, iter = 999)
anova(fit)

## 
## Analysis of Variance, using Residual Randomization
## Permutation procedure: Randomization of null model residuals 
## Number of permutations: 1000 
## Estimation method: Ordinary Least Squares 
## Sums of Squares and Cross-products: Type I 
## Effect sizes (Z) based on F distributions
## 
##           Df       SS        MS     Rsq      F      Z Pr(>F)    
## Sex        1 0.015780 0.0157802 0.28012 28.209 4.7773  0.001 ***
## Pop        1 0.009129 0.0091294 0.16206 16.320 4.7097  0.001 ***
## Sex:Pop    1 0.003453 0.0034532 0.06130  6.173 3.7015  0.001 ***
## Residuals 50 0.027970 0.0005594 0.49651                         
## Total     53 0.056333                                           
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Call: procD.lm(f1 = coords ~ Sex * Pop, iter = 999, data = pupfish)

fitm <- manova.update(fit, print.progress = FALSE)
summary(fitm, test = "Pillai")

## 
## Linear Model fit with lm.rrpp
## 
## Number of observations: 54
## Number of dependent variables: 112
## Data space dimensions: 53
## Residual covariance matrix rank: 50
## Sums of Squares and Cross-products: Type I
## Number of permutations: 1000 
## 
##            Df Rand      Pillai    Z          Pr(>Pillai)
## Sex         1 Residuals 0.9466190  4.9765825 0.001      
## Pop         1 Residuals 0.9805254  1.3801816 0.070      
## Sex:Pop     1 Residuals 0.9270745 -0.3379822 0.706      
## Full.Model  3 Residuals 2.8339784  9.6895441 0.001      
## Residuals  50

Pairwise Comparisons: SIMPLE EXAMPLE

group <- interaction(pupfish$Pop, pupfish$Sex)
PW <- pairwise(fit, groups = group)
summary(PW)

## 
## Pairwise comparisons
## 
## Groups: Marsh.F Sinkhole.F Marsh.M Sinkhole.M 
## 
## RRPP: 1000 permutations
## 
## LS means:
## Vectors hidden (use show.vectors = TRUE to view)
## 
## Pairwise distances between means, plus statistics
##                                d  UCL (95%)          Z Pr > d
## Marsh.F:Sinkhole.F    0.03302552 0.03312516  1.6124629  0.055
## Marsh.F:Marsh.M       0.04611590 0.04297378  2.3159417  0.007
## Marsh.F:Sinkhole.M    0.03881514 0.04633821 -0.5421145  0.699
## Sinkhole.F:Marsh.M    0.04605211 0.05506197 -0.2523753  0.597
## Sinkhole.F:Sinkhole.M 0.02568508 0.04364031 -2.2111968  0.993
## Marsh.M:Sinkhole.M    0.02802087 0.03343901  0.1854026  0.420

2: PCMS

Here is a simple example of some common phylogenetic comparative analyses.

**IMPORTANT: Recall that one must have a strict 1:1 match between the data and the phylogeny. The first bit of this code is ‘pre-processing’ to ensure that match (the function `treedata` is one way to accomplish this, as shown below).

library(geiger)

## Loading required package: ape

## Loading required package: phytools

## Loading required package: maps

plethtree <- read.tree('Data/plethtree.tre')

plethtree <- read.tree('Data/plethtree.tre')
plethland <- readland.tps('Data/PlethodonLand.tps',specID = "ID",
                          warnmsg = FALSE)

## 
## No curves detected; all points appear to be fixed landmarks.

gps <- read.csv('Data/PlethGps.csv', header=TRUE, row.names=1)
Y.gpa <- gpagen(plethland, print.progress = FALSE)
M <- mshape(Y.gpa$coords)
size <- Y.gpa$Csize

shape <- Y.gpa$coords
shape.test <- treedata(phy = plethtree, data = two.d.array(shape), warnings = TRUE)
  #no warnings. Everything matches in this case

data.matched <- treedata(phy = plethtree, data = gps, warnings=FALSE)
elev <- as.factor(data.matched$data); names(elev) <- row.names(data.matched$data)

gdf <- geomorph.data.frame(shape = shape, size = size, elev = elev, plethtree = plethtree)

links <- matrix(c(4,3,2,1,1,6,7,8,9,10,1,1,11,5,5,4,2,3,7,8,9,10,11,9,10,1),
                ncol=2, byrow=FALSE)
plot(ladderize(plethtree),edge.width=3)
axisPhylo(1)

Phylogenetic Generalized Least Squares (PLGS): Linear Models

One example:

pgls.reg <- procD.pgls(f1 = shape~size, phy=plethtree, data=gdf, print.progress = FALSE)
summary(pgls.reg)

## 
## Analysis of Variance, using Residual Randomization
## Permutation procedure: Randomization of null model residuals 
## Number of permutations: 1000 
## Estimation method: Generalized Least-Squares (via OLS projection) 
## Sums of Squares and Cross-products: Type I 
## Effect sizes (Z) based on F distributions
## 
##           Df        SS         MS     Rsq      F      Z Pr(>F)  
## size       1 0.0006581 0.00065811 0.07046 3.0323 1.9503  0.028 *
## Residuals 40 0.0086814 0.00021704 0.92954                       
## Total     41 0.0093395                                          
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Call: procD.lm(f1 = shape ~ size, iter = iter, seed = seed, RRPP = TRUE,  
##     SS.type = SS.type, effect.type = effect.type, int.first = int.first,  
##     Cov = Cov, data = data, print.progress = print.progress)

allom.plot <- plot(pgls.reg, type = "regression", predictor = gdf$size,
                   reg.type ="RegScore", pch=19, cex=1.5, xlab = "size") # make sure to have a predictor 
fit.line <- lm(allom.plot$RegScore ~ gdf$size)
abline(fit.line, col = "red")

Phylogenetic Signal

PS.shape <- physignal(A=shape,phy=plethtree,iter=999, print.progress = FALSE)
summary(PS.shape)

## 
## Call:
## physignal(A = shape, phy = plethtree, iter = 999, print.progress = FALSE) 
## 
## 
## 
## Observed Phylogenetic Signal (K): 0.392
## 
## P-value: 0.001
## 
## Based on 1000 random permutations
## 
##  Use physignal.z to estimate effect size.

plot(PS.shape)

Phylogenetic Ordination

Phylomorphospace

PCA <- gm.prcomp(shape, phy=plethtree)
plot(PCA,phylo = TRUE, pch=21, bg=gdf$elev, cex=2, phylo.par = list(tip.labels = FALSE, node.labels = FALSE) )
legend("topleft", pch=21, pt.bg = unique(gdf$elev), legend = levels(gdf$elev))

Phylogenetic PCA (pPCA): With GLS-centered residuals

This does not remove phylogenetic signal.

pPCA <- gm.prcomp(shape, phy=plethtree, GLS = TRUE, transform = FALSE)
plot(pPCA,phylo = TRUE, pch=21, bg=gdf$elev, cex=2, phylo.par = list(tip.labels = FALSE, node.labels = FALSE) )
legend("topleft", pch=21, pt.bg = unique(gdf$elev), legend = levels(gdf$elev))

Phylogenetic PCA (pPCA): With GLS-transformed residuals

This REMOVES phylogenetic signal.

pPCA2 <- gm.prcomp(shape, phy=plethtree, GLS = TRUE, transform = TRUE)
plot(pPCA2,phylo = TRUE, pch=21, bg=gdf$elev, cex=2, phylo.par = list(tip.labels = FALSE, node.labels = FALSE) )
legend("topleft", pch=21, pt.bg = unique(gdf$elev), legend = levels(gdf$elev))

Phylogenetically-Aligned Components Analysis (PACA)

PACA <- gm.prcomp(shape, phy=plethtree, align.to.phy = TRUE)
plot(PACA, phylo = TRUE, pch=21, bg=gdf$elev, cex=2, phylo.par = list(tip.labels = FALSE, node.labels = FALSE) )
legend("topleft", pch=21, pt.bg = unique(gdf$elev), legend = levels(gdf$elev))

Comparing Evolutionary Rate Models

ER<-compare.evol.rates(A=gdf$shape, phy=plethtree,gp=gdf$elev,iter=999, method = 'permutation',print.progress = FALSE)
summary(ER)

## 
## Call:
## 
## 
## Observed Rate Ratio: 1.0133
## 
## P-value: 0.964
## 
## Effect Size: -1.5892
## 
## Based on 1000 random permutations
## 
## The rate for group High is 1.00093880316058e-05  
## 
## The rate for group Low is 1.01425751777744e-05

plot(ER)

EMR <- compare.multi.evol.rates(A=gdf$shape, phy=plethtree, gp=c(rep(1,5),rep(2,6)), print.progress = FALSE)
summary(EMR)

## 
## Call:
## 
## 
## Observed Rate Ratio: 1.0826
## 
## P-value: 0.976
## 
## Effect Size: -1.938
## 
## Based on 1000 random permutations
## 
## The rate for group 1 is 9.67170500545254e-06  
## 
## The rate for group 2 is 1.04710160170649e-05

plot(EMR)

3: Analysis of Symmetry

Here is a simple example of an analysis of symmetry:

Matching Symmetry

data(mosquito)
Y.gpa <- gpagen(mosquito$wingshape, print.progress = FALSE)
plot(Y.gpa)

mosquito.sym <- bilat.symmetry(A = Y.gpa, ind = mosquito$ind, side=mosquito$side, 
                           object.sym = FALSE, print.progress = FALSE)
summary(mosquito.sym)

## 
## Call:
## bilat.symmetry(A = Y.gpa, ind = mosquito$ind, side = mosquito$side,  
##     object.sym = FALSE, print.progress = FALSE) 
## 
## 
## Symmetry (data) type: Matching 
## 
## Type I (Sequential) Sums of Squares and Cross-products
## Randomized Residual Permutation Procedure Used
## 1000 Permutations
## 
## Shape ANOVA
##          Df       SS        MS     Rsq      F       Z Pr(>F)    
## ind       9 0.104888 0.0116542 0.45533 2.7646  4.7037  0.001 ***
## side      1 0.003221 0.0032209 0.01398 0.7641 -0.3558  0.635    
## ind:side 29 0.122249 0.0042155 0.53069                          
## Total    39 0.230358                                            
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Centroid Size ANOVA
##          Df         SS         MS     Rsq      F        Z Pr(>F)
## ind       9 4.1496e-09 4.6107e-10 0.18555 0.7484 -0.42684  0.665
## side      1 3.4740e-10 3.4738e-10 0.01553 0.5638  0.15784  0.471
## ind:side 29 1.7867e-08 6.1609e-10 0.79891                       
## Total    39 2.2364e-08

Object Symmetry

data('lizards')
Y.gpa <- gpagen(lizards$coords, print.progress = FALSE)
plot(Y.gpa)

lizard.sym <- bilat.symmetry(A = Y.gpa, ind = lizards$ind, replicate = lizards$rep,
                           object.sym = TRUE, land.pairs = lizards$lm.pairs, print.progress = FALSE)
summary(lizard.sym)

## 
## Call:
## bilat.symmetry(A = Y.gpa, ind = lizards$ind, replicate = lizards$rep,  
##     object.sym = TRUE, land.pairs = lizards$lm.pairs, print.progress = FALSE) 
## 
## 
## 
## Symmetry (data) type: Object 
## 
## Type I (Sequential) Sums of Squares and Cross-products
## Randomized Residual Permutation Procedure Used
## 1000 Permutations
## 
## Shape ANOVA
##                     Df       SS        MS     Rsq       F       Z Pr(>F)    
## ind                 48 0.236788 0.0049331 0.83194  7.3721 -0.1011  0.536    
## side                 1 0.009432 0.0094317 0.03314 14.0951  3.7540  0.001 ***
## ind:side            48 0.032119 0.0006692 0.11285 10.4367 19.7078  0.001 ***
## ind:side:replicate  98 0.006283 0.0000641 0.02208                           
## Total              195 0.284622                                             
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(lizard.sym, warpgrids = TRUE)

4: Measurement Error

Note that the function, measurement.error is a general function in the RRPP package. It assumes data are in the form of a matrix. One could use the function, gm.measurement.error, in geomorph, which assumes data are in the form of a 3D array, as shape data usually are. The functions do the same analyses. The example below uses measurement.error but the help file for gm.measurement.error shows how the same example could be run with shape data in a 3D array.

Here is a simple example of the analysis of measurement error:

data(fishy)
ME2 <- measurement.error(
  Y = "coords",
  subjects = "subj",
  replicates = "reps",
  groups = "groups",
  data = fishy)

anova(ME2)

## 
## Analysis of Variance, using Residual Randomization
## Permutation procedure: Randomization of null model residuals 
## Number of permutations: 1000 
## Estimation method: Ordinary Least Squares 
## Sums of Squares and Cross-products: Type Within-subject II 
## Effect sizes (Z) based on SNR distributions
## 
##                       Df      SS        MS     Rsq EtaSq.ME    SNR       Z
## Subjects              59 1.18779 0.0201320 0.95788          39.534 17.7458
## Systematic ME          1 0.00303 0.0030348 0.00245  0.08002  0.101  4.6930
## Systematic ME:Groups   2 0.00485 0.0024231 0.00391  0.12778  0.161  6.5764
## Random ME             57 0.03004 0.0005271 0.02423  0.79220               
## Total                119 1.24002                                          
##                      Pr(>SNR)    
## Subjects                0.001 ***
## Systematic ME           0.001 ***
## Systematic ME:Groups    0.001 ***
## Random ME                        
## Total                            
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Call: Y ~ subjects + groups * replicates

ICCstats(ME2, subjects = "Subjects", 
  with_in = "Systematic ME", groups = "groups")

##                       Df      SS        MS     Rsq EtaSq.ME    SNR       Z
## Subjects              59 1.18779 0.0201320 0.95788          39.534 17.7458
## Systematic ME          1 0.00303 0.0030348 0.00245  0.08002  0.101  4.6930
## Systematic ME:Groups   2 0.00485 0.0024231 0.00391  0.12778  0.161  6.5764
## Random ME             57 0.03004 0.0005271 0.02423  0.79220               
## Total                119 1.24002                                          
##                      Pr(>SNR)    
## Subjects                0.001 ***
## Systematic ME           0.001 ***
## Systematic ME:Groups    0.001 ***
## Random ME                        
## Total                            
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## 
## ICC stats based on dispersion of values
## 
##       ICC     ICC_a     ICC_c 
## 0.9441794 0.9444302 0.9483017

P <- plot(ME2)

focusMEonSubjects(P, subjects = 18:20, shadow = TRUE)

5: Things to Explore on Your Own

Remember Days 1 - 3
Perform phylogenetic ANOVA, phylogenetic PLS, and other PCMs.
Visualize shapes from everything!
Plot summaries from these analyses to accompany the statistical findings
Explore function options and output (there are many!)