class: center, middle, inverse, title-slide .title[ # 14: Recent GMM Advances and Prospectus ] .author[ ### ] --- ### Week in Review: Overarching Goal + Our goals for the week were: + 1: Learn how to quantify anatomical shapes using points, curves, and surfaces + 2: Generate a set of shape variables from these data + 3: Statistically evaluate hypotheses of shape variation and covariation using robust statistical methods .pull-left[ + Accomplishing these goals required use of the **Procrustes Paradigm** <img src="LectureData/01.Intro/GenGMProtocol.png" width="90%" style="display: block; margin: auto;" /> ] .pull-right[ + This, then, **must** be coupled with **RRPP and effect size (Z-score) evaluation** for shape analysis: `$$z =\frac{\log(F) - \mu_{\log(F)}}{\sigma_{\log(F)}}$$` ] --- ### Recall the RRPP Procedure `\(^1\)` + For virtually all shape-based hypotheses, we used permutation procedures (RRPP). This is embodied as: .pull-left[ 1: Fit `\(\small\mathbf{X}_{R}\)` for each `\(\small\mathbf{X}_{F}\)`; Estimate `\(\small\hat{\mathbf{Z}}_{R}\)` and `\(\small\mathbf{E}_{R}\)` 2: Permute, `\(\small\mathbf{E}_{R}\)`: obtain pseudo-values as: `\(\small\mathbf{\mathcal{Z}} = \mathbf{\hat{Z}}_{R} + \mathbf{E}_{R}\)` 3: Fit `\(\small\mathbf{X}_{F}\)` using `\(\small\mathbf{\mathcal{Z}}\)`: obtain coefficients and summary statistics 4: Calculate `\(\small{F}\)`-value (or other test statistic) in every random permutation (observed case counts as one permutation) 5: For `\(\small{n}\)` permutations, `\(\small{P} = \frac{n(F_{random} \geq F_{obs})}{n}\)` ] .pull-right[ 6: Calculate *effect size* as a standard deviate of the observed value in a normalized distribution of random values (helps for comparing effects within and between models); i.e.: `$$\small{z} = \frac{ \log\left( F\right) - \mu_{\log\left(F\right)} } { \sigma_{\log\left(F\right)} }$$` where `\(\small\mu_{\log\left(F\right)}\)` and `\(\sigma_{\log\left(F\right)}\)` are the expected value and standard deviation from the sampling distribution, respectively. ] .footnote[ 1: Collyer et al. *Heredity.* (2015); Adams & Collyer. *Evolution.* (2016); Adams & Collyer. *Evolution.* (2018) 2: Important! RRPP is *not* constrained by *n:p* ratios and in fact is quite useful when `\(\small{n}\ll{p}\)` (displays high power in such cases) ] --- ### Procrustes Paradigm and RRPP: Present Applications What we learned this week: Procrustes + RRPP with effect sizes ( `\(Z\)`-scores) provides the tools to evaluate an inordinately large breadth of biological hypotheses related to shape variation. A partial list includes: + 1: General linear model (GLM) questions: + Do groups differ in shape (manova)? + Is there covariation between shape and a continuous variable (regression)? + Is there variation in shape across multiple effects (factorial models)? + Is there an association between shape & `\(\small\mathbf{X}\)` while accounting for phylogeny (PGLS models)? + 2: 'Advanced' GLM questions: + Which groups differ in shape; which slopes differ in terms of shape covariation (pairwise comparisons)? + Do pairs of groups differ in their magnitude or direction of shape change (trajectory analysis)? + Are there differences in the degree of shape disparity (dispersion) among groups? --- ### Procrustes Paradigm and RRPP: Present Applications (Cont.) + 3: Covariation questions: + Does shape covary with another set of variables (partial least squares - PLS)? + Is the degree of covariation greater in one dataset than another (z-score comparisons)? + 4: Phylogenetic questions: + What is the degree of phylogenetic signal in shape? + Does shape covary with other variables while accounting for phylogenetic relatedness (PGLS and P-PLS)? + Do rates of phenotypic shape evolution differ? + 5: GM-specific questions: + Is there shape asymmetry? Is it directional or fluctuating? + For repeated shapes, is measurement error a concern? + Is there modular signal in my data? Integrated signal? + Is the degree of modularity (or integration) greater in one dataset relative to another? + Are there allometric patterns of shape variation? ###### NOTE: A huge advantage of what we learned this week is the visualization of patterns of shape variation (i.e., statistical plots), combined with visualizations of shape deformations (predicted values, group means, etc.) --- ### What's Left to Do? + Given the breadth of topics we covered, one might reasonably ask "In terms of theory, is there anything left to do?" + Can empiricists expect additional novel approaches to be developed going forward, that help biologists address new and important questions? -- + We argue the answer to this question is **YES**!! + Here are some recent '*Up and Coming*' methods from Dean and Mike, as well as our musings on other possible areas of future work --- ### Current 1: Dimensions of Phylogenetic Signal + The degree to which phenotypic similarity associates with phylogenetic relatedness <img src="LectureData/14.prospectus/PhySigConcept.png" width="40%" style="display: block; margin: auto;" /> + For multivariate data, we have `\(K_{mult}=\frac{\mathbf{D}_{\mathbf{Z},E(\mathbf{Z})}^{T}\mathbf{D}_{\mathbf{Z},E(\mathbf{Z})}}{\mathbf{PD}_{\tilde{\mathbf{Z}},0}^{T}\mathbf{PD}_{\tilde{\mathbf{Z}},0}}/\frac{tr(\mathbf\Omega)-N(\mathbf{1}^{T}\mathbf{\Omega1})^{-1}}{N-1}\)` --- ### Current 1: Dimensions of Phylogenetic Signal + 100s of GMM (and other) studies have utilized `\(K_{mult}\)`: .pull-left[ <img src="LectureData/14.prospectus/PhySigWeakHist.png" width="90%" style="display: block; margin: auto;" /> + Such patterns often interpreted as "Significant, but weak phylogenetic signal" ] -- .pull-right[ <img src="LectureData/14.prospectus/PhySigSimul.png" width="90%" style="display: block; margin: auto;" /> + But phylogenetic signal could be "concentrated" in a few dimensions ] + How do we evaluate dimensions of phylogenetic signal in multivariate data? --- ### Current 1: Dimensions of Phylogenetic Signal: Computations + Re-examine `\(K\)` and decompose it into component dimensions$^1$ + Numerator of `\(K_{mult}\)` is ratio of covariance matrices: `$$\frac{tr\left((\mathbf{Z}-E(\mathbf{Z}))^T(\mathbf{Z}-E(\mathbf{Z})\right)}{tr\left((\mathbf{Z}-E(\mathbf{Z}))^T\mathbf\Omega^{-1}(\mathbf{Z}-E(\mathbf{Z}))\right)} = \frac{tr(SSCP_0)}{tr(SSCP_{\Omega})}$$` -- So rather than finding the trace of each component separately, summarize them as: `$$K= \frac{SSCP_0}{SSCP_{\Omega}} = SSCP_{\Omega}^{-1}SSCP_0$$` This is a *relative eigenanalysis* of `\(SSCP_0\)` relative to `\(SSCP_{\Omega}\)` .footnote[1. Mitteroecker, Collyer, Adams. (2024: In Press). *Syst.Biol.*] --- ### Current 1: Dimensions of Phylogenetic Signal: Computations + Perform *relative eigenanalysis* of `\(SSCP_0\)` relative to `\(SSCP_{\Omega}\)` `$$K= \frac{SSCP_0}{SSCP_{\Omega}} = SSCP_{\Omega}^{-1}SSCP_0$$` + Next, decompose as: `\(K=\mathbf{E}\boldsymbol{\Lambda}\mathbf{E}^T\)` + Diagonal elements of `\(\boldsymbol{\Lambda}\)` are eigenvalues + Summarize eigenvalues: `\(tr(K)=\sum\delta_i\)` and `\(det(K)=\prod\delta_i\)` + Evaluate statistically via permutation + Note: `\(H_0\)` is that `\(SSCP_0\)` and `\(SSCP_{\Omega}\)` are proportional, so: `\(H_0 = \delta_1 = \delta_2 = \delta_3...\)` .footnote[1. Mitteroecker, Collyer, Adams. (2024: In Press). *Syst.Biol.*] --- ### Current 1: Dimensions of Phylogenetic Signal: Example I + Papionins display strong and concentrated phylogenetic signal <img src="LectureData/14.prospectus/PhySigHomin.png" width="60%" style="display: block; margin: auto;" /> .footnote[1. Mitteroecker, Collyer, Adams. (2024: In Press). *Syst.Biol.*] --- ### Current 1: Dimensions of Phylogenetic Signal: Example II + Crocodylians display strong and concentrated phylogenetic signal <img src="LectureData/14.prospectus/PhySigCroc.png" width="60%" style="display: block; margin: auto;" /> .footnote[1. Mitteroecker, Collyer, Adams. (2024: In Press). *Syst.Biol.*] --- ### Current 2: Phylogenetic Comparison of Microevolutionary Patterns .pull-left[ Contemporary timescales: evaluate trends within species (microevolution) <img src="LectureData/14.prospectus/Micro.png" width="95%" /> Evaluated via standard (OLS) linear models ] -- .pull-right[ Macroevolutionary timescales: evaluate trends across the phylogeny <img src="LectureData/14.prospectus/Macro.png" width="95%" /> Evaluated via phylogenetic comparative methods ] -- + When comparing intraspecific trends across species, we ***should*** account for phylogenetic nonindependence + How can we do that? --- ### Current 2: PCMs and Multiple Individuals PCMs use 1 individual per species (e.g., species means) - Assumes `\(\sigma^2_{within.spp} << \sigma^2_{among.spp}\)` -- Using species means in presence of `\(\sigma^2_{within.spp}\)` is problematic - Can `\(\uparrow\)` bias in parameter estimates - Can `\(\uparrow\)` type I error, `\(\downarrow\)` power, and `\(\uparrow\)` model misspecification -- Some PCMs can include intraspecific variation (Ives et al. 2007; Felsenstein 2008), but - `\(\sigma^2_{within.spp}\)` is unstructured sampling error around the mean - In other words, assume within-species variation is **random** (i.e., measurement error) What about within-species variation that is not random? How can we account for it (or compare it across species)? --- ### Current 2: Extended Phylogenetic Least Squares (E-PGLS) `\(^1\)` + E-PGLS provide a bridge between micro- and macroevolutionary patterns + Can evaluate cross-species trends of trait covariation (univariate and multivariate) based on multiple individuals per species + Can compare within-species patterns across species while conditioning on the phylogeny + E-PGLS combines: + Hierarchical linear model + Expanded phylogenetic covariance matrix + Novel permutation procedure .footnote[1. Adams and Collyer (In Review)] --- ### Current 2: E-PGLS: Procedure + The E-PGLS model: `\(\mathbf{Z} = \mathbf{X}_S \hat{\mathbf{B}}_S + \mathbf{X}_X \hat{\mathbf{B}}_X + \mathbf{X}_{C_{X}}\hat{\mathbf{B}}_{C_{X}} + \mathbf{E}\)` + Phenotypic data, `\(\mathbf{Z}\)`, is described by a partitioned model, `\(\mathbf{X} = \mathbf{X}_S \vert \mathbf{X}_X \vert \mathbf{X}_{C_X}\)`, with residual error, `\(\mathbf{E}\)`, distributed as `\(\mathbf{E} \sim \mathcal{MN}(0,\boldsymbol{\Omega}_{n})\)` + `\(\boldsymbol{\Omega}_n\)` is an expanded phylogenetic covariance matrix -- .pull-left[ `\(\boldsymbol{\Omega}_n = \mathbf{X}_S\boldsymbol{\Omega}\mathbf{X}_S^T\)` <img src="LectureData/14.prospectus/ExpandedOmegaMatrix.png" width="90%" /> ] -- .pull-right[ Type III SS for `Species` effect, and type II SS for other effects (to hold interspecific trends constant) Special RRPP procedures utilized (RRPP restricted to within species to evaluate other model effects: similar to ME methods) Adams and Collyer (In Review) ] --- ### Current 2: E-PGLS: Simulations .pull-left[ <img src="LectureData/14.prospectus/Fig1.png" width="90%" /> Method emulates existing approaches when within-species variation is random ] -- .pull-right[ <img src="LectureData/14.prospectus/Fig2a.png" width="65%" /><img src="LectureData/14.prospectus/Fig2b.png" width="65%" /> Method has appropriate type I error, power and empirical sampling distributions ] .footnote[1. Adams and Collyer (In Review)] --- ### Current 2: E-PGLS: Example (pupfish SShD) <img src="LectureData/14.prospectus/Fig3.png" width="80%" /> .footnote[1. Adams and Collyer (In Review)] --- ### Future 1: An Improved Procrustes Algorithm + Recall the workhorse of geometric morphometrics: the Generalized Procrustes Analysis (superimposition) + Translate all specimens to a common location + Scale all specimens to unit centroid size + Optimally rotate all specimens to minimize LS deviations <img src="14-Prospectus_files/figure-html/unnamed-chunk-13-1.png" width="35%" style="display: block; margin: auto;" /> --- ### Future 1: An Improved Procrustes Algorithm .pull-left[ + Mathematically, GPA assumes that all landmarks are 'iid' (independent, and identically distributed) in terms of their error distribution + This implies that the values for each landmark of each specimen are derived from a circular normal distribution, which is independent landmark by landmark + Such an assumption is unrealistic ] -- .pull-right[ + For example, recall the 'Pinocchio effect' <img src="LectureData/03.superimposition/GRF-Example.png" width="80%" style="display: block; margin: auto;" /> ] -- + Here, some landmarks display greater variation than do others (and GRF was proposed to 'account' for this, via an ad-hoc procedure) + Also, consider the possibility that some landmarks are correlated with others (i.e., landmark changes are correlated) + Neither of these conditions is considered by GPA, as it is OLS (=unweighted) --- ### Future 1: Towards A Weighted Procrustes Algorithm + To this point (and for the past 25 years), we have conducted superimposition using GPA, which is an unweighted alignment + That is, one translates, rotates, and scales, using mean values + As implemented (and as described this week), these steps assume that all landmarks contribute equally during the alignment. + This is tantamount to including a `\(\mathbf{I}_{pk}\)` identity matrix within the GPA algorithm -- + One could consider a 'weighted GPA' where the translation, rotation, and scale are performed relative to a landmark-coordinate covariance matrix (much like one performs OLS = unweighted regression versus PGLS = phylogenetically-weighted regression) + The algebra for this is well-established statistically, but the 'weighting' covariance matrix in this case is highly singular, causing mathematical difficulties --- ### Future 1: Towards A Weighted Procrustes Algorithm .pull-left[ + Some approximations have been proposed (e.g., Theobald and Wuttke 2006) <img src="LectureData/14.prospectus/WGPATheobald.png" width="80%" style="display: block; margin: auto;" /> ] .pull-right[ + Here (with simplifying assumptions), one can align protein structures allowing for different variation among landmarks, but with equal variation within (i.e., using a partitioned `\(\hat{\mathbf\Sigma}=\mathbf\Sigma_p\otimes\Sigma_k\)`) + We are currently developing a general algorithm for weighted GPA that relaxes all of these assumptions, and where the 'standard' GPA approach would be a special case algorithm ] --- ### Future 2: Integration and Modularity: A Spatial Perspective + The past decade has seen a wonderful resurgence of interest in modularity and integration + This has lead to the development of numerous approaches for *quantifying* such patterns -- + Recent work (Bookstein 2016) highlighted the need to explore integration across spatial scales + This begs the question: should other (all?) patterns of integration and modularity among sub-units be evaluated relative to some 'null' expectation beyond that of the empirical sampling distribution derived from permuting landmarks into modules? -- + We contend that this is an important issue for future investigation + Our perspective is to consider the **spatial** proximity of landmarks in developing a null model for integration and modularity tests --- ### Future 2: Spatial Integration and Modularity .pull-left[ + In ecology, data obtained from different geographic localities is expected to covary due to their *spatial proximity* + In other words, geographically proximate locations are expected to be more similar, and covary more highly, than are geographically less proximate locations <img src="LectureData/14.prospectus/SpatialCov.png" width="50%" style="display: block; margin: auto;" /> ] -- .pull-right[ + For morphometrics, **anatomical proximity** is akin to geographic proximity in spatial statistics + We contend that null models of integration and modularity should be **conditioned** on anatomical proximity; analogously to spatial proximity issues <img src="LectureData/14.prospectus/Mouse.png" width="65%" style="display: block; margin: auto;" /> + We are presently developing integration and modularity methods that evaluate patterns relative to this anatomical proximity ] --- ### Future 2: Spatial Integration and Modularity .pull-left[ + There are additional challenges with integration and modularity + **Visualizing modularity** .center[ <img src="LectureData/14.prospectus/modulemap.png" width="30%" /> **Module Maps** ] ] .pull.right[ .center[ <img src="LectureData/14.prospectus/mod_covs.png" width="30%" /><img src="LectureData/14.prospectus/Burns_etal_eigen.png" width="30%" /> **Module Eigenanalysis** ]] ###### Burns et al. (2023) *Evolution*. --- ### Future 2: Spatial Integration and Modularity .pull-left[ + There are additional challenges with integration and modularity + Visualizing modularity + **Alternative modular hypotheses** .remark-code[It is possible to simulate many partitionings of the same covariance matrix into *K* modules. Performing **module eigenanalysis** allows a distribution of modular strength to be made. One can evaluate the strength of modularity in this distribution. ]] .pull-right[ .center[ <img src="LectureData/14.prospectus/mod_techniques.png" width="90%" /> **K-module analysis** ]] --- ### Future 3: Multivariate Model Comparison + The goal is to evaluate two or more models with different sets of parameters to determine if one is 'better' based on criteria quantifying the *fit* of the data to those models. + Procedurally, model fit is often obtained using likelihood: `$$\mathcal{L}(\hat{\beta}|\textbf{y}) = \frac{e^{-\frac{1}{2}( \textbf{y} - \mathbf{X}\hat\beta)^T\hat{\boldsymbol{\Sigma}}^{-1} ( \textbf{y} - \mathbf{X}\hat\beta)}}{\sqrt{2\pi^{np}|\hat{\boldsymbol{\Sigma}}|}}$$` + Likelihood ratio tests are then performed to compare models `$$-2\lambda_{LR}= -2*\frac{\mathcal{L}(\hat{\beta}_0|\textbf{y})}{\mathcal{L}(\hat{\beta}_1|\textbf{y})} \sim\chi^2$$` -- + LRT-based hypothesis testing is what we've used throughout this workshop to evaluate linear models + Possible shortcomings include: + Model overfitting (improper application can lead one to infer that an overly complicated model is best) + LRT is commonly not used for non-nested models (though it can be: Vuong 1989; Lewis et al. 2011) --- ### Future 3: Multivariate Model Comparison + Models are also compared using indexing measures of penalized likelihood (e.g., AIC). `$$AIC = -2\mathcal{L}(\hat{\beta}|\textbf{y}) + 2[pk+0.5p(p+1)]$$` + Lower AIC values represent preferred models, with a cut-off typically stated as `\(\small\Delta_{AIC} > 4\)` (for *univariate* data) -- + AIC (and similar) measures represent a 'balance' between model fit and the number of parameters used + They are related to Kullback–Leibler information theory, and are a sort of 'distance' + Note that for multivariate data, the parameter penalty can become quite 'heavy') --- ### Model Comparison via RRPP-Based Effect Sizes + One can combine the strengths of LRT and AIC via RRPP-based effect sizes + For each candidate model, `\(\mathcal{L}(\hat{\beta}|\textbf{y})\)` can be obtained + RRPP may be used to generate an empirical sampling distribution of `\(\mathcal{L}(\hat{\beta}|\textbf{y})\)` + An effect size, `\(Z\)`-score, may be calculated for each model -- + Larger `\(Z\)`-scores imply a stronger strength of signal for that model + This is akin to a 'distance' of the candidate model relative to an intercept model: `\(Z \sim 1\)` + The model with the largest `\(Z\)`-score is the best model + NOTE: one can also perform two-sample comparisons of effect sizes (sensu Adams and Collyer 2016; 2019) to compare models --- ### Future 3: Multivariate Model Comparison: Example + Here is a simple (univariate: snake head size: CS) example using AIC: .pull-left[ ``` r f.0 <- lm.rrpp(hs ~ 1, data = rdf) f.svl <- lm.rrpp(hs ~ svl, data = rdf) f.sex <- lm.rrpp(hs ~ sex, data = rdf) f.reg <- lm.rrpp(hs ~ region, data = rdf) f.svl.reg <- lm.rrpp(hs ~ svl + region, data = rdf, SS.type = "II") f.svl.by.reg <- lm.rrpp(hs ~ svl * region, data = rdf, SS.type = "II") ``` ] .pull-right[ ``` r model.comparison(f.0, f.svl, f.sex, f.reg, f.svl.reg, f.svl.by.reg, type = "logLik") ``` ``` ## logLik residual.pc.no penalty AIC ## 1 -709.5568 1 4 1423.114 ## svl -705.0201 1 6 1416.040 ## sex -709.5201 1 6 1425.040 ## region -706.2789 1 8 1420.558 ## svl + region -701.5571 1 10 1413.114 ## svl + region + svl:region -700.2760 1 14 1414.552 ``` ] + Based on AIC, the best model is `\(\small{Size}={Sex}+{Region}\)` --- ### Future 3: Multivariate Model Comparison: Example + Using likelihood ratio tests (anova), this model is also preferred .scrollable[ ``` r anova(f.svl.by.reg) ``` ``` ## ## Analysis of Variance, using Residual Randomization ## Permutation procedure: Randomization of null model residuals ## Number of permutations: 1000 ## Estimation method: Ordinary Least Squares ## Sums of Squares and Cross-products: Type II ## Effect sizes (Z) based on F distributions ## ## Df SS MS Rsq F Z Pr(>F) ## svl 1 286392 286392 0.07945 9.5450 2.48962 0.003 ** ## region 2 207559 103779 0.05758 3.4588 1.75000 0.040 * ## svl:region 2 73442 36721 0.02038 1.2238 0.52435 0.314 ## Residuals 101 3030455 30005 0.84074 ## Total 106 3604504 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Call: lm.rrpp(f1 = hs ~ svl * region, SS.type = "II", data = rdf) ``` ] --- ### Future 3: Multivariate Model Comparison: Example + And now let's look at the effect sizes of the models: ``` r model.comparison(f.0, f.svl, f.sex, f.reg, f.svl.reg, f.svl.by.reg, type = "Z") ``` ``` ## Z ## 1 -0.2679764 ## svl 2.3502504 ## sex -0.8901494 ## region 1.7070038 ## svl + region 2.8300695 ## svl + region + svl:region 2.6081491 ``` + Again in this case, the model with the strongest signal is `\(\small{Size}={Sex}+{Region}\)` + **Clearly, there is considerable work left to do to verify this new theoretical procedure, but the initial concept of using `\(Z\)`-scores from RRPP for model comparison is promising!** *Stay tuned...*