Statistical analyses
To investigate how the pairwise log response ratio of the mean population parameters was affected by geographic and macroclimatic distance between populations, system type (island versus mainland) and taxonomy, we fitted Bayesian phylogenetic mixed models using the MCMCglmm package (Hadfield 2010).
We ran two general models, corresponding to phenotypic variability and genetic diversity respectively. For these general models, the model structure was: log_ratio(phenotypic trait or genetic diversity) ~ factor(mainland vs. island) + log10(geographic distance) + Kingdom (plant vs. animal) + macroclimatic distance + interaction (mainland vs. island) : log10(geographic distance) + interaction (mainland vs. island): macroclimatic distance . The models included phylogeny, study ID, and the response variable type (e.g. size, heterozygosity, totalling 16 levels for genetic diversity and 7 levels for phenotype variability, see Fig. S2.2 b, c) as random intercepts. Our models accounted for potential pseudoreplication issues associated with the process of pairwise comparison across populations and the phylogenetic structure of the data. If a population was represented in more than one pairwise comparison, using the full set of pairwise combinations for any group of populations would result in pseudoreplication. To avoid this, we used random pairwise comparisons between populations without replacement to create datasets where each population can only be represented once. For example, for comparisons in a system with three island populations, each dataset would only include one pairwise comparison to avoid any given population being represented more than once. To capture the full set of possible pairwise comparisons, we created 100 pairwise datasets, and each was then used to independently test our hypotheses. To ensure that the results were not due to the evolutionary history of species, phylogeny was included in the MCMCglmm model as a random effect (Hadfield 2010). Rather than using one phylogenetic tree and assuming no error in the tree structure or branch length, we created a distribution of 100 phylogenies from various sources that incorporated the errors associated with building phylogenetic trees (Supporting material S4). As a result of accounting for pairwise pseudoreplication and phylogenetic uncertainty, we ran 100 MCMCglmm models as described in the Multree package (Guillerme and Healy 2014), with each separate run associated with an independent pairwise dataset and a random phylogeny. As the posterior outputs of MCMC models are combinable, coefficient distributions were created by amalgamating coefficient posterior distributions from all runs.
The general phenotypic variability model included 43 species (7 plants, 36 animals) and the general genetic diversity model included 71 species (30 plants, 41 animals). Due to the different numbers of populations studied per species, each replicate model of the genetic and phenotypic models included a different number of associated pairwise measures between populations, ranging between approximately 1610-1640 and 1070-1190 respectively.
To assess the robustness of our results, we ran a series of additional models for both the phenotypic variability and neutral genetic diversity datasets, each exploring different limiting aspects of our data:i ) As zero values are common in measures of genetic diversity and biological phenotypes (e.g., lack of polymorphic loci in a population), and log ratio values cannot be calculated if any values are zero, these values were dropped from the main models (and from the results presented in the main text). To test the effect of zero-values on our analyses, we ran separate, “zero-adjusted” general models for both the phenotypic variability and neutral genetic diversity, in which we added 10% of the mean of the respective variable to all individual measurements.ii ) To explore the extent to which the general models were influenced by the response variables more frequently represented in the database, we ran separate models on the two most commonly measured variables in the dataset: body (or body part) size and heterozygosity. These models were fitted following the same method as the general models but had one random term (the response variable type) removed. Finally, as macroclimatic distance and geographical distances were correlated (Fig. S6.3), we also ran each of the main models with either the macroclimatic distance or geographical distance excluded. The models built this way could not accommodate non-neutral genetic differentiation between populations, because the unit of observation in our study was the population. Likewise, the models did not accommodate existing models of population variability developed specifically for island systems, such as the effect of island size or distance of islands from the mainland, which are difficult to correspond to mainland systems, and fell beyond the scope of this analysis.
The structure of all models together with the number of species and corresponding pairwise population measurements is presented in Supporting material S5.