Statistical analyses
To investigate how the pairwise log response ratio of the mean
population parameters was affected by geographic and macroclimatic
distance between populations, system type (island versus mainland) and
taxonomy, we fitted Bayesian phylogenetic mixed models using the
MCMCglmm package (Hadfield 2010).
We ran two general models, corresponding to phenotypic variability and
genetic diversity respectively. For these general models, the model
structure was: log_ratio(phenotypic trait or genetic diversity)
~ factor(mainland vs. island) +
log10(geographic distance) + Kingdom (plant vs. animal)
+ macroclimatic distance + interaction (mainland vs. island) :
log10(geographic distance) + interaction (mainland vs.
island): macroclimatic distance . The models included phylogeny, study
ID, and the response variable type (e.g. size, heterozygosity, totalling
16 levels for genetic diversity and 7 levels for phenotype variability,
see Fig. S2.2 b, c) as random intercepts. Our models accounted for
potential pseudoreplication issues associated with the process of
pairwise comparison across populations and the phylogenetic structure of
the data. If a population was represented in more than one pairwise
comparison, using the full set of pairwise combinations for any group of
populations would result in pseudoreplication. To avoid this, we used
random pairwise comparisons between populations without replacement to
create datasets where each population can only be represented once. For
example, for comparisons in a system with three island populations, each
dataset would only include one pairwise comparison to avoid any given
population being represented more than once. To capture the full set of
possible pairwise comparisons, we created 100 pairwise datasets, and
each was then used to independently test our hypotheses. To ensure that
the results were not due to the evolutionary history of species,
phylogeny was included in the MCMCglmm model as a random effect
(Hadfield 2010). Rather than using one phylogenetic tree and assuming no
error in the tree structure or branch length, we created a distribution
of 100 phylogenies from various sources that incorporated the errors
associated with building phylogenetic trees (Supporting material S4). As
a result of accounting for pairwise pseudoreplication and phylogenetic
uncertainty, we ran 100 MCMCglmm models as described in the Multree
package (Guillerme and Healy 2014), with each separate run associated
with an independent pairwise dataset and a random phylogeny. As the
posterior outputs of MCMC models are combinable, coefficient
distributions were created by amalgamating coefficient posterior
distributions from all runs.
The general phenotypic variability model included 43 species (7 plants,
36 animals) and the general genetic diversity model included 71 species
(30 plants, 41 animals). Due to the different numbers of populations
studied per species, each replicate model of the genetic and phenotypic
models included a different number of associated pairwise measures
between populations, ranging between approximately 1610-1640 and
1070-1190 respectively.
To assess the robustness of our results, we ran a series of additional
models for both the phenotypic variability and neutral genetic diversity
datasets, each exploring different limiting aspects of our data:i ) As zero values are common in measures of genetic diversity and
biological phenotypes (e.g., lack of polymorphic loci in a population),
and log ratio values cannot be calculated if any values are zero, these
values were dropped from the main models (and from the results presented
in the main text). To test the effect of zero-values on our analyses, we
ran separate, “zero-adjusted” general models for both the phenotypic
variability and neutral genetic diversity, in which we added 10% of the
mean of the respective variable to all individual measurements.ii ) To explore the extent to which the general models were
influenced by the response variables more frequently represented in the
database, we ran separate models on the two most commonly measured
variables in the dataset: body (or body part) size and heterozygosity.
These models were fitted following the same method as the general models
but had one random term (the response variable type) removed. Finally,
as macroclimatic distance and geographical distances were correlated
(Fig. S6.3), we also ran each of the main models with either the
macroclimatic distance or geographical distance excluded. The models
built this way could not accommodate non-neutral genetic differentiation
between populations, because the unit of observation in our study was
the population. Likewise, the models did not accommodate existing models
of population variability developed specifically for island systems,
such as the effect of island size or distance of islands from the
mainland, which are difficult to correspond to mainland systems, and
fell beyond the scope of this analysis.
The structure of all models together with the number of species and
corresponding pairwise population measurements is presented in
Supporting material S5.