Population genetics presents us with numerous conundrums – several of which have to do with how the same genomic disposition can be “reached” over evolutionary time with multiple alternate demographic or selective processes. I have discussed several of these issues before (here and here), wherein demography confounds selection or vice versa. Studies that estimate genetic diversity, differentiation, and/or effective population sizes thus need to pay attention to the effects of linked selection before making conclusive statements about their underlying evolutionary history. The issue at hand has to do with demographic inference in the presence of selective sweeps. As expected, selective sweeps (or positive selection at a site, and the subsequent reduction in genomic diversity at linked neutral sites) cause reduction in effective population size, a consequence also characteristic of population bottlenecks. Considering that the inference of effective population size is a key component of most genome-scale studies, Schrider et al. (2016) in a new manuscript discuss these confounding effects of sweeps in the inference of effective population sizes using three popular evolutionary model-based inference platforms – ABC, δaδi, and PSMC. Their findings have important implications for how we study genomic diversity and differentiation.
Briefly, using coalescent simulations of 500 unlinked loci, and 100 replicate genomes under each of four population histories – constant size, bottleneck, exponential growth, and bottleneck followed by exponential growth, they determine the efficiency of genetic diversity (π), Tajima’s D, and the three methods above in recapturing the effects of linked selective sweeps of varying intensities on sites with increasing genetic distance. For inference using PSMC, the authors simulated 100 replicates of 15 Mb genomes under four scenarios – neutral, one recent sweep, three recurrent sweeps, and one of five sweeps.
Under the bottleneck model, increasing the number of loci under sweeps upwardly biased parameter estimates of effective population sizes using both δaδi, and ABC. Similarly, the population growth model simulations showed bias towards more recent and faster growth rates using both methods. Inferences were differently biased under both methods in the contraction followed by growth model as well. Inference using PSMC indicated that sweeps can influence population size change estimates considerably, depending on the number of recurrent sweeps over evolutionary time, with increased variance in estimates with increased number of sweeps, thus “dramatically skew”-ing estimates. Note however, that this is exactly what one would expect to see while using PSMC in the presence of sweeps – selective sweeps cause drastic reductions in effective population sizes, which can confound true bottlenecks (see this interesting Twitter conversation over this debate).
Rightfully so, Schrider et al. (2016) caution scientists about the challenges in “simultaneous estimation of parameters related to natural selection and demographic history”.
Until an approach to obtain accurate estimates of demographic parameters in the face of natural selection is devised, population size histories inferred from population genetic datasets could remain significantly biased.
Schrider, Daniel, Alexander G. Shanku, and Andrew D. Kern. “Effects of linked selective sweeps on demographic inference and model selection.”bioRxiv (2016): 047019. DOI: http://dx.doi.org/10.1101/047019