Ryan N Gutenkunst
Work Summary
We learn history from the genomes of humans, tumors, and other species. Our studies reveal how evolution works at the molecular level, offering fundamental insight into how humans and pathogens adapt to challenges.
We learn history from the genomes of humans, tumors, and other species. Our studies reveal how evolution works at the molecular level, offering fundamental insight into how humans and pathogens adapt to challenges.
PMID: 19851460;PMCID: PMC2760211;Abstract:
Demographic models built from genetic data play important roles in illuminating prehistorical events and serving as null models in genome scans for selection. We introduce an inference method based on the joint frequency spectrum of genetic variants within and between populations. For candidate models we numerically compute the expected spectrum using a diffusion approximation to the one-locus, two-allele Wright-Fisher process, involving up to three simultaneous populations. Our approach is a composite likelihood scheme, since linkage between neutral loci alters the variance but not the expectation of the frequency spectrum. We thus use bootstraps incorporating linkage to estimate uncertainties for parameters and significance values for hypothesis tests. Our method can also incorporate selection on single sites, predicting the joint distribution of selected alleles among populations experiencing a bevy of evolutionary forces, including expansions, contractions, migrations, and admixture. We model human expansion out of Africa and the settlement of the New World, using 5 Mb of noncoding DNA resequenced in 68 individuals from 4 populations (YRI, CHB, CEU, and MXL) by the Environmental Genome Project. We infer divergence between West African and Eurasian populations 140 thousand years ago (95% confidence interval: 40-270 kya). This is earlier than other genetic studies, in part because we incorporate migration. We estimate the European (CEU) and East Asian (CHB) divergence time to be 23 kya (95% c.i.: 17-43 kya), long after archeological evidence places modern humans in Europe. Finally, we estimate divergence between East Asians (CHB) and Mexican-Americans (MXL) of 22 kya (95% c.i.: 16.3-26.9 kya), and our analysis yields no evidence for subsequent migration. Furthermore, combining our demographic model with a previously estimated distribution of selective effects among newly arising amino acid mutations accurately predicts the frequency spectrum of nonsynonymous variants across three continental populations (YRI, CHB, CEU).
PMID: 23555251;PMCID: PMC3605146;Abstract:
Secondary bacterial infections are a leading cause of illness and death during epidemic and pandemic influenza. Experimental studies suggest a lethal synergism between influenza and certain bacteria, particularly Streptococcus pneumoniae, but the precise processes involved are unclear. To address the mechanisms and determine the influences of pathogen dose and strain on disease, we infected groups of mice with either the H1N1 subtype influenza A virus A/Puerto Rico/8/34 (PR8) or a version expressing the 1918 PB1-F2 protein (PR8-PB1-F2(1918)), followed seven days later with one of two S. pneumoniae strains, type 2 D39 or type 3 A66.1. We determined that, following bacterial infection, viral titers initially rebound and then decline slowly. Bacterial titers rapidly rise to high levels and remain elevated. We used a kinetic model to explore the coupled interactions and study the dominant controlling mechanisms. We hypothesize that viral titers rebound in the presence of bacteria due to enhanced viral release from infected cells, and that bacterial titers increase due to alveolar macrophage impairment. Dynamics are affected by initial bacterial dose but not by the expression of the influenza 1918 PB1-F2 protein. Our model provides a framework to investigate pathogen interaction during coinfections and to uncover dynamical differences based on inoculum size and strain. © 2013 Smith et al.
Characterizing patterns of genetic variation within and among human populations is important for understanding human evolutionary history and for careful design of medical genetic studies. Here, we analyze patterns of variation across 443,434 single nucleotide polymorphisms (SNPs) genotyped in 3845 individuals from four continental regions. This unique resource allows us to illuminate patterns of diversity in previously under-studied populations at the genome-wide scale including Latin America, South Asia, and Southern Europe. Key insights afforded by our analysis include quantifying the degree of admixture in a large collection of individuals from Guadalajara, Mexico; identifying language and geography as key determinants of population structure within India; and elucidating a north-south gradient in haplotype diversity within Europe. We also present a novel method for identifying long-range tracts of homozygosity indicative of recent common ancestry. Application of our approach suggests great variation within and among populations in the extent of homozygosity, suggesting both demographic history (such as population bottlenecks) and recent ancestry events (such as consanguinity) play an important role in patterning variation in large modern human populations.
Evolutionary biology often seeks to decipher the drivers of speciation, and much debate persists over the relative importance of isolation and gene flow in the formation of new species. Genetic studies of closely related species can assess if gene flow was present during speciation, because signatures of past introgression often persist in the genome. We test hypotheses on which mechanisms of speciation drove diversity among three distinct lineages of desert tortoise in the genus Gopherus. These lineages offer a powerful system to study speciation, because different biogeographic patterns (physical vs. ecological segregation) are observed at opposing ends of their distributions. We use 82 samples collected from 38 sites, representing the entire species' distribution and generate sequence data for mtDNA and four nuclear loci. A multilocus phylogenetic analysis in *BEAST estimates the species tree. RNA-seq data yield 20,126 synonymous variants from 7665 contigs from two individuals of each of the three lineages. Analyses of these data using the demographic inference package ∂a∂i serve to test the null hypothesis of no gene flow during divergence. The best-fit demographic model for the three taxa is concordant with the *BEAST species tree, and the ∂a∂i analysis does not indicate gene flow among any of the three lineages during their divergence. These analyses suggest that divergence among the lineages occurred in the absence of gene flow and in this scenario the genetic signature of ecological isolation (parapatric model) cannot be differentiated from geographic isolation (allopatric model).