Ryan N Gutenkunst

Ryan N Gutenkunst

Associate Department Head, Molecular and Cellular Biology
Associate Professor, Applied BioSciences - GIDP
Associate Professor, Applied Mathematics - GIDP
Associate Professor, Cancer Biology -
Associate Professor, Ecology and Evolutionary Biology
Associate Professor, Genetics - GIDP
Associate Professor, Molecular and Cellular Biology
Associate Professor, Public Health
Associate Professor, Statistics-GIDP
Associate Professor, BIO5 Institute
Member of the Graduate Faculty
Director, Graduate Studies
Primary Department
Contact
(520) 626-0569

Work Summary

We learn history from the genomes of humans, tumors, and other species. Our studies reveal how evolution works at the molecular level, offering fundamental insight into how humans and pathogens adapt to challenges.

Research Interest

The Gutenkunst group studies the function and evolution of the complex molecular networks that comprise life. To do so, they integrate computational population genomics, bioinformatics, and molecular evolution. They focus on developing new computational methods to extract biological insight from genomic data and applying those methods to understand population history and natural selection.

Publications

Andrés, A. M., Hubisz, M. J., Indap, A., Torgerson, D. G., Degenhardt, J. D., Boyko, A. R., Gutenkunst, R. N., White, T. J., Green, E. D., Bustamante, C. D., Clark, A. G., & Nielsen, R. (2009). Targets of balancing selection in the human genome. Molecular Biology and Evolution, 26(12), 2755-2764.

PMID: 19713326;PMCID: PMC2782326;Abstract:

Balancing selection is potentially an important biological force for maintaining advantageous genetic diversity in populations, including variation that is responsible for long-term adaptation to the environment. By serving as a means to maintain genetic variation, it may be particularly relevant to maintaining phenotypic variation in natural populations. Nevertheless, its prevalence and specific targets in the human genome remain largely unknown. We have analyzed the patterns of diversity and divergence of 13,400 genes in two human populations using an unbiased single-nucleotide polymorphism data set, a genome-wide approach, and a method that incorporates demography in neutrality tests. We identified an unbiased catalog of genes with signatures of long-term balancing selection, which includes immunity genes as well as genes encoding keratins and membrane channels; the catalog also shows enrichment in functional categories involved in cellular structure. Patterns are mostly concordant in the two populations, with a small fraction of genes showing population-specific signatures of selection. Power considerations indicate that our findings represent a subset of all targets in the genome, suggesting that although balancing selection may not have an obvious impact on a large proportion of human genes, it is a key force affecting the evolution of a number of genes in humans.

Lynch, M., Gutenkunst, R., Ackerman, M., Spitze, K., Ye, Z., Maruki, T., & Jia, Z. (2017). Population Genomics of Daphnia pulex. Genetics, 206, 315.

Using data from 83 isolates from a single population, the population genomics of the microcrustacean Daphnia pulex are described and compared to current knowledge for the only other well-studied invertebrate, Drosophila melanogaster These two species are quite similar with respect to effective population sizes and mutation rates, although some features of recombination appear to be different, with linkage disequilibrium being elevated at short ( 100 bp) distances in D. melanogaster and at long distances in D. pulex The study population adheres closely to the expectations under Hardy-Weinberg equilibrium, and reflects a past population history of no more than a two-fold range of variation in effective population size. Four-fold redundant silent sites and a restricted region of intronic sites appear to evolve in a nearly neutral fashion, providing a powerful tool for population-genetic analyses. Amino-acid replacement sites are predominantly under strong purifying selection, as are a large fraction of sites in UTRs and intergenic regions, but the majority of SNPs at such sites that rise to frequencies > 0:05 appear to evolve in a nearly neutral fashion. All forms of genomic sites (including replacement sites within codons, and intergenic and UTR regions) appear to be experiencing an ~ 2x higher level of selection scaled to the power of drift in D. melanogaster, but this may in part be a consequence of recent demographic changes. These results establish D. pulex as an excellent system for future work on the evolutionary genomics of natural populations.

Hsieh, P., Hallmark, B., Watkins, J. C., Karafet, T. C., Osipova, L. P., Gutenkunst, R. N., & Hammer, M. F. (2017). Exome sequencing provides evidence of polygenic adaptation to a fat-rich animal diet in indigenous Siberian populations. Molecular Biology and Evolution, 34, 2914.
Robinson, J. D., Coffman, A. J., Hickerson, M. J., & Gutenkunst, R. N. (2014). Sampling strategies for frequency spectrum-based population genomic inference. BMC evolutionary biology, 14(1), 254.

BackgroundThe allele frequency spectrum (AFS) consists of counts of the number of single nucleotide polymorphism (SNP) loci with derived variants present at each given frequency in a sample. Multiple approaches have recently been developed for parameter estimation and calculation of model likelihoods based on the joint AFS from two or more populations. We conducted a simulation study of one of these approaches, implemented in the Python module ¿a¿i, to compare parameter estimation and model selection accuracy given different sample sizes under one- and two-population models.ResultsOur simulations included a variety of demographic models and two parameterizations that differed in the timing of events (divergence or size change). Using a number of SNPs reasonably obtained through next-generation sequencing approaches (10,000 - 50,000), accurate parameter estimates and model selection were possible for models with more ancient demographic events, even given relatively small numbers of sampled individuals. However, for recent events, larger numbers of individuals were required to achieve accuracy and precision in parameter estimates similar to that seen for models with older divergence or population size changes. We quantify i) the uncertainty in model selection, using tools from information theory, and ii) the accuracy and precision of parameter estimates, using the root mean squared error, as a function of the timing of demographic events, sample sizes used in the analysis, and complexity of the simulated models.ConclusionsHere, we illustrate the utility of the genome-wide AFS for estimating demographic history and provide recommendations to guide sampling in population genomics studies that seek to draw inference from the AFS. Our results indicate that larger samples of individuals (and thus larger AFS) provide greater power for model selection and parameter estimation for more recent demographic events.

Altshuler, D. L., Durbin, R. M., Abecasis, G. R., Bentley, D. R., Chakravarti, A., Clark, A. G., Collins, F. S., M., F., Donnelly, P., Egholm, M., Flicek, P., Gabriel, S. B., Gibbs, R. A., Knoppers, B. M., Lander, E. S., Lehrach, H., Mardis, E. R., McVean, G. A., Nickerson, D. A., , Peltonen, L., et al. (2010). A map of human genome variation from population-scale sequencing. Nature, 467(7319), 1061-1073.

PMID: 20981092;PMCID: PMC3042601;Abstract:

The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10 g-8 per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research. © 2010 Macmillan Publishers Limited. All rights reserved. © 2010 Macmillan Publishers Limited. All rights reserved.