Lingling An

Lingling An

Associate Professor, Agricultural-Biosystems Engineering
Associate Professor, Public Health
Associate Professor, Statistics-GIDP
Associate Professor, BIO5 Institute
Member of the General Faculty
Member of the Graduate Faculty
Primary Department
Department Affiliations
(520) 621-1248

Research Interest

Lingling An, PhD, conducts research in the interdisciplinary boundaries of many fields such as statistical sciences, biological and medical sciences, genomics and genetics. Her statistical group's major research interests include development and application of statistical and computational methods for analysis of high-dimensional genomic/genetic, metagenomic/ metatranscriptomic, and epigenomic data. The overlying vision is to develop rigorous, timely and useful statistical and computational methodologies to help biologists/geneticists to ask, answer, and disseminate biologically interesting information in the quest to understand the ultimate function of DNA and gene network.


An, L. (2017). Suspect Reduction for Culture Independent Microbial Source Tracking in Trace Evidence Analysis Using Community. Journal of Forensic Sciences.
An, L. (2017). FIS-PRC2 plays a dual role in regulation of type I MADS-box genes in early endosperm of Arabidopsis. Plant Physiology.
Sohn, M., Du, R., & An, L. (2015). A robust approach for identifying differentially abundant features in metagenomic samples. Bioinformatics.
Jiang, H., Lingling, A. n., Lin, S. M., Feng, G., & Qiu, Y. (2012). A Statistical Framework for Accurate Taxonomic Assignment of Metagenomic Sequencing Reads. PLoS ONE, 7(10).

PMID: 23049702;PMCID: PMC3462201;Abstract:

The advent of next-generation sequencing technologies has greatly promoted the field of metagenomics which studies genetic material recovered directly from an environment. Characterization of genomic composition of a metagenomic sample is essential for understanding the structure of the microbial community. Multiple genomes contained in a metagenomic sample can be identified and quantitated through homology searches of sequence reads with known sequences catalogued in reference databases. Traditionally, reads with multiple genomic hits are assigned to non-specific or high ranks of the taxonomy tree, thereby impacting on accurate estimates of relative abundance of multiple genomes present in a sample. Instead of assigning reads one by one to the taxonomy tree as many existing methods do, we propose a statistical framework to model the identified candidate genomes to which sequence reads have hits. After obtaining the estimated proportion of reads generated by each genome, sequence reads are assigned to the candidate genomes and the taxonomy tree based on the estimated probability by taking into account both sequence alignment scores and estimated genome abundance. The proposed method is comprehensively tested on both simulated datasets and two real datasets. It assigns reads to the low taxonomic ranks very accurately. Our statistical approach of taxonomic assignment of metagenomic reads, TAMER, is implemented in R and available at © 2012 Jiang et al.

Pookhao, N., Sohn, M., Jenkins, I., Du, R., Jiang, H., & An, L. (2014). A two-stage statistical procedure for feature selection and comparison in functional analysis of metagenomes. Bioinformatics.