Chengcheng Hu
Director, Biostatistics - Phoenix Campus
Professor, BIO5 Institute
Professor, Public Health
Professor, Statistics-GIDP
Primary Department
Department Affiliations
(520) 626-9308
Work Summary
Chengcheng Hu has worked on a broad range of areas including cancer, occupational health, HIV/AIDS, and aging. He has extensive collaborative research in conducting methodological research in the areas of survival analysis, longitudinal data, high-dimensional data, and measurement error. His current methodological interest, arising from studies of viral and human genetics and biomarkers, is to develop innovative methods to investigate the relationship between high-dimensional information and longitudinal outcomes or survival endpoints.
Research Interest
Chengcheng Hu, Ph.D., is an Associate Professor, Public Health and Director, Biostatistics, Phoenix campus at the Mel and Enid Zuckerman College of Public Health, University of Arizona. He is also Director of the Biometry Core on the Chemoprevention of Skin Cancer Project at the University of Arizona Cancer Center. Hu has worked on multiple federal grants in a broad range of areas including cancer, occupational health, HIV/AIDS, and aging. In addition to extensive experience in collaborative research, he has conducted methodological research in the areas of survival analysis, longitudinal data, high-dimensional data, and measurement error. His current methodological interest, arising from studies of viral and human genetics and biomarkers, is to develop innovative methods to investigate the relationship between high-dimensional information and longitudinal outcomes or survival endpoints. Hu joined the UA Mel and Enid Zuckerman College of Public Health in 2008. Prior to this he was an assistant professor of Biostatistics at the Harvard School of Public Health from 2002 to 2008. While at Harvard, he also served as senior statistician in the Pediatric AIDS Clinical Trials Group (PACTG) and the International Maternal Pediatric Adolescent AIDS Clinical Trials Group (IMPAACT). Hu received his Ph.D. and M.S. in Biostatistics from the University of Washington and a M.A. in Mathematics from the Johns Hopkins University.

Publications

Vasquez, M. M., Hu, C., Roe, D. J., Chen, Z., Halonen, M., & Guerra, S. (2016). Least absolute shrinkage and selection operator type methods for the identification of serum biomarkers of overweight and obesity: simulation and application. BMC medical research methodology, 16(1), 154.
BIO5 Collaborators
Zhao Chen, Stefano Guerra, Chengcheng Hu

The study of circulating biomarkers and their association with disease outcomes has become progressively complex due to advances in the measurement of these biomarkers through multiplex technologies. The Least Absolute Shrinkage and Selection Operator (LASSO) is a data analysis method that may be utilized for biomarker selection in these high dimensional data. However, it is unclear which LASSO-type method is preferable when considering data scenarios that may be present in serum biomarker research, such as high correlation between biomarkers, weak associations with the outcome, and sparse number of true signals. The goal of this study was to compare the LASSO to five LASSO-type methods given these scenarios.

Huang, S., Chengcheng, H., Bell, M., Billheimer, D., Guerra, S., Roe, D., Monica, V., & Bedrick, E. (2018). Regularized Continuous-Time Markov Model via Elastic Net. Biometrics.
BIO5 Collaborators
Dean Billheimer, Stefano Guerra, Chengcheng Hu
Thomson, C. A., Jackson, R., Chou, Y., Hu, C., Ernst, K. C., Bea, J. W., Klimentidis, Y. C., & Chen, Z. (2017). Body mass index, waist circumference and mortality in a large mutiethnic postmenopausal cohort - Results from the Women's Health Initiative.. Journal of the American Geriatric Society.
BIO5 Collaborators
Zhao Chen, Chengcheng Hu, Yann C Klimentidis
Muñoz-Rodríguez, J. L., Vrba, L., Futscher, B. W., Hu, C., Komenaka, I. K., Meza-Montenegro, M. M., Gutierrez-Millan, L. E., Daneri-Navarro, A., Thompson, P. A., & Martinez, M. E. (2015). Differentially expressed microRNAs in postpartum breast cancer in Hispanic women. PloS one, 10(4), e0124340.
BIO5 Collaborators
Bernard W Futscher, Chengcheng Hu

The risk of breast cancer transiently increases immediately following pregnancy; peaking between 3-7 years. The biology that underlies this risk window and the effect on the natural history of the disease is unknown. MicroRNAs (miRNAs) are small non-coding RNAs that have been shown to be dysregulated in breast cancer. We conducted miRNA profiling of 56 tumors from a case series of multiparous Hispanic women and assessed the pattern of expression by time since last full-term pregnancy. A data-driven splitting analysis on the pattern of 355 miRNAs separated the case series into two groups: a) an early group representing women diagnosed with breast cancer ≤ 5.2 years postpartum (n = 12), and b) a late group representing women diagnosed with breast cancer ≥ 5.3 years postpartum (n = 44). We identified 15 miRNAs with significant differential expression between the early and late postpartum groups; 60% of these miRNAs are encoded on the X chromosome. Ten miRNAs had a two-fold or higher difference in expression with miR-138, miR-660, miR-31, miR-135b, miR-17, miR-454, and miR-934 overexpressed in the early versus the late group; while miR-892a, miR-199a-5p, and miR-542-5p were underexpressed in the early versus the late postpartum group. The DNA methylation of three out of five tested miRNAs (miR-31, miR-135b, and miR-138) was lower in the early versus late postpartum group, and negatively correlated with miRNA expression. Here we show that miRNAs are differentially expressed and differentially methylated between tumors of the early versus late postpartum, suggesting that potential differences in epigenetic dysfunction may be operative in postpartum breast cancers.

Vasquez, M. M., Hu, C., Roe, D. J., Halonen, M., & Guerra, S. (2017). Measurement error correction in the least absolute shrinkage and selection operator model when validation data are available. Statistical methods in medical research, 962280217734241.
BIO5 Collaborators
Stefano Guerra, Chengcheng Hu

Measurement of serum biomarkers by multiplex assays may be more variable as compared to single biomarker assays. Measurement error in these data may bias parameter estimates in regression analysis, which could mask true associations of serum biomarkers with an outcome. The Least Absolute Shrinkage and Selection Operator (LASSO) can be used for variable selection in these high-dimensional data. Furthermore, when the distribution of measurement error is assumed to be known or estimated with replication data, a simple measurement error correction method can be applied to the LASSO method. However, in practice the distribution of the measurement error is unknown and is expensive to estimate through replication both in monetary cost and need for greater amount of sample which is often limited in quantity. We adapt an existing bias correction approach by estimating the measurement error using validation data in which a subset of serum biomarkers are re-measured on a random subset of the study sample. We evaluate this method using simulated data and data from the Tucson Epidemiological Study of Airway Obstructive Disease (TESAOD). We show that the bias in parameter estimation is reduced and variable selection is improved.