Hierarchical Bayes Modeling Of Mediation Through High-Dimensional --Omics Data
Dr. Thomas is Professor of Biostatistics in the Department of Preventive Medicine, and Verna R. Richter Chair in Cancer Research at the University of Southern California, Keck School of Medicine. He received his Ph.D. from McGill University in 1976, where he continued as a faculty member until his recruitment to USC in 1984. There he served as the Head of the Biostatistics Division until 2013 and co-directed the Southern California Environmental Health Sciences Center and the Cancer Epidemiology Program in the USC/Norris Comprehensive Cancer Center. His primary research interest has been in the development of statistical methods for environmental and genetic epidemiology, with numerous collaborations in both areas. On the environmental side, he has been particularly active in radiation carcinogenesis and air pollution health effects research, notably as one of the senior investigators on the Southern California Children’s Health Study and the Women’s Environmental Cancer and Radiation Exposure (WECARE) study and as a member of President Clinton’s Advisory Committee on Human Radiation Experiments. On the genetic side, he is a coinvestigator in the NCI’s Colon Cancer Family Registry, the Genetic Analysis Workshop, the ENDGAME consortium to develop methods for genome-wide association studies, and past President of the International Genetic Epidemiology Society. Dr. Thomas has numerous publications, including the textbooks Statistical Methods in Genetic Epidemiology (Oxford University Press, 2004) and Statistical Methods in Environmental Epidemiology (Oxford University Press, 2009). He currently directs a program project grant on “Statistical methods for integrative genomics in cancer.”
Various high-dimensional epigenetic, transcriptomic, proteomic, metabolomic, and other – omic data have become available to provide insight into the mediation of genetic and environmental influences on disease risk through the internal environment. For example, the “exposome” concept has been implemented using mass spectrometry metabolomic measurements to capture a broad spectrum of internal metabolites of exogenous exposures, but statistical methods for analyzing these and other - omic data are in their infancy. The “Meeting-in-the-Middle” principle aims to identify the subset of metabolites that are related to both exposure and disease. Here, we introduce a novel hierarchical Bayes framework for implementing this idea through simultaneous variable selection on exposure-metabolite and metabolite-disease associations, while incorporating external information such as the pathways in which the different metabolites are thought to act. The approach is validated by simulation and applied to data on hepatocellular carcinoma of the liver in relation to a panel of 125 metabolites and 7 established risk factors from a nested case-control study within the EPIC cohort. 15 of the metabolites yielded Bayes factors for mediation greater that 20 (“strong” evidence), the majority of these with multiple exposures. To explore this phenomenon further, we expanded the hierarchical model to include the pathways through which these metabolites act as prior covariates. The strongest associations with exposures were found for the class of lysophosphatidylcholines and the strongest with disease for biogenic amines and acylcarnitines. These approaches could be extended to study mediation through multiple types of – omic data. Genetic Epidemiology 2016;40 (11): 619.