The "Etiome": Identification and Clustering of Human Disease Etiological Factors

BACKGROUND: Both genetic and environmental factors contribute to human diseases. Most common diseases are influenced by a large number of genetic and environmental factors, most of which individually have only a modest effect on the disease. Though genetic contributions are relatively well characterized for some monogenetic diseases, there has been no effort at curating the extensive list of environmental etiological factors.

RESULTS: From a comprehensive search of the MeSH annotation of MEDLINE articles, we identified 3,342 environmental etiological factors associated with 3,159 diseases. We also identified 1,100 genes associated with 1,034 complex diseases from the NIH Genetic Association Database (GAD), a database of genetic association studies. 863 diseases have both genetic and environmental etiological factors available. Integrating genetic and environmental factors results in the "etiome", which we define as the comprehensive compendium of disease etiology. Clustering of environmental factors may alert clinicians of the risks of added exposures, or synergy in interventions to alter these factors. Clustering of both genetic and environmental etiological factors puts genes in the context of environment in a quantitative manner.

CONCLUSION: In this paper, we obtained a comprehensive list of associations between disease and environmental factors using MeSH annotation of MEDLINE articles. It serves as a summary of current knowledge between etiological factors and diseases. By combining the environmental etiological factors and genetic factors from GAD, we computed the "etiome" profile for 863 diseases. Comparing diseases across these profiles may have utility for clinical medicine, basic science research, and population-based science.