Predicting three-dimensional genome organization with chromatin states

YF Qi and B Zhang, PLOS COMPUTATIONAL BIOLOGY, 15, e1007024 (2019).

DOI: 10.1371/journal.pcbi.1007024

We introduce a computational model to simulate chromatin structure and dynamics. Starting from one-dimensional genomics and epigenomics data that are available for hundreds of cell types, this model enables de novo prediction of chromatin structures at five-kilo-base resolution. Simulated chromatin structures recapitulate known features of genome organization, including the formation of chromatin loops, topologically associating domains (TADs) and compartments, and are in quantitative agreement with chromosome conformation capture experiments and super- resolution microscopy measurements. Detailed characterization of the predicted structural ensemble reveals the dynamical flexibility of chromatin loops and the presence of cross-talk among neighboring TADs. Analysis of the model's energy function uncovers distinct mechanisms for chromatin folding at various length scales and suggests a need to go beyond simple A/B compartment types to predict specific contacts between regulatory elements using polymer simulations. Author summary Three- dimensional genome organization is expected to play crucial roles in regulating gene expression and establishing cell fate, and has inspired the development of numerous innovative experimental techniques for its characterization. Though significant progress has been made, it remains challenging to construct chromosome structures at high resolution. Following the maximum entropy approach pioneered by Zhang and Wolynes, we developed a predictive model and parameterized a force field to study chromatin structure and dynamics using genome-wide chromosome conformation capture data (Hi-C). Starting from one-dimensional sequence information that includes histone modification profiles and CTCF binding sites, this model predicts chromosome structure at a 5kb resolution, thus establishing a sequence-structure relationship for the genome. A significant advantage of this model over comparable approaches is its ability to study long-range specific contacts between promoters and enhancers, in addition to building high-resolution structures for loops, TADs and compartments. Furthermore, the model is shown to be transferable across chromosomes and cell types, thus opens up the opportunity to carry out de novo prediction of genome organization for hundreds of cell types with available epigenomics but not Hi-C data.

Return to Publications page