bims-crepig Biomed News
on Chromatin regulation and epigenetics in cell fate and cancer
Issue of 2021‒07‒18
33 papers selected by
Connor Rogerson
University of Cambridge, MRC Cancer Unit

  1. Nucleic Acids Res. 2021 Jul 13. pii: gkab553. [Epub ahead of print]
      Chromatin is a tightly packaged structure of DNA and protein within the nucleus of a cell. The arrangement of different protein complexes along the DNA modulates and is modulated by gene expression. Measuring the binding locations and occupancy levels of different transcription factors (TFs) and nucleosomes is therefore crucial to understanding gene regulation. Antibody-based methods for assaying chromatin occupancy are capable of identifying the binding sites of specific DNA binding factors, but only one factor at a time. In contrast, epigenomic accessibility data like MNase-seq, DNase-seq, and ATAC-seq provide insight into the chromatin landscape of all factors bound along the genome, but with little insight into the identities of those factors. Here, we present RoboCOP, a multivariate state space model that integrates chromatin accessibility data with nucleotide sequence to jointly compute genome-wide probabilistic scores of nucleosome and TF occupancy, for hundreds of different factors. We apply RoboCOP to MNase-seq and ATAC-seq data to elucidate the protein-binding landscape of nucleosomes and 150 TFs across the yeast genome, and show that our model makes better predictions than existing methods. We also compute a chromatin occupancy profile of the yeast genome under cadmium stress, revealing chromatin dynamics associated with transcriptional regulation.
  2. FASEB J. 2021 Aug;35(8): e21768
      Insulators are cis-regulatory elements that block enhancer activity and prevent heterochromatin spreading. The binding of CCCTC-binding factor (CTCF) protein is essential for insulators to play the roles in a chromatin context. The β-globin locus, consisting of multiple genes and enhancers, is flanked by two insulators 3'HS1 and HS5. However, it has been reported that the absence of these insulators did not affect the β-globin transcription. To explain the unexpected finding, we have deleted a CTCF motif at 3'HS1 or HS5 in the human β-globin locus and analyzed chromatin interactions around the locus. It was found that a topologically associating domain (TAD) containing the β-globin locus is maintained by neighboring CTCF sites in the CTCF motif-deleted loci. The additional deletions of neighboring CTCF motifs disrupted the β-globin TAD, resulting in decrease of the β-globin transcription. Chromatin interactions of the β-globin enhancers with gene promoter were weakened in the multiple CTCF motifs-deleted loci, even though the enhancers have still active chromatin features such as histone H3K27ac and histone H3 depletion. Genome-wide analysis using public CTCF ChIA-PET and ChIP-seq data showed that chromatin domains possessing multiple CTCF binding sites tend to contain super-enhancers like the β-globin enhancers. Taken together, our results show that multiple CTCF sites surrounding the β-globin locus cooperate with each other to maintain a TAD. The β-globin TAD appears to provide a compact spatial environment that enables enhancers to interact with promoter.
    Keywords:  CTCF; TAD; chromatin interaction; enhancer; β-globin
  3. Nucleic Acids Res. 2021 Jul 09. pii: gkab598. [Epub ahead of print]
      Proper cell fate determination is largely orchestrated by complex gene regulatory networks centered around transcription factors. However, experimental elucidation of key transcription factors that drive cellular identity is currently often intractable. Here, we present ANANSE (ANalysis Algorithm for Networks Specified by Enhancers), a network-based method that exploits enhancer-encoded regulatory information to identify the key transcription factors in cell fate determination. As cell type-specific transcription factors predominantly bind to enhancers, we use regulatory networks based on enhancer properties to prioritize transcription factors. First, we predict genome-wide binding profiles of transcription factors in various cell types using enhancer activity and transcription factor binding motifs. Subsequently, applying these inferred binding profiles, we construct cell type-specific gene regulatory networks, and then predict key transcription factors controlling cell fate transitions using differential networks between cell types. This method outperforms existing approaches in correctly predicting major transcription factors previously identified to be sufficient for trans-differentiation. Finally, we apply ANANSE to define an atlas of key transcription factors in 18 normal human tissues. In conclusion, we present a ready-to-implement computational tool for efficient prediction of transcription factors in cell fate determination and to study transcription factor-mediated regulatory mechanisms. ANANSE is freely available at
  4. J Biol Chem. 2021 Jul 13. pii: S0021-9258(21)00747-X. [Epub ahead of print] 100947
      Transcription factors (TFs) harboring BTB (Broad-Complex, Tramtrack, and Bric a brac) domains play important roles in development and disease. These BTB domains are thought to recruit transcriptional modulators to target DNA regions. However, a systematic molecular understanding of the mechanism of action of this TF family is lacking. Here, we identify the zinc finger BTB-TF Zbtb2 from a genetic screen for regulators of exit from pluripotency, and demonstrate that its absence perturbs embryonic stem cell differentiation and the gene expression dynamics underlying peri-implantation development. We show that ZBTB2 binds the chromatin remodeler Ep400 to mediate downstream transcription. Independently, the BTB domain directly interacts with the chromatin remodeller NuRD and the histone chaperone HiRA. NuRD recruitment is a common feature of BTB-TFs, and based on phylogenetic analysis we propose that this is a conserved evolutionary property. Binding to UBN2, in contrast, is specific to ZBTB2 and requires a C-terminal extension of the BTB domain. Taken together, this study identifies a BTB-domain TF that recruits chromatin modifiers and a histone chaperone during a developmental cell state transition, and defines unique and shared molecular functions of the BTB-domain TF family.
    Keywords:  Ep400; HiRA; NuRD; ZBTB2; chromatin remodeling; embryonic stem cell; epigenetics; histone chaperone; transcription factor
  5. Nat Commun. 2021 Jul 16. 12(1): 4359
      Histone H3 lysine 9 (H3K9) methylation is a central epigenetic modification that defines heterochromatin from unicellular to multicellular organisms. In mammalian cells, H3K9 methylation can be catalyzed by at least six distinct SET domain enzymes: Suv39h1/Suv39h2, Eset1/Eset2 and G9a/Glp. We used mouse embryonic fibroblasts (MEFs) with a conditional mutation for Eset1 and introduced progressive deletions for the other SET domain genes by CRISPR/Cas9 technology. Compound mutant MEFs for all six SET domain lysine methyltransferase (KMT) genes lack all H3K9 methylation states, derepress nearly all families of repeat elements and display genomic instabilities. Strikingly, the 6KO H3K9 KMT MEF cells no longer maintain heterochromatin organization and have lost electron-dense heterochromatin. This is a compelling analysis of H3K9 methylation-deficient mammalian chromatin and reveals a definitive function for H3K9 methylation in protecting heterochromatin organization and genome integrity.
  6. Nat Commun. 2021 Jul 16. 12(1): 4344
      Poised enhancers (PEs) represent a genetically distinct set of distal regulatory elements that control the expression of major developmental genes. Before becoming activated in differentiating cells, PEs are already bookmarked in pluripotent cells with unique chromatin and topological features that could contribute to their privileged regulatory properties. However, since PEs were originally characterized in embryonic stem cells (ESC), it is currently unknown whether PEs are functionally conserved in vivo. Here, we show that the chromatin and 3D structural features of PEs are conserved among mouse pluripotent cells both in vitro and in vivo. We also uncovered that the interactions between PEs and their target genes are globally controlled by the combined action of Polycomb, Trithorax and architectural proteins. Moreover, distal regulatory sequences located close to developmental genes and displaying the typical genetic (i.e. CpG islands) and chromatin (i.e. high accessibility and H3K27me3 levels) features of PEs are commonly found across vertebrates. These putative PEs show high sequence conservation within specific vertebrate clades, with only a few being evolutionary conserved across all vertebrates. Lastly, by genetically disrupting PEs in mouse and chicken embryos, we demonstrate that these regulatory elements play essential roles during the induction of major developmental genes in vivo.
  7. Genes Dev. 2021 Jul 08.
      The establishment of cell fates involves alterations of transcription factor repertoires and repurposing of transcription factors by post-translational modifications. In embryonic stem cells (ESCs), the chromatin organizers SATB2 and SATB1 balance pluripotency and differentiation by activating and repressing pluripotency genes, respectively. Here, we show that conditional Satb2 gene inactivation weakens ESC pluripotency, and we identify SUMO2 modification of SATB2 by the E3 ligase ZFP451 as a potential driver of ESC differentiation. Mutations of two SUMO-acceptor lysines of Satb2 (Satb2 K→ R ) or knockout of Zfp451 impair the ability of ESCs to silence pluripotency genes and activate differentiation-associated genes in response to retinoic acid (RA) treatment. Notably, the forced expression of a SUMO2-SATB2 fusion protein in either Satb2 K→ R or Zfp451 -/- ESCs rescues, in part, their impaired differentiation potential and enhances the down-regulation of Nanog The differentiation defect of Satb2 K→ R ESCs correlates with altered higher-order chromatin interactions relative to Satb2 wt ESCs. Upon RA treatment of Satb2 wt ESCs, SATB2 interacts with ZFP451 and the LSD1/CoREST complex and gains binding at differentiation genes, which is not observed in RA-treated Satb2 K→ R cells. Thus, SATB2 SUMOylation may contribute to the rewiring of transcriptional networks and the chromatin interactome of ESCs in the transition of pluripotency to differentiation.
    Keywords:  ES cell; LSD1; SATB1; SATB2; SUMO2; ZFP451; differentiation; pluripotency
  8. Bioinformatics. 2021 07 12. 37(Suppl_1): i280-i288
      MOTIVATION: Mapping distal regulatory elements, such as enhancers, is a cornerstone for elucidating how genetic variations may influence diseases. Previous enhancer-prediction methods have used either unsupervised approaches or supervised methods with limited training data. Moreover, past approaches have implemented enhancer discovery as a binary classification problem without accurate boundary detection, producing low-resolution annotations with superfluous regions and reducing the statistical power for downstream analyses (e.g. causal variant mapping and functional validations). Here, we addressed these challenges via a two-step model called Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays (DECODE). First, we employed direct enhancer-activity readouts from novel functional characterization assays, such as STARR-seq, to train a deep neural network for accurate cell-type-specific enhancer prediction. Second, to improve the annotation resolution, we implemented a weakly supervised object detection framework for enhancer localization with precise boundary detection (to a 10 bp resolution) using Gradient-weighted Class Activation Mapping.RESULTS: Our DECODE binary classifier outperformed a state-of-the-art enhancer prediction method by 24% in transgenic mouse validation. Furthermore, the object detection framework can condense enhancer annotations to only 13% of their original size, and these compact annotations have significantly higher conservation scores and genome-wide association study variant enrichments than the original predictions. Overall, DECODE is an effective tool for enhancer classification and precise localization.
    AVAILABILITY AND IMPLEMENTATION: DECODE source code and pre-processing scripts are available at
    SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
  9. Cell Rep. 2021 Jul 13. pii: S2211-1247(21)00755-5. [Epub ahead of print]36(2): 109357
      Neuronal activity-induced enhancers drive gene activation. We demonstrate that BRG1, the core subunit of SWI/SNF-like BAF ATP-dependent chromatin remodeling complexes, regulates neuronal activity-induced enhancers. Upon stimulation, BRG1 is recruited to enhancers in an H3K27Ac-dependent manner. BRG1 regulates enhancer basal activities and inducibility by affecting cohesin binding, enhancer-promoter looping, RNA polymerase II recruitment, and enhancer RNA expression. We identify a serine phosphorylation site in BRG1 that is induced by neuronal stimulations and is sensitive to CaMKII inhibition. BRG1 phosphorylation affects its interaction with several transcription co-factors, including the NuRD repressor complex and cohesin, possibly modulating BRG1-mediated transcription outcomes. Using mice with knockin mutations, we show that non-phosphorylatable BRG1 fails to efficiently induce activity-dependent genes, whereas phosphomimic BRG1 increases enhancer activity and inducibility. These mutant mice display anxiety-like phenotypes and altered responses to stress. Therefore, we reveal a mechanism connecting neuronal signaling to enhancer activities through BRG1 phosphorylation.
    Keywords:  BRG1/SMARCA4; CaMKII; NuRD; PFI-3; chromatin remodeling; enhancer; enhancer-promoter looping; immediate early genes; neuronal activity-regulated genes; phosphorylation
  10. Mol Cell. 2021 Jul 12. pii: S1097-2765(21)00499-8. [Epub ahead of print]
      The super elongation complex (SEC) contains the positive transcription elongation factor b (P-TEFb) and the subcomplex ELL2-EAF1, which stimulates RNA polymerase II (RNA Pol II) elongation. Here, we report the cryoelectron microscopy (cryo-EM) structure of ELL2-EAF1 bound to a RNA Pol II elongation complex at 2.8 Å resolution. The ELL2-EAF1 dimerization module directly binds the RNA Pol II lobe domain, explaining how SEC delivers P-TEFb to RNA Pol II. The same site on the lobe also binds the initiation factor TFIIF, consistent with SEC binding only after the transition from transcription initiation to elongation. Structure-guided functional analysis shows that the stimulation of RNA elongation requires the dimerization module and the ELL2 linker that tethers the module to the RNA Pol II protrusion. Our results show that SEC stimulates elongation allosterically and indicate that this stimulation involves stabilization of a closed conformation of the RNA Pol II active center cleft.
    Keywords:  EAF1; ELL2; P-TEFb; RNA Pol II lobe and protrusion; RNA polymerase II; TFIIF; stimulation of transcription elongation; super elongation complex; transcription
  11. Bioinformatics. 2021 07 12. 37(Suppl_1): i317-i326
      MOTIVATION: Single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) provides new opportunities to dissect epigenomic heterogeneity and elucidate transcriptional regulatory mechanisms. However, computational modeling of scATAC-seq data is challenging due to its high dimension, extreme sparsity, complex dependencies and high sensitivity to confounding factors from various sources.RESULTS: Here, we propose a new deep generative model framework, named SAILER, for analyzing scATAC-seq data. SAILER aims to learn a low-dimensional nonlinear latent representation of each cell that defines its intrinsic chromatin state, invariant to extrinsic confounding factors like read depth and batch effects. SAILER adopts the conventional encoder-decoder framework to learn the latent representation but imposes additional constraints to ensure the independence of the learned representations from the confounding factors. Experimental results on both simulated and real scATAC-seq datasets demonstrate that SAILER learns better and biologically more meaningful representations of cells than other methods. Its noise-free cell embeddings bring in significant benefits in downstream analyses: clustering and imputation based on SAILER result in 6.9% and 18.5% improvements over existing methods, respectively. Moreover, because no matrix factorization is involved, SAILER can easily scale to process millions of cells. We implemented SAILER into a software package, freely available to all for large-scale scATAC-seq data analysis.
    AVAILABILITY AND IMPLEMENTATION: The software is publicly available at
    SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
  12. Bioinformatics. 2021 Jul 09. pii: btab507. [Epub ahead of print]
      MOTIVATION: Genome-wide profiling of transcription factor binding and chromatin states is a widely-used approach for mechanistic understanding of gene regulation. Recent technology development has enabled such profiling at single-cell resolution. However, an end-to-end computational pipeline for analyzing such data is still lacking.RESULTS: Here, we have developed a flexible pipeline for analysis and visualization of single-cell CUT&Tag and CUT&RUN data, which provides functions for sequence alignment, quality control, dimensionality reduction, cell clustering, data aggregation, and visualization. Furthermore, it is also seamlessly integrated with the functions in original CUT&RUNTools for population-level analyses. As such, this provides a valuable toolbox for the community.
    SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
  13. Bio Protoc. 2021 Jun 05. 11(11): e4043
      We previously introduced Cleavage Under Targets & Tagmentation (CUT&Tag), an epigenomic profiling method in which antibody tethering of the Tn5 transposase to a chromatin epitope of interest maps specific chromatin features in small samples and single cells. With CUT&Tag, intact cells or nuclei are permeabilized, followed by successive addition of a primary antibody, a secondary antibody, and a chimeric Protein A-Transposase fusion protein that binds to the antibody. Addition of Mg++ activates the transposase and inserts sequencing adapters into adjacent DNA in situ. We have since adapted CUT&Tag to also map chromatin accessibility by simply modifying the transposase activation conditions when using histone H3K4me2, H3K4me3, or Serine-5-phosphorylated RNA Polymerase II antibodies. Using these antibodies, we redirect the tagmentation of accessible DNA sites to produce chromatin accessibility maps with exceptionally high signal-to-noise and resolution. All steps from nuclei to amplified sequencing-ready libraries are performed in single PCR tubes using non-toxic reagents and inexpensive equipment, making our simplified strategy for simultaneous chromatin profiling and accessibility mapping suitable for the lab, home workbench, or classroom.
    Keywords:  CUT&Tag; Chromatin accessibility; Epigenomic profiling; Histone modifications; RNA polymerase II
  14. Genome Res. 2021 Jul 14. pii: gr.275269.121. [Epub ahead of print]
      Archived formalin-fixed paraffin-embedded (FFPE) samples are the global standard format for preservation of the majority of biopsies in both basic research and translational cancer studies, and profiling chromatin accessibility in the archived FFPE tissues is fundamental to understanding gene regulation. Accurate mapping of chromatin accessibility from FFPE specimens is challenging because of the high degree of DNA damage. Here, we first showed that standard ATAC-seq can be applied to purified FFPE nuclei but yields lower library complexity and a smaller proportion of long DNA fragments. We then present FFPE-ATAC, the first highly sensitive method for decoding chromatin accessibility in FFPE tissues that combines Tn5-mediated transposition and T7 in vitro transcription. The FFPE-ATAC generates high-quality chromatin accessibility profiles with 500 nuclei from a single FFPE tissue section, enables the dissection of chromatin profiles from the regions of interest with the aid of hematoxylin and eosin (H&E) staining, and reveals disease-associated chromatin regulation from the human colorectal cancer FFPE tissue archived for more than 10 years. In summary, the approach allows decoding of the chromatin states that regulate gene expression in archival FFPE tissues, thereby permitting investigators, to better understand epigenetic regulation in cancer and precision medicine.
  15. Proc Natl Acad Sci U S A. 2021 Jul 20. pii: e2105137118. [Epub ahead of print]118(29):
      During embryonic development, hierarchical cascades of transcription factors interact with lineage-specific chromatin structures to control the sequential steps in the differentiation of specialized cell types. While examples of transcription factor cascades have been well documented, the mechanisms underlying developmental changes in accessibility of cell type-specific enhancers remain poorly understood. Here, we show that the transcriptional "master regulator" ATOH1-which is necessary for the differentiation of two distinct mechanoreceptor cell types, hair cells in the inner ear and Merkel cells of the epidermis-is unable to access much of its target enhancer network in the progenitor populations of either cell type when it first appears, imposing a block to further differentiation. This block is overcome by a feed-forward mechanism in which ATOH1 first stimulates expression of POU4F3, which subsequently acts as a pioneer factor to provide access to closed ATOH1 enhancers, allowing hair cell and Merkel cell differentiation to proceed. Our analysis also indicates the presence of both shared and divergent ATOH1/POU4F3-dependent enhancer networks in hair cells and Merkel cells. These cells share a deep developmental lineage relationship, deriving from their common epidermal origin, and suggesting that this feed-forward mechanism preceded the evolutionary divergence of these very different mechanoreceptive cell types.
    Keywords:  ATOH1 and POU4F3; Merkel cells; feed-forward epigenetic transcriptional control; inner ear hair cells; mechanoreceptor evolution
  16. Nat Commun. 2021 07 15. 12(1): 4335
      Astrocytes have essential functions in brain homeostasis that are established late in differentiation, but the mechanisms underlying the functional maturation of astrocytes are not well understood. Here we identify extensive transcriptional changes that occur during murine astrocyte maturation in vivo that are accompanied by chromatin remodelling at enhancer elements. Investigating astrocyte maturation in a cell culture model revealed that in vitro-differentiated astrocytes lack expression of many mature astrocyte-specific genes, including genes for the transcription factors Rorb, Dbx2, Lhx2 and Fezf2. Forced expression of these factors in vitro induces distinct sets of mature astrocyte-specific transcripts. Culturing astrocytes in a three-dimensional matrix containing FGF2 induces expression of Rorb, Dbx2 and Lhx2 and improves astrocyte maturity based on transcriptional and chromatin profiles. Therefore, extrinsic signals orchestrate the expression of multiple intrinsic regulators, which in turn induce in a modular manner the transcriptional and chromatin changes underlying astrocyte maturation.
  17. Bioinformatics. 2021 Jul 13. pii: btab513. [Epub ahead of print]
      MOTIVATION: Identifying histone tail modifications using ChIP-seq is commonly used in time-series experiments in development and disease. These assays, however, cover specific time-points leaving intermediate or early stages with missing information. Although several machine learning methods were developed to predict histone marks, none exploited the dependence that exists in time-series experiments between data generated at specific time-points to extrapolate these findings to time-points where data cannot be generated for lack or scarcity of materials (i.e., early developmental stages).RESULTS: Here, we train a deep learning model named TempoMAGE, to predict the presence or absence of H3K27ac in open chromatin regions by integrating information from sequence, gene expression, chromatin accessibility and the estimated change in H3K27ac state from a reference time-point. We show that adding reference time-point information systematically improves the overall model's performance. Additionally, sequence signatures extracted from our method were exclusive to the training dataset indicating that our model learned data-specific features. As an application, TempoMAGE was able to predict the activity of enhancers from pre-validated in-vivo dataset highlighting its ability to be used for functional annotation of putative enhancers.
    AVAILABILITY: TempoMAGE is freely available through GitHub at
    SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
  18. Mol Metab. 2021 Jul 08. pii: S2212-8778(21)00136-8. [Epub ahead of print] 101291
      OBJECTIVE: Type II nuclear hormone receptors, including farnesoid X receptors (FXR), liver X receptors (LXR), and peroxisome proliferator-activated receptors (PPAR), which serve as drug targets for metabolic diseases, are permanently positioned in the nucleus and thought to be bound to DNA regardless of the ligand status. However, recent genome-wide location analysis showed that LXRα and PPARα binding in the liver is largely ligand-dependent. We hypothesized that pioneer factor Foxa2 evicts nucleosomes to enable ligand-dependent binding of type II nuclear receptors and performed genome-wide studies to test this hypothesis.METHODS: ATAC-Seq was used to profile chromatin accessibility, ChIP-Seq was performed to assess transcription factor (Foxa2, FXR, LXRα, and PPARα) binding, and RNA-Seq analysis determined differentially expressed genes in wildtype and Foxa2 mutants treated with a ligand (GW4064 for FXR, GW3965 and T09 for LXRα).
    RESULTS: We show that chromatin accessibility, FXR binding and LXRα occupancy, and ligand-responsive activation of gene expression by FXR and LXRα require Foxa2. Unexpectedly, Foxa2 occupancy is drastically increased when either receptor, FXR or LXRα, is bound by an agonist. In addition, co-immunoprecipitation experiments demonstrate that Foxa2 interacts with either receptor in a ligand-dependent manner, suggesting that Foxa2 and the receptor bind DNA as an interdependent complex during ligand activation. Furthermore, PPARα binding is induced in Foxa2 mutants treated with FXR and LXR ligands, leading to activation of PPARα targets.
    CONCLUSIONS: Our model requiring pioneering activity for ligand activation challenges the existing ligand-independent binding mechanism. We also demonstrate that Foxa2 is required to achieve activation of the proper receptor, one that binds the added ligand, by repressing the activity of a competing receptor.
    Keywords:  FXR; Foxa2; LXR; lipid metabolism; liver; nuclear receptor; pioneer factor
  19. Bioinformatics. 2021 07 12. 37(Suppl_1): i272-i279
      MOTIVATION: The high-throughput chromosome conformation capture (Hi-C) technique has enabled genome-wide mapping of chromatin interactions. However, high-resolution Hi-C data requires costly, deep sequencing; therefore, it has only been achieved for a limited number of cell types. Machine learning models based on neural networks have been developed as a remedy to this problem.RESULTS: In this work, we propose a novel method, EnHiC, for predicting high-resolution Hi-C matrices from low-resolution input data based on a generative adversarial network (GAN) framework. Inspired by non-negative matrix factorization, our model fully exploits the unique properties of Hi-C matrices and extracts rank-1 features from multi-scale low-resolution matrices to enhance the resolution. Using three human Hi-C datasets, we demonstrated that EnHiC accurately and reliably enhanced the resolution of Hi-C matrices and outperformed other GAN-based models. Moreover, EnHiC-predicted high-resolution matrices facilitated the accurate detection of topologically associated domains and fine-scale chromatin interactions.
    AVAILABILITY AND IMPLEMENTATION: EnHiC is publicly available at
    SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
  20. Nat Commun. 2021 07 12. 12(1): 4249
      5-Hydroxymethylcytosine (5hmC) is an important epigenetic mark that regulates gene expression. Charting the landscape of 5hmC in human tissues is fundamental to understanding its regulatory functions. Here, we systematically profiled the whole-genome 5hmC landscape at single-base resolution for 19 types of human tissues. We found that 5hmC preferentially decorates gene bodies and outperforms gene body 5mC in reflecting gene expression. Approximately one-third of 5hmC peaks are tissue-specific differentially-hydroxymethylated regions (tsDhMRs), which are deposited in regions that potentially regulate the expression of nearby tissue-specific functional genes. In addition, tsDhMRs are enriched with tissue-specific transcription factors and may rewire tissue-specific gene expression networks. Moreover, tsDhMRs are associated with single-nucleotide polymorphisms identified by genome-wide association studies and are linked to tissue-specific phenotypes and diseases. Collectively, our results show the tissue-specific 5hmC landscape of the human genome and demonstrate that 5hmC serves as a fundamental regulatory element affecting tissue-specific gene expression programs and functions.
  21. Brief Bioinform. 2021 Jul 16. pii: bbab273. [Epub ahead of print]
      Transcription factors (TFs) are essential proteins in regulating the spatiotemporal expression of genes. It is crucial to infer the potential transcription factor binding sites (TFBSs) with high resolution to promote biology and realize precision medicine. Recently, deep learning-based models have shown exemplary performance in the prediction of TFBSs at the base-pair level. However, the previous models fail to integrate nucleotide position information and semantic information without noisy responses. Thus, there is still room for improvement. Moreover, both the inner mechanism and prediction results of these models are challenging to interpret. To this end, the Deep Attentive Encoder-Decoder Neural Network (D-AEDNet) is developed to identify the location of TFs-DNA binding sites in DNA sequences. In particular, our model adopts Skip Architecture to leverage the nucleotide position information in the encoder and removes noisy responses in the information fusion process by Attention Gate. Simultaneously, the Transcription Factor Motif Discovery based on Sliding Window (TF-MoDSW), an approach to discover TFs-DNA binding motifs by utilizing the output of neural networks, is proposed to understand the biological meaning of the predicted result. On ChIP-exo datasets, experimental results show that D-AEDNet has better performance than competing methods. Besides, we authenticate that Attention Gate can improve the interpretability of our model by ways of visualization analysis. Furthermore, we confirm that ability of D-AEDNet to learn TFs-DNA binding motifs outperform the state-of-the-art methods and availability of TF-MoDSW to discover biological sequence motifs in TFs-DNA interaction by conducting experiment on ChIP-seq datasets.
    Keywords:  Attention Gate; interpretability; motif discovery; transcription factor binding sites
  22. Genome Biol. 2021 Jul 12. 22(1): 206
      BACKGROUND: Metazoan cells only utilize a small subset of the potential DNA replication origins to duplicate the whole genome in each cell cycle. Origin choice is linked to cell growth, differentiation, and replication stress. Although various genetic and epigenetic signatures have been linked to the replication efficiency of origins, there is no consensus on how the selection of origins is determined.RESULTS: We apply dual-color stochastic optical reconstruction microscopy (STORM) super-resolution imaging to map the spatial distribution of origins within individual topologically associating domains (TADs). We find that multiple replication origins initiate separately at the spatial boundary of a TAD at the beginning of the S phase. Intriguingly, while both high-efficiency and low-efficiency origins are distributed homogeneously in the TAD during the G1 phase, high-efficiency origins relocate to the TAD periphery before the S phase. Origin relocalization is dependent on both transcription and CTCF-mediated chromatin structure. Further, we observe that the replication machinery protein PCNA forms immobile clusters around TADs at the G1/S transition, explaining why origins at the TAD periphery are preferentially fired.
    CONCLUSION: Our work reveals a new origin selection mechanism that the replication efficiency of origins is determined by their physical distribution in the chromatin domain, which undergoes a transcription-dependent structural re-organization process. Our model explains the complex links between replication origin efficiency and many genetic and epigenetic signatures that mark active transcription. The coordination between DNA replication, transcription, and chromatin organization inside individual TADs also provides new insights into the biological functions of sub-domain chromatin structural dynamics.
    Keywords:  Chromatin structure; Replication origin; STORM; Super-resolution imaging; Topologically associating domain (TAD); Transcription
  23. Nucleic Acids Res. 2021 Jul 14. pii: gkab607. [Epub ahead of print]
      Recent studies demonstrate that histones are subjected to a series of short-chain fatty acid modifications that is known as histone acylations. However, the enzymes responsible for histone acylations in vivo are not well characterized. Here, we report that HBO1 is a versatile histone acyltransferase that catalyzes not only histone acetylation but also propionylation, butyrylation and crotonylation both in vivo and in vitro and does so in a JADE or BRPF family scaffold protein-dependent manner. We show that the minimal HBO1/BRPF2 complex can accommodate acetyl-CoA, propionyl-CoA, butyryl-CoA and crotonyl-CoA. Comparison of CBP and HBO1 reveals that they catalyze histone acylations at overlapping as well as distinct sites, with HBO1 being the key enzyme for H3K14 acylations. Genome-wide chromatin immunoprecipitation assay demonstrates that HBO1 is highly enriched at and contributes to bulk histone acylations on the transcriptional start sites of active transcribed genes. HBO1 promoter intensity highly correlates with the level of promoter histone acylation, but has no significant correlation with level of transcription. We also show that HBO1 is associated with a subset of DNA replication origins. Collectively our study establishes HBO1 as a versatile histone acyltransferase that links histone acylations to promoter acylations and selection of DNA replication origins.
  24. Nat Cell Biol. 2021 Jul;23(7): 704-717
      Haematopoietic stem cells (HSCs) are normally quiescent, but have evolved mechanisms to respond to stress. Here, we evaluate haematopoietic regeneration induced by chemotherapy. We detect robust chromatin reorganization followed by increased transcription of transposable elements (TEs) during early recovery. TE transcripts bind to and activate the innate immune receptor melanoma differentiation-associated protein 5 (MDA5) that generates an inflammatory response that is necessary for HSCs to exit quiescence. HSCs that lack MDA5 exhibit an impaired inflammatory response after chemotherapy and retain their quiescence, with consequent better long-term repopulation capacity. We show that the overexpression of ERV and LINE superfamily TE copies in wild-type HSCs, but not in Mda5-/- HSCs, results in their cycling. By contrast, after knockdown of LINE1 family copies, HSCs retain their quiescence. Our results show that TE transcripts act as ligands that activate MDA5 during haematopoietic regeneration, thereby enabling HSCs to mount an inflammatory response necessary for their exit from quiescence.
  25. Proc Natl Acad Sci U S A. 2021 Jul 20. pii: e2101671118. [Epub ahead of print]118(29):
      Lipids are present within the cell nucleus, where they engage with factors involved in gene regulation. Cholesterol associates with chromatin in vivo and stimulates nucleosome packing in vitro, but its effects on specific transcriptional responses are not clear. Here, we show that the lipidated Wilms tumor 1 (WT1) transcriptional corepressor, brain acid soluble protein 1 (BASP1), interacts with cholesterol in the cell nucleus through a conserved cholesterol interaction motif. We demonstrate that BASP1 directly recruits cholesterol to the promoter region of WT1 target genes. Mutation of BASP1 to ablate its interaction with cholesterol or the treatment of cells with drugs that block cholesterol biosynthesis inhibits the transcriptional repressor function of BASP1. We find that the BASP1-cholesterol interaction is required for BASP1-dependent chromatin remodeling and the direction of transcription programs that control cell differentiation. Our study uncovers a mechanism for gene-specific targeting of cholesterol where it is required to mediate transcriptional repression.
    Keywords:  BASP1; WT1; cholesterol; chromatin; transcription
  26. Stem Cell Reports. 2021 Jun 28. pii: S2213-6711(21)00315-5. [Epub ahead of print]
      Histone variants contribute to the complexity of the chromatin landscape and play an integral role in defining DNA domains and regulating gene expression. The histone H3 variant H3.3 is incorporated into genic elements independent of DNA replication by its chaperone HIRA. Here we demonstrate that Hira is required for the self-renewal of adult hematopoietic stem cells (HSCs) and to restrain erythroid differentiation. Deletion of Hira led to rapid depletion of HSCs while differentiated hematopoietic cells remained largely unaffected. Depletion of HSCs after Hira deletion was accompanied by increased expression of bivalent and erythroid genes, which was exacerbated upon cell division and paralleled increased erythroid differentiation. Assessing H3.3 occupancy identified a subset of polycomb-repressed chromatin in HSCs that depends on HIRA to maintain the inaccessible, H3.3-occupied state for gene repression. HIRA-dependent H3.3 incorporation thus defines distinct repressive chromatin that represses erythroid differentiation of HSCs.
    Keywords:  Hira; differentiation; epigenetics; erythropoiesis; hematopoiesis; hematopoietic stem cell; histone H3.3; polycomb
  27. Nat Commun. 2021 Jul 16. 12(1): 4369
      There is a strong demand for methods that can efficiently reconstruct valid super-resolution intact genome 3D structures from sparse and noise single-cell Hi-C data. Here, we develop Single-Cell Chromosome Conformation Calculator (Si-C) within the Bayesian theory framework and apply this approach to reconstruct intact genome 3D structures from single-cell Hi-C data of eight G1-phase haploid mouse ES cells. The inferred 100-kb and 10-kb structures consistently reproduce the known conserved features of chromatin organization revealed by independent imaging experiments. The analysis of the 10-kb resolution 3D structures reveals cell-to-cell varying domain structures in individual cells and hyperfine structures in domains, such as loops. An average of 0.2 contact reads per divided bin is sufficient for Si-C to obtain reliable structures. The valid super-resolution structures constructed by Si-C demonstrate the potential for visualizing and investigating interactions between all chromatin loci at the genome scale in individual cells.
  28. Cell Rep. 2021 Jul 13. pii: S2211-1247(21)00723-3. [Epub ahead of print]36(2): 109347
      Proper lung function relies on the precise balance of specialized epithelial cells that coordinate to maintain homeostasis. Herein, we describe essential roles for the transcriptional regulators YAP/TAZ in maintaining lung epithelial homeostasis, reporting that conditional deletion of Yap and Wwtr1/Taz in the lung epithelium of adult mice results in severe defects, including alveolar disorganization and the development of airway mucin hypersecretion. Through in vivo lineage tracing and in vitro molecular experiments, we reveal that reduced YAP/TAZ activity promotes intrinsic goblet transdifferentiation of secretory airway epithelial cells. Global gene expression and chromatin immunoprecipitation sequencing (ChIP-seq) analyses suggest that YAP/TAZ act cooperatively with TEA domain (TEAD) transcription factors and the NuRD complex to suppress the goblet cell fate program, directly repressing the SPDEF gene. Collectively, our study identifies YAP/TAZ as critical factors in lung epithelial homeostasis and offers molecular insight into the mechanisms promoting goblet cell differentiation, which is a hallmark of many lung diseases.
    Keywords:  Hippo; Mucin; NuRD; Spdef; Taz; Tead; Yap; goblet cell; lung
  29. Nat Commun. 2021 Jul 16. 12(1): 4362
      Squamous cell carcinomas (SCCs) comprise one of the most common histologic types of human cancer. Transcriptional dysregulation of SCC cells is orchestrated by tumor protein p63 (TP63), a master transcription factor (TF) and a well-researched SCC-specific oncogene. In the present study, both Gene Set Enrichment Analysis (GSEA) of SCC patient samples and in vitro loss-of-function assays establish fatty-acid metabolism as a key pathway downstream of TP63. Further studies identify sterol regulatory element binding transcription factor 1 (SREBF1) as a central mediator linking TP63 with fatty-acid metabolism, which regulates the biosynthesis of fatty-acids, sphingolipids (SL), and glycerophospholipids (GPL), as revealed by liquid chromatography tandem mass spectrometry (LC-MS/MS)-based lipidomics. Moreover, a feedback co-regulatory loop consisting of SREBF1/TP63/Kruppel like factor 5 (KLF5) is identified, which promotes overexpression of all three TFs in SCCs. Downstream of SREBF1, a non-canonical, SCC-specific function is elucidated: SREBF1 cooperates with TP63/KLF5 to regulate hundreds of cis-regulatory elements across the SCC epigenome, which converge on activating cancer-promoting pathways. Indeed, SREBF1 is essential for SCC viability and migration, and its overexpression is associated with poor survival in SCC patients. Taken together, these data shed light on mechanisms of transcriptional dysregulation in cancer, identify specific epigenetic regulators of lipid metabolism, and uncover SREBF1 as a potential therapeutic target and prognostic marker in SCC.
  30. Nat Genet. 2021 Jul 15.
      In mammalian embryos, proper zygotic genome activation (ZGA) underlies totipotent development. Double homeobox (DUX)-family factors participate in ZGA, and mouse Dux is required for forming cultured two-cell (2C)-like cells. Remarkably, in mouse embryonic stem cells, Dux is activated by the tumor suppressor p53, and Dux expression promotes differentiation into expanded-fate cell types. Long-read sequencing and assembly of the mouse Dux locus reveals its complex chromatin regulation including putative positive and negative feedback loops. We show that the p53-DUX/DUX4 regulatory axis is conserved in humans. Furthermore, we demonstrate that cells derived from patients with facioscapulohumeral muscular dystrophy (FSHD) activate human DUX4 during p53 signaling via a p53-binding site in a primate-specific subtelomeric long terminal repeat (LTR)10C element. In summary, our work shows that p53 activation convergently evolved to couple p53 to Dux/DUX4 activation in embryonic stem cells, embryos and cells from patients with FSHD, potentially uniting the developmental and disease regulation of DUX-family factors and identifying evidence-based therapeutic opportunities for FSHD.
  31. Nat Aging. 2021 May;1(5): 454-472
      Cellular senescence restrains the expansion of neoplastic cells through several layers of regulation. We report that the histone H3-specific demethylase KDM4 is expressed as human stromal cells undergo senescence. In clinical oncology, upregulated KDM4 and diminished H3K9/H3K36 methylation correlate with poorer survival of prostate cancer patients post-chemotherapy. Global chromatin accessibility mapping via ATAC-seq, and expression profiling through RNA-seq, reveal global changes of chromatin openness and spatiotemporal reprogramming of the transcriptomic landscape, which underlie the senescence-associated secretory phenotype (SASP). Selective targeting of KDM4 dampens the SASP of senescent stromal cells, promotes cancer cell apoptosis in the treatment-damaged tumor microenvironment (TME), and prolongs survival of experimental animals. Our study supports dynamic changes of H3K9/H3K36 methylation during senescence, identifies an unusually permissive chromatin state, and unmasks KDM4 as a key SASP modulator. KDM4 targeting presents a novel therapeutic avenue to manipulate cellular senescence and limit its contribution to age-related pathologies including cancer.
  32. Cancer Discov. 2021 Jul;2(4): 370-387
      Lysine demethylase 5A (KDM5A) is a negative regulator of histone H3K4 trimethylation, a histone mark associated with activate gene transcription. We identify that KDM5A interacts with the P-TEFb complex and cooperates with MYC to control MYC targeted genes in multiple myeloma (MM) cells. We develop a cell-permeable and selective KDM5 inhibitor, JQKD82, that increases histone H3K4me3 but paradoxically inhibits downstream MYC-driven transcriptional output in vitro and in vivo. Using genetic ablation together with our inhibitor, we establish that KDM5A supports MYC target gene transcription independent of MYC itself, by supporting TFIIH (CDK7)- and P-TEFb (CDK9)-mediated phosphorylation of RNAPII. These data identify KDM5A as a unique vulnerability in MM functioning through regulation of MYC-target gene transcription, and establish JQKD82 as a tool compound to block KDM5A function as a potential therapeutic strategy for MM.
    Keywords:  Epigenetics; Histone modifications; JQKD82; KDM5 inhibitor; Multiple myeloma; Transcription factor
  33. Sci Adv. 2021 Jul;pii: eabg7444. [Epub ahead of print]7(29):
      Histone H3K27M is a driving mutation in diffuse intrinsic pontine glioma (DIPG), a deadly pediatric brain tumor. H3K27M reshapes the epigenome through a global inhibition of PRC2 catalytic activity and displacement of H3K27me2/3, promoting oncogenesis of DIPG. As a consequence, a histone modification H3K36me2, antagonistic to H3K27me2/3, is aberrantly elevated. Here, we investigate the role of H3K36me2 in H3K27M-DIPG by tackling its upstream catalyzing enzymes (writers) and downstream binding factors (readers). We determine that NSD1 and NSD2 are the key writers for H3K36me2. Loss of NSD1/2 in H3K27M-DIPG impedes cellular proliferation and tumorigenesis by disrupting tumor-promoting transcriptional programs. Further, we demonstrate that LEDGF and HDGF2 are the main readers mediating the protumorigenic effects downstream of NSD1/2-H3K36me2. Treatment with a chemically modified peptide mimicking endogenous H3K36me2 dislodges LEDGF/HDGF2 from chromatin and specifically inhibits the proliferation of H3K27M-DIPG. Our results indicate a functional pathway of NSD1/2-H3K36me2-LEDGF/HDGF2 as an acquired dependency in H3K27M-DIPG.