bims-crepig Biomed News
on Chromatin regulation and epigenetics in cell fate and cancer
Issue of 2021–02–21
thirty-one papers selected by
Connor Rogerson, University of Cambridge, MRC Cancer Unit



  1. Nucleic Acids Res. 2021 Jan 27. pii: gkab034. [Epub ahead of print]
      Maintenance of stem-cell identity requires proper regulation of enhancer activity. Both transcription factors OCT4/SOX2/NANOG and histone methyltransferase complexes MLL/SET1 were shown to regulate enhancer activity, but how they are regulated in embryonic stem cells (ESCs) remains further studies. Here, we report a transcription factor BACH1, which directly interacts with OCT4/SOX2/NANOG (OSN) and MLL/SET1 methyltransferase complexes and maintains pluripotency in mouse ESCs (mESCs). BTB domain and bZIP domain of BACH1 are required for these interactions and pluripotency maintenance. Loss of BACH1 reduced the interaction between NANOG and MLL1/SET1 complexes, and decreased their occupancy on chromatin, and further decreased H3 lysine 4 trimethylation (H3K4me3) level on gene promoters and (super-) enhancers, leading to decreased enhancer activity and transcription activity, especially on stemness-related genes. Moreover, BACH1 recruited NANOG through chromatin looping and regulated remote NANOG binding, fine-tuning enhancer-promoter activity and gene expression. Collectively, these observations suggest that BACH1 maintains pluripotency in ESCs by recruiting NANOG and MLL/SET1 complexes to chromatin and maintaining the trimethylated state of H3K4 and enhancer-promoter activity, especially on stemness-related genes.
    DOI:  https://doi.org/10.1093/nar/gkab034
  2. Cell Rep. 2021 Jan 26. pii: S2211-1247(20)31668-5. [Epub ahead of print]34(4): 108679
      Cells in renewing tissues exhibit dramatic transcriptional changes as they differentiate. The contribution of chromatin looping to tissue renewal is incompletely understood. Enhancer-promoter interactions could be relatively stable as cells transition from progenitor to differentiated states; alternatively, chromatin looping could be as dynamic as the gene expression from their loci. The intestinal epithelium is the most rapidly renewing mammalian tissue. Proliferative cells in crypts of Lieberkühn sustain a stream of differentiated cells that are continually shed into the lumen. We apply chromosome conformation capture combined with chromatin immunoprecipitation (HiChIP) and sequencing to measure enhancer-promoter interactions in progenitor and differentiated cells of the intestinal epithelium. Despite dynamic gene regulation across the differentiation axis, we find that enhancer-promoter interactions are relatively stable. Functionally, we find HNF4 transcription factors are required for chromatin looping at target genes. Depletion of HNF4 disrupts local chromatin looping, histone modifications, and target gene expression. This study provides insights into transcriptional regulatory mechanisms governing homeostasis in renewing tissues.
    Keywords:  3D chromatin looping; HNF4 transcription factors; HiChIP; chylomicron; crypt-villus axis; enhancer-promoter interactions; intestinal epithelium; lipid; lipogenesis; transcriptional regulation
    DOI:  https://doi.org/10.1016/j.celrep.2020.108679
  3. Cell Death Dis. 2021 Feb 19. 12(2): 197
      Transcription factors (TFs) regulate the expression of target genes, inducing changes in cell morphology or activities needed for cell fate determination and differentiation. The BMP signaling pathway is widely regarded as one of the most important pathways in vertebrate skeletal biology, of which BMP2 is a potent inducer, governing the osteoblast differentiation of bone marrow stromal cells (BMSCs). However, the mechanism by which BMP2 initiates its downstream transcription factor cascade and determines the direction of differentiation remains largely unknown. In this study, we used RNA-seq, ATAC-seq, and animal models to characterize the BMP2-dependent gene regulatory network governing osteoblast lineage commitment. Sp7-Cre; Bmp2fx/fx mice (BMP2-cKO) were generated and exhibited decreased bone density and lower osteoblast number (n > 6). In vitro experiments showed that BMP2-cKO mouse bone marrow stromal cells (mBMSCs) had an impact on osteoblast differentiation and deficient cell proliferation. Osteogenic medium induced mBMSCs from BMP2-cKO mice and control were subjected to RNA-seq and ATAC-seq analysis to reveal differentially expressed TFs, along with their target open chromatin regions. Combined with H3K27Ac CUT&Tag during osteoblast differentiation, we identified 2338 BMP2-dependent osteoblast-specific active enhancers. Motif enrichment assay revealed that over 80% of these elements were directly targeted by RUNX2, DLX5, MEF2C, OASIS, and KLF4. We deactivated Klf4 in the Sp7 + lineage to validate the role of KLF4 in osteoblast differentiation of mBMSCs. Compared to the wild-type, Sp7-Cre; Klf4fx/+ mice (KLF4-Het) were smaller in size and had abnormal incisors resembling BMP2-cKO mice. Additionally, KLF4-Het mice had fewer osteoblasts and decreased osteogenic ability. RNA-seq and ATAC-seq revealed that KLF4 mainly "co-bound" with RUNX2 to regulate downstream genes. Given the significant overlap between KLF4- and BMP2-dependent NFRs and enriched motifs, our findings outline a comprehensive BMP2-dependent gene regulatory network specifically governing osteoblast differentiation of the Sp7 + lineage, in which Klf4 is a novel transcription factor.
    DOI:  https://doi.org/10.1038/s41419-021-03480-7
  4. Mol Cell. 2021 Jan 21. pii: S1097-2765(20)30961-8. [Epub ahead of print]
      While the role of transcription factors and coactivators in controlling enhancer activity and chromatin structure linked to gene expression is well established, the involvement of corepressors is not. Using inflammatory macrophage activation as a model, we investigate here a corepressor complex containing GPS2 and SMRT both genome-wide and at the Ccl2 locus, encoding the chemokine CCL2 (MCP-1). We report that corepressors co-occupy candidate enhancers along with the coactivators CBP (H3K27 acetylase) and MED1 (mediator) but act antagonistically by repressing eRNA transcription-coupled H3K27 acetylation. Genome editing, transcriptional interference, and cistrome analysis reveals that apparently related enhancer and silencer elements control Ccl2 transcription in opposite ways. 4C-seq indicates that corepressor depletion or inflammatory signaling functions mechanistically similarly to trigger enhancer activation. In ob/ob mice, adipose tissue macrophage-selective depletion of the Ccl2 enhancer-transcribed eRNA reduces metaflammation. Thus, the identified corepressor-eRNA-chemokine pathway operates in vivo and suggests therapeutic opportunities by targeting eRNAs in immuno-metabolic diseases.
    Keywords:  GPS2; SMRT; chromatin remodeling; corepressor; eRNA; enhancer; epigenetics; inflammation; macrophage; silencer
    DOI:  https://doi.org/10.1016/j.molcel.2020.12.040
  5. Nat Genet. 2021 Feb 18.
      Rapid cellular responses to environmental stimuli are fundamental for development and maturation. Immediate early genes can be transcriptionally induced within minutes in response to a variety of signals. How their induction levels are regulated and their untimely activation by spurious signals prevented during development is poorly understood. We found that in developing sensory neurons, before perinatal sensory-activity-dependent induction, immediate early genes are embedded into a unique bipartite Polycomb chromatin signature, carrying active H3K27ac on promoters but repressive Ezh2-dependent H3K27me3 on gene bodies. This bipartite signature is widely present in developing cell types, including embryonic stem cells. Polycomb marking of gene bodies inhibits mRNA elongation, dampening productive transcription, while still allowing for fast stimulus-dependent mark removal and bipartite gene induction. We reveal a developmental epigenetic mechanism regulating the rapidity and amplitude of the transcriptional response to relevant stimuli, while preventing inappropriate activation of stimulus-response genes.
    DOI:  https://doi.org/10.1038/s41588-021-00789-z
  6. Genes Dev. 2021 Feb 18.
      mSWI/SNF or BAF chromatin regulatory complexes are dosage-sensitive regulators of human neural development frequently mutated in autism spectrum disorders and intellectual disability. Cell cycle exit and differentiation of neural stem/progenitor cells is accompanied by BAF subunit switching to generate neuron-specific nBAF complexes. We manipulated the timing of BAF subunit exchange in vivo and found that early loss of the npBAF subunit BAF53a stalls the cell cycle to disrupt neurogenesis. Loss of BAF53a results in decreased chromatin accessibility at specific neural transcription factor binding sites, including the pioneer factors SOX2 and ASCL1, due to Polycomb accumulation. This results in repression of cell cycle genes, thereby blocking cell cycle progression and differentiation. Cell cycle block upon Baf53a deletion could be rescued by premature expression of the nBAF subunit BAF53b but not by other major drivers of proliferation or differentiation. WNT, EGF, bFGF, SOX2, c-MYC, or PAX6 all fail to maintain proliferation in the absence of BAF53a, highlighting a novel mechanism underlying neural progenitor cell cycle exit in the continued presence of extrinsic proliferative cues.
    Keywords:  BAF complex; cell cycle exit; cell type-specific transcriptional networks; chromatin accessibility; cortical development; neurogenesis
    DOI:  https://doi.org/10.1101/gad.342345.120
  7. Genomics Proteomics Bioinformatics. 2021 Feb 10. pii: S1672-0229(21)00012-7. [Epub ahead of print]
      The establishment of a landscape of enhancers across human cells is crucial to deciphering the mechanism of gene regulation, cell differentiation, and disease development. High-throughput experimental approaches, though having successfully reported enhancers in typical cell lines, are still too costly and time consuming to perform systematic identification of enhancers specific to different cell lines. Existing computational methods, though capable of predicting regulatory elements purely relying on DNA sequences, lack the power of cell line-specific screening. Recent studies have suggested that chromatin accessibility of a DNA segment is closely related to its potential function in regulation, and thus may provide useful information in identifying regulatory elements. Motivated by the above understanding, we integrate DNA sequences and chromatin accessibility data to accurately predict enhancers in a cell line-specific manner. We proposed DeepCAPE, a deep convolutional neural network to predict enhancers via the integration of DNA sequences and DNase-seq data. Benefitting from the well-designed feature extraction mechanism and skip connection strategy, our model not only consistently outperforms existing methods in the imbalanced classification of cell line-specific enhancers against background sequences, but also has the ability to self-adapt to different sizes of datasets. Besides, with the adoption of auto-encoder, our model is capable of making cross cell-line predictions. We further visualize kernels of the first convolutional layer and show the match of identified sequence signatures and known motifs. We finally demonstrate the potential ability of our model to explain functional implications of putative disease-associated genetic variants and discriminate disease-related enhancers.
    Keywords:  Chromatin accessibility; Data integration; Disease-associated regulatory element; Enhancer prediction; Transcription factor binding motif
    DOI:  https://doi.org/10.1016/j.gpb.2019.04.006
  8. BMC Biol. 2021 Feb 15. 19(1): 30
       BACKGROUND: The concentrations of distinct types of RNA in cells result from a dynamic equilibrium between RNA synthesis and decay. Despite the critical importance of RNA decay rates, current approaches for measuring them are generally labor-intensive, limited in sensitivity, and/or disruptive to normal cellular processes. Here, we introduce a simple method for estimating relative RNA half-lives that is based on two standard and widely available high-throughput assays: Precision Run-On sequencing (PRO-seq) and RNA sequencing (RNA-seq).
    RESULTS: Our method treats PRO-seq as a measure of transcription rate and RNA-seq as a measure of RNA concentration, and estimates the rate of RNA decay required for a steady-state equilibrium. We show that this approach can be used to assay relative RNA half-lives genome-wide, with good accuracy and sensitivity for both coding and noncoding transcription units. Using a structural equation model (SEM), we test several features of transcription units, nearby DNA sequences, and nearby epigenomic marks for associations with RNA stability after controlling for their effects on transcription. We find that RNA splicing-related features are positively correlated with RNA stability, whereas features related to miRNA binding and DNA methylation are negatively correlated with RNA stability. Furthermore, we find that a measure based on U1 binding and polyadenylation sites distinguishes between unstable noncoding and stable coding transcripts but is not predictive of relative stability within the mRNA or lincRNA classes. We also identify several histone modifications that are associated with RNA stability.
    CONCLUSION: We introduce an approach for estimating the relative half-lives of individual RNAs. Together, our estimation method and systematic analysis shed light on the pervasive impacts of RNA stability on cellular RNA concentrations.
    Keywords:  Epigenomics; PRO-seq; RNA half-life; RNA splicing; Structural equation modeling
    DOI:  https://doi.org/10.1186/s12915-021-00949-x
  9. Genome Biol. 2021 Feb 14. 22(1): 61
      Single-cell and bulk genomics assays have complementary strengths and weaknesses, and alone neither strategy can fully capture regulatory elements across the diversity of cells in complex tissues. We present CellWalker, a method that integrates single-cell open chromatin (scATAC-seq) data with gene expression (RNA-seq) and other data types using a network model that simultaneously improves cell labeling in noisy scATAC-seq and annotates cell type-specific regulatory elements in bulk data. We demonstrate CellWalker's robustness to sparse annotations and noise using simulations and combined RNA-seq and ATAC-seq in individual cells. We then apply CellWalker to the developing brain. We identify cells transitioning between transcriptional states, resolve regulatory elements to cell types, and observe that autism and other neurological traits can be mapped to specific cell types through their regulatory elements.
    DOI:  https://doi.org/10.1186/s13059-021-02279-1
  10. Endocrinology. 2021 Feb 16. pii: bqab036. [Epub ahead of print]
      The hormone, prolactin, has been implicated in breast cancer pathogenesis and regulates chromatin engagement by the transcription factor, STAT5A. STAT5A is known to inducibly bind promoters and cis-regulatory elements genome wide, though the mechanisms by which it exerts specificity and regulation of target gene expression remain enigmatic. We previously identified HDAC6 and HMGN2 as cofactors that facilitate prolactin induced, STAT5A mediated gene expression. Here, multi-condition STAT5A, HDAC6, and HMGN2 ChIP-seq with parallel condition RNA-seq are utilized to reveal the cis-regulatory landscape and cofactor dynamics underlying prolactin stimulated gene expression in breast cancer. We find that prolactin regulated genes are significantly enriched for cis-regulatory elements bound by HDAC6 and HMGN2, and that inducible STAT5A binding at enhancers, rather than promoters, conveys specificity for prolactin regulated genes. The selective HDAC6 inhibitor, ACY-241, blocks prolactin induced STAT5A chromatin engagement at cis-regulatory elements as well as a significant proportion of prolactin stimulated gene expression. We identify functional pathways known to contribute to the development and/or progression of breast cancer that are activated by prolactin and inhibited by ACY-241. Additionally, we find that the DNA sequences underlying shared STAT5A and HDAC6 binding-sites at enhancers are differentially enriched for estrogen response elements (ESR1 and ESR2 motifs) relative to enhancers bound by STAT5A alone. Gene set enrichment analysis identifies significant overlap of ERα regulated genes with genes regulated by prolactin, particularly prolactin regulated genes with promoters or enhancers co-occupied by both STAT5A and HDAC6. Lastly, the therapeutic efficacy of ACY-241 is demonstrated in in vitro and in vivo breast cancer models, where we identify synergistic ACY-241 drug combinations and observe differential sensitivity of ER + models relative to ER - models.
    Keywords:  ACY-241; Citarinostat; HDAC6; STAT5A; breast cancer; prolactin
    DOI:  https://doi.org/10.1210/endocr/bqab036
  11. Nat Commun. 2021 02 16. 12(1): 1046
      Three-dimensional chromatin looping interactions play an important role in constraining enhancer-promoter interactions and mediating transcriptional gene regulation. CTCF is thought to play a critical role in the formation of these loops, but the specificity of which CTCF binding events form loops and which do not is difficult to predict. Loops often have convergent CTCF binding site motif orientation, but this constraint alone is only weakly predictive of genome-wide interaction data. Here we present an easily interpretable and simple mathematical model of CTCF mediated loop formation which is consistent with Cohesin extrusion and can predict ChIA-PET CTCF looping interaction measurements with high accuracy. Competition between overlapping loops is a critical determinant of loop specificity. We show that this model is consistent with observed chromatin interaction frequency changes induced by CTCF binding site deletion, inversion, and mutation, and is also consistent with observed constraints on validated enhancer-promoter interactions.
    DOI:  https://doi.org/10.1038/s41467-021-21368-0
  12. Nat Genet. 2021 Feb 18.
      The arrangement (syntax) of transcription factor (TF) binding motifs is an important part of the cis-regulatory code, yet remains elusive. We introduce a deep learning model, BPNet, that uses DNA sequence to predict base-resolution chromatin immunoprecipitation (ChIP)-nexus binding profiles of pluripotency TFs. We develop interpretation tools to learn predictive motif representations and identify soft syntax rules for cooperative TF binding interactions. Strikingly, Nanog preferentially binds with helical periodicity, and TFs often cooperate in a directional manner, which we validate using clustered regularly interspaced short palindromic repeat (CRISPR)-induced point mutations. Our model represents a powerful general approach to uncover the motifs and syntax of cis-regulatory sequences in genomics data.
    DOI:  https://doi.org/10.1038/s41588-021-00782-6
  13. Nature. 2021 Jan 27.
      METTL3 (methyltransferase-like 3) mediates the N6-methyladenosine (m6A) methylation of mRNA, which affects the stability of mRNA and its translation into protein1. METTL3 also binds chromatin2-4, but the role of METTL3 and m6A methylation in chromatin is not fully understood. Here we show that METTL3 regulates mouse embryonic stem-cell heterochromatin, the integrity of which is critical for silencing retroviral elements and for mammalian development5. METTL3 predominantly localizes to the intracisternal A particle (IAP)-type family of endogenous retroviruses. Knockout of Mettl3 impairs the deposition of multiple heterochromatin marks onto METTL3-targeted IAPs, and upregulates IAP transcription, suggesting that METTL3 is important for the integrity of IAP heterochromatin. We provide further evidence that RNA transcripts derived from METTL3-bound IAPs are associated with chromatin and are m6A-methylated. These m6A-marked transcripts are bound by the m6A reader YTHDC1, which interacts with METTL3 and in turn promotes the association of METTL3 with chromatin. METTL3 also interacts physically with the histone 3 lysine 9 (H3K9) tri-methyltransferase SETDB1 and its cofactor TRIM28, and is important for their localization to IAPs. Our findings demonstrate that METTL3-catalysed m6A modification of RNA is important for the integrity of IAP heterochromatin in mouse embryonic stem cells, revealing a mechanism of heterochromatin regulation in mammals.
    DOI:  https://doi.org/10.1038/s41586-021-03210-1
  14. Elife. 2021 Feb 16. pii: e64684. [Epub ahead of print]10
      The mature cerebellum controls motor skill precision and participates in other sophisticated brain functions that include learning, cognition, and speech. Different types of GABAergic and glutamatergic cerebellar neurons originate in temporal order from two progenitor niches, the ventricular zone and rhombic lip, which express the transcription factors Ptf1a and Atoh1, respectively. However, the molecular machinery required to specify the distinct neuronal types emanating from these progenitor zones is still unclear. Here, we uncover the transcription factor Olig3 as a major determinant in generating the earliest neuronal derivatives emanating from both progenitor zones in mice. In the rhombic lip, Olig3 regulates progenitor cell proliferation. In the ventricular zone, Olig3 safeguards Purkinje cell specification by curtailing the expression of Pax2, a transcription factor that suppresses the Purkinje cell differentiation program. Our work thus defines Olig3 as a key factor in early cerebellar development.
    Keywords:  Olig3; bHLH transcription factors; cerebellar development; cerebellar hypoplasia; developmental biology; mouse; neuron specification; neuronal fate change; neuroscience
    DOI:  https://doi.org/10.7554/eLife.64684
  15. Nat Methods. 2021 Feb 15.
      Genome-wide profiling of histone modifications can reveal not only the location and activity state of regulatory elements, but also the regulatory mechanisms involved in cell-type-specific gene expression during development and disease pathology. Conventional assays to profile histone modifications in bulk tissues lack single-cell resolution. Here we describe an ultra-high-throughput method, Paired-Tag, for joint profiling of histone modifications and transcriptome in single cells to produce cell-type-resolved maps of chromatin state and transcriptome in complex tissues. We used this method to profile five histone modifications jointly with transcriptome in the adult mouse frontal cortex and hippocampus. Integrative analysis of the resulting maps identified distinct groups of genes subject to divergent epigenetic regulatory mechanisms. Our single-cell multiomics approach enables comprehensive analysis of chromatin state and gene regulation in complex tissues and characterization of gene regulatory programs in the constituent cell types.
    DOI:  https://doi.org/10.1038/s41592-021-01060-3
  16. J Biol Chem. 2021 Feb 10. pii: S0021-9258(21)00185-X. [Epub ahead of print] 100413
      Proper expression of Homeobox A cluster genes (HoxA) is essential for embryonic stem cell (ESC) differentiation and individual development. However, mechanisms controlling precise spatiotemporal expression of HoxA during early ESC differentiation remain poorly understood. Herein, we identified a functional CTCF-binding element (CBE+47) closest to the 3'-end of HoxA within the same topologically associated domain (TAD) in ESC. CRISPR-Cas9-mediated deletion of CBE+47 significantly upregulated HoxA expression and enhanced early ESC differentiation induced by retinoic acid (RA) relative to wild-type cells. Mechanistic analysis by chromosome conformation capture assay (Capture-C) revealed that CBE+47 deletion decreased interactions between adjacent enhancers, enabling formation of a relatively loose enhancer-enhancer interaction complex (EEIC), which overall increased interactions between that EEIC and central regions of HoxA chromatin. These findings indicate that CBE+47 organizes chromatin interactions between its adjacent enhancers and HoxA. Furthermore, deletion of those adjacent enhancers synergistically inhibited HoxA activation, suggesting that these enhancers serve as an EEIC required for RA-induced HoxA activation. Collectively, these results provide new insight into RA-induced HoxA expression during early ESC differentiation, also highlight precise regulatory roles of the CTCF-binding element in orchestrating high-order chromatin structure.
    Keywords:  CTCF-binding element; HoxA; differentiation; embryonic stem cells; enhancer; enhancer-enhancer interaction complex; long-range chromatin interaction
    DOI:  https://doi.org/10.1016/j.jbc.2021.100413
  17. Nature. 2021 Jan 27.
      Many sequence variants have been linked to complex human traits and diseases1, but deciphering their biological functions remains challenging, as most of them reside in noncoding DNA. Here we have systematically assessed the binding of 270 human transcription factors to 95,886 noncoding variants in the human genome using an ultra-high-throughput multiplex protein-DNA binding assay, termed single-nucleotide polymorphism evaluation by systematic evolution of ligands by exponential enrichment (SNP-SELEX). The resulting 828 million measurements of transcription factor-DNA interactions enable estimation of the relative affinity of these transcription factors to each variant in vitro and evaluation of the current methods to predict the effects of noncoding variants on transcription factor binding. We show that the position weight matrices of most transcription factors lack sufficient predictive power, whereas the support vector machine combined with the gapped k-mer representation show much improved performance, when assessed on results from independent SNP-SELEX experiments involving a new set of 61,020 sequence variants. We report highly predictive models for 94 human transcription factors and demonstrate their utility in genome-wide association studies and understanding of the molecular pathways involved in diverse human traits and diseases.
    DOI:  https://doi.org/10.1038/s41586-021-03211-0
  18. EMBO Rep. 2021 Feb 19. e51989
      During X chromosome inactivation (XCI), in female placental mammals, gene silencing is initiated by the Xist long non-coding RNA. Xist accumulation at the X leads to enrichment of specific chromatin marks, including PRC2-dependent H3K27me3 and SETD8-dependent H4K20me1. However, the dynamics of this process in relation to Xist RNA accumulation remains unknown as is the involvement of H4K20me1 in initiating gene silencing. To follow XCI dynamics in living cells, we developed a genetically encoded, H3K27me3-specific intracellular antibody or H3K27me3-mintbody. By combining live-cell imaging of H3K27me3, H4K20me1, the X chromosome and Xist RNA, with ChIP-seq analysis we uncover concurrent accumulation of both marks during XCI, albeit with distinct genomic distributions. Furthermore, using a Xist B and C repeat mutant, which still shows gene silencing on the X but not H3K27me3 deposition, we also find a complete lack of H4K20me1 enrichment. This demonstrates that H4K20me1 is dispensable for the initiation of gene silencing, although it may have a role in the chromatin compaction that characterises facultative heterochromatin.
    Keywords:  H4K20me1; X inactivation; embryonic stem cells; heterochromatin; polycomb
    DOI:  https://doi.org/10.15252/embr.202051989
  19. Development. 2021 Feb 16. pii: dev.197202. [Epub ahead of print]
      The Evf2 long non-coding RNA directs Dlx5/6 ultraconserved enhancer(UCE)-intrachromosomal interactions, regulating genes across a 27Mb region on chr6 in mouse developing forebrain. Here, we show that Evf2 long-range gene repression occurs through multi-step mechanisms involving the transcription factor Sox2. Evf2 directly interacts with Sox2, antagonizing Sox2 activation of Dlx5/6UCE, and recruits Sox2 to the Dlx5/6eii shadow enhancer and key Dlx5/6UCE interaction sites. Sox2 directly interacts with Dlx1 and Smarca4, as part of the Evf2 ribonucleoprotein complex, forming spherical subnuclear domains (protein pools, PPs). Evf2 targets Sox2 PPs to one long-range repressed target gene (Rbm28), at the expense of another (Akr1b8). Evf2 and Sox2 shift Dlx5/6UCE interactions towards Rbm28, linking Evf2/Sox2 co-regulated topological control and gene repression. We propose a model that distinguishes Evf2 gene repression mechanisms at Rbm28 (Dlx5/6UCE position) and Akr1b8 (limited Sox2 availability). Genome-wide control of RNPs (Sox2, Dlx and Smarca4) shows that co-recruitment influences Sox2 DNA binding. Together, these data suggest that Evf2 organizes a Sox2 PP subnuclear domain, and through Sox2-RNP sequestration and recruitment, regulates chr6 long-range UCE targeting and activity with genome-wide consequences.
    Keywords:  Architectural proteins; Chromosome 3D structure; Enhancers; Epigenetics; Forebrain development; Gene repression; LncRNA; Ribonucleoprotein complex; Transcription factor binding
    DOI:  https://doi.org/10.1242/dev.197202
  20. Elife. 2021 Feb 15. pii: e64444. [Epub ahead of print]10
      The canonical Wnt pathway transcriptional co-activator β-catenin regulates self-renewal and differentiation of mammalian nephron progenitor cells (NPCs). We modulated β-catenin levels in NPC cultures using the GSK3 inhibitor CHIR9902 (CHIR) to examine opposing developmental actions of β-catenin. Low CHIR-mediated maintenance and expansion of NPCs is independent of direct engagement of TCF/LEF/β-catenin transcriptional complexes at low CHIR-dependent cell-cycle targets. In contrast, in high CHIR, TCF7/LEF1/β-catenin complexes replaced TCF7L1/TCF7L2 binding on enhancers of differentiation-promoting target genes. Chromosome confirmation studies showed pre-established promoter-enhancer connections to these target genes in NPCs. High CHIR-associated de novo looping was observed in positive transcriptional feedback regulation to the canonical Wnt pathway. Thus, β-catenin's direct transcriptional role is restricted to the induction of NPCs where rising β-catenin levels switch inhibitory TCF7L1/TCF7L2 complexes to activating LEF1/TCF7 complexes at primed gene targets poised for rapid initiation of a nephrogenic program.
    Keywords:  developmental biology; mouse
    DOI:  https://doi.org/10.7554/eLife.64444
  21. Genome Biol. 2021 Feb 18. 22(1): 63
      The integration of single-cell RNA-sequencing datasets from multiple sources is critical for deciphering cell-to-cell heterogeneities and interactions in complex biological systems. We present a novel unsupervised batch effect removal framework, called iMAP, based on both deep autoencoders and generative adversarial networks. Compared with current methods, iMAP shows superior, robust, and scalable performance in terms of both reliably detecting the batch-specific cells and effectively mixing distributions of the batch-shared cell types. Applying iMAP to tumor microenvironment datasets from two platforms, Smart-seq2 and 10x Genomics, we find that iMAP can leverage the powers of both platforms to discover novel cell-cell interactions.
    Keywords:  Data integration; Deep learning; GAN; scRNA-seq
    DOI:  https://doi.org/10.1186/s13059-021-02280-8
  22. Proc Natl Acad Sci U S A. 2021 Feb 23. pii: e2019052118. [Epub ahead of print]118(8):
      Both gene repressor (Polycomb-dependent) and activator (Polycomb-independent) functions of the Polycomb protein enhancer of zeste homolog 2 (EZH2) are implicated in cancer progression. EZH2 protein can be phosphorylated at various residues, such as threonine 487 (T487), by CDK1 kinase, and such phosphorylation acts as a Polycomb repressive complex 2 (PRC2) suppression "code" to mediate the gene repressor-to-activator switch of EZH2 functions. Here we demonstrate that the histone reader protein ZMYND8 is overexpressed in human clear cell renal cell carcinoma (ccRCC). ZMYND8 binds to EZH2, and their interaction is largely enhanced by CDK1 phosphorylation of EZH2 at T487. ZMYND8 depletion not only enhances Polycomb-dependent function of EZH2 in hypoxia-exposed breast cancer cells or von Hippel-Lindau (VHL)-deficient ccRCC cells, but also suppresses the FOXM1 transcription program. We further show that ZMYND8 is required for EZH2-FOXM1 interaction and is important for FOXM1-dependent matrix metalloproteinase (MMP) gene expression and EZH2-mediated migration and invasion of VHL-deficient ccRCC cells. Our results identify a previously uncharacterized role of the chromatin reader ZMYND8 in recognizing the PRC2-inhibitory phosphorylation "code" essential for the Polycomb-dependent to -independent switch of EZH2 functions. They also reveal an oncogenic pathway driving cell migration and invasion in hypoxia-inducible factor-activated (hypoxia or VHL-deficient) cancer.
    Keywords:  EZH2; PRC2; ZMYND8; cancer; hypoxia
    DOI:  https://doi.org/10.1073/pnas.2019052118
  23. New Phytol. 2021 Feb 19.
      Chromatin modifications play important roles in plant adaptation to abiotic stresses, but the precise function of histone H3 lysine 36 (H3K36) methylation in drought tolerance remains poorly evaluated. Here, we report that SDG708, a specific H3K36 methyltransferase, functions as a positive regulator of drought tolerance in rice. SDG708 promoted abscisic acid (ABA) biosynthesis by directly targeting and activating the crucial ABA biosynthesis genes NINE-CIS-EPOXYCAROTENOID DIOXYGENASE 3 (OsNCED3) and NINE-CIS-EPOXYCAROTENOID DIOXYGENASE 5 (OsNCED5). Additionally, SDG708 induced hydrogen peroxide accumulation in the guard cells and promoted stomatal closure to reduce water loss. Overexpression of SDG708 concomitantly enhanced rice drought tolerance and increased grain yield under normal and drought stress conditions. Thus, SDG708 is potentially useful as an epigenetic regulator in breeding for grain yield improvement.
    Keywords:  H3K36 methylation; abscisic acid (ABA) synthesis; drought tolerance; grain yield; rice
    DOI:  https://doi.org/10.1111/nph.17290
  24. Nat Immunol. 2021 Feb 18.
      The transcription factor IRF8 is essential for the development of monocytes and dendritic cells (DCs), whereas it inhibits neutrophilic differentiation. It is unclear how Irf8 expression is regulated and how this single transcription factor supports the generation of both monocytes and DCs. Here, we identified a RUNX-CBFβ-driven enhancer 56 kb downstream of the Irf8 transcription start site. Deletion of this enhancer in vivo significantly decreased Irf8 expression throughout the myeloid lineage from the progenitor stages, thus resulting in loss of common DC progenitors and overproduction of Ly6C+ monocytes. We demonstrated that high, low or null expression of IRF8 in hematopoietic progenitor cells promotes differentiation toward type 1 conventional DCs, Ly6C+ monocytes or neutrophils, respectively, via epigenetic regulation of distinct sets of enhancers in cooperation with other transcription factors. Our results illustrate the mechanism through which IRF8 controls the lineage choice in a dose-dependent manner within the myeloid cell system.
    DOI:  https://doi.org/10.1038/s41590-021-00871-y
  25. Science. 2021 Feb 19. pii: eabc6405. [Epub ahead of print]371(6531):
      Genes with novel cellular functions may evolve through exon shuffling, which can assemble novel protein architectures. Here, we show that DNA transposons provide a recurrent supply of materials to assemble protein-coding genes through exon shuffling. We find that transposase domains have been captured-primarily via alternative splicing-to form fusion proteins at least 94 times independently over the course of ~350 million years of tetrapod evolution. We find an excess of transposase DNA binding domains fused to host regulatory domains, especially the Krüppel-associated box (KRAB) domain, and identify four independently evolved KRAB-transposase fusion proteins repressing gene expression in a sequence-specific fashion. The bat-specific KRABINER fusion protein binds its cognate transposons genome-wide and controls a network of genes and cis-regulatory elements. These results illustrate how a transcription factor and its binding sites can emerge.
    DOI:  https://doi.org/10.1126/science.abc6405
  26. Clin Epigenetics. 2021 Feb 17. 13(1): 37
       BACKGROUND: BRG1 (encoded by SMARCA4) is a catalytic component of the SWI/SNF chromatin remodelling complex, with key roles in modulating DNA accessibility. Dysregulation of BRG1 is observed, but functionally uncharacterised, in a wide range of malignancies. We have probed the functions of BRG1 on a background of prostate cancer to investigate how BRG1 controls gene expression programmes and cancer cell behaviour.
    RESULTS: Our investigation of SMARCA4 revealed that BRG1 is over-expressed in the majority of the 486 tumours from The Cancer Genome Atlas prostate cohort, as well as in a complementary panel of 21 prostate cell lines. Next, we utilised a temporal model of BRG1 depletion to investigate the molecular effects on global transcription programmes. Depleting BRG1 had no impact on alternative splicing and conferred only modest effect on global expression. However, of the transcriptional changes that occurred, most manifested as down-regulated expression. Deeper examination found the common thread linking down-regulated genes was involvement in proliferation, including several known to increase prostate cancer proliferation (KLK2, PCAT1 and VAV3). Interestingly, the promoters of genes driving proliferation were bound by BRG1 as well as the transcription factors, AR and FOXA1. We also noted that BRG1 depletion repressed genes involved in cell cycle progression and DNA replication, but intriguingly, these pathways operated independently of AR and FOXA1. In agreement with transcriptional changes, depleting BRG1 conferred G1 arrest.
    CONCLUSIONS: Our data have revealed that BRG1 promotes cell cycle progression and DNA replication, consistent with the increased cell proliferation associated with oncogenesis.
    Keywords:  BRG1; Cancer; Cell cycle; Chromatin remodelling; DNA replication; Gene expression; SMARCA4; Transcription
    DOI:  https://doi.org/10.1186/s13148-021-01023-7
  27. Nat Commun. 2021 02 16. 12(1): 1072
      In addition to nucleosomes, chromatin contains non-histone chromatin-associated proteins, of which the high-mobility group proteins are the most abundant. Chromatin-mediated regulation of transcription involves DNA methylation and histone modifications. However, the order of events and the precise function of high-mobility group proteins during transcription initiation remain unclear. Here we show that high-mobility group AT-hook 2 protein (HMGA2) induces DNA nicks at the transcription start site, which are required by the histone chaperone FACT complex to incorporate nucleosomes containing the histone variant H2A.X. Further, phosphorylation of H2A.X at S139 (γ-H2AX) is required for repair-mediated DNA demethylation and transcription activation. The relevance of these findings is demonstrated within the context of TGFB1 signaling and idiopathic pulmonary fibrosis, suggesting therapies against this lethal disease. Our data support the concept that chromatin opening during transcriptional initiation involves intermediates with DNA breaks that subsequently require DNA repair mechanisms to ensure genome integrity.
    DOI:  https://doi.org/10.1038/s41467-021-21227-y
  28. Front Genet. 2020 ;11 618478
      Assay for transposase-accessible chromatin using sequencing data (ATAC-seq) is an efficient and precise method for revealing chromatin accessibility across the genome. Most of the current ATAC-seq tools follow chromatin immunoprecipitation sequencing (ChIP-seq) strategies that do not consider ATAC-seq-specific properties. To incorporate specific ATAC-seq quality control and the underlying biology of chromatin accessibility, we developed a bioinformatics software named ATACgraph for analyzing and visualizing ATAC-seq data. ATACgraph profiles accessible chromatin regions and provides ATAC-seq-specific information including definitions of nucleosome-free regions (NFRs) and nucleosome-occupied regions. ATACgraph also allows identification of differentially accessible regions between two ATAC-seq datasets. ATACgraph incorporates the docker image with the Galaxy platform to provide an intuitive user experience via the graphical interface. Without tedious installation processes on a local machine or cloud, users can analyze data through activated websites using pre-designed workflows or customized pipelines composed of ATACgraph modules. Overall, ATACgraph is an effective tool designed for ATAC-seq for biologists with minimal bioinformatics knowledge to analyze chromatin accessibility. ATACgraph can be run on any ATAC-seq data with no limit to specific genomes. As validation, we demonstrated ATACgraph on human genome to showcase its functions for ATAC-seq interpretation. This software is publicly accessible and can be downloaded at https://github.com/RitataLU/ATACgraph.
    Keywords:  ATAC-seq; bioinformatics; chromatin accessibility; epigenomics; genomics; next-generation sequencing
    DOI:  https://doi.org/10.3389/fgene.2020.618478
  29. Nature. 2021 Feb 17.
      The repair of DNA double-strand breaks (DSBs) is essential for safeguarding genome integrity. When a DSB forms, the PI3K-related ATM kinase rapidly triggers the establishment of megabase-sized, chromatin domains decorated with phosphorylated histone H2AX (γH2AX), which act as seeds for the formation of DNA-damage response foci1. It is unclear how these foci are rapidly assembled to establish a 'repair-prone' environment within the nucleus. Topologically associating domains are a key feature of 3D genome organization that compartmentalize transcription and replication, but little is known about their contribution to DNA repair processes2,3. Here we show that topologically associating domains are functional units of the DNA damage response, and are instrumental for the correct establishment of γH2AX-53BP1 chromatin domains in a manner that involves one-sided cohesin-mediated loop extrusion on both sides of the DSB. We propose a model in which H2AX-containing nucleosomes are rapidly phosphorylated as they actively pass by DSB-anchored cohesin. Our work highlights the importance of chromosome conformation in the maintenance of genome integrity and demonstrates the establishment of a chromatin modification by loop extrusion.
    DOI:  https://doi.org/10.1038/s41586-021-03193-z
  30. BMC Bioinformatics. 2021 Feb 15. 22(1): 69
       BACKGROUND: Chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq), initially introduced more than a decade ago, is widely used by the scientific community to detect protein/DNA binding and histone modifications across the genome. Every experiment is prone to noise and bias, and ChIP-seq experiments are no exception. To alleviate bias, the incorporation of control datasets in ChIP-seq analysis is an essential step. The controls are used to account for the background signal, while the remainder of the ChIP-seq signal captures true binding or histone modification. However, a recurrent issue is different types of bias in different ChIP-seq experiments. Depending on which controls are used, different aspects of ChIP-seq bias are better or worse accounted for, and peak calling can produce different results for the same ChIP-seq experiment. Consequently, generating "smart" controls, which model the non-signal effect for a specific ChIP-seq experiment, could enhance contrast and increase the reliability and reproducibility of the results.
    RESULT: We propose a peak calling algorithm, Weighted Analysis of ChIP-seq (WACS), which is an extension of the well-known peak caller MACS2. There are two main steps in WACS: First, weights are estimated for each control using non-negative least squares regression. The goal is to customize controls to model the noise distribution for each ChIP-seq experiment. This is then followed by peak calling. We demonstrate that WACS significantly outperforms MACS2 and AIControl, another recent algorithm for generating smart controls, in the detection of enriched regions along the genome, in terms of motif enrichment and reproducibility analyses.
    CONCLUSIONS: This ultimately improves our understanding of ChIP-seq controls and their biases, and shows that WACS results in a better approximation of the noise distribution in controls.
    Keywords:  Bias; ChIP-seq; Controls
    DOI:  https://doi.org/10.1186/s12859-020-03927-2
  31. Epigenetics. 2021 Feb 17. 1-16
      Genome-wide association studies (GWAS) have identified SNPs linked with lung cancer risk. Our aim was to discover the genes, non-coding RNAs, and regulatory elements within GWAS-identified risk regions that are deregulated in non-small cell lung carcinoma (NSCLC) to identify novel, clinically targetable genes and mechanisms in carcinogenesis. A targeted bisulphite-sequencing approach was used to comprehensively investigate DNA methylation changes occurring within lung cancer risk regions in 17 NSCLC and adjacent normal tissue pairs. We report differences in differentially methylated regions between adenocarcinoma and squamous cell carcinoma. Among the minimal regions found to be differentially methylated in at least 50% of the patients, 7 candidates were replicated in 2 independent cohorts (n = 27 and n = 87) and the potential of 6 as methylation-dependent regulatory elements was confirmed by functional assays. This study contributes to understanding the pathways implicated in lung cancer initiation and progression, and provides new potential targets for cancer treatment.
    Keywords:  DMR; DNA methylation; Risk SNPs; bisulphite sequencing; enhancers; lung cancer
    DOI:  https://doi.org/10.1080/15592294.2021.1878723