bims-crepig Biomed News
on Chromatin regulation and epigenetics in cell fate and cancer
Issue of 2023‒05‒14
twenty-one papers selected by
Connor Rogerson
University of Cambridge

  1. Mol Cell. 2023 Apr 28. pii: S1097-2765(23)00255-1. [Epub ahead of print]
      At active human genes, the +1 nucleosome is located downstream of the RNA polymerase II (RNA Pol II) pre-initiation complex (PIC). However, at inactive genes, the +1 nucleosome is found further upstream, at a promoter-proximal location. Here, we establish a model system to show that a promoter-proximal +1 nucleosome can reduce RNA synthesis in vivo and in vitro, and we analyze its structural basis. We find that the PIC assembles normally when the edge of the +1 nucleosome is located 18 base pairs (bp) downstream of the transcription start site (TSS). However, when the nucleosome edge is located further upstream, only 10 bp downstream of the TSS, the PIC adopts an inhibited state. The transcription factor IIH (TFIIH) shows a closed conformation and its subunit XPB contacts DNA with only one of its two ATPase lobes, inconsistent with DNA opening. These results provide a mechanism for nucleosome-dependent regulation of transcription initiation.
    Keywords:  +1 nucleosome; RNA polymerase II; gene regulation; pre-initiation complex; promoter-proximal +1 nucleosome; transcription initiation; transcription reduction
  2. Nat Commun. 2023 May 11. 14(1): 2712
      Transcriptional regulation is commonly governed by alternative promoters. However, the regulatory architecture in alternative and reference promoters, and how they differ, remains elusive. In 100 CAGE-seq libraries from hepatocellular carcinoma patients, here we annotate 4083 alternative promoters in 2926 multi-promoter genes, which are largely undetected in normal livers. These genes are enriched in oncogenic processes and predominantly show association with overall survival. Alternative promoters are narrow nucleosome depleted regions, CpG island depleted, and enriched for tissue-specific transcription factors. Globally tumors lose DNA methylation. We show hierarchical retention of intragenic DNA methylation with CG-poor regions rapidly losing methylation, while CG-rich regions retain it, a process mediated by differential SETD2, H3K36me3, DNMT3B, and TET1 binding. This mechanism is validated in SETD2 knockdown cells and SETD2-mutated patients. Selective DNA methylation loss in CG-poor regions makes the chromatin accessible for alternative transcription. We show alternative promoters can control tumor transcriptomes and their regulatory architecture.
  3. Nat Genet. 2023 May 08.
      Although enhancers are central regulators of mammalian gene expression, the mechanisms underlying enhancer-promoter (E-P) interactions remain unclear. Chromosome conformation capture (3C) methods effectively capture large-scale three-dimensional (3D) genome structure but struggle to achieve the depth necessary to resolve fine-scale E-P interactions. Here, we develop Region Capture Micro-C (RCMC) by combining micrococcal nuclease (MNase)-based 3C with a tiling region-capture approach and generate the deepest 3D genome maps reported with only modest sequencing. By applying RCMC in mouse embryonic stem cells and reaching the genome-wide equivalent of ~317 billion unique contacts, RCMC reveals previously unresolvable patterns of highly nested and focal 3D interactions, which we term microcompartments. Microcompartments frequently connect enhancers and promoters, and although loss of loop extrusion and inhibition of transcription disrupts some microcompartments, most are largely unaffected. We therefore propose that many E-P interactions form through a compartmentalization mechanism, which may partially explain why acute cohesin depletion only modestly affects global gene expression.
  4. Mol Cell. 2023 Apr 28. pii: S1097-2765(23)00285-X. [Epub ahead of print]
      The HUSH complex recognizes and silences foreign DNA such as viruses, transposons, and transgenes without prior exposure to its targets. Here, we show that endogenous targets of the HUSH complex fall into two distinct classes based on the presence or absence of H3K9me3. These classes are further distinguished by their transposon content and differential response to the loss of HUSH. A de novo genomic rearrangement at the Sox2 locus induces a switch from H3K9me3-independent to H3K9me3-associated HUSH targeting, resulting in silencing. We further demonstrate that HUSH interacts with the termination factor WDR82 and-via its component MPP8-with nascent RNA. HUSH accumulates at sites of high RNAPII occupancy including long exons and transcription termination sites in a manner dependent on WDR82 and CPSF. Together, our results uncover the functional diversity of HUSH targets and show that this vertebrate-specific complex exploits evolutionarily ancient transcription termination machinery for co-transcriptional chromatin targeting and genome surveillance.
    Keywords:  CPSF; HUSH complex; MPP8; SETDB1; TASOR; WDR82; gene silencing; heterochromatin; transcription termination; transposable elements
  5. Mol Syst Biol. 2023 May 09. e11392
      Many genes are co-expressed and form genomic domains of coordinated gene activity. However, the regulatory determinants of domain co-activity remain unclear. Here, we leverage human individual variation in gene expression to characterize the co-regulatory processes underlying domain co-activity and systematically quantify their effect sizes. We employ transcriptional decomposition to extract from RNA expression data an expression component related to co-activity revealed by genomic positioning. This strategy reveals close to 1,500 co-activity domains, covering most expressed genes, of which the large majority are invariable across individuals. Focusing specifically on domains with high variability in co-activity reveals that contained genes have a higher sharing of eQTLs, a higher variability in enhancer interactions, and an enrichment of binding by variably expressed transcription factors, compared to genes within non-variable domains. Through careful quantification of the relative contributions of regulatory processes underlying co-activity, we find transcription factor expression levels to be the main determinant of gene co-activity. Our results indicate that distal trans effects contribute more than local genetic variation to individual variation in co-activity domains.
    Keywords:  co-activity domains; co-regulation; gene regulation; individual variation; transcriptional decomposition
  6. Bioinform Adv. 2023 ;3(1): vbad055
      Summary: Transcription factors (TFs) are proteins that directly interpret the genome to regulate gene expression and determine cellular phenotypes. TF identification is a common first step in unraveling gene regulatory networks. We present CREPE, an R Shiny app to catalogue and annotate TFs. CREPE was benchmarked against curated human TF datasets. Next, we use CREPE to explore the TF repertoires of Heliconius erato and Heliconius melpomene butterflies.Availability and implementation: CREPE is available as a Shiny app package available at GitHub (
    Supplementary information: Supplementary data are available at Bioinformatics Advances online.
  7. Sci Rep. 2023 May 10. 13(1): 7589
      The onset of erythropoiesis is under strict developmental control, with direct and indirect inputs influencing its derivation from the hematopoietic stem cell. A major regulator of this transition is KLF1/EKLF, a zinc finger transcription factor that plays a global role in all aspects of erythropoiesis. Here, we have identified a short, conserved enhancer element in KLF1 intron 1 that is important for establishing optimal levels of KLF1 in mouse and human cells. Chromatin accessibility of this site exhibits cell-type specificity and is under developmental control during the differentiation of human CD34+ cells towards the erythroid lineage. This site binds GATA1, SMAD1, TAL1, and ETV6. In vivo editing of this region in cell lines and primary cells reduces KLF1 expression quantitatively. However, we find that, similar to observations seen in pedigrees of families with KLF1 mutations, downstream effects are variable, suggesting that the global architecture of the site is buffered towards keeping the KLF1 genetic region in an active state. We propose that modification of intron 1 in both alleles is not equivalent to complete loss of function of one allele.
  8. Genome Biol. 2023 May 08. 24(1): 108
      BACKGROUND: Genetic variation in regulatory sequences that alter transcription factor (TF) binding is a major cause of phenotypic diversity. Brassinosteroid is a growth hormone that has major effects on plant phenotypes. Genetic variation in brassinosteroid-responsive cis-elements likely contributes to trait variation. Pinpointing such regulatory variations and quantitative genomic analysis of the variation in TF-target binding, however, remains challenging. How variation in transcriptional targets of signaling pathways such as the brassinosteroid pathway contributes to phenotypic variation is an important question to be investigated with innovative approaches.RESULTS: Here, we use a hybrid allele-specific chromatin binding sequencing (HASCh-seq) approach and identify variations in target binding of the brassinosteroid-responsive TF ZmBZR1 in maize. HASCh-seq in the B73xMo17 F1s identifies thousands of target genes of ZmBZR1. Allele-specific ZmBZR1 binding (ASB) has been observed for 18.3% of target genes and is enriched in promoter and enhancer regions. About a quarter of the ASB sites correlate with sequence variation in BZR1-binding motifs and another quarter correlate with haplotype-specific DNA methylation, suggesting that both genetic and epigenetic variations contribute to the high level of variation in ZmBZR1 occupancy. Comparison with GWAS data shows linkage of hundreds of ASB loci to important yield and disease-related traits.
    CONCLUSION: Our study provides a robust method for analyzing genome-wide variations of TF occupancy and identifies genetic and epigenetic variations of the brassinosteroid response transcription network in maize.
    Keywords:  Allele-specific; Brassinosteroid; ChIP-seq; Functional variation; Regulatory network; Transcription factor
  9. Proc Natl Acad Sci U S A. 2023 05 16. 120(20): e2219699120
      Kidney organoids differentiated from pluripotent stem cells are powerful models of kidney development and disease but are characterized by cell immaturity and off-target cell fates. Comparing the cell-specific gene regulatory landscape during organoid differentiation with human adult kidney can serve to benchmark progress in differentiation at the epigenome and transcriptome level for individual organoid cell types. Using single-cell multiome and histone modification analysis, we report more broadly open chromatin in organoid cell types compared to the human adult kidney. We infer enhancer dynamics by cis-coaccessibility analysis and validate an enhancer driving transcription of HNF1B by CRISPR interference both in cultured proximal tubule cells and also during organoid differentiation. Our approach provides an experimental framework to judge the cell-specific maturation state of human kidney organoids and shows that kidney organoids can be used to validate individual gene regulatory networks that regulate differentiation.
    Keywords:  CRISPR interference; CUT&RUN; kidney organoid; scATAC-seq; scRNA-seq
  10. Proc Natl Acad Sci U S A. 2023 05 16. 120(20): e2218229120
      Castration-resistant prostate cancer (CRPC) poses a major clinical challenge with the androgen receptor (AR) remaining to be a critical oncogenic player. Several lines of evidence indicate that AR induces a distinct transcriptional program after androgen deprivation in CRPCs. However, the mechanism triggering AR binding to a distinct set of genomic loci in CRPC and how it promotes CRPC development remain unclear. We demonstrate here that atypical ubiquitination of AR mediated by an E3 ubiquitin ligase TRAF4 plays an important role in this process. TRAF4 is highly expressed in CRPCs and promotes CRPC development. It mediates K27-linked ubiquitination at the C-terminal tail of AR and increases its association with the pioneer factor FOXA1. Consequently, AR binds to a distinct set of genomic loci enriched with FOXA1- and HOXB13-binding motifs to drive different transcriptional programs including an olfactory transduction pathway. Through the surprising upregulation of olfactory receptor gene transcription, TRAF4 increases intracellular cAMP levels and boosts E2F transcription factor activity to promote cell proliferation under androgen deprivation conditions. Altogether, these findings reveal a posttranslational mechanism driving AR-regulated transcriptional reprogramming to provide survival advantages for prostate cancer cells under castration conditions.
    Keywords:  CRPC; E2F; TRAF4; androgen receptor; ubiquitination
  11. BMC Genomics. 2023 May 11. 24(1): 253
      Cis-regulatory elements (CRE) are critical for coordinating gene expression programs that dictate cell-specific differentiation and homeostasis. Recently developed self-transcribing active regulatory region sequencing (STARR-Seq) has allowed for genome-wide annotation of functional CREs. Despite this, STARR-Seq assays are only employed in cell lines, in part, due to difficulties in delivering reporter constructs. Herein, we implemented and validated a STARR-Seq-based screen in human CD4+ T cells using a non-integrating lentiviral transduction system. Lenti-STARR-Seq is the first example of a genome-wide assay of CRE function in human primary cells, identifying thousands of functional enhancers and negative regulatory elements (NREs) in human CD4+ T cells. We find an unexpected difference in nucleosome organization between enhancers and NRE: enhancers are located between nucleosomes, whereas NRE are occupied by nucleosomes in their endogenous locations. We also describe chromatin modification, eRNA production, and transcription factor binding at both enhancers and NREs. Our findings support the idea of silencer repurposing as enhancers in alternate cell types. Collectively, these data suggest that Lenti-STARR-Seq is a successful approach for CRE screening in primary human cell types, and provides an atlas of functional CREs in human CD4+ T cells.
    Keywords:  CD4; Cis-Regulatory Elements; Enhancers; Negative Regulatory Elements; STARR-Seq; Silencers
  12. Genes Dev. 2023 Apr 25.
      The RNA polymerase II core promoter is the site of convergence of the signals that lead to the initiation of transcription. Here, we performed a comparative analysis of the downstream core promoter region (DPR) in Drosophila and humans by using machine learning. These studies revealed a distinct human-specific version of the DPR and led to the use of machine learning models for the identification of synthetic extreme DPR motifs with specificity for human transcription factors relative to Drosophila factors and vice versa. More generally, machine learning models could similarly be used to design synthetic DNA elements with customized functional properties.
    Keywords:  Drosophila; RNA polymerase II; core promoter; gene expression; transcription
  13. Proc Natl Acad Sci U S A. 2023 05 16. 120(20): e2210991120
      In 2021, the World Health Organization reclassified glioblastoma, the most common form of adult brain cancer, into isocitrate dehydrogenase (IDH)-wild-type glioblastomas and grade IV IDH mutant (G4 IDHm) astrocytomas. For both tumor types, intratumoral heterogeneity is a key contributor to therapeutic failure. To better define this heterogeneity, genome-wide chromatin accessibility and transcription profiles of clinical samples of glioblastomas and G4 IDHm astrocytomas were analyzed at single-cell resolution. These profiles afforded resolution of intratumoral genetic heterogeneity, including delineation of cell-to-cell variations in distinct cell states, focal gene amplifications, as well as extrachromosomal circular DNAs. Despite differences in IDH mutation status and significant intratumoral heterogeneity, the profiled tumor cells shared a common chromatin structure defined by open regions enriched for nuclear factor 1 transcription factors (NFIA and NFIB). Silencing of NFIA or NFIB suppressed in vitro and in vivo growths of patient-derived glioblastomas and G4 IDHm astrocytoma models. These findings suggest that despite distinct genotypes and cell states, glioblastoma/G4 astrocytoma cells share dependency on core transcriptional programs, yielding an attractive platform for addressing therapeutic challenges associated with intratumoral heterogeneity.
    Keywords:  amplicons; extrachromosomal DNA; glioblastoma; single cell
  14. Nucleic Acids Res. 2023 May 11. pii: gkad375. [Epub ahead of print]
      MiniPromoters, or compact promoters, are short DNA sequences that can drive expression in specific cells and tissues. While broadly useful, they are of high relevance to gene therapy due to their role in enabling precise control of where a therapeutic gene will be expressed. Here, we present OnTarget (, a webserver that streamlines the MiniPromoter design process. Users only need to specify a gene of interest or custom genomic coordinates on which to focus the identification of promoters and enhancers, and can also provide relevant cell-type-specific genomic evidence (e.g. accessible chromatin regions, histone modifications, etc.). OnTarget combines the provided data with internal data to identify candidate promoters and enhancers and design MiniPromoters. To illustrate the utility of OnTarget, we designed and characterized two MiniPromoters targeting different cell populations relevant to Parkinson Disease.
  15. Genome Res. 2023 May 08. pii: gr.277581.122. [Epub ahead of print]
      The mammalian suprachiasmatic nucleus (SCN), located in the ventral hypothalamus, synchronises and maintains daily cellular and physiological rhythms across the body, in accordance with environmental and visceral cues. Consequently, the systematic regulation of spatiotemporal gene transcription in the SCN is vital for daily timekeeping. So far, the regulatory elements assisting circadian gene transcription have only been studied in peripheral tissues, lacking the critical neuronal dimension intrinsic to the role of the SCN as central brain pacemaker. By using histone-ChIP-seq, we identified SCN-enriched gene regulatory elements that associated with temporal gene expression. Based on tissue-specific H3K27ac and H3K4me3 marks we successfully produced the first-ever SCN gene-regulatory map. We found that a large majority of SCN enhancers not only exhibit robust 24-hour rhythmic modulation in H3K27ac occupancy, peaking at distinct times-of-day, but also possess canonical E-box (CACGTG) motifs potentially influencing downstream cycling gene expression. To establish enhancer-gene relationships in the SCN, we conducted directional RNA-seq at six distinct times across day and night and studied the association between dynamically changing histone acetylation and gene transcript levels. About 35% of the cycling H3K27ac sites were found adjacent to rhythmic gene transcripts, often preceding the rise in mRNA levels. We also noted that enhancers encompass noncoding actively transcribing enhancer RNAs (eRNAs) in the SCN, which in turn oscillate, along with cyclic histone acetylation, and correlates with rhythmic gene transcription. Taken together, these findings shed light on genome-wide pretranscriptional regulation operative in the central clock that confers its precise and robust oscillation necessary to orchestrate daily timekeeping in mammals.
  16. Nucleic Acids Res. 2023 May 09. pii: gkad354. [Epub ahead of print]
      In the current update, we added a feature for analysing changes in spatial distances between promoters and enhancers in chromatin 3D model ensembles. We updated our datasets by the novel in situ CTCF and RNAPII ChIA-PET chromatin loops obtained from the GM12878 cell line mapped to the GRCh38 genome assembly and extended the 1000 Genomes SVs dataset. To handle the new datasets, we applied GPU acceleration for the modelling engine, which gives a speed-up of 30× versus the previous versions. To improve visualisation and data analysis, we embedded the IGV tool for viewing ChIA-PET arcs with additional genes and SVs annotations. For 3D model visualisation, we added a new viewer: NGL, where we provided colouring by gene and enhancer location. The models are downloadable in mmcif and xyz format. The web server is hosted and performs calculations on DGX A100 GPU servers that provide optimal performance with multitasking. 3D-GNOME 3.0 web server provides unique insights into the topological mechanism of human variations at the population scale with high speed-up and is freely available at
  17. Nucleic Acids Res. 2023 May 09. pii: gkad361. [Epub ahead of print]
      Somatic stem cells contribute to normal tissue homeostasis, and their epigenomic features play an important role in regulating tissue identities or developing disease states. Enhancers are one of the key players controlling chromatin context-specific gene expression in a spatial and temporal manner while maintaining tissue homeostasis, and their dysregulation leads to tumorigenesis. Here, epigenomic and transcriptomic analyses reveal that forkhead box protein D2 (FOXD2) is a hub for the gene regulatory network exclusive to large intestinal stem cells, and its overexpression plays a significant role in colon cancer regression. FOXD2 is positioned at the closed chromatin and facilitates mixed-lineage leukemia protein-4 (MLL4/KMT2D) binding to deposit H3K4 monomethylation. De novo FOXD2-mediated chromatin interactions rewire the regulation of p53-responsive genes and induction of apoptosis. Taken together, our findings illustrate the novel mechanistic details of FOXD2 in suppressing colorectal cancer growth and suggest its function as a chromatin-tuning factor and a potential therapeutic target for colorectal cancer.
  18. Nucleic Acids Res. 2023 May 09. pii: gkad342. [Epub ahead of print]
      Iron metabolism is closely associated with the pathogenesis of obesity. However, the mechanism of the iron-dependent regulation of adipocyte differentiation remains unclear. Here, we show that iron is essential for rewriting of epigenetic marks during adipocyte differentiation. Iron supply through lysosome-mediated ferritinophagy was found to be crucial during the early stage of adipocyte differentiation, and iron deficiency during this period suppressed subsequent terminal differentiation. This was associated with demethylation of both repressive histone marks and DNA in the genomic regions of adipocyte differentiation-associated genes,  including Pparg, which encodes PPARγ, the master regulator of adipocyte differentiation. In addition, we identified several epigenetic demethylases to be responsible for iron-dependent adipocyte differentiation, with the histone demethylase jumonji domain-containing 1A and the DNA demethylase ten-eleven translocation 2 as the major enzymes. The interrelationship between repressive histone marks and DNA methylation was indicated by an integrated genome-wide association analysis, and was also supported by the findings that both histone and DNA demethylation were suppressed by either the inhibition of lysosomal ferritin flux or the knockdown of iron chaperone poly(rC)-binding protein 2. In summary, epigenetic regulations through iron-dependent control of epigenetic enzyme activities play an important role in the organized gene expression mechanisms of adipogenesis.
  19. J Cell Sci. 2023 May 09. pii: jcs.261014. [Epub ahead of print]
      Sister chromatid cohesion is a multi-step process implemented throughout the cell cycle to ensure the correct transmission of chromosomes to daughter cells. While cohesion establishment and mitotic cohesion dissolution have been extensively explored, the regulation of cohesin loading is still poorly understood. Here, we report that the methyltransferase NSD3 is essential for mitotic sister chromatid cohesion before mitosis entry. NSD3 interacts with the cohesin loader complex kollerin (NIPBL/MAU2) and promotes the chromatin recruitment of MAU2 and cohesin at mitotic exit. We also show that NSD3 associates with chromatin in early anaphase, prior to the recruitment of MAU2 and RAD21, and dissociates from chromatin when prophase begins. Among the two NSD3 isoforms present in somatic cells, the long isoform is responsible for regulating kollerin and cohesin chromatin-loading, and its methyltransferase activity is required for efficient sister chromatid cohesion. Based on these observations, we propose that NSD3-dependent methylation contributes to sister chromatid cohesion by ensuring proper kollerin recruitment and thus cohesin loading.
    Keywords:  Cohesin; MAU2; Methylation; Mitosis; NIPBL; NSD3
  20. Genes Dev. 2023 May 10.
      A wide range of sequencing methods has been developed to assess nascent RNA transcription and resolve the single-nucleotide position of RNA polymerase genome-wide. These techniques are often burdened with high input material requirements and lengthy protocols. We leveraged the template-switching properties of thermostable group II intron reverse transcriptase (TGIRT) and developed Butt-seq (bulk analysis of nascent transcript termini sequencing), which can produce libraries from purified nascent RNA in 6 h and from as few as 10,000 cells-an improvement of at least 10-fold over existing techniques. Butt-seq shows that inhibition of the superelongation complex (SEC) causes promoter-proximal pausing to move upstream in a fashion correlated with subnucleosomal fragments. To address transcriptional regulation in a tissue, Butt-seq was used to measure the circadian regulation of transcription from fly heads. All the results indicate that Butt-seq is a simple and powerful technique to analyze transcription at a high level of resolution.
    Keywords:  RNA polymerase II pausing; circadian rhythms; nascent RNA; superelongation complex; transcriptional profiling; transcriptional regulation
  21. Nucleic Acids Res. 2023 May 11. pii: gkad393. [Epub ahead of print]
      Gene and protein set enrichment analysis is a critical step in the analysis of data collected from omics experiments. Enrichr is a popular gene set enrichment analysis web-server search engine that contains hundreds of thousands of annotated gene sets. While Enrichr has been useful in providing enrichment analysis with many gene set libraries from different categories, integrating enrichment results across libraries and domains of knowledge can further hypothesis generation. To this end, Enrichr-KG is a knowledge graph database and a web-server application that combines selected gene set libraries from Enrichr for integrative enrichment analysis and visualization. The enrichment results are presented as subgraphs made of nodes and links that connect genes to their enriched terms. In addition, users of Enrichr-KG can add gene-gene links, as well as predicted genes to the subgraphs. This graphical representation of cross-library results with enriched and predicted genes can illuminate hidden associations between genes and annotated enriched terms from across datasets and resources. Enrichr-KG currently serves 26 gene set libraries from different categories that include transcription, pathways, ontologies, diseases/drugs, and cell types. To demonstrate the utility of Enrichr-KG we provide several case studies. Enrichr-KG is freely available at: