bims-crepig Biomed News
on Chromatin regulation and epigenetics in cell fate and cancer
Issue of 2021‒12‒05
27 papers selected by
Connor Rogerson
University of Cambridge, MRC Cancer Unit

  1. Nat Commun. 2021 Dec 01. 12(1): 7002
      During embryogenesis, the genome shifts from transcriptionally quiescent to extensively active in a process known as Zygotic Genome Activation (ZGA). In Drosophila, the pioneer factor Zelda is known to be essential for the progression of development; still, it regulates the activation of only a small subset of genes at ZGA. However, thousands of genes do not require Zelda, suggesting that other mechanisms exist. By conducting GRO-seq, HiC and ChIP-seq in Drosophila embryos, we demonstrate that up to 65% of zygotically activated genes are enriched for the histone variant H2A.Z. H2A.Z enrichment precedes ZGA and RNA Polymerase II loading onto chromatin. In vivo knockdown of maternally contributed Domino, a histone chaperone and ATPase, reduces H2A.Z deposition at transcription start sites, causes global downregulation of housekeeping genes at ZGA, and compromises the establishment of the 3D chromatin structure. We infer that H2A.Z is essential for the de novo establishment of transcriptional programs during ZGA via chromatin reorganization.
  2. Nucleic Acids Res. 2021 Nov 24. pii: gkab1100. [Epub ahead of print]
      Embryonic stem cells (ESCs) can differentiate into any given cell type and therefore represent a versatile model to study the link between gene regulation and differentiation. To quantitatively assess the dynamics of enhancer activity during the early stages of murine ESC differentiation, we analyzed accessible genomic regions using STARR-seq, a massively parallel reporter assay. This resulted in a genome-wide quantitative map of active mESC enhancers, in pluripotency and during the early stages of differentiation. We find that only a minority of accessible regions is active and that such regions are enriched near promoters, characterized by specific chromatin marks, enriched for distinct sequence motifs, and modeling shows that active regions can be predicted from sequence alone. Regions that change their activity upon retinoic acid-induced differentiation are more prevalent at distal intergenic regions when compared to constitutively active enhancers. Further, analysis of differentially active enhancers verified the contribution of individual TF motifs toward activity and inducibility as well as their role in regulating endogenous genes. Notably, the activity of retinoic acid receptor alpha (RARα) occupied regions can either increase or decrease upon the addition of its ligand, retinoic acid, with the direction of the change correlating with spacing and orientation of the RARα consensus motif and the co-occurrence of additional sequence motifs. Together, our genome-wide enhancer activity map elucidates features associated with enhancer activity levels, identifies regulatory regions disregarded by computational prediction tools, and provides a resource for future studies into regulatory elements in mESCs.
  3. Nat Commun. 2021 Nov 30. 12(1): 6985
      Polycomb Repressive Complex 2 (PRC2) is crucial for the coordinated expression of genes during early embryonic development, catalyzing histone H3 lysine 27 trimethylation. Two distinct PRC2 complexes, PRC2.1 and PRC2.2, contain respectively MTF2 and JARID2 in embryonic stem cells (ESCs). In this study, we explored their roles in lineage specification and commitment, using single-cell transcriptomics and mouse embryoid bodies derived from Mtf2 and Jarid2 null ESCs. We observe that the loss of Mtf2 results in enhanced and faster differentiation towards cell fates from all germ layers, while the Jarid2 null cells are predominantly directed towards early differentiating precursors, with reduced efficiency towards mesendodermal lineages. These effects are caused by derepression of developmental regulators that are poised for activation in pluripotent cells and gain H3K4me3 at their promoters in the absence of PRC2 repression. Upon lineage commitment, the differentiation trajectories are relatively similar to those of wild-type cells. Together, our results uncover a major role for MTF2-containing PRC2.1 in balancing poised lineage-specific gene activation, whereas the contribution of JARID2-containing PRC2 is more selective in nature compared to MTF2. These data explain how PRC2 imposes thresholds for lineage choice during the exit of pluripotency.
  4. Cell. 2021 Nov 25. pii: S0092-8674(21)01327-1. [Epub ahead of print]
      Chromosome loops shift dynamically during development, homeostasis, and disease. CCCTC-binding factor (CTCF) is known to anchor loops and construct 3D genomes, but how anchor sites are selected is not yet understood. Here, we unveil Jpx RNA as a determinant of anchor selectivity. Jpx RNA targets thousands of genomic sites, preferentially binding promoters of active genes. Depleting Jpx RNA causes ectopic CTCF binding, massive shifts in chromosome looping, and downregulation of >700 Jpx target genes. Without Jpx, thousands of lost loops are replaced by de novo loops anchored by ectopic CTCF sites. Although Jpx controls CTCF binding on a genome-wide basis, it acts selectively at the subset of developmentally sensitive CTCF sites. Specifically, Jpx targets low-affinity CTCF motifs and displaces CTCF protein through competitive inhibition. We conclude that Jpx acts as a CTCF release factor and shapes the 3D genome by regulating anchor site usage.
    Keywords:  3D genome; CTCF; CTCF release factor; CTCF site selection; Jpx RNA; chromatin loop; chromosome conformation; gene activation; loop anchors; noncoding RNA
  5. Nucleic Acids Res. 2021 Nov 24. pii: gkab1105. [Epub ahead of print]
      The term 'super enhancers' (SE) has been widely used to describe stretches of closely localized enhancers that are occupied collectively by large numbers of transcription factors (TFs) and co-factors, and control the transcription of highly-expressed genes. Through integrated analysis of >600 DNase-seq, ChIP-seq, GRO-seq, STARR-seq, RNA-seq, Hi-C and ChIA-PET data in five human cancer cell lines, we identified a new class of autonomous SEs (aSEs) that are excluded from classic SE calls by the widely used Rank Ordering of Super-Enhancers (ROSE) method. TF footprint analysis revealed that compared to classic SEs and regular enhancers, aSEs are tightly bound by a dense array of master lineage TFs, which serve as anchors to recruit additional TFs and co-factors in trans. In addition, aSEs are preferentially enriched for Cohesins, which likely involve in stabilizing long-distance interactions between aSEs and their distal target genes. Finally, we showed that aSEs can be reliably predicted using a single DNase-seq data or combined with Mediator and/or P300 ChIP-seq. Overall, our study demonstrates that aSEs represent a unique class of functionally important enhancer elements that distally regulate the transcription of highly expressed genes.
  6. Nucleic Acids Res. 2021 Nov 25. pii: gkab1093. [Epub ahead of print]
      DNA lesions impact on local transcription and the damage-induced transcriptional repression facilitates efficient DNA repair. However, how chromatin dynamics cooperates with these two events remained largely unknown. We here show that histone H2A acetylation at K118 is enriched in transcriptionally active regions. Under DNA damage, the RSF1 chromatin remodeling factor recruits HDAC1 to DSB sites. The RSF1-HDAC1 complex induces the deacetylation of H2A(X)-K118 and its deacetylation is indispensable for the ubiquitination of histone H2A at K119. Accordingly, the acetylation mimetic H2A-K118Q suppressed the H2A-K119ub level, perturbing the transcriptional repression at DNA lesions. Intriguingly, deacetylation of H2AX at K118 also licenses the propagation of γH2AX and recruitment of MDC1. Consequently, the H2AX-K118Q limits DNA repair. Together, the RSF1-HDAC1 complex controls the traffic of the DNA damage response and transcription simultaneously in transcriptionally active chromatins. The interplay between chromatin remodelers and histone modifiers highlights the importance of chromatin versatility in the maintenance of genome integrity.
  7. Cell Rep. 2021 Nov 30. pii: S2211-1247(21)01552-7. [Epub ahead of print]37(9): 110066
      The role of chromatin-associated RNAi components in the nucleus of mammalian cells and in particular in the context of developmental programs remains to be elucidated. Here, we investigate the function of nuclear Argonaute 1 (Ago1) in gene expression regulation during skeletal muscle differentiation. We show that Ago1 is required for activation of the myogenic program by supporting chromatin modification mediated by developmental enhancer activation. Mechanistically, we demonstrate that Ago1 directly controls global H3K27 acetylation (H3K27ac) by regulating enhancer RNA (eRNA)-CREB-binding protein (CBP) acetyltransferase interaction, a key step in enhancer-driven gene activation. In particular, we show that Ago1 is specifically required for myogenic differentiation 1 (MyoD) and downstream myogenic gene activation, whereas its depletion leads to failure of CBP acetyltransferase activation and blocking of the myogenic program. Our work establishes a role of the mammalian enhancer-associated RNAi component Ago1 in epigenome regulation and activation of developmental programs.
    Keywords:  CBP acetyltransferase; H3K27 acetylation; H3K27ac; MyoD expression; eRNAs; enhancer RNAs; myogenic differentiation; nuclear Ago1
  8. Nat Commun. 2021 Dec 02. 12(1): 7045
      Enhancer activation is essential for cell-type specific gene expression during cellular differentiation, however, how enhancers transition from a hypoacetylated "primed" state to a hyperacetylated-active state is incompletely understood. Here, we show SET domain-containing 5 (SETD5) forms a complex with NCoR-HDAC3 co-repressor that prevents histone acetylation of enhancers for two master adipogenic regulatory genes Cebpa and Pparg early during adipogenesis. The loss of SETD5 from the complex is followed by enhancer hyperacetylation. SETD5 protein levels were transiently increased and rapidly degraded prior to enhancer activation providing a mechanism for the loss of SETD5 during the transition. We show that induction of the CDC20 co-activator of the ubiquitin ligase leads to APC/C mediated degradation of SETD5 during the transition and this operates as a molecular switch that facilitates adipogenesis.
  9. Elife. 2021 Dec 01. pii: e72676. [Epub ahead of print]10
      Flowering plants utilize small RNA molecules to guide DNA methyltransferases to genomic sequences. This RNA-directed DNA methylation (RdDM) pathway preferentially targets euchromatic transposable elements. However, RdDM is thought to be recruited by methylation of histone H3 at lysine 9 (H3K9me), a hallmark of heterochromatin. How RdDM is targeted to euchromatin despite an affinity for H3K9me is unclear. Here we show that loss of histone H1 enhances heterochromatic RdDM, preferentially at nucleosome linker DNA. Surprisingly, this does not require SHH1, the RdDM component that binds H3K9me. Furthermore, H3K9me is dispensable for RdDM, as is CG DNA methylation. Instead, we find that non-CG methylation is specifically associated with small RNA biogenesis, and without H1 small RNA production quantitatively expands to non-CG methylated loci. Our results demonstrate that H1 enforces the separation of euchromatic and heterochromatic DNA methylation pathways by excluding the small RNA-generating branch of RdDM from non-CG methylated heterochromatin.
    Keywords:  A. thaliana; chromosomes; gene expression; plant biology
  10. Mol Cell Oncol. 2021 ;8(5): 1984827
      We reported that histone H3 lysine (K) 4 methyltransferase, KMT2D, serves as a potent tumor-suppressor in melanoma, which was identified via in vivo epigenome-focused RNA interference (RNAi) screen. KMT2D-deficient tumors show substantial reprogramming of key metabolic pathways including glycolysis via reduction of H3K4me1 (Histone H3K4 mono-methylation)-marked active enhancers, conferring sensitivity to inhibitors of glycolysis and IGFR (Insulin Growth Factor Receptor) pathway.
    Keywords:  KMT2D; enhancer reprogramming; glycolysis; melanoma
  11. PLoS Genet. 2021 Dec;17(12): e1009250
      Epigenetic mechanisms are gatekeepers for the gene expression patterns that establish and maintain cellular identity in mammalian development, stem cells and adult homeostasis. Amongst many epigenetic marks, methylation of histone 3 lysine 4 (H3K4) is one of the most widely conserved and occupies a central position in gene expression. Mixed lineage leukemia 1 (MLL1/KMT2A) is the founding mammalian H3K4 methyltransferase. It was discovered as the causative mutation in early onset leukemia and subsequently found to be required for the establishment of definitive hematopoiesis and the maintenance of adult hematopoietic stem cells. Despite wide expression, the roles of MLL1 in non-hematopoietic tissues remain largely unexplored. To bypass hematopoietic lethality, we used bone marrow transplantation and conditional mutagenesis to discover that the most overt phenotype in adult Mll1-mutant mice is intestinal failure. MLL1 is expressed in intestinal stem cells (ISCs) and transit amplifying (TA) cells but not in the villus. Loss of MLL1 is accompanied by loss of ISCs and a differentiation bias towards the secretory lineage with increased numbers and enlargement of goblet cells. Expression profiling of sorted ISCs revealed that MLL1 is required to promote expression of several definitive intestinal transcription factors including Pitx1, Pitx2, Foxa1, Gata4, Zfp503 and Onecut2, as well as the H3K27me3 binder, Bahcc1. These results were recapitulated using conditional mutagenesis in intestinal organoids. The stem cell niche in the crypt includes ISCs in close association with Paneth cells. Loss of MLL1 from ISCs promoted transcriptional changes in Paneth cells involving metabolic and stress responses. Here we add ISCs to the MLL1 repertoire and observe that all known functions of MLL1 relate to the properties of somatic stem cells, thereby highlighting the suggestion that MLL1 is a master somatic stem cell regulator.
  12. Nat Commun. 2021 Dec 02. 12(1): 7033
      Comprehensive genomic studies have delineated key driver mutations linked to disease progression for most cancers. However, corresponding transcriptional changes remain largely elusive because of the bias associated with cross-study analysis. Here, we overcome these hurdles and generate a comprehensive prostate cancer transcriptome atlas that describes the roadmap to tumor progression in a qualitative and quantitative manner. Most cancers follow a uniform trajectory characterized by upregulation of polycomb-repressive-complex-2, G2-M checkpoints, and M2 macrophage polarization. Using patient-derived xenograft models, we functionally validate our observations and add single-cell resolution. Thereby, we show that tumor progression occurs through transcriptional adaption rather than a selection of pre-existing cancer cell clusters. Moreover, we determine at the single-cell level how inhibition of EZH2 - the top upregulated gene along the trajectory - reverts tumor progression and macrophage polarization. Finally, a user-friendly web-resource is provided enabling the investigation of dynamic transcriptional perturbations linked to disease progression.
  13. Genetics. 2021 Aug 26. pii: iyab108. [Epub ahead of print]219(1):
      Drosophila Heterochromatin Protein 1a (HP1a) is essential for heterochromatin formation and is involved in transcriptional silencing. However, certain loci require HP1a to be transcribed. One model posits that HP1a acts as a transcriptional silencer within euchromatin while acting as an activator within heterochromatin. However, HP1a has been observed as an activator of a set of euchromatic genes. Therefore, it is not clear whether, or how, chromatin context informs the function of HP1 proteins. To understand the role of HP1 proteins in transcription, we examined the genome-wide binding profile of HP1a as well as two other Drosophila HP1 family members, HP1B and HP1C, to determine whether coordinated binding of these proteins is associated with specific transcriptional outcomes. We found that HP1 proteins share many of their endogenous binding targets. These genes are marked by active histone modifications and are expressed at higher levels than nontarget genes in both heterochromatin and euchromatin. In addition, HP1 binding targets displayed increased RNA polymerase pausing compared with nontarget genes. Specifically, colocalization of HP1B and HP1C was associated with the highest levels of polymerase pausing and gene expression. Analysis of HP1 null mutants suggests these proteins coordinate activity at transcription start sites to regulate transcription. Depletion of HP1B or HP1C alters expression of protein-coding genes bound by HP1 family members. Our data broaden understanding of the mechanism of transcriptional activation by HP1a and highlight the need to consider particular protein-protein interactions, rather than broader chromatin context, to predict impacts of HP1 at transcription start sites.
    Keywords:  Heterochromatin Protein 1; chromatin; promoter proximal pausing; transcription
  14. Cell Cycle. 2021 Nov 29. 1-10
      In fission yeast, MBF-dependent transcription is required for cells to complete S phase. The MBF transcription factor is regulated through a complex feedback mechanism that involves the co-repressors Yox1 and Nrm1 that are loaded onto MBF at the end of S phase, while positive transactivation is achieved through the constitutive binding of the co-activator Rep2. Here we show that Rep2 is required to fully recruit the chromatin remodelers SWI/SNF and RSC to MBF-regulated promoters. On the contrary, Nrm1 and Yox1, when bound to the MBF complex, block the approximation of these chromatin remodelers to MBF-regulated promoters. We propose that SWI/SNF and RSC are recruited to MBF-regulated genes, and RSC together with SAGA complex are important to regulate the G1-to-S transcriptional wave. Mutants of these remodeler complexes are highly sensitive when cells are exposed to insults that challenge DNA synthesis.
    Keywords:  MBF; RSC complex; start
  15. Genome Biol. 2021 Nov 29. 22(1): 323
      We present recount3, a resource consisting of over 750,000 publicly available human and mouse RNA sequencing (RNA-seq) samples uniformly processed by our new Monorail analysis pipeline. To facilitate access to the data, we provide the recount3 and snapcount R/Bioconductor packages as well as complementary web resources. Using these tools, data can be downloaded as study-level summaries or queried for specific exon-exon junctions, genes, samples, or other features. Monorail can be used to process local and/or private data, allowing results to be directly compared to any study in recount3. Taken together, our tools help biologists maximize the utility of publicly available RNA-seq data, especially to improve their understanding of newly collected data. recount3 is available from .
  16. J Immunol. 2021 Dec 03. pii: ji2100923. [Epub ahead of print]
      Somatic hypermutation (SHM) drives the genetic diversity of Ig genes in activated B cells and supports the generation of Abs with increased affinity for Ag. SHM is targeted to Ig genes by their enhancers (diversification activators [DIVACs]), but how the enhancers mediate this activity is unknown. We show using chicken DT40 B cells that highly active DIVACs increase the phosphorylation of RNA polymerase II (Pol II) and Pol II occupancy in the mutating gene with little or no accompanying increase in elongation-competent Pol II or production of full-length transcripts, indicating accumulation of stalled Pol II. DIVAC has similar effect also in human Ramos Burkitt lymphoma cells. The DIVAC-induced stalling is weakly associated with an increase in the detection of ssDNA bubbles in the mutating target gene. We did not find evidence for antisense transcription, or that DIVAC functions by altering levels of H3K27ac or the histone variant H3.3 in the mutating gene. These findings argue for a connection between Pol II stalling and cis-acting targeting elements in the context of SHM and thus define a mechanistic basis for locus-specific targeting of SHM in the genome. Our results suggest that DIVAC elements render the target gene a suitable platform for AID-mediated mutation without a requirement for increasing transcriptional output.
  17. Nat Commun. 2021 Dec 01. 12(1): 7000
      At initiation of X chromosome inactivation (XCI), Xist is monoallelically upregulated from the future inactive X (Xi) chromosome, overcoming repression by its antisense transcript Tsix. Xist recruits various chromatin remodelers, amongst them SPEN, which are involved in silencing of X-linked genes in cis and establishment of the Xi. Here, we show that SPEN plays an important role in initiation of XCI. Spen null female mouse embryonic stem cells (ESCs) are defective in Xist upregulation upon differentiation. We find that Xist-mediated SPEN recruitment to the Xi chromosome happens very early in XCI, and that SPEN-mediated silencing of the Tsix promoter is required for Xist upregulation. Accordingly, failed Xist upregulation in Spen-/- ESCs can be rescued by concomitant removal of Tsix. These findings indicate that SPEN is not only required for the establishment of the Xi, but is also crucial in initiation of the XCI process.
  18. STAR Protoc. 2021 Dec 17. 2(4): 100972
      Single-cell multi-omics sequencing technology can infer cell heterogeneity and reveal relationships across molecular layers. Combining single-cell RNA sequencing, DNA methylation, and chromatin accessibility allows a multimodal understanding of cell function and epigenetic regulation within individual cells. Here, we offer a protocol to perform scChaRM-seq (single-cell chromatin accessibility, RNA barcoding, and DNA methylation sequencing), which has been applied to study de novo DNA methylation and its relationship with transcription and chromatin accessibility in single human oocytes. For complete details on the use and execution of this protocol, please refer to Yan et al. (2021).
    Keywords:  Gene Expression; Genomics; Molecular Biology; RNAseq; Sequencing; Single Cell
  19. NAR Genom Bioinform. 2021 Dec;3(4): lqab101
      As chromatin accessibility data from ATAC-seq experiments continues to expand, there is continuing need for standardized analysis pipelines. Here, we present PEPATAC, an ATAC-seq pipeline that is easily applied to ATAC-seq projects of any size, from one-off experiments to large-scale sequencing projects. PEPATAC leverages unique features of ATAC-seq data to optimize for speed and accuracy, and it provides several unique analytical approaches. Output includes convenient quality control plots, summary statistics, and a variety of generally useful data formats to set the groundwork for subsequent project-specific data analysis. Downstream analysis is simplified by a standard definition format, modularity of components, and metadata APIs in R and Python. It is restartable, fault-tolerant, and can be run on local hardware, using any cluster resource manager, or in provided Linux containers. We also demonstrate the advantage of aligning to the mitochondrial genome serially, which improves the accuracy of alignment statistics and quality control metrics. PEPATAC is a robust and portable first step for any ATAC-seq project. BSD2-licensed code and documentation are available at
  20. Nucleic Acids Res. 2021 Nov 30. pii: gkab1113. [Epub ahead of print]
      JASPAR ( is an open-access database containing manually curated, non-redundant transcription factor (TF) binding profiles for TFs across six taxonomic groups. In this 9th release, we expanded the CORE collection with 341 new profiles (148 for plants, 101 for vertebrates, 85 for urochordates, and 7 for insects), which corresponds to a 19% expansion over the previous release. We added 298 new profiles to the Unvalidated collection when no orthogonal evidence was found in the literature. All the profiles were clustered to provide familial binding profiles for each taxonomic group. Moreover, we revised the structural classification of DNA binding domains to consider plant-specific TFs. This release introduces word clouds to represent the scientific knowledge associated with each TF. We updated the genome tracks of TFBSs predicted with JASPAR profiles in eight organisms; the human and mouse TFBS predictions can be visualized as native tracks in the UCSC Genome Browser. Finally, we provide a new tool to perform JASPAR TFBS enrichment analysis in user-provided genomic regions. All the data is accessible through the JASPAR website, its associated RESTful API, the R/Bioconductor data package, and a new Python package, pyJASPAR, that facilitates serverless access to the data.
  21. Mol Cell. 2021 Nov 18. pii: S1097-2765(21)00952-7. [Epub ahead of print]
      The MYCN oncoprotein drives the development of numerous neuroendocrine and pediatric tumors. Here we show that MYCN interacts with the nuclear RNA exosome, a 3'-5' exoribonuclease complex, and recruits the exosome to its target genes. In the absence of the exosome, MYCN-directed elongation by RNA polymerase II (RNAPII) is slow and non-productive on a large group of cell-cycle-regulated genes. During the S phase of MYCN-driven tumor cells, the exosome is required to prevent the accumulation of stalled replication forks and of double-strand breaks close to the transcription start sites. Upon depletion of the exosome, activation of ATM causes recruitment of BRCA1, which stabilizes nuclear mRNA decapping complexes, leading to MYCN-dependent transcription termination. Disruption of mRNA decapping in turn activates ATR, indicating transcription-replication conflicts. We propose that exosome recruitment by MYCN maintains productive transcription elongation during S phase and prevents transcription-replication conflicts to maintain the rapid proliferation of neuroendocrine tumor cells.
    Keywords:  ATM; ATR; DCP1A; MYC; MYCN; Neuroblastoma; RNA Exosome; TFIIS; transcription-replication conflict.
  22. Nucleic Acids Res. 2021 Nov 25. pii: gkab1128. [Epub ahead of print]
      Molecular interactions are key drivers of biological function. Providing interaction resources to the research community is important since they allow functional interpretation and network-based analysis of molecular data. ConsensusPathDB ( is a meta-database combining interactions of diverse types from 31 public resources for humans, 16 for mice and 14 for yeasts. Using ConsensusPathDB, researchers commonly evaluate lists of genes, proteins and metabolites against sets of molecular interactions defined by pathways, Gene Ontology and network neighborhoods and retrieve complex molecular neighborhoods formed by heterogeneous interaction types. Furthermore, the integrated protein-protein interaction network is used as a basis for propagation methods. Here, we present the 2022 update of ConsensusPathDB, highlighting content growth, additional functionality and improved database stability. For example, the number of human molecular interactions increased to 859 848 connecting 200 499 unique physical entities such as genes/proteins, metabolites and drugs. Furthermore, we integrated regulatory datasets in the form of transcription factor-, microRNA- and enhancer-gene target interactions, thus providing novel functionality in the context of overrepresentation and enrichment analyses. We specifically emphasize the use of the integrated protein-protein interaction network as a scaffold for network inferences, present topological characteristics of the network and discuss strengths and shortcomings of such approaches.
  23. Oncogene. 2021 Dec 03.
      The oncogenic potential of the latent transcription factor signal transducer and activator of transcription (STAT)3 in many human cancers, including lung cancer, has been largely attributed to its nuclear activity as a tyrosine-phosphorylated (pY705 site) transcription factor. By contrast, an alternate mitochondrial pool of serine phosphorylated (pS727 site) STAT3 has been shown to promote tumourigenesis by regulating metabolic processes, although this has been reported in only a restricted number of mutant RAS-addicted neoplasms. Therefore, the involvement of STAT3 serine phosphorylation in the pathogenesis of most cancer types, including mutant KRAS lung adenocarcinoma (LAC), is unknown. Here, we demonstrate that LAC is suppressed in oncogenic KrasG12D-driven mouse models engineered for pS727-STAT3 deficiency. The proliferative potential of the transformed KrasG12D lung epithelium, and mutant KRAS human LAC cells, was significantly reduced upon pS727-STAT3 deficiency. Notably, we uncover the multifaceted capacity of constitutive pS727-STAT3 to metabolically reprogramme LAC cells towards a hyper-proliferative state by regulating nuclear and mitochondrial (mt) gene transcription, the latter via the mtDNA transcription factor, TFAM. Collectively, our findings reveal an obligate requirement for the transcriptional activity of pS727-STAT3 in mutant KRAS-driven LAC with potential to guide future therapeutic targeting approaches.
  24. Nucleic Acids Res. 2021 Dec 01. pii: gkab1176. [Epub ahead of print]
      High levels of histone acetylation are associated with the regulatory elements of active genes, suggesting a link between acetylation and gene activation. We revisited this model, in the context of EGF-inducible gene expression and found that rather than a simple unifying model, there are two broad classes of genes; one in which high lysine acetylation activity is required for efficient gene activation, and a second group where the opposite occurs and high acetylation activity is inhibitory. We examined the latter class in more detail using EGR2 as a model gene and found that lysine acetylation levels are critical for several activation parameters, including the timing of expression onset, and overall amplitudes of the transcriptional response. In contrast, DUSP1 responds in the canonical manner and its transcriptional activity is promoted by acetylation. Single cell approaches demonstrate heterogenous activation kinetics of a given gene in response to EGF stimulation. Acetylation levels modify these heterogenous patterns and influence both allele activation frequencies and overall expression profile parameters. Our data therefore point to a complex interplay between acetylation equilibria and target gene induction where acetylation level thresholds are an important determinant of transcriptional induction dynamics that are sensed in a gene-specific manner.
  25. Nucleic Acids Res. 2021 Nov 25. pii: gkab1125. [Epub ahead of print]
      Here, we report that in T47D breast cancer cells 50 pM progestin is sufficient to activate cell cycle entry and the progesterone gene expression program. At this concentration, equivalent to the progesterone blood levels found around the menopause, progesterone receptor (PR) binds only to 2800 genomic sites, which are accessible to ATAC cleavage prior to hormone exposure. These highly accessible sites (HAs) are surrounded by well-organized nucleosomes and exhibit breast enhancer features, including estrogen receptor alpha (ERα), higher FOXA1 and BRD4 (bromodomain containing 4) occupancy. Although HAs are enriched in RAD21 and CTCF, PR binding is the driving force for the most robust interactions with hormone-regulated genes. HAs show higher frequency of 3D contacts among themselves than with other PR binding sites, indicating colocalization in similar compartments. Gene regulation via HAs is independent of classical coregulators and ATP-activated remodelers, relying mainly on MAP kinase activation that enables PR nuclear engagement. HAs are also preferentially occupied by PR and ERα in breast cancer xenografts derived from MCF-7 cells as well as from patients, indicating their potential usefulness as targets for therapeutic intervention.
  26. Nucleic Acids Res. 2021 Nov 26. pii: gkab1133. [Epub ahead of print]
      Mapping gene interactions within tissues/cell types plays a crucial role in understanding the genetic basis of human physiology and disease. Tissue functional gene networks (FGNs) are essential models for mapping complex gene interactions. We present TissueNexus, a database of 49 human tissue/cell line FGNs constructed by integrating heterogeneous genomic data. We adopted an advanced machine learning approach for data integration because Bayesian classifiers, which is the main approach used for constructing existing tissue gene networks, cannot capture the interaction and nonlinearity of genomic features well. A total of 1,341 RNA-seq datasets containing 52,087 samples were integrated for all of these networks. Because the tissue label for RNA-seq data may be annotated with different names or be missing, we performed intensive hand-curation to improve quality. We further developed a user-friendly database for network search, visualization, and functional analysis. We illustrate the application of TissueNexus in prioritizing disease genes. The database is publicly available at
  27. Nucleic Acids Res. 2021 Nov 29. pii: gkab1179. [Epub ahead of print]
      Methylation on CpG residues is one of the most important epigenetic modifications of nuclear DNA, regulating gene expression. Methylation of mitochondrial DNA (mtDNA) has been studied using whole genome bisulfite sequencing (WGBS), but recent evidence has uncovered technical issues which introduce a potential bias during methylation quantification. Here, we validate the technical concerns of WGBS, and develop and assess the accuracy of a new protocol for mtDNA nucleotide variant-specific methylation using single-molecule Oxford Nanopore Sequencing (ONS). Our approach circumvents confounders by enriching for full-length molecules over nuclear DNA. Variant calling analysis against showed that 99.5% of homoplasmic mtDNA variants can be reliably identified providing there is adequate sequencing depth. We show that some of the mtDNA methylation signal detected by ONS is due to sequence-specific false positives introduced by the technique. The residual signal was observed across several human primary and cancer cell lines and multiple human tissues, but was always below the error threshold modelled using negative controls. We conclude that there is no evidence for CpG methylation in human mtDNA, thus resolving previous controversies. Additionally, we developed a reliable protocol to study epigenetic modifications of mtDNA at single-molecule and single-base resolution, with potential applications beyond CpG methylation.