bims-crepig Biomed News
on Chromatin regulation and epigenetics in cell fate and cancer
Issue of 2021–12–26
twenty-six papers selected by
Connor Rogerson, University of Cambridge, MRC Cancer Unit



  1. Genome Biol. 2021 Dec 20. 22(1): 348
      Understanding the contributions of transcription factor DNA binding sites to transcriptional enhancers is a significant challenge. We developed Quantitative enhancer-FACS-Seq for highly parallel quantification of enhancer activities from a genomically integrated reporter in Drosophila melanogaster embryos. We investigate the contributions of the DNA binding motifs of four poorly characterized TFs to the activities of twelve embryonic mesodermal enhancers. We measure quantitative changes in enhancer activity and discover a range of epistatic interactions among the motifs, both synergistic and alleviating. We find that understanding the regulatory consequences of TF binding motifs requires that they be investigated in combination across enhancer contexts.
    Keywords:  Drosophila embryonic mesoderm; Enhancers; Epistasis; Reporter assay; Transcription factor binding sites
    DOI:  https://doi.org/10.1186/s13059-021-02574-x
  2. Mol Cell. 2021 Dec 14. pii: S1097-2765(21)01025-X. [Epub ahead of print]
      Developmental genes such as Xist, which initiates X chromosome inactivation, are controlled by complex cis-regulatory landscapes, which decode multiple signals to establish specific spatiotemporal expression patterns. Xist integrates information on X chromosome dosage and developmental stage to trigger X inactivation in the epiblast specifically in female embryos. Through a pooled CRISPR screen in differentiating mouse embryonic stem cells, we identify functional enhancer elements of Xist at the onset of random X inactivation. Chromatin profiling reveals that X-dosage controls the promoter-proximal region, while differentiation cues activate several distal enhancers. The strongest distal element lies in an enhancer cluster associated with a previously unannotated Xist-enhancing regulatory transcript, which we named Xert. Developmental cues and X-dosage are thus decoded by distinct regulatory regions, which cooperate to ensure female-specific Xist upregulation at the correct developmental time. With this study, we start to disentangle how multiple, functionally distinct regulatory elements interact to generate complex expression patterns in mammals.
    Keywords:  CRISPR screens; CRISPRi; X-chromosome inactivation; Xert; Xist; chromatin modifications; enhancers; epigenetics; lncRNA
    DOI:  https://doi.org/10.1016/j.molcel.2021.11.023
  3. Genome Res. 2021 Dec 21. pii: gr.275669.121. [Epub ahead of print]
      Nuclear organization and chromatin interactions are important for genome function, yet determining chromatin connections at high-resolution remains a major challenge. To address this, we developed Accessible Region Conformation Capture (ARC-C), which profiles interactions between regulatory elements genome-wide without a capture step. Applied to C. elegans, we identify ~15,000 significant interactions between regulatory elements at 500bp resolution. Of 105 TFs or chromatin regulators tested, we find that the binding sites of 60 are enriched for interacting with each other, making them candidates for mediating interactions. These include cohesin and condensin II. Applying ARC-C to a mutant of transcription factor BLMP-1 detected changes in interactions between its targets. ARC-C simultaneously profiles domain level architecture, and we observe that C. elegans chromatin domains defined by either active or repressive modifications form topologically associating domains (TADs) which interact with A/B (active/inactive) compartment-like structure. Furthermore, we discovered that inactive compartment interactions are dependent on H3K9 methylation. ARC-C is a powerful new tool to interrogate genome architecture and regulatory interactions at high resolution.
    DOI:  https://doi.org/10.1101/gr.275669.121
  4. Genome Res. 2021 Dec 20. pii: gr.275901.121. [Epub ahead of print]
      Gene expression is regulated through complex molecular interactions, involving cis-acting elements that can be situated far away from their target genes. Data on long-range contacts between promoters and regulatory elements is rapidly accumulating. However, it remains unclear how these regulatory relationships evolve and how they contribute to the establishment of robust gene expression profiles. Here, we address these questions by comparing genome-wide maps of promoter-centered chromatin contacts in mouse and human. We show that there is significant evolutionary conservation of cis-regulatory landscapes, indicating that selective pressures act to preserve not only regulatory element sequences but also their chromatin contacts with target genes. The extent of evolutionary conservation is remarkable for long-range promoter-enhancer contacts, illustrating how the structure of regulatory landscapes constrains large-scale genome evolution. We show that the evolution of cis-regulatory landscapes, measured in terms of distal element sequences, synteny or contacts with target genes, is significantly associated with gene expression evolution.
    DOI:  https://doi.org/10.1101/gr.275901.121
  5. Nucleic Acids Res. 2021 Dec 21. pii: gkab1215. [Epub ahead of print]
      Transcription and genome architecture are interdependent, but it is still unclear how nucleosomes in the chromatin fiber interact with nascent RNA, and which is the relative nuclear distribution of these RNAs and elongating RNA polymerase II (RNAP II). Using super-resolution (SR) microscopy, we visualized the nascent transcriptome, in both nucleoplasm and nucleolus, with nanoscale resolution. We found that nascent RNAs organize in structures we termed RNA nanodomains, whose characteristics are independent of the number of transcripts produced over time. Dual-color SR imaging of nascent RNAs, together with elongating RNAP II and H2B, shows the physical relation between nucleosome clutches, RNAP II, and RNA nanodomains. The distance between nucleosome clutches and RNA nanodomains is larger than the distance measured between elongating RNAP II and RNA nanodomains. Elongating RNAP II stands between nascent RNAs and the small, transcriptionally active, nucleosome clutches. Moreover, RNA factories are small and largely formed by few RNAP II. Finally, we describe a novel approach to quantify the transcriptional activity at an individual gene locus. By measuring local nascent RNA accumulation upon transcriptional activation at single alleles, we confirm the measurements made at the global nuclear level.
    DOI:  https://doi.org/10.1093/nar/gkab1215
  6. Genome Res. 2021 Dec 21. pii: gr.276042.121. [Epub ahead of print]
      Chromosomal translocations are important drivers of hematological malignancies whereby proto-oncogenes are activated by juxtaposition with super-enhancers, often called enhancer hijacking. We analysed the epigenomic consequences of rearrangements between the super-enhancers of the immunoglobulin heavy locus (IGH) and proto-oncogene CCND1 that are common in B cell malignancies. By integrating BLUEPRINT epigenomic data with DNA breakpoint detection, we characterised the normal chromatin landscape of the human IGH locus and its dynamics after pathological genomic rearrangement. We detected an H3K4me3 broad domain (BD) within the IGH locus of healthy B cells that was absent in samples with IGH-CCND1 translocations. The appearance of H3K4me3-BD over CCND1 in the latter was associated with overexpression and extensive chromatin accessibility of its gene body. We observed similar cancer-specific H3K4me3-BDs associated with super-enhancer hijacking of other common oncogenes in B cell (MAF, MYC and FGFR3/NSD2) and in T-cell malignancies (LMO2, TLX3 and TAL1). Our analysis suggests that H3K4me3-BDs can be created by super-enhancers and supports the new concept of epigenomic translocation, where the relocation of H3K4me3-BDs from cell identity genes to oncogenes accompanies the translocation of super-enhancers.
    DOI:  https://doi.org/10.1101/gr.276042.121
  7. PLoS Genet. 2021 Dec 23. 17(12): e1009986
      TP53 and ARID1A are frequently mutated across cancer but rarely in the same primary tumor. Endometrial cancer has the highest TP53-ARID1A mutual exclusivity rate. However, the functional relationship between TP53 and ARID1A mutations in the endometrium has not been elucidated. We used genetically engineered mice and in vivo genomic approaches to discern both unique and overlapping roles of TP53 and ARID1A in the endometrium. TP53 loss with oncogenic PIK3CAH1047R in the endometrial epithelium results in features of endometrial hyperplasia, adenocarcinoma, and intraepithelial carcinoma. Mutant endometrial epithelial cells were transcriptome profiled and compared to control cells and ARID1A/PIK3CA mutant endometrium. In the context of either TP53 or ARID1A loss, PIK3CA mutant endometrium exhibited inflammatory pathway activation, but other gene expression programs differed based on TP53 or ARID1A status, such as epithelial-to-mesenchymal transition. Gene expression patterns observed in the genetic mouse models are reflective of human tumors with each respective genetic alteration. Consistent with TP53-ARID1A mutual exclusivity, the p53 pathway is activated following ARID1A loss in the endometrial epithelium, where ARID1A normally directly represses p53 pathway genes in vivo, including the stress-inducible transcription factor, ATF3. However, co-existing TP53-ARID1A mutations led to invasive adenocarcinoma associated with mutant ARID1A-driven ATF3 induction, reduced apoptosis, TP63+ squamous differentiation and invasion. These data suggest TP53 and ARID1A mutations drive shared and distinct tumorigenic programs in the endometrium and promote invasive endometrial cancer when existing simultaneously. Hence, TP53 and ARID1A mutations may co-occur in a subset of aggressive or metastatic endometrial cancers, with ARID1A loss promoting squamous differentiation and the acquisition of invasive properties.
    DOI:  https://doi.org/10.1371/journal.pgen.1009986
  8. Nucleic Acids Res. 2021 Dec 21. pii: gkab1250. [Epub ahead of print]
      Transcription factors (TFs) play a pivotal role in cell fate decision by coordinating gene expression programs. Although most TFs act at the DNA layer, few TFs bind RNA and modulate splicing. Yet, the mechanistic cues underlying TFs activity in splicing remain elusive. Focusing on the Drosophila Hox TF Ultrabithorax (Ubx), our work shed light on a novel layer of Ubx function at the RNA level. Transcriptome and genome-wide binding profiles in embryonic mesoderm and Drosophila cells indicate that Ubx regulates mRNA expression and splicing to promote distinct outcomes in defined cellular contexts. Our results demonstrate a new RNA-binding ability of Ubx. We find that the N51 amino acid of the DNA-binding Homeodomain is non-essential for RNA interaction in vitro, but is required for RNA interaction in vivo and Ubx splicing activity. Moreover, mutation of the N51 amino acid weakens the interaction between Ubx and active RNA Polymerase II (Pol II). Our results reveal that Ubx regulates elongation-coupled splicing, which could be coordinated by a dynamic interplay with active Pol II on chromatin. Overall, our work uncovered a novel role of the Hox TFs at the mRNA regulatory layer. This could be an essential function for other classes of TFs to control cell diversity.
    DOI:  https://doi.org/10.1093/nar/gkab1250
  9. Nucleic Acids Res. 2021 Dec 20. pii: gkab1233. [Epub ahead of print]
      Investigating chromatin interactions between regulatory regions such as enhancer and promoter elements is vital for understanding the regulation of gene expression. Compared to Hi-C and its variants, the emerging 3D mapping technologies focusing on enriched signals, such as TrAC-looping, reduce the sequencing cost and provide higher interaction resolution for cis-regulatory elements. A robust pipeline is needed for the comprehensive interpretation of these data, especially for loop-centric analysis. Therefore, we have developed a new versatile tool named cLoops2 for the full-stack analysis of these 3D chromatin interaction data. cLoops2 consists of core modules for peak-calling, loop-calling, differentially enriched loops calling and loops annotation. It also contains multiple modules for interaction resolution estimation, data similarity estimation, features quantification, feature aggregation analysis, and visualization. cLoops2 with documentation and example data are open source and freely available at GitHub: https://github.com/KejiZhaoLab/cLoops2.
    DOI:  https://doi.org/10.1093/nar/gkab1233
  10. Nucleic Acids Res. 2021 Dec 21. pii: gkab1137. [Epub ahead of print]
      Histone H3mm18 is a non-allelic H3 variant expressed in skeletal muscle and brain in mice. However, its function has remained enigmatic. We found that H3mm18 is incorporated into chromatin in cells with low efficiency, as compared to H3.3. We determined the structures of the nucleosome core particle (NCP) containing H3mm18 by cryo-electron microscopy, which revealed that the entry/exit DNA regions are drastically disordered in the H3mm18 NCP. Consistently, the H3mm18 NCP is substantially unstable in vitro. The forced expression of H3mm18 in mouse myoblast C2C12 cells markedly suppressed muscle differentiation. A transcriptome analysis revealed that the forced expression of H3mm18 affected the expression of multiple genes, and suppressed a group of genes involved in muscle development. These results suggest a novel gene expression regulation system in which the chromatin landscape is altered by the formation of unusual nucleosomes with a histone variant, H3mm18, and provide important insight into understanding transcription regulation by chromatin.
    DOI:  https://doi.org/10.1093/nar/gkab1137
  11. FASEB Bioadv. 2021 Dec;3(12): 1020-1033
      Epigenetic alterations of chromatin structure affect chromatin accessibility and collaborate with genetic alterations in the development of cancer. Lysine demethylase 4B (KDM4B) has been identified as a JmjC domain-containing epigenetic modifier that possesses histone demethylase activity. Although recent studies have demonstrated that KDM4B positively regulates the pathogenesis of multiple types of solid tumors, the tissue specificity and context dependency have not been fully elucidated. In this study, we investigated gene expression profiles established from clinical samples and found that KDM4B is elevated specifically in acute myeloid leukemia (AML) associated with chromosomal translocation 8;21 [t(8;21)], which results in a fusion of the AML1 and the eight-twenty-one (ETO) genes to generate a leukemia oncogene, AML1-ETO fusion transcription factor. Short hairpin RNA-mediated KDM4B silencing significantly reduced cell proliferation in t(8;21)-positive AML cell lines. Meanwhile, KDM4B silencing suppressed the expression of AML1-ETO-inducible genes, and consistently perturbed chromatin accessibility of AML1-ETO-binding sites involving altered active enhancer marks and functional cis-regulatory elements. Notably, transduction of murine KDM4B orthologue mutants followed by KDM4B silencing demonstrated a requirement of methylated-histone binding modules for a proliferative surge. To address the role of KDM4B in leukemia development, we further generated and analyzed Kdm4b conditional knockout mice. As a result, Kdm4b deficiency attenuated clonogenic potential mediated by AML1-ETO and delayed leukemia progression in vivo. Thus, our results highlight a tumor-promoting role of KDM4B in AML associated with t(8;21).
    Keywords:  acute myeloid leukemia; chromatin accessibility; gene expression analysis; gene targeting
    DOI:  https://doi.org/10.1096/fba.2021-00030
  12. Nucleic Acids Res. 2021 Dec 20. pii: gkab1235. [Epub ahead of print]
      Three-dimensional (3D) conformation of the chromatin is crucial to stringently regulate gene expression patterns and DNA replication in a cell-type specific manner. Hi-C is a key technique for measuring 3D chromatin interactions genome wide. Estimating and predicting the resolution of a library is an essential step in any Hi-C experimental design. Here, we present the mathematical concepts to estimate the resolution of a dataset and predict whether deeper sequencing would enhance the resolution. We have developed HiCRes, a docker pipeline, by applying these concepts to several Hi-C libraries.
    DOI:  https://doi.org/10.1093/nar/gkab1235
  13. Development. 2021 Dec 15. pii: dev200191. [Epub ahead of print]148(24):
      Zygotic genome activation (ZGA) represents the initiation of transcription following fertilisation. Despite its importance, we know little of the molecular events that initiate mammalian ZGA in vivo. Recent in vitro studies in mouse embryonic stem cells have revealed developmental pluripotency associated 2 and 4 (Dppa2/4) as key regulators of ZGA-associated transcription. However, their roles in initiating ZGA in vivo remain unexplored. We reveal that Dppa2/4 proteins are present in the nucleus at all stages of preimplantation development and associate with mitotic chromatin. We generated conditional single and double maternal knockout mouse models to deplete maternal stores of Dppa2/4. Importantly, Dppa2/4 maternal knockout mice were fertile when mated with wild-type males. Immunofluorescence and transcriptome analyses of two-cell embryos revealed that, although ZGA took place, there were subtle defects in embryos that lacked maternal Dppa2/4. Strikingly, heterozygous offspring that inherited the null allele maternally had higher preweaning lethality than those that inherited the null allele paternally. Together, our results show that although Dppa2/4 are dispensable for ZGA transcription, maternal stores have an important role in offspring survival, potentially via epigenetic priming of developmental genes.
    Keywords:  Dppa2; Dppa4; Embryo; Epigenetics; Maternal factors; Mouse; Zygotic genome activation
    DOI:  https://doi.org/10.1242/dev.200191
  14. Nucleic Acids Res. 2021 Dec 24. pii: gkab1271. [Epub ahead of print]
      The histone chaperone FACT (FAcilitates Chromatin Transcription) plays an essential role in transcription and DNA replication by its dual functions on nucleosome assembly to maintain chromatin integrity and nucleosome disassembly to destabilize nucleosome and facilitate its accessibility simultaneously. Mono-ubiquitination at Lysine 119 of H2A (ubH2A) has been suggested to repress transcription by preventing the recruitment of FACT at early elongation process. However, up to date, how ubH2A directly affects FACT on nucleosome assembly and disassembly remains elusive. In this study, we demonstrated that the dual functions of FACT are differently regulated by ubH2A. The H2A ubiquitination does not affect FACT's chaperone function in nucleosome assembly and FACT can deposit ubH2A-H2B dimer on tetrasome to form intact nucleosome. However, ubH2A greatly restricts FACT binding on nucleosome and inhibits its activity of nucleosome disassembly. Interestingly, deubiquitination of ubH2A rescues the nucleosome disassembly function of FACT to activate gene transcription. Our findings provide mechanistic insights of how H2A ubiquitination affects FACT in breaking nucleosome and maintaining its integrity, which sheds light on the biological function of ubH2A and various FACT's activity under different chromatin states.
    DOI:  https://doi.org/10.1093/nar/gkab1271
  15. EMBO J. 2021 Dec 21. e109445
      Genetically diverse pluripotent stem cells display varied, heritable responses to differentiation cues. Here, we harnessed these disparities through derivation of mouse embryonic stem cells from the BXD genetic reference panel, along with C57BL/6J (B6) and DBA/2J (D2) parental strains, to identify loci regulating cell state transitions. Upon transition to formative pluripotency, B6 stem cells quickly dissolved naïve networks adopting gene expression modules indicative of neuroectoderm lineages, whereas D2 retained aspects of naïve pluripotency. Spontaneous formation of embryoid bodies identified divergent differentiation where B6 showed a propensity toward neuroectoderm and D2 toward definitive endoderm. Genetic mapping identified major trans-acting loci co-regulating chromatin accessibility and gene expression in both naïve and formative pluripotency. These loci distally modulated occupancy of pluripotency factors at hundreds of regulatory elements. One trans-acting locus on Chr 12 primarily impacted chromatin accessibility in embryonic stem cells, while in epiblast-like cells, the same locus subsequently influenced expression of genes enriched for neurogenesis, suggesting early chromatin priming. These results demonstrate genetically determined biases in lineage commitment and identify major regulators of the pluripotency epigenome.
    Keywords:  BXD mice; KRAB zinc-finger proteins; cell fate; epigenetic variability; mESCs
    DOI:  https://doi.org/10.15252/embj.2021109445
  16. Genome Biol. 2021 Dec 20. 22(1): 344
      Single-cell CRISPR screens are a promising biotechnology for mapping regulatory elements to target genes at genome-wide scale. However, technical factors like sequencing depth impact not only expression measurement but also perturbation detection, creating a confounding effect. We demonstrate on two single-cell CRISPR screens how these challenges cause calibration issues. We propose SCEPTRE: analysis of single-cell perturbation screens via conditional resampling, which infers associations between perturbations and expression by resampling the former according to a working model for perturbation detection probability in each cell. SCEPTRE demonstrates very good calibration and sensitivity on CRISPR screen data, yielding hundreds of new regulatory relationships supported by orthogonal biological evidence.
    DOI:  https://doi.org/10.1186/s13059-021-02545-2
  17. Nat Immunol. 2022 Jan;23(1): 122-134
      T cell activation, a key early event in the adaptive immune response, is subject to elaborate transcriptional control. In the present study, we examined how the activities of eight major transcription factor (TF) families are integrated to shape the epigenome of naive and activated CD4 and CD8 T cells. By leveraging extensive polymorphisms in evolutionarily divergent mice, we identified the 'heavy lifters' positively influencing chromatin accessibility. Members of Ets, Runx and TCF/Lef TF families occupied the vast majority of accessible chromatin regions, acting as 'housekeepers', 'universal amplifiers' and 'placeholders', respectively, at sites that maintained or gained accessibility upon T cell activation. In addition, a small subset of strongly induced immune response genes displayed a noncanonical TF recruitment pattern. Our study provides a key resource and foundation for the understanding of transcriptional and epigenetic regulation in T cells and offers a new perspective on the hierarchical interactions between critical TFs.
    DOI:  https://doi.org/10.1038/s41590-021-01086-x
  18. Nat Methods. 2021 Dec 23.
      Single-cell atlases often include samples that span locations, laboratories and conditions, leading to complex, nested batch effects in data. Thus, joint analysis of atlas datasets requires reliable data integration. To guide integration method choice, we benchmarked 68 method and preprocessing combinations on 85 batches of gene expression, chromatin accessibility and simulation data from 23 publications, altogether representing >1.2 million cells distributed in 13 atlas-level integration tasks. We evaluated methods according to scalability, usability and their ability to remove batch effects while retaining biological variation using 14 evaluation metrics. We show that highly variable gene selection improves the performance of data integration methods, whereas scaling pushes methods to prioritize batch removal over conservation of biological variation. Overall, scANVI, Scanorama, scVI and scGen perform well, particularly on complex integration tasks, while single-cell ATAC-sequencing integration performance is strongly affected by choice of feature space. Our freely available Python module and benchmarking pipeline can identify optimal data integration methods for new data, benchmark new methods and improve method development.
    DOI:  https://doi.org/10.1038/s41592-021-01336-8
  19. Nucleic Acids Res. 2021 Dec 20. pii: gkab1242. [Epub ahead of print]
      CTCF is crucial to the organization of mammalian genomes into loop structures. According to recent studies, the transcription apparatus is compartmentalized and concentrated at super-enhancers to form phase-separated condensates and drive the expression of cell-identity genes. However, it remains unclear whether and how transcriptional condensates are coupled to higher-order chromatin organization. Here, we show that CTCF is essential for RNA polymerase II (Pol II)-mediated chromatin interactions, which occur as hyperconnected spatial clusters at super-enhancers. We also demonstrate that CTCF clustering, unlike Pol II clustering, is independent of liquid-liquid phase-separation and resistant to perturbation of transcription. Interestingly, clusters of Pol II, BRD4, and MED1 were found to dissolve upon CTCF depletion, but were reinstated upon restoration of CTCF, suggesting a potent instructive function for CTCF in the formation of transcriptional condensates. Overall, we provide evidence suggesting that CTCF-mediated chromatin looping acts as an architectural prerequisite for the assembly of phase-separated transcriptional condensates.
    DOI:  https://doi.org/10.1093/nar/gkab1242
  20. Elife. 2021 Dec 24. pii: e70535. [Epub ahead of print]10
      Liquid-liquid phase separation (LLPS) of intrinsically disordered regions (IDRs) in proteins can drive the formation of membraneless compartments in cells. Phase-separated structures enrich for specific partner proteins and exclude others. Previously, we showed that the IDRs of metazoan DNA replication initiators drive DNA-dependent phase separation in vitro and chromosome binding in vivo, and that initiator condensates selectively recruit replication-specific partner proteins (Parker et al., 2019). How initiator IDRs facilitate LLPS and maintain compositional specificity is unknown. Here, using D. melanogaster (Dm) Cdt1 as a model initiation factor, we show that phase separation results from a synergy between electrostatic DNA-bridging interactions and hydrophobic inter-IDR contacts. Both sets of interactions depend on sequence composition (but not sequence order), are resistant to 1,6-hexanediol, and do not depend on aromaticity. These findings demonstrate that distinct sets of interactions drive condensate formation and specificity across different phase-separating systems and advance efforts to predict IDR LLPS propensity and partner selection a priori.
    Keywords:  D. melanogaster; biochemistry; chemical biology
    DOI:  https://doi.org/10.7554/eLife.70535
  21. Genome Res. 2021 Dec 23. pii: gr.275723.121. [Epub ahead of print]
      Accurate transcription start site (TSS) annotations are essential for understanding transcriptional regulation and its role in human disease. Gene collections such as GENCODE contain annotations for tens of thousands of TSSs, but not all of these annotations are experimentally validated, nor do they contain information on cell type-specific usage. Therefore, we sought to generate a collection of experimentally validated TSSs by integrating RNA Annotation and Mapping of Promoters for the Analysis of Gene Expression (RAMPAGE) data from 115 cell and tissue types, which resulted in a collection of approximately 50 thousand representative RAMPAGE peaks. These peaks were primarily proximal to GENCODE-annotated TSSs and were concordant with other transcription assays. Because RAMPAGE uses paired-end reads, we were then able to connect peaks to transcripts by analyzing the genomic positions of the 3' ends of read mates. Using this paired-end information, we classified the vast majority (37 thousand) of our RAMPAGE peaks as verified TSSs, updating TSS annotations for 20% of GENCODE genes. We also found that these updated TSS annotations were supported by epigenomic and other transcriptomic datasets. To demonstrate the utility of this RAMPAGE rPeak collection, we intersected it with the NHGRI/EBI genome-wide association studies (GWAS) catalog and identified new candidate GWAS genes. Overall, our work demonstrates the importance of integrating experimental data to further refine TSS annotations and provides a valuable resource for the biological community.
    DOI:  https://doi.org/10.1101/gr.275723.121
  22. Nucleic Acids Res. 2021 Dec 24. pii: gkab1252. [Epub ahead of print]
      Multiple RNA polymerases (RNAPs) transcribing a gene have been known to exhibit collective group behavior, causing the transcription elongation rate to increase with the rate of transcription initiation. Such behavior has long been believed to be driven by a physical interaction or 'push' between closely spaced RNAPs. However, recent studies have posited that RNAPs separated by longer distances may cooperate by modifying the DNA segment under transcription. Here, we present a theoretical model incorporating the mechanical coupling between RNAP translocation and the DNA torsional response. Using stochastic simulations, we demonstrate DNA supercoiling-mediated long-range cooperation between co-transcribing RNAPs. We find that inhibiting transcription initiation can slow down the already recruited RNAPs, in agreement with recent experimental observations, and predict that the average transcription elongation rate varies non-monotonically with the rate of transcription initiation. We further show that while RNAPs transcribing neighboring genes oriented in tandem can cooperate, those transcribing genes in divergent or convergent orientations can act antagonistically, and that such behavior holds over a large range of intergenic separations. Our model makes testable predictions, revealing how the mechanical interplay between RNAPs and the DNA they transcribe can govern transcriptional dynamics.
    DOI:  https://doi.org/10.1093/nar/gkab1252
  23. Cell Stem Cell. 2021 Dec 17. pii: S1934-5909(21)00484-7. [Epub ahead of print]
      In human embryos, the initiation of transcription (embryonic genome activation [EGA]) occurs by the eight-cell stage, but its exact timing and profile are unclear. To address this, we profiled gene expression at depth in human metaphase II oocytes and bipronuclear (2PN) one-cell embryos. High-resolution single-cell RNA sequencing revealed previously inaccessible oocyte-to-embryo gene expression changes. This confirmed transcript depletion following fertilization (maternal RNA degradation) but also uncovered low-magnitude upregulation of hundreds of spliced transcripts. Gene expression analysis predicted embryonic processes including cell-cycle progression and chromosome maintenance as well as transcriptional activators that included cancer-associated gene regulators. Transcription was disrupted in abnormal monopronuclear (1PN) and tripronuclear (3PN) one-cell embryos. These findings indicate that human embryonic transcription initiates at the one-cell stage, sooner than previously thought. The pattern of gene upregulation promises to illuminate processes involved at the onset of human development, with implications for epigenetic inheritance, stem-cell-derived embryos, and cancer.
    Keywords:  embryonic genome activation (EGA); fertilization; human one-cell embryo; single-cell RNA-seq; totipotency; transcriptome; zygote
    DOI:  https://doi.org/10.1016/j.stem.2021.11.012
  24. Nucleic Acids Res. 2021 Dec 20. pii: gkab1230. [Epub ahead of print]
      Mitochondrial transcription factor A (TFAM) plays a critical role in mitochondrial transcription initiation and mitochondrial DNA (mtDNA) packaging. Both functions require DNA binding, but in one case TFAM must recognize a specific promoter sequence, while packaging requires coating of mtDNA by association with non sequence-specific regions. The mechanisms by which TFAM achieves both sequence-specific and non sequence-specific recognition have not yet been determined. Existing crystal structures of TFAM bound to DNA allowed us to identify two guanine-specific interactions that are established between TFAM and the bound DNA. These interactions are observed when TFAM is bound to both specific promoter sequences and non-sequence specific DNA. These interactions are established with two guanine bases separated by 10 random nucleotides (GN10G). Our biochemical results demonstrate that the GN10G consensus is essential for transcriptional initiation and contributes to facilitating TFAM binding to DNA substrates. Furthermore, we report a crystal structure of TFAM in complex with a non sequence-specific sequence containing a GN10G consensus. The structure reveals a unique arrangement in which TFAM bridges two DNA substrates while maintaining the GN10G interactions. We propose that the GN10G consensus is key to facilitate the interaction of TFAM with DNA.
    DOI:  https://doi.org/10.1093/nar/gkab1230
  25. Nature. 2021 Dec 22.
      The switch/sucrose non-fermentable (SWI/SNF) complex has a crucial role in chromatin remodelling1 and is altered in over 20% of cancers2,3. Here we developed a proteolysis-targeting chimera (PROTAC) degrader of the SWI/SNF ATPase subunits, SMARCA2 and SMARCA4, called AU-15330. Androgen receptor (AR)+ forkhead box A1 (FOXA1)+ prostate cancer cells are exquisitely sensitive to dual SMARCA2 and SMARCA4 degradation relative to normal and other cancer cell lines. SWI/SNF ATPase degradation rapidly compacts cis-regulatory elements bound by transcription factors that drive prostate cancer cell proliferation, namely AR, FOXA1, ERG and MYC, which dislodges them from chromatin, disables their core enhancer circuitry, and abolishes the downstream oncogenic gene programs. SWI/SNF ATPase degradation also disrupts super-enhancer and promoter looping interactions that wire supra-physiologic expression of the AR, FOXA1 and MYC oncogenes themselves. AU-15330 induces potent inhibition of tumour growth in xenograft models of prostate cancer and synergizes with the AR antagonist enzalutamide, even inducing disease remission in castration-resistant prostate cancer (CRPC) models without toxicity. Thus, impeding SWI/SNF-mediated enhancer accessibility represents a promising therapeutic approach for enhancer-addicted cancers.
    DOI:  https://doi.org/10.1038/s41586-021-04246-z
  26. Nat Immunol. 2022 Jan;23(1): 99-108
      Enzymes of the TET family are methylcytosine dioxygenases that undergo frequent mutational or functional inactivation in human cancers. Recurrent loss-of-function mutations in TET proteins are frequent in human diffuse large B cell lymphoma (DLBCL). Here, we investigate the role of TET proteins in B cell homeostasis and development of B cell lymphomas with features of DLBCL. We show that deletion of Tet2 and Tet3 genes in mature B cells in mice perturbs B cell homeostasis and results in spontaneous development of germinal center (GC)-derived B cell lymphomas with increased G-quadruplexes and R-loops. At a genome-wide level, G-quadruplexes and R-loops were associated with increased DNA double-strand breaks (DSBs) at immunoglobulin switch regions. Deletion of the DNA methyltransferase DNMT1 in TET-deficient B cells prevented expansion of GC B cells, diminished the accumulation of G-quadruplexes and R-loops and delayed B lymphoma development, consistent with the opposing functions of DNMT and TET enzymes in DNA methylation and demethylation. Clustered regularly interspaced short palindromic repeats (CRISPR)-mediated depletion of nucleases and helicases that regulate G-quadruplexes and R-loops decreased the viability of TET-deficient B cells. Our studies suggest a molecular mechanism by which TET loss of function might predispose to the development of B cell malignancies.
    DOI:  https://doi.org/10.1038/s41590-021-01087-w