bims-crepig Biomed News
on Chromatin regulation and epigenetics in cell fate and cancer
Issue of 2023–11–19
24 papers selected by
Connor Rogerson, University of Cambridge



  1. Cell Rep. 2023 Nov 10. pii: S2211-1247(23)01440-7. [Epub ahead of print]42(11): 113428
      H3K9 methylation (H3K9me) marks transcriptionally silent genomic regions called heterochromatin. HP1 proteins are required to establish and maintain heterochromatin. HP1 proteins bind to H3K9me, recruit factors that promote heterochromatin formation, and oligomerize to form phase-separated condensates. We do not understand how these different HP1 properties are involved in establishing and maintaining transcriptional silencing. Here, we demonstrate that the S. pombe HP1 homolog, Swi6, can be completely bypassed to establish silencing at ectopic and endogenous loci when an H3K4 methyltransferase, Set1, and an H3K14 acetyltransferase, Mst2, are deleted. Deleting Set1 and Mst2 enhances Clr4 enzymatic activity, leading to higher H3K9me levels and spreading. In contrast, Swi6 and its capacity to oligomerize were indispensable during epigenetic maintenance. Our results demonstrate the role of HP1 proteins in regulating histone modification crosstalk during establishment and identify a genetically separable function in maintaining epigenetic memory.
    Keywords:  CP: Molecular biology; H3K9 methylation; chromatin compaction; condensates; epigenetics; euchromatin; heterochromatin; histones; inheritance; silencing; yeast
    DOI:  https://doi.org/10.1016/j.celrep.2023.113428
  2. Nucleic Acids Res. 2023 Nov 16. pii: gkad1069. [Epub ahead of print]
      The Cistrome Data Browser is a resource of ChIP-seq, ATAC-seq and DNase-seq data from humans and mice. It provides maps of the genome-wide locations of transcription factors, cofactors, chromatin remodelers, histone post-translational modifications and regions of chromatin accessible to endonuclease activity. Cistrome DB v3.0 contains approximately 45 000 human and 44 000 mouse samples with about 32 000 newly collected datasets compared to the previous release. The Cistrome DB v3.0 user interface is implemented as a single page application that unifies menu driven and data driven search functions and provides an embedded genome browser, which allows users to find and visualize data more effectively. Users can find informative chromatin profiles through keyword, menu, and data-driven search tools. Browser search functions can predict the regulators of query genes as well as the cell type and factor dependent functionality of potential cis-regulatory elements. Cistrome DB v3.0 expands the display of quality control statistics, incorporates sequence logos into motif enrichment displays and includes more expansive sample metadata. Cistrome DB v3.0 is available at http://db3.cistrome.org/browser.
    DOI:  https://doi.org/10.1093/nar/gkad1069
  3. Nucleic Acids Res. 2023 Nov 14. pii: gkad1059. [Epub ahead of print]
      JASPAR (https://jaspar.elixir.no/) is a widely-used open-access database presenting manually curated high-quality and non-redundant DNA-binding profiles for transcription factors (TFs) across taxa. In this 10th release and 20th-anniversary update, the CORE collection has expanded with 329 new profiles. We updated three existing profiles and provided orthogonal support for 72 profiles from the previous release's UNVALIDATED collection. Altogether, the JASPAR 2024 update provides a 20% increase in CORE profiles from the previous release. A trimming algorithm enhanced profiles by removing low information content flanking base pairs, which were likely uninformative (within the capacity of the PFM models) for TFBS predictions and modelling TF-DNA interactions. This release includes enhanced metadata, featuring a refined classification for plant TFs' structural DNA-binding domains. The new JASPAR collections prompt updates to the genomic tracks of predicted TF binding sites (TFBSs) in 8 organisms, with human and mouse tracks available as native tracks in the UCSC Genome browser. All data are available through the JASPAR web interface and programmatically through its API and the updated Bioconductor and pyJASPAR packages. Finally, a new TFBS extraction tool enables users to retrieve predicted JASPAR TFBSs intersecting their genomic regions of interest.
    DOI:  https://doi.org/10.1093/nar/gkad1059
  4. Nat Commun. 2023 Nov 16. 14(1): 7420
      Responses of cells to stimuli are increasingly discovered to involve the binding of sequence-specific transcription factors outside of known target genes. We wanted to determine to what extent the genome-wide binding and function of a transcription factor are shaped by the cell type versus the stimulus. To do so, we induced the Heat Shock Response pathway in two different cancer cell lines with two different stimuli and related the binding of its master regulator HSF1 to nascent RNA and chromatin accessibility. Here, we show that HSF1 binding patterns retain their identity between basal conditions and under different magnitudes of activation, so that common HSF1 binding is globally associated with distinct transcription outcomes. HSF1-induced increase in DNA accessibility was modest in scale, but occurred predominantly at remote genomic sites. Apart from regulating transcription at existing elements including promoters and enhancers, HSF1 binding amplified during responses to stimuli may engage inactive chromatin.
    DOI:  https://doi.org/10.1038/s41467-023-43157-7
  5. Nat Genet. 2023 Nov 13.
      The biological functions of noncoding RNA N6-methyladenosine (m6A) modification remain poorly understood. In the present study, we depict the landscape of super-enhancer RNA (seRNA) m6A modification in pancreatic ductal adenocarcinoma (PDAC) and reveal a regulatory axis of m6A seRNA, H3K4me3 modification, chromatin accessibility and oncogene transcription. We demonstrate the cofilin family protein CFL1, overexpressed in PDAC, as a METTL3 cofactor that helps seRNA m6A methylation formation. The increased seRNA m6As are recognized by the reader YTHDC2, which recruits H3K4 methyltransferase MLL1 to promote H3K4me3 modification cotranscriptionally. Super-enhancers with a high level of H3K4me3 augment chromatin accessibility and facilitate oncogene transcription. Collectively, these results shed light on a CFL1-METTL3-seRNA m6A-YTHDC2/MLL1 axis that plays a role in the epigenetic regulation of local chromatin state and gene expression, which strengthens our knowledge about the functions of super-enhancers and their transcripts.
    DOI:  https://doi.org/10.1038/s41588-023-01568-8
  6. Nucleic Acids Res. 2023 Nov 13. pii: gkad1016. [Epub ahead of print]
      Transcription factors (TFs), transcription co-factors (TcoFs) and their target genes perform essential functions in diseases and biological processes. KnockTF 2.0 (http:////www.licpathway.net//KnockTF//index.html) aims to provide comprehensive gene expression profile datasets before//after T(co)F knockdown//knockout across multiple tissue//cell types of different species. Compared with KnockTF 1.0, KnockTF 2.0 has the following improvements: (i) Newly added T(co)F knockdown//knockout datasets in mice, Arabidopsis thaliana and Zea mays and also an expanded scale of datasets in humans. Currently, KnockTF 2.0 stores 1468 manually curated RNA-seq and microarray datasets associated with 612 TFs and 172 TcoFs disrupted by different knockdown//knockout techniques, which are 2.5 times larger than those of KnockTF 1.0. (ii) Newly added (epi)genetic annotations for T(co)F target genes in humans and mice, such as super-enhancers, common SNPs, methylation sites and chromatin interactions. (iii) Newly embedded and updated search and analysis tools, including T(co)F Enrichment (GSEA), Pathway Downstream Analysis and Search by Target Gene (BLAST). KnockTF 2.0 is a comprehensive update of KnockTF 1.0, which provides more T(co)F knockdown//knockout datasets and (epi)genetic annotations across multiple species than KnockTF 1.0. KnockTF 2.0 facilitates not only the identification of functional T(co)Fs and target genes but also the investigation of their roles in the physiological and pathological processes.
    DOI:  https://doi.org/10.1093/nar/gkad1016
  7. Nucleic Acids Res. 2023 Nov 13. pii: gkad1022. [Epub ahead of print]
      Stochastic origin activation gives rise to significant cell-to-cell variability in the pattern of genome replication. The molecular basis for heterogeneity in efficiency and timing of individual origins is a long-standing question. Here, we developed Methylation Accessibility of TArgeted Chromatin domain Sequencing (MATAC-Seq) to determine single-molecule chromatin accessibility of four specific genomic loci. MATAC-Seq relies on preferential modification of accessible DNA by methyltransferases combined with Nanopore-Sequencing for direct readout of methylated DNA-bases. Applying MATAC-Seq to selected early-efficient and late-inefficient yeast replication origins revealed large heterogeneity of chromatin states. Disruption of INO80 or ISW2 chromatin remodeling complexes leads to changes at individual nucleosomal positions that correlate with changes in their replication efficiency. We found a chromatin state with an accessible nucleosome-free region in combination with well-positioned +1 and +2 nucleosomes as a strong predictor for efficient origin activation. Thus, MATAC-Seq identifies the large spectrum of alternative chromatin states that co-exist on a given locus previously masked in population-based experiments and provides a mechanistic basis for origin activation heterogeneity during eukaryotic DNA replication. Consequently, our single-molecule chromatin accessibility assay will be ideal to define single-molecule heterogeneity across many fundamental biological processes such as transcription, replication, or DNA repair in vitro and ex vivo.
    DOI:  https://doi.org/10.1093/nar/gkad1022
  8. Sci Adv. 2023 Nov 15. 9(46): eadf3980
      Embryonic stem cells (ESCs) have transcriptionally permissive chromatin enriched for gene activation-associated histone modifications. A striking exception is DOT1L-mediated H3K79 dimethylation (H3K79me2) that is considered a positive regulator of transcription. We find that ESCs are depleted for H3K79me2 at shared locations of enrichment with somatic cells, which are highly and ubiquitously expressed housekeeping genes, and have lower RNA polymerase II (RNAPII) at the transcription start site (TSS) despite greater nascent transcription. Inhibiting DOT1L increases the efficiency of reprogramming of somatic to induced pluripotent stem cells, enables an ESC-like RNAPII pattern at the TSS, and functionally compensates for enforced RNAPII pausing. DOT1L inhibition increases H3K27 methylation and RNAPII elongation-enhancing histone acetylation without changing the expression of the causal histone-modifying enzymes. Only the maintenance of elevated histone acetylation is essential for enhanced reprogramming and occurs at loci that are depleted for H3K79me2. Thus, DOT1L inhibition promotes the hyperacetylation and hypertranscription pluripotent properties.
    DOI:  https://doi.org/10.1126/sciadv.adf3980
  9. Nat Commun. 2023 Nov 16. 14(1): 7435
      SND1 and MTDH are known to promote cancer and therapy resistance, but their mechanisms and interactions with other oncogenes remain unclear. Here, we show that oncoprotein ERG interacts with SND1/MTDH complex through SND1's Tudor domain. ERG, an ETS-domain transcription factor, is overexpressed in many prostate cancers. Knocking down SND1 in human prostate epithelial cells, especially those overexpressing ERG, negatively impacts cell proliferation. Transcriptional analysis shows substantial overlap in genes regulated by ERG and SND1. Mechanistically, we show that ERG promotes nuclear localization of SND1/MTDH. Forced nuclear localization of SND1 prominently increases its growth promoting function irrespective of ERG expression. In mice, prostate-specific Snd1 deletion reduces cancer growth and tumor burden in a prostate cancer model (PB-Cre/Ptenflox/flox/ERG mice), Moreover, we find a significant overlap between prostate transcriptional signatures of ERG and SND1. These findings highlight SND1's crucial role in prostate tumorigenesis, suggesting SND1 as a potential therapeutic target in prostate cancer.
    DOI:  https://doi.org/10.1038/s41467-023-43245-8
  10. Nucleic Acids Res. 2023 Nov 16. pii: gkad1077. [Epub ahead of print]
      We present a major update of the HOCOMOCO collection that provides DNA binding specificity patterns of 949 human transcription factors and 720 mouse orthologs. To make this release, we performed motif discovery in peak sets that originated from 14 183 ChIP-Seq experiments and reads from 2554 HT-SELEX experiments yielding more than 400 thousand candidate motifs. The candidate motifs were annotated according to their similarity to known motifs and the hierarchy of DNA-binding domains of the respective transcription factors. Next, the motifs underwent human expert curation to stratify distinct motif subtypes and remove non-informative patterns and common artifacts. Finally, the curated subset of 100 thousand motifs was supplied to the automated benchmarking to select the best-performing motifs for each transcription factor. The resulting HOCOMOCO v12 core collection contains 1443 verified position weight matrices, including distinct subtypes of DNA binding motifs for particular transcription factors. In addition to the core collection, HOCOMOCO v12 provides motif sets optimized for the recognition of binding sites in vivo and in vitro, and for annotation of regulatory sequence variants. HOCOMOCO is available at https://hocomoco12.autosome.org and https://hocomoco.autosome.org.
    DOI:  https://doi.org/10.1093/nar/gkad1077
  11. Nucleic Acids Res. 2023 Nov 13. pii: gkad1066. [Epub ahead of print]
      Chromatin remodeling is essential to allow full development of alternative gene expression programs in response to environmental changes. In fission yeast, oxidative stress triggers massive transcriptional changes including the activation of hundreds of genes, with the participation of histone modifying complexes and chromatin remodelers. DNA transcription is associated to alterations in DNA topology, and DNA topoisomerases facilitate elongation along gene bodies. Here, we test whether the DNA topoisomerase Top1 participates in the RNA polymerase II-dependent activation of the cellular response to oxidative stress. Cells lacking Top1 are resistant to H2O2 stress. The transcriptome of Δtop1 strain was not greatly affected in the absence of stress, but activation of the anti-stress gene expression program was more sustained than in wild-type cells. Top1 associated to stress open reading frames. While the nucleosomes of stress genes are partially and transiently evicted during stress, the chromatin configuration remains open for longer times in cells lacking Top1, facilitating RNA polymerase II progression. We propose that, by removing DNA tension arising from transcription, Top1 facilitates nucleosome reassembly and works in synergy with the chromatin remodeler Hrp1 as opposing forces to transcription and to Snf22 // Hrp3 opening remodelers.
    DOI:  https://doi.org/10.1093/nar/gkad1066
  12. Nature. 2023 Nov 15.
      Mouse models are a critical tool for studying human diseases, particularly developmental disorders1. However, conventional approaches for phenotyping may fail to detect subtle defects throughout the developing mouse2. Here we set out to establish single-cell RNA sequencing of the whole embryo as a scalable platform for the systematic phenotyping of mouse genetic models. We applied combinatorial indexing-based single-cell RNA sequencing3 to profile 101 embryos of 22 mutant and 4 wild-type genotypes at embryonic day 13.5, altogether profiling more than 1.6 million nuclei. The 22 mutants represent a range of anticipated phenotypic severities, from established multisystem disorders to deletions of individual regulatory regions4,5. We developed and applied several analytical frameworks for detecting differences in composition and/or gene expression across 52 cell types or trajectories. Some mutants exhibit changes in dozens of trajectories whereas others exhibit changes in only a few cell types. We also identify differences between widely used wild-type strains, compare phenotyping of gain- versus loss-of-function mutants and characterize deletions of topological associating domain boundaries. Notably, some changes are shared among mutants, suggesting that developmental pleiotropy might be 'decomposable' through further scaling of this approach. Overall, our findings show how single-cell profiling of whole embryos can enable the systematic molecular and cellular phenotypic characterization of mouse mutants with unprecedented breadth and resolution.
    DOI:  https://doi.org/10.1038/s41586-023-06548-w
  13. Nat Commun. 2023 Nov 16. 14(1): 7422
      Regeneration requires mechanisms for producing a wide array of cell types. Neoblasts are stem cells in the planarian Schmidtea mediterranea that undergo fate specification to produce over 125 adult cell types. Fate specification in neoblasts can be regulated through expression of fate-specific transcription factors. We utilize multiplexed error-robust fluorescence in situ hybridization (MERFISH) and whole-mount FISH to characterize fate choice distribution of stem cells within planarians. Fate choices are often made distant from target tissues and in a highly intermingled manner, with neighboring neoblasts frequently making divergent fate choices for tissues of different location and function. We propose that pattern formation is driven primarily by the migratory assortment of progenitors from mixed and spatially distributed fate-specified stem cells and that fate choice involves stem-cell intrinsic processes.
    DOI:  https://doi.org/10.1038/s41467-023-43267-2
  14. Mol Cell. 2023 Nov 16. pii: S1097-2765(23)00860-2. [Epub ahead of print]83(22): 4141-4157.e11
      Biomolecular condensates have emerged as a major organizational principle in the cell. However, the formation, maintenance, and dissolution of condensates are still poorly understood. Transcriptional machinery partitions into biomolecular condensates at key cell identity genes to activate these. Here, we report a specific perturbation of WNT-activated β-catenin condensates that disrupts oncogenic signaling. We use a live-cell condensate imaging method in human cancer cells to discover FOXO and TCF-derived peptides that specifically inhibit β-catenin condensate formation on DNA, perturb nuclear β-catenin condensates in cells, and inhibit β-catenin-driven transcriptional activation and colorectal cancer cell growth. We show that these peptides compete with homotypic intermolecular interactions that normally drive condensate formation. Using this framework, we derive short peptides that specifically perturb condensates and transcriptional activation of YAP and TAZ in the Hippo pathway. We propose a "monomer saturation" model in which short interacting peptides can be used to specifically inhibit condensate-associated transcription in disease.
    Keywords:  NMR; WNT-signaling; biomolecular condensates; signaling inhibition; transcriptional regulation
    DOI:  https://doi.org/10.1016/j.molcel.2023.10.023
  15. Cell Rep. 2023 Nov 16. pii: S2211-1247(23)01466-3. [Epub ahead of print]42(11): 113454
      Previous studies of the murine Ly49 and human KIR gene clusters implicated competing sense and antisense promoters in the control of variegated gene expression. In the current study, an examination of transcription factor genes defines an abundance of convergent and divergent sense/antisense promoter pairs, suggesting that competing promoters may control cell fate determination. Differentiation of CD34+ hematopoietic progenitors in vitro shows that cells with GATA1 antisense transcription have enhanced GATA2 transcription and a mast cell phenotype, whereas cells with GATA2 antisense transcription have increased GATA1 transcripts and an erythroblast phenotype. Detailed analyses of the AHR and RORC genes demonstrate the ability of competing promoters to act as binary switches and the association of antisense transcription with an immature/progenitor cell phenotype. These data indicate that alternative cell fates generated by promoter competition in lineage-determining transcription factors contribute to the programming of cell differentiation.
    Keywords:  AHR; CP: Molecular biology; CP: Stem cell research; GATA1; GATA2; RORC; antisense transcription; cell differentiation; divergent/convergent transcription; lineage determining; transcription factors
    DOI:  https://doi.org/10.1016/j.celrep.2023.113454
  16. Cancer Res. 2023 Nov 14.
      Ewing sarcoma is an aggressive cancer with a defective response to DNA damage leading to an enhanced sensitivity to genotoxic agents. Mechanistically, Ewing sarcoma is driven by the fusion transcription factor EWS-FLI1 which reprograms the tumor cell epigenome. The NuRD complex is an important regulator of chromatin function, controlling both gene expression and DNA damage repair, and has been associated with EWS-FLI1 activity. Here, a NuRD-focused CRISPR/Cas9 inactivation screen identified the helicase CHD4 as essential for Ewing sarcoma cell proliferation. CHD4 silencing induced tumor cell death by apoptosis and abolished colony formation. Although CHD4 and NuRD co-localized with EWS-FLI1 at enhancers and super-enhancers, CHD4 promoted Ewing sarcoma cell survival not by modulating EWS-FLI1 activity and its oncogenic gene expression program but by regulating chromatin structure. CHD4 depletion led to a global increase in DNA accessibility and induction of spontaneous DNA damage, resulting in an increased susceptibility to DNA damaging agents. CHD4 loss delayed tumor growth in vivo, increased overall survival, and combination with PARP inhibition by olaparib treatment further suppressed tumor growth. Collectively, these findings highlight the NuRD subunit CHD4 as a therapeutic target in Ewing sarcoma that can potentiate the anti-tumor activity of genotoxic agents.
    DOI:  https://doi.org/10.1158/0008-5472.CAN-22-3950
  17. Sci Rep. 2023 Nov 14. 13(1): 19885
      The dosage-dependent recruitment of RNA polymerase II (Pol II) at the promoters of genes related to neurodevelopment and stem cell maintenance is required for transcription by the fine-tuned expression of SET-domain-containing protein 5 (SETD5). Pol II O-GlcNAcylation by O-GlcNAc transferase (OGT) is critical for preinitiation complex formation and transcription cycling. SETD5 dysregulation has been linked to stem cell-like properties in some cancer types; however, the role of SETD5 in cancer cell stemness has not yet been determined. We here show that aberrant SETD5 overexpression induces stemness in colorectal cancer (CRC) cells. SETD5 overexpression causes the upregulation of PI3K-AKT pathway-related genes and cancer stem cell (CSC) markers such as CD133, Kruppel-like factor 4 (KLF4), and estrogen-related receptor beta (ESRRB), leading to the gain of stem cell-like phenotypes. Our findings also revealed a functional relationship between SETD5, OGT, and Pol II. OGT-catalyzed Pol II glycosylation depends on SETD5, and the SETD5-Pol II interaction weakens in OGT-depleted cells, suggesting a SETD5-OGT-Pol II interdependence. SETD5 deficiency reduces Pol II occupancy at PI3K-AKT pathway-related genes and CD133 promoters, suggesting a role for SETD5-mediated Pol II recruitment in gene regulation. Moreover, the SETD5 depletion nullified the SETD5-induced stemness of CRC cells and Pol II O-GlcNAcylation. These findings support the hypothesis that SETD5 mediates OGT-catalyzed O-GlcNAcylation of RNA Pol II, which is involved in cancer cell stemness gain via CSC marker gene upregulation.
    DOI:  https://doi.org/10.1038/s41598-023-46923-1
  18. Dev Cell. 2023 Nov 08. pii: S1534-5807(23)00555-5. [Epub ahead of print]
      Cardiomyocytes are highly metabolic cells responsible for generating the contractile force in the heart. During fetal development and regeneration, these cells actively divide but lose their proliferative activity in adulthood. The mechanisms that coordinate their metabolism and proliferation are not fully understood. Here, we study the role of the transcription factor NFYa in developing mouse hearts. Loss of NFYa alters cardiomyocyte composition, causing a decrease in immature regenerative cells and an increase in trabecular and mature cardiomyocytes, as identified by spatial and single-cell transcriptome analyses. NFYa-deleted cardiomyocytes exhibited reduced proliferation and impaired mitochondrial metabolism, leading to cardiac growth defects and embryonic death. NFYa, interacting with cofactor SP2, activates genes linking metabolism and proliferation at the transcription level. Our study identifies a nodal role of NFYa in regulating prenatal cardiac growth and a previously unrecognized transcriptional control mechanism of heart metabolism, highlighting the importance of mitochondrial metabolism during heart development and regeneration.
    Keywords:  cardiac metabolism; cardiomyocyte proliferation; heart development; nuclear transcription factor Y
    DOI:  https://doi.org/10.1016/j.devcel.2023.10.012
  19. Proc Natl Acad Sci U S A. 2023 Nov 21. 120(47): e2313835120
      The cyclic AMP response element (CRE) binding protein (CREB) is a transcription factor that contains a 280-residue N-terminal transactivation domain and a basic leucine zipper that mediates interaction with DNA. The transactivation domain comprises three subdomains, the glutamine-rich domains Q1 and Q2 and the kinase inducible activation domain (KID). NMR chemical shifts show that the isolated subdomains are intrinsically disordered but have a propensity to populate local elements of secondary structure. The Q1 and Q2 domains exhibit a propensity for formation of short β-hairpin motifs that function as binding sites for glutamine-rich sequences. These motifs mediate intramolecular interactions between the CREB Q1 and Q2 domains as well as intermolecular interactions with the glutamine-rich Q1 domain of the TATA-box binding protein associated factor 4 (TAF4) subunit of transcription factor IID (TFIID). Using small-angle X-ray scattering, NMR, and single-molecule Förster resonance energy transfer, we show that the Q1, Q2, and KID regions remain dynamically disordered in a full-length CREB transactivation domain (CREBTAD) construct. The CREBTAD polypeptide chain is largely extended although some compaction is evident in the KID and Q2 domains. Paramagnetic relaxation enhancement reveals transient long-range contacts both within and between the Q1 and Q2 domains while the intervening KID domain is largely devoid of intramolecular interactions. Phosphorylation results in expansion of the KID domain, presumably making it more accessible for binding the CBP/p300 transcriptional coactivators. Our study reveals the complex nature of the interactions within the intrinsically disordered transactivation domain of CREB and provides molecular-level insights into dynamic and transient interactions mediated by the glutamine-rich domains.
    Keywords:  NMR; TFIID; intrinsically disordered protein; single-molecule FRET; transcriptional activation
    DOI:  https://doi.org/10.1073/pnas.2313835120
  20. Nat Commun. 2023 Nov 15. 14(1): 7291
      Fusion-positive rhabdomyosarcoma (FP-RMS) driven by the expression of the PAX3-FOXO1 (P3F) fusion oncoprotein is an aggressive subtype of pediatric rhabdomyosarcoma. FP-RMS histologically resembles developing muscle yet occurs throughout the body in areas devoid of skeletal muscle highlighting that FP-RMS is not derived from an exclusively myogenic cell of origin. Here we demonstrate that P3F reprograms mouse and human endothelial progenitors to FP-RMS. We show that P3F expression in aP2-Cre expressing cells reprograms endothelial progenitors to functional myogenic stem cells capable of regenerating injured muscle fibers. Further, we describe a FP-RMS mouse model driven by P3F expression and Cdkn2a loss in endothelial cells. Additionally, we show that P3F expression in TP53-null human iPSCs blocks endothelial-directed differentiation and guides cells to become myogenic cells that form FP-RMS tumors in immunocompromised mice. Together these findings demonstrate that FP-RMS can originate from aberrant development of non-myogenic cells driven by P3F.
    DOI:  https://doi.org/10.1038/s41467-023-43044-1
  21. Nat Comput Sci. 2023 Jul;3(7): 644-657
      Resolving chromatin-remodeling-linked gene expression changes at cell-type resolution is important for understanding disease states. Here we describe MAGICAL (Multiome Accessibility Gene Integration Calling and Looping), a hierarchical Bayesian approach that leverages paired single-cell RNA sequencing and single-cell transposase-accessible chromatin sequencing from different conditions to map disease-associated transcription factors, chromatin sites, and genes as regulatory circuits. By simultaneously modeling signal variation across cells and conditions in both omics data types, MAGICAL achieved high accuracy on circuit inference. We applied MAGICAL to study Staphylococcus aureus sepsis from peripheral blood mononuclear single-cell data that we generated from subjects with bloodstream infection and uninfected controls. MAGICAL identified sepsis-associated regulatory circuits predominantly in CD14 monocytes, known to be activated by bacterial sepsis. We addressed the challenging problem of distinguishing host regulatory circuit responses to methicillin-resistant and methicillin-susceptible S. aureus infections. Although differential expression analysis failed to show predictive value, MAGICAL identified epigenetic circuit biomarkers that distinguished methicillin-resistant from methicillin-susceptible S. aureus infections.
    DOI:  https://doi.org/10.1038/s43588-023-00476-5
  22. Commun Biol. 2023 Nov 16. 6(1): 1138
      Oncogenic pathways that drive cancer progression reflect both genetic changes and epigenetic regulation. Here we stratified primary tumors from each of 24 TCGA adult cancer types based on the gene expression patterns of epigenetic factors (epifactors). The tumors for five cancer types (ACC, KIRC, LGG, LIHC, and LUAD) separated into two robust clusters that were better than grade or epithelial-to-mesenchymal transition in predicting clinical outcomes. The majority of epifactors that drove the clustering were also individually prognostic. A pan-cancer machine learning model deploying epifactor expression data for these five cancer types successfully separated the patients into poor and better outcome groups. Single-cell analysis of adult and pediatric tumors revealed that expression patterns associated with poor or worse outcomes were present in individual cells within tumors. Our study provides an epigenetic map of cancer types and lays a foundation for discovering pan-cancer targetable epifactors.
    DOI:  https://doi.org/10.1038/s42003-023-05459-w
  23. PLoS Genet. 2023 Nov 15. 19(11): e1010826
      engrailed (en) encodes a homeodomain transcription factor crucial for the proper development of Drosophila embryos and adults. Like many developmental transcription factors, en expression is regulated by many enhancers, some of overlapping function, that drive expression in spatially and temporally restricted patterns. The en embryonic enhancers are located in discrete DNA fragments that can function correctly in small reporter transgenes. In contrast, the en imaginal disc enhancers (IDEs) do not function correctly in small reporter transgenes. En is expressed in the posterior compartment of wing imaginal discs; in contrast, small IDE-reporter transgenes are expressed mainly in the anterior compartment. We found that En binds to the IDEs and suggest that it may directly repress IDE function and modulate En expression levels. We identified two en IDEs, O and S. Deletion of either of these IDEs from a 79kb HA-en rescue transgene (HAen79) caused a loss-of-function en phenotype when the HAen79 transgene was the sole source of En. In contrast, flies with a deletion of the same IDEs from an endogenous en gene had no phenotype, suggesting a resiliency not seen in the HAen79 rescue transgene. Inserting a gypsy insulator in HAen79 between en regulatory DNA and flanking sequences strengthened the activity of HAen79, giving better function in both the ON and OFF transcriptional states. Altogether our data suggest that the en IDEs stimulate expression in the entire imaginal disc, and that the ON/OFF state is set by epigenetic memory set by the embryonic enhancers. This epigenetic regulation is similar to that of the Ultrabithorax IDEs and we suggest that the activity of late-acting enhancers in other genes may be similarly regulated.
    DOI:  https://doi.org/10.1371/journal.pgen.1010826
  24. Nucleic Acids Res. 2023 Nov 16. pii: gkad1072. [Epub ahead of print]
      RegulonDB is a database that contains the most comprehensive corpus of knowledge of the regulation of transcription initiation of Escherichia coli K-12, including data from both classical molecular biology and high-throughput methodologies. Here, we describe biological advances since our last NAR paper of 2019. We explain the changes to satisfy FAIR requirements. We also present a full reconstruction of the RegulonDB computational infrastructure, which has significantly improved data storage, retrieval and accessibility and thus supports a more intuitive and user-friendly experience. The integration of graphical tools provides clear visual representations of genetic regulation data, facilitating data interpretation and knowledge integration. RegulonDB version 12.0 can be accessed at https://regulondb.ccg.unam.mx.
    DOI:  https://doi.org/10.1093/nar/gkad1072