bims-crepig Biomed News
on Chromatin regulation and epigenetics in cell fate and cancer
Issue of 2021‒04‒11
27 papers selected by
Connor Rogerson
University of Cambridge, MRC Cancer Unit


  1. Sci Adv. 2021 Apr;pii: eabe2261. [Epub ahead of print]7(15):
      Forkhead box protein A1 (FOXA1) is essential for androgen-dependent prostate cancer (PCa) growth. However, how FOXA1 levels are regulated remains elusive and its therapeutic targeting proven challenging. Here, we report FOXA1 as a nonhistone substrate of enhancer of zeste homolog 2 (EZH2), which methylates FOXA1 at lysine-295. This methylation is recognized by WD40 repeat protein BUB3, which subsequently recruits ubiquitin-specific protease 7 (USP7) to remove ubiquitination and enhance FOXA1 protein stability. They functionally converge in regulating cell cycle genes and promoting PCa growth. FOXA1 is a major therapeutic target of the inhibitors of EZH2 methyltransferase activities in PCa. FOXA1-driven PCa growth can be effectively mitigated by EZH2 enzymatic inhibitors, either alone or in combination with USP7 inhibitors. Together, our study reports EZH2-catalyzed methylation as a key mechanism to FOXA1 protein stability, which may be leveraged to enhance therapeutic targeting of PCa using enzymatic EZH2 inhibitors.
    DOI:  https://doi.org/10.1126/sciadv.abe2261
  2. Genome Res. 2021 Apr 08. pii: gr.260851.120. [Epub ahead of print]
      Genomic sequence variation within enhancers and promoters can have a significant impact on the cellular state and phenotype. However, sifting through the millions of candidate variants in a personal genome or a cancer genome, to identify those that impact cis-regulatory function, remains a major challenge. Interpretation of noncoding genome variation benefits from explainable artificial intelligence to predict and interpret the impact of a mutation on gene regulation. Here we generate phased whole genomes with matched chromatin accessibility, histone modifications, and gene expression for 10 melanoma cell lines. We find that training a specialized deep learning model, called DeepMEL2, on melanoma chromatin accessibility data can capture the various regulatory programs of the melanocytic and mesenchymal-like melanoma cell states. This model outperforms motif-based variant scoring, as well as more generic deep learning models. We detect hundreds to thousands of allele-specific chromatin accessibility variants (ASCAVs) in each melanoma genome, of which 15-20% can be explained by gains or losses of transcription factor binding sites. A considerable fraction of ASCAVs are caused by changes in AP-1 binding, as confirmed by matched ChIP-seq data to identify allele-specific binding of JUN and FOSL1. Finally, by augmenting the DeepMEL2 model with ChIP-seq data for GABPA, the TERT promoter mutation as well as additional ETS motif gains can be identified with high confidence. In conclusion, we present a new integrative genomics approach and a deep learning model to identify and interpret functional enhancer mutations with allelic imbalance of chromatin accessibility and gene expression.
    DOI:  https://doi.org/10.1101/gr.260851.120
  3. Cell Rep. 2021 Apr 06. pii: S2211-1247(21)00278-3. [Epub ahead of print]35(1): 108964
      Chromatin remodelers often show broad expression patterns in multiple cell types yet can elicit cell-specific effects in development and diseases. Arid1a binds DNA and regulates gene expression during tissue development and homeostasis. However, it is unclear how Arid1a achieves its functional specificity in regulating progenitor cells. Using the tooth root as a model, we show that loss of Arid1a impairs the differentiation-associated cell cycle arrest of tooth root progenitors through Hedgehog (Hh) signaling regulation, leading to shortened roots. Our data suggest that Plagl1, as a co-factor, endows Arid1a with its cell-type/spatial functional specificity. Furthermore, we show that loss of Arid1a leads to increased expression of Arid1b, which is also indispensable for odontoblast differentiation but is not involved in regulation of Hh signaling. This study expands our knowledge of the intricate interactions among chromatin remodelers, transcription factors, and signaling molecules during progenitor cell fate determination and lineage commitment.
    Keywords:  Arid1a; Hh signaling; Plagl1; cell cycle; stem/progenitor cells
    DOI:  https://doi.org/10.1016/j.celrep.2021.108964
  4. Nat Genet. 2021 Apr;53(4): 551-563
      Polycomb repressive complexes 1 and 2 (PRC1/2) maintain transcriptional silencing of developmental genes largely by catalyzing the formation of mono-ubiquitinated histone H2A at lysine 119 (H2AK119ub1) and trimethylated histone H3 at lysine 27 (H3K27me3), respectively. How Polycomb domains are reprogrammed during mammalian preimplantation development remains largely unclear. Here we show that, although H2AK119ub1 and H3K27me3 are highly colocalized in gametes, they undergo differential reprogramming dynamics following fertilization. H3K27me3 maintains thousands of maternally biased domains until the blastocyst stage, whereas maternally biased H2AK119ub1 distribution in zygotes is largely equalized at the two-cell stage. Notably, while maternal PRC2 depletion has a limited effect on global H2AK119ub1 in early embryos, it disrupts allelic H2AK119ub1 at H3K27me3 imprinting loci including Xist. By contrast, acute H2AK119ub1 depletion in zygotes does not affect H3K27me3 imprinting maintenance, at least by the four-cell stage. Importantly, loss of H2AK119ub1, but not H3K27me3, causes premature activation of developmental genes during zygotic genome activation (ZGA) and subsequent embryonic arrest. Thus, our study reveals distinct dynamics and functions of H3K27me3 and H2AK119ub1 in mouse preimplantation embryos.
    DOI:  https://doi.org/10.1038/s41588-021-00821-2
  5. Nucleic Acids Res. 2021 Apr 09. pii: gkab244. [Epub ahead of print]
      Lysine acetylation (Kac) is well known to occur in histones for chromatin function and epigenetic regulation. In addition to histones, Kac is also detected in a large number of proteins with diverse biological functions. However, Kac function and regulatory mechanism for most proteins are unclear. In this work, we studied mutation effects of rice genes encoding cytoplasm-localized histone deacetylases (HDAC) on protein acetylome and found that the HDAC protein HDA714 was a major deacetylase of the rice non-histone proteins including many ribosomal proteins (r-proteins) and translation factors that were extensively acetylated. HDA714 loss-of-function mutations increased Kac levels but reduced abundance of r-proteins. In vitro and in vivo experiments showed that HDA714 interacted with r-proteins and reduced their Kac. Substitutions of lysine by arginine (depleting Kac) in several r-proteins enhance, while mutations of lysine to glutamine (mimicking Kac) decrease their stability in transient expression system. Ribo-seq analysis revealed that the hda714 mutations resulted in increased ribosome stalling frequency. Collectively, the results uncover Kac as a functional posttranslational modification of r-proteins which is controlled by histone deacetylases, extending the role of Kac in gene expression to protein translational regulation.
    DOI:  https://doi.org/10.1093/nar/gkab244
  6. Nature. 2021 Apr 07.
      Genome-wide association studies (GWAS) have identified thousands of noncoding loci that are associated with human diseases and complex traits, each of which could reveal insights into the mechanisms of disease1. Many of the underlying causal variants may affect enhancers2,3, but we lack accurate maps of enhancers and their target genes to interpret such variants. We recently developed the activity-by-contact (ABC) model to predict which enhancers regulate which genes and validated the model using CRISPR perturbations in several cell types4. Here we apply this ABC model to create enhancer-gene maps in 131 human cell types and tissues, and use these maps to interpret the functions of GWAS variants. Across 72 diseases and complex traits, ABC links 5,036 GWAS signals to 2,249 unique genes, including a class of 577 genes that appear to influence multiple phenotypes through variants in enhancers that act in different cell types. In inflammatory bowel disease (IBD), causal variants are enriched in predicted enhancers by more than 20-fold in particular cell types such as dendritic cells, and ABC achieves higher precision than other regulatory methods at connecting noncoding variants to target genes. These variant-to-function maps reveal an enhancer that contains an IBD risk variant and that regulates the expression of PPIF to alter the membrane potential of mitochondria in macrophages. Our study reveals principles of genome regulation, identifies genes that affect IBD and provides a resource and generalizable strategy to connect risk variants of common diseases to their molecular and cellular functions.
    DOI:  https://doi.org/10.1038/s41586-021-03446-x
  7. Nucleic Acids Res. 2021 Apr 06. pii: gkab180. [Epub ahead of print]
      The lysine specific demethylase 1 (LSD1) plays a pivotal role in cellular differentiation by regulating the expression of key developmental genes in concert with different coregulatory proteins. This process is impaired in different cancer types and incompletely understood. To comprehensively identify functional coregulators of LSD1, we established a novel tractable fluorescent reporter system to monitor LSD1 activity in living cells. Combining this reporter system with a state-of-the-art multiplexed RNAi screen, we identify the DEAD-box helicase 19A (DDX19A) as a novel coregulator and demonstrate that suppression of Ddx19a results in an increase of R-loops and reduced LSD1-mediated gene silencing. We further show that DDX19A binds to tri-methylated lysine 27 of histone 3 (H3K27me3) and it regulates gene expression through the removal of transcription promoting R-loops. Our results uncover a novel transcriptional regulatory cascade where the downregulation of genes is dependent on the LSD1 mediated demethylation of histone H3 lysine 4 (H3K4). This allows the polycomb repressive complex 2 (PRC2) to methylate H3K27, which serves as a binding site for DDX19A. Finally, the binding of DDX19A leads to the efficient removal of R-loops at active promoters, which further de-represses LSD1 and PRC2, establishing a positive feedback loop leading to a robust repression of the target gene.
    DOI:  https://doi.org/10.1093/nar/gkab180
  8. Mol Cell. 2021 Mar 26. pii: S1097-2765(21)00181-7. [Epub ahead of print]
      Long undecoded transcript isoforms (LUTIs) represent a class of non-canonical mRNAs that downregulate gene expression through the combined act of transcriptional and translational repression. While single gene studies revealed important aspects of LUTI-based repression, how these features affect gene regulation on a global scale is unknown. Using transcript leader and direct RNA sequencing, here, we identify 74 LUTI candidates that are specifically induced in meiotic prophase. Translational repression of these candidates appears to be ubiquitous and is dependent on upstream open reading frames. However, LUTI-based transcriptional repression is variable. In only 50% of the cases, LUTI transcription causes downregulation of the protein-coding transcript isoform. Higher LUTI expression, enrichment of histone 3 lysine 36 trimethylation, and changes in nucleosome position are the strongest predictors of LUTI-based transcriptional repression. We conclude that LUTIs downregulate gene expression in a manner that integrates translational repression, chromatin state changes, and the magnitude of LUTI expression.
    Keywords:  H3K36; LUTI; chromatin; differentiation; gene expression; isoform; meiosis; transcription factor; translation; uORF
    DOI:  https://doi.org/10.1016/j.molcel.2021.03.013
  9. Mol Cell. 2021 Mar 29. pii: S1097-2765(21)00174-X. [Epub ahead of print]
      Nuclear speckles are prominent nuclear bodies that contain proteins and RNA involved in gene expression. Although links between nuclear speckles and gene activation are emerging, the mechanisms regulating association of genes with speckles are unclear. We find that speckle association of p53 target genes is driven by the p53 transcription factor. Focusing on p21, a key p53 target, we demonstrate that speckle association boosts expression by elevating nascent RNA amounts. p53-regulated speckle association did not depend on p53 transactivation functions but required an intact proline-rich domain and direct DNA binding, providing mechanisms within p53 for regulating gene-speckle association. Beyond p21, a substantial subset of p53 targets have p53-regulated speckle association. Strikingly, speckle-associating p53 targets are more robustly activated and occupy a distinct niche of p53 biology compared with non-speckle-associating p53 targets. Together, our findings illuminate regulated speckle association as a mechanism used by a transcription factor to boost gene expression.
    Keywords:  chromosome architecture; gene activation; nuclear positioning; nuclear speckles; p21; p53; phase-separated nuclear bodies; transcription; transcription factor
    DOI:  https://doi.org/10.1016/j.molcel.2021.03.006
  10. Elife. 2021 Apr 09. pii: e63632. [Epub ahead of print]10
      Single-cell measurements of cellular characteristics have been instrumental in understanding the heterogeneous pathways that drive differentiation, cellular responses to signals, and human disease. Recent advances have allowed paired capture of protein abundance and transcriptomic state, but a lack of epigenetic information in these assays has left a missing link to gene regulation. Using the heterogeneous mixture of cells in human peripheral blood as a test case, we developed a novel scATAC-seq workflow that increases signal-to-noise and allows paired measurement of cell surface markers and chromatin accessibility: integrated cellular indexing of chromatin landscape and epitopes, called ICICLE-seq. We extended this approach using a droplet-based multiomics platform to develop a trimodal assay that simultaneously measures transcriptomics (scRNA-seq), epitopes, and chromatin accessibility (scATAC-seq) from thousands of single cells, which we term TEA-seq. Together, these multimodal single-cell assays provide a novel toolkit to identify type-specific gene regulation and expression grounded in phenotypically defined cell types.
    Keywords:  chromatin; epitopes; genetics; genomics; human; immunology; inflammation; multiomics; sequencing; transcription
    DOI:  https://doi.org/10.7554/eLife.63632
  11. Nat Cell Biol. 2021 Apr;23(4): 391-400
      Mobile transposable elements (TEs) not only participate in genome evolution but also threaten genome integrity. In healthy cells, TEs that encode all of the components that are necessary for their mobility are specifically silenced, yet the precise mechanism remains unknown. Here, we characterize the mechanism used by a conserved class of chromatin remodelers that prevent TE mobility. In the Arabidopsis chromatin remodeler DECREASE IN DNA METHYLATION 1 (DDM1), we identify two conserved binding domains for the histone variant H2A.W, which marks plant heterochromatin. DDM1 is necessary and sufficient for the deposition of H2A.W onto potentially mobile TEs, yet does not act on TE fragments or host protein-coding genes. DDM1-mediated H2A.W deposition changes the properties of chromatin, resulting in the silencing of TEs and, therefore, prevents their mobility. This distinct mechanism provides insights into the interplay between TEs and their host in the contexts of evolution and disease, and potentiates innovative strategies for targeted gene silencing.
    DOI:  https://doi.org/10.1038/s41556-021-00658-1
  12. Cancer Res. 2021 Feb 15. 81(4): 935-944
      p53 is a short-lived protein with low basal levels under normal homeostasis conditions. However, upon DNA damage, levels of p53 dramatically increase for its activation. Although robust stabilization of p53 serves as a "trademark" for DNA damage responses, the requirement for such dramatic protein stabilization in tumor suppression has not been well addressed. Here we generated a mutant p53KQ mouse where all the C-terminal domain lysine residues were mutated to glutamines (K to Q mutations at K367, K369, K370, K378, K379, K383, and K384) to mimic constitutive acetylation of the p53 C-terminus. Because of p53 activation, p53KQ/KQ mice were perinatal lethal, yet this lethality was averted in p53KQ/- mice, which displayed normal postnatal development. Nevertheless, p53KQ/- mice died prematurely due to anemia and hematopoiesis failure. Further analyses indicated that expression of the acetylation-mimicking p53 mutant in vivo induces activation of p53 targets in various tissues without obviously increasing p53 levels. In the well-established pancreatic ductal adenocarcinoma (PDAC) mouse model, expression of the acetylation-mimicking p53-mutant protein effectively suppressed K-Ras-induced PDAC development in the absence of robust p53 stabilization. Together, our results provide proof-of-principle evidence that p53-mediated transcriptional function and tumor suppression can be achieved independently of its robust stabilization and reveal an alternative approach to activate p53 function for therapeutic purposes. SIGNIFICANCE: Although robust p53 stabilization is critical for acute p53 responses such as DNA damage, this study underscores the important role of low basal p53 protein levels in p53 activation and tumor suppression.
    DOI:  https://doi.org/10.1158/0008-5472.CAN-20-1804
  13. Nat Commun. 2021 04 07. 12(1): 2098
      The transition from naive to primed pluripotency is accompanied by an extensive reorganisation of transcriptional and epigenetic programmes. However, the role of transcriptional enhancers and three-dimensional chromatin organisation in coordinating these developmental programmes remains incompletely understood. Here, we generate a high-resolution atlas of gene regulatory interactions, chromatin profiles and transcription factor occupancy in naive and primed human pluripotent stem cells, and develop a network-graph approach to examine the atlas at multiple spatial scales. We uncover highly connected promoter hubs that change substantially in interaction frequency and in transcriptional co-regulation between pluripotent states. Small hubs frequently merge to form larger networks in primed cells, often linked by newly-formed Polycomb-associated interactions. We identify widespread state-specific differences in enhancer activity and interactivity that correspond with an extensive reconfiguration of OCT4, SOX2 and NANOG binding and target gene expression. These findings provide multilayered insights into the chromatin-based gene regulatory control of human pluripotent states.
    DOI:  https://doi.org/10.1038/s41467-021-22201-4
  14. PLoS Genet. 2021 Apr 07. 17(4): e1009238
      ARID1A is a core DNA-binding subunit of the BAF chromatin remodeling complex, and is lost in up to 7% of all cancers. The frequency of ARID1A loss increases in certain cancer types, such as clear cell ovarian carcinoma where ARID1A protein is lost in about 50% of cases. While the impact of ARID1A loss on the function of the BAF chromatin remodeling complexes is likely to drive oncogenic gene expression programs in specific contexts, ARID1A also binds genome stability regulators such as ATR and TOP2. Here we show that ARID1A loss leads to DNA replication stress associated with R-loops and transcription-replication conflicts in human cells. These effects correlate with altered transcription and replication dynamics in ARID1A knockout cells and to reduced TOP2A binding at R-loop sites. Together this work extends mechanisms of replication stress in ARID1A deficient cells with implications for targeting ARID1A deficient cancers.
    DOI:  https://doi.org/10.1371/journal.pgen.1009238
  15. Nucleic Acids Res. 2021 Apr 09. pii: gkab249. [Epub ahead of print]
      Liquid-liquid phase separation (LLPS) contributes to the spatial and functional segregation of molecular processes within the cell nucleus. However, the role played by LLPS in chromatin folding in living cells remains unclear. Here, using stochastic optical reconstruction microscopy (STORM) and Hi-C techniques, we studied the effects of 1,6-hexanediol (1,6-HD)-mediated LLPS disruption/modulation on higher-order chromatin organization in living cells. We found that 1,6-HD treatment caused the enlargement of nucleosome clutches and their more uniform distribution in the nuclear space. At a megabase-scale, chromatin underwent moderate but irreversible perturbations that resulted in the partial mixing of A and B compartments. The removal of 1,6-HD from the culture medium did not allow chromatin to acquire initial configurations, and resulted in more compact repressed chromatin than in untreated cells. 1,6-HD treatment also weakened enhancer-promoter interactions and TAD insulation but did not considerably affect CTCF-dependent loops. Our results suggest that 1,6-HD-sensitive LLPS plays a limited role in chromatin spatial organization by constraining its folding patterns and facilitating compartmentalization at different levels.
    DOI:  https://doi.org/10.1093/nar/gkab249
  16. Nucleic Acids Res. 2021 Apr 06. pii: gkab210. [Epub ahead of print]
      Trimethylation of histone H3 lysine 27 (H3K27me3) is important for gene silencing and imprinting, (epi)genome organization and organismal development. In a prevalent model, the functional readout of H3K27me3 in mammalian cells is achieved through the H3K27me3-recognizing chromodomain harbored within the chromobox (CBX) component of canonical Polycomb repressive complex 1 (cPRC1), which induces chromatin compaction and gene repression. Here, we report that binding of H3K27me3 by a Bromo Adjacent Homology (BAH) domain harbored within BAH domain-containing protein 1 (BAHD1) is required for overall BAHD1 targeting to chromatin and for optimal repression of the H3K27me3-demarcated genes in mammalian cells. Disruption of direct interaction between BAHD1BAH and H3K27me3 by point mutagenesis leads to chromatin remodeling, notably, increased histone acetylation, at its Polycomb gene targets. Mice carrying an H3K27me3-interaction-defective mutation of Bahd1BAH causes marked embryonic lethality, showing a requirement of this pathway for normal development. Altogether, this work demonstrates an H3K27me3-initiated signaling cascade that operates through a conserved BAH 'reader' module within BAHD1 in mammals.
    DOI:  https://doi.org/10.1093/nar/gkab210
  17. Epigenomes. 2020 Dec;pii: 24. [Epub ahead of print]4(4):
      Differential DNA methylation is characteristic of gene regulatory regions, such as enhancers, which mostly constitute low or intermediate CpG content in their DNA sequence. Consequently, quantification of changes in DNA methylation at these sites is challenging. Given that DNA methylation across most of the mammalian genome is maintained, the use of genome-wide bisulfite sequencing to measure fractional changes in DNA methylation at specific sites is an overexertion which is both expensive and cumbersome. Here, we developed a MethylRAD technique with an improved experimental plan and bioinformatic analysis tool to examine regional DNA methylation changes in embryonic stem cells (ESCs) during differentiation. The transcriptional silencing of pluripotency genes (PpGs) during ESC differentiation is accompanied by PpG enhancer (PpGe) silencing mediated by the demethylation of H3K4me1 by LSD1. Our MethylRAD data show that in the presence of LSD1 inhibitor, a significant fraction of LSD1-bound PpGe fails to gain DNA methylation. We further show that this effect is mostly observed in PpGes with low/intermediate CpG content. Underscoring the sensitivity and accuracy of MethylRAD sequencing, our study demonstrates that this method can detect small changes in DNA methylation in regulatory regions, including those with low/intermediate CpG content, thus asserting its use as a method of choice for diagnostic purposes.
    Keywords:  DNA methylation; MethylRAD; embryonic stem cells; enhancer; histone demethylase LSD1; pluripotency genes
    DOI:  https://doi.org/10.3390/epigenomes4040024
  18. Cancer Cell. 2021 Mar 30. pii: S1535-6108(21)00164-1. [Epub ahead of print]
      Extrachromosomal, circular DNA (ecDNA) is emerging as a prevalent yet less characterized oncogenic alteration in cancer genomes. We leverage ChIA-PET and ChIA-Drop chromatin interaction assays to characterize genome-wide ecDNA-mediated chromatin contacts that impact transcriptional programs in cancers. ecDNAs in glioblastoma patient-derived neurosphere and prostate cancer cell cultures are marked by widespread intra-ecDNA and genome-wide chromosomal interactions. ecDNA-chromatin contact foci are characterized by broad and high-level H3K27ac signals converging predominantly on chromosomal genes of increased expression levels. Prostate cancer cells harboring synthetic ecDNA circles composed of characterized enhancers result in the genome-wide activation of chromosomal gene transcription. Deciphering the chromosomal targets of ecDNAs at single-molecule resolution reveals an association with actively expressed oncogenes spatially clustered within ecDNA-directed interaction networks. Our results suggest that ecDNA can function as mobile transcriptional enhancers to promote tumor progression and manifest a potential synthetic aneuploidy mechanism of transcription control in cancer.
    Keywords:  ChIA-Drop; ChIA-PET; chromatin interactions; ecDNA; mobile enhancers
    DOI:  https://doi.org/10.1016/j.ccell.2021.03.006
  19. Dev Cell. 2021 Apr 05. pii: S1534-5807(21)00211-2. [Epub ahead of print]56(7): 1043-1055.e4
      Dynamic cell identities underlie flexible developmental programs. The stomatal lineage in the Arabidopsis leaf epidermis features asynchronous and indeterminate divisions that can be modulated by environmental cues. The products of the lineage, stomatal guard cells and pavement cells, regulate plant-atmosphere exchanges, and the epidermis as a whole influences overall leaf growth. How flexibility is encoded in development of the stomatal lineage and how cell fates are coordinated in the leaf are open questions. Here, by leveraging single-cell transcriptomics and molecular genetics, we uncovered models of cell differentiation within Arabidopsis leaf tissue. Profiles across leaf tissues identified points of regulatory congruence. In the stomatal lineage, single-cell resolution resolved underlying cell heterogeneity within early stages and provided a fine-grained profile of guard cell differentiation. Through integration of genome-scale datasets and spatiotemporally precise functional manipulations, we also identified an extended role for the transcriptional regulator SPEECHLESS in reinforcing cell fate commitment.
    Keywords:  Arabidopsis; SPEECHLESS; WEE1; developmental flexibility; leaf; scRNA-seq; stomata
    DOI:  https://doi.org/10.1016/j.devcel.2021.03.014
  20. Genome Res. 2021 Apr 09. pii: gr.268722.120. [Epub ahead of print]
      When assessed over a large number of samples, bulk RNA sequencing provides reliable data for gene expression at the tissue level. Single-cell RNA sequencing (scRNA-seq) deepens those analyses by evaluating gene expression at the cellular level. Both data types lend insights into disease etiology. With current technologies, however, scRNA-seq data are known to be noisy. Moreover, constrained by costs, scRNA-seq data are typically generated from a relatively small number of subjects, which limits their utility for some analyses, such as identification of gene expression quantitative trait loci (eQTLs). To address these issues, while maintaining the unique advantages of each data type, we develop a Bayesian method (bMIND) to integrate bulk and scRNA-seq data. With a prior derived from scRNA-seq data, we propose to estimate sample-level cell type-specific (CTS) expression from bulk expression data. The CTS expression enables large-scale sample-level downstream analyses, such as detection of CTS differentially expressed genes (DEGs) and eQTLs. Through simulations, we demonstrate that bMIND improves the accuracy of sample-level CTS expression estimates and power to discover CTS-DEGs when compared to existing methods. To further our understanding of two complex phenotypes, autism spectrum disorder and Alzheimer's disease, we apply bMIND to gene expression data of relevant brain tissue to identify CTS-DEGs. Our results complement findings for CTS-DEGs obtained from snRNA-seq studies, replicating certain DEGs in specific cell types while nominating other novel genes for those cell types. Finally, we calculate CTS-eQTLs for eleven brain regions by analyzing Genotype-Tissue Expression Project data, creating a new resource for biological insights.
    DOI:  https://doi.org/10.1101/gr.268722.120
  21. Cell Death Dis. 2021 Apr 06. 12(4): 352
      Transcription factor AP-2α (TFAP2A) was previously regarded as a critical regulator during embryonic development, and its mediation in carcinogenesis has received intensive attention recently. However, its role in lung adenocarcinoma (LUAD) has not been fully elucidated. Here, we tried to investigate TFAP2A expression profiling, clinical significance, biological function and molecular underpinnings in LUAD. We proved LUAD possessed universal TFAP2A high expression, indicating a pervasively poorer prognosis in multiple independent datasets. Then we found TFAP2A was not indispensable for LUAD proliferation, and exogenous overexpression even caused repression. However, we found TFAP2A could potently promote LUAD metastasis possibly by triggering epithelial-mesenchymal transition (EMT) in vitro and in vivo. Furthermore, we demonstrated TFAP2A could transactivate Pregnancy-specific glycoprotein 9 (PSG9) to enhance transforming growth factor β (TGF-β)-triggering EMT in LUAD. Meanwhile, we discovered suppressed post-transcriptional silencing of miR-16 family upon TFAP2A partly contributed to TFAP2A upregulation in LUAD. In clinical specimens, we also validated cancer-regulating effect of miR-16 family/TFAP2A/PSG9 axis, especially for lymph node metastasis of LUAD. In conclusion, we demonstrated that TFAP2A could pivotally facilitate LUAD progression, possibly through a novel pro-metastasis signaling pathway (miR-16 family/TFAP2A/PSG9/ TGF-β).
    DOI:  https://doi.org/10.1038/s41419-021-03606-x
  22. Nat Commun. 2021 04 06. 12(1): 2053
      Root development relies on the establishment of meristematic tissues that give rise to distinct cell types that differentiate across defined temporal and spatial gradients. Dissection of the developmental trajectories and the transcriptional networks that underlie them could aid understanding of the function of the root apical meristem in both dicots and monocots. Here, we present a single-cell RNA (scRNA) sequencing and chromatin accessibility survey of rice radicles. By temporal profiling of individual root tip cells we reconstruct continuous developmental trajectories of epidermal cells and ground tissues, and elucidate regulatory networks underlying cell fate determination in these cell lineages. We further identify characteristic processes, transcriptome profiles, and marker genes for these cell types and reveal conserved and divergent root developmental pathways between dicots and monocots. Finally, we demonstrate the potential of the platform for functional genetic studies by using spatiotemporal modeling to identify a rice root meristematic mutant from a cell-specific gene cohort.
    DOI:  https://doi.org/10.1038/s41467-021-22352-4
  23. Nat Genet. 2021 Apr;53(4): 539-550
      Parental epigenomes are established during gametogenesis. While they are largely reset after fertilization, broad domains of Polycomb repressive complex 2 (PRC2)-mediated formation of lysine 27-trimethylated histone H3 (H3K27me3) are inherited from oocytes in mice. How maternal H3K27me3 is established and inherited by embryos remains elusive. Here, we show that PRC1-mediated formation of lysine 119-monoubiquititinated histone H2A (H2AK119ub1) confers maternally heritable H3K27me3. Temporal profiling of H2AK119ub1 dynamics revealed that atypically broad H2AK119ub1 domains are established, along with H3K27me3, during oocyte growth. From the two-cell stage, H2AK119ub1 is progressively deposited at typical Polycomb targets and precedes H3K27me3. Reduction of H2AK119ub1 by depletion of Polycomb group ring finger 1 (PCGF1) and PCGF6-essential components of variant PRC1 (vPRC1)-leads to H3K27me3 loss at a subset of genes in oocytes. The gene-selective H3K27me3 deficiency is irreversibly inherited by embryos, causing loss of maternal H3K27me3-dependent imprinting, embryonic sublethality and placental enlargement at term. Collectively, our study unveils preceding dynamics of H2AK119ub1 over H3K27me3 at the maternal-to-zygotic transition, and identifies PCGF1/6-vPRC1 as an essential player in maternal epigenetic inheritance.
    DOI:  https://doi.org/10.1038/s41588-021-00820-3
  24. Plant Cell. 2021 Apr 07. pii: koab103. [Epub ahead of print]
      Leguminous plants produce nodules for nitrogen fixation; however, nodule production incurs an energy cost. Therefore, as an adaptive strategy, leguminous plants halt root nodule development when sufficient amounts of nitrogen nutrients, such as nitrate, are present in the environment. Although legume NODULE INCEPTION (NIN)-LIKE PROTEIN (NLP) transcription factors have recently been identified, understanding how nodulation is controlled by nitrate, a fundamental question for nitrate-mediated transcriptional regulation of symbiotic genes, remains elusive. Here, we show that two Lotus japonicus NLPs, NITRATE UNRESPONSIVE SYMBIOSIS 1 (NRSYM1)/LjNLP4 and NRSYM2/LjNLP1, have overlapping functions in the nitrate-induced control of nodulation and act as master regulators for nitrate-dependent gene expression. We further identify candidate target genes of LjNLP4 by combining transcriptome analysis with a DNA affinity purification (DAP)-seq approach. We then demonstrate that LjNLP4 and LjNIN, a key nodulation-specific regulator and paralogue of LjNLP4, have different DNA-binding specificities. Moreover, LjNLP4-LjNIN dimerization underlies LjNLP4-mediated bifunctional transcriptional regulation. These data provide a basic principle for how nitrate controls nodulation through positive and negative regulation of symbiotic genes.
    DOI:  https://doi.org/10.1093/plcell/koab103
  25. Science. 2021 Apr 09. pii: eabd0875. [Epub ahead of print]372(6538):
      DNA methylation is essential to mammalian development, and dysregulation can cause serious pathological conditions. Key enzymes responsible for deposition and removal of DNA methylation are known, but how they cooperate to regulate the methylation landscape remains a central question. Using a knockin DNA methylation reporter, we performed a genome-wide CRISPR-Cas9 screen in human embryonic stem cells to discover DNA methylation regulators. The top screen hit was an uncharacterized gene, QSER1, which proved to be a key guardian of bivalent promoters and poised enhancers of developmental genes, especially those residing in DNA methylation valleys (or canyons). We further demonstrate genetic and biochemical interactions of QSER1 and TET1, supporting their cooperation to safeguard transcriptional and developmental programs from DNMT3-mediated de novo methylation.
    DOI:  https://doi.org/10.1126/science.abd0875
  26. Nucleic Acids Res. 2021 Apr 09. pii: gkab226. [Epub ahead of print]
      Skeletal muscle is a dynamic tissue the size of which can be remodeled through the concerted actions of various cues. Here, we investigated the skeletal muscle transcriptional program and identified key tissue-specific regulatory genetic elements. Our results show that Myod1 is bound to numerous skeletal muscle enhancers in collaboration with the glucocorticoid receptor (GR) to control gene expression. Remarkably, transcriptional activation controlled by these factors occurs through direct contacts with the promoter region of target genes, via the CpG-bound transcription factor Nrf1, and the formation of Ctcf-anchored chromatin loops, in a myofiber-specific manner. Moreover, we demonstrate that GR negatively controls muscle mass and strength in mice by down-regulating anabolic pathways. Taken together, our data establish Myod1, GR and Nrf1 as key players of muscle-specific enhancer-promoter communication that orchestrate myofiber size regulation.
    DOI:  https://doi.org/10.1093/nar/gkab226
  27. Proc Natl Acad Sci U S A. 2021 Apr 13. pii: e2023070118. [Epub ahead of print]118(15):
      Simultaneous profiling of multiomic modalities within a single cell is a grand challenge for single-cell biology. While there have been impressive technical innovations demonstrating feasibility-for example, generating paired measurements of single-cell transcriptome (single-cell RNA sequencing [scRNA-seq]) and chromatin accessibility (single-cell assay for transposase-accessible chromatin using sequencing [scATAC-seq])-widespread application of joint profiling is challenging due to its experimental complexity, noise, and cost. Here, we introduce BABEL, a deep learning method that translates between the transcriptome and chromatin profiles of a single cell. Leveraging an interoperable neural network model, BABEL can predict single-cell expression directly from a cell's scATAC-seq and vice versa after training on relevant data. This makes it possible to computationally synthesize paired multiomic measurements when only one modality is experimentally available. Across several paired single-cell ATAC and gene expression datasets in human and mouse, we validate that BABEL accurately translates between these modalities for individual cells. BABEL also generalizes well to cell types within new biological contexts not seen during training. Starting from scATAC-seq of patient-derived basal cell carcinoma (BCC), BABEL generated single-cell expression that enabled fine-grained classification of complex cell states, despite having never seen BCC data. These predictions are comparable to analyses of experimental BCC scRNA-seq data for diverse cell types related to BABEL's training data. We further show that BABEL can incorporate additional single-cell data modalities, such as protein epitope profiling, thus enabling translation across chromatin, RNA, and protein. BABEL offers a powerful approach for data exploration and hypothesis generation.
    Keywords:  deep learning; gene regulation; multiomics; single-cell analysis
    DOI:  https://doi.org/10.1073/pnas.2023070118