bims-crepig Biomed News
on Chromatin regulation and epigenetics in cell fate and cancer
Issue of 2020–07–19
forty-four papers selected by
Connor Rogerson, University of Cambridge, MRC Cancer Unit



  1. Nat Commun. 2020 Jul 13. 11(1): 3491
      Sperm contributes genetic and epigenetic information to the embryo to efficiently support development. However, the mechanism underlying such developmental competence remains elusive. Here, we investigated whether all sperm cells have a common epigenetic configuration that primes transcriptional program for embryonic development. Using calibrated ChIP-seq, we show that remodelling of histones during spermiogenesis results in the retention of methylated histone H3 at the same genomic location in most sperm cell. This homogeneously methylated fraction of histone H3 in the sperm genome is maintained during early embryonic replication. Such methylated histone fraction resisting post-fertilisation reprogramming marks developmental genes whose expression is perturbed upon experimental reduction of histone methylation. A similar homogeneously methylated histone H3 fraction is detected in human sperm. Altogether, we uncover a conserved mechanism of paternal epigenetic information transmission to the embryo through the homogeneous retention of methylated histone in a sperm cells population.
    DOI:  https://doi.org/10.1038/s41467-020-17238-w
  2. Bioinform Biol Insights. 2020 ;14 1177932220938063
      The differentiation of embryonic stem cells into various lineages is highly dependent on the chromatin state of the genome and patterns of gene expression. To identify lineage-specific enhancers driving the differentiation of progenitors into pancreatic cells, we used a previously described computational framework called Total Functional Score of Enhancer Elements (TFSEE), which integrates multiple genomic assays that probe both transcriptional and epigenomic states. First, we evaluated and compared TFSEE as an enhancer-calling algorithm with enhancers called using GRO-seq-defined enhancer transcripts (method 1) versus enhancers called using histone modification ChIP-seq data (method 2). Second, we used TFSEE to define the enhancer landscape and identify transcription factors (TFs) that maintain the multipotency of a subpopulation of endodermal stem cells during differentiation into pancreatic lineages. Collectively, our results demonstrate that TFSEE is a robust enhancer-calling algorithm that can be used to perform multilayer genomic data integration to uncover cell type-specific TFs that control lineage-specific enhancers.
    Keywords:  Enhancer; epigenome; gene regulation; pancreas; tissue-specific transcription; transcription factor
    DOI:  https://doi.org/10.1177/1177932220938063
  3. Epigenetics Chromatin. 2020 Jul 17. 13(1): 30
      Several thousand sex-differential distal enhancers have been identified in mouse liver; however, their links to sex-biased genes and the impact of any sex-differences in nuclear organization and chromatin interactions are unknown. To address these issues, we first characterized 1847 mouse liver genomic regions showing significant sex differential occupancy by cohesin and CTCF, two key 3D nuclear organizing factors. These sex-differential binding sites were primarily distal to sex-biased genes but rarely generated sex-differential TAD (topologically associating domain) or intra-TAD loop anchors, and were sometimes found in TADs without sex-biased genes. A substantial subset of sex-biased cohesin-non-CTCF binding sites, but not sex-biased cohesin-and-CTCF binding sites, overlapped sex-biased enhancers. Cohesin depletion reduced the expression of male-biased genes with distal, but not proximal, sex-biased enhancers by >10-fold, implicating cohesin in long-range enhancer interactions regulating sex-biased genes. Using circularized chromosome conformation capture-based sequencing (4C-seq), we showed that sex differences in distal sex-biased enhancer-promoter interactions are common. Intra-TAD loops with sex-independent cohesin-and-CTCF anchors conferred sex specificity to chromatin interactions indirectly, by insulating sex-biased enhancer-promoter contacts and by bringing sex-biased genes into closer proximity to sex-biased enhancers. Furthermore, sex-differential chromatin interactions involving sex-biased gene promoters, enhancers, and lncRNAs were associated with sex-biased binding of cohesin and/or CTCF. These studies elucidate how 3D genome organization impacts sex-biased gene expression in a non-reproductive tissue through both direct and indirect effects of cohesin and CTCF looping on distal enhancer interactions with sex-differentially expressed genes.
    Keywords:  DNase hypersensitive sites; Liver sex differences; Nipbl
    DOI:  https://doi.org/10.1186/s13072-020-00350-y
  4. Sci Adv. 2020 May;6(21): eaaz4815
      Self-renewal and differentiation of hematopoietic stem cells (HSCs) are orchestrated by the combinatorial action of transcription factors and epigenetic regulators. Here, we have explored the mechanism by which histone H4 lysine 16 acetyltransferase MOF regulates erythropoiesis. Single-cell RNA sequencing and chromatin immunoprecipitation sequencing uncovered that MOF influences erythroid trajectory by dynamic recruitment to chromatin and its haploinsufficiency causes accumulation of a transient HSC population. A regulatory network consisting of MOF, RUNX1, and GFI1B is critical for erythroid fate commitment. GFI1B acts as a Mof activator which is necessary and sufficient for cell type-specific induction of Mof expression. Plasticity of Mof-depleted HSCs can be rescued by expression of a downstream effector, Gata1, or by rebalancing acetylation via a histone deacetylase inhibitor. Accurate timing and dosage of Mof expression act as a rheostat for the feedforward transcription factor network that safeguards progression along the erythroid fate.
    DOI:  https://doi.org/10.1126/sciadv.aaz4815
  5. Nucleic Acids Res. 2020 Jul 16. pii: gkaa606. [Epub ahead of print]
      Schwann cells are the nerve ensheathing cells of the peripheral nervous system. Absence, loss and malfunction of Schwann cells or their myelin sheaths lead to peripheral neuropathies such as Charcot-Marie-Tooth disease in humans. During Schwann cell development and myelination chromatin is dramatically modified. However, impact and functional relevance of these modifications are poorly understood. Here, we analyzed histone H2B monoubiquitination as one such chromatin modification by conditionally deleting the Rnf40 subunit of the responsible E3 ligase in mice. Rnf40-deficient Schwann cells were arrested immediately before myelination or generated abnormally thin, unstable myelin, resulting in a peripheral neuropathy characterized by hypomyelination and progressive axonal degeneration. By combining sequencing techniques with functional studies we show that H2B monoubiquitination does not influence global gene expression patterns, but instead ensures selective high expression of myelin and lipid biosynthesis genes and proper repression of immaturity genes. This requires the specific recruitment of the Rnf40-containing E3 ligase by Egr2, the central transcriptional regulator of peripheral myelination, to its target genes. Our study identifies histone ubiquitination as essential for Schwann cell myelination and unravels new disease-relevant links between chromatin modifications and transcription factors in the underlying regulatory network.
    DOI:  https://doi.org/10.1093/nar/gkaa606
  6. Sci Rep. 2020 Jul 16. 10(1): 11832
      Transcription factor binding to genomic DNA is generally prevented by nucleosome formation, in which the DNA is tightly wrapped around the histone octamer. In contrast, pioneer transcription factors efficiently bind their target DNA sequences within the nucleosome. OCT4 has been identified as a pioneer transcription factor required for stem cell pluripotency. To study the nucleosome binding by OCT4, we prepared human OCT4 as a recombinant protein, and biochemically analyzed its interactions with the nucleosome containing a natural OCT4 target, the LIN28B distal enhancer DNA sequence, which contains three potential OCT4 target sequences. By a combination of chemical mapping and cryo-electron microscopy single-particle analysis, we mapped the positions of the three target sequences within the nucleosome. A mutational analysis revealed that OCT4 preferentially binds its target DNA sequence located near the entry/exit site of the nucleosome. Crosslinking mass spectrometry consistently showed that OCT4 binds the nucleosome in the proximity of the histone H3 N-terminal region, which is close to the entry/exit site of the nucleosome. We also found that the linker histone H1 competes with OCT4 for the nucleosome binding. These findings provide important information for understanding the molecular mechanism by which OCT4 binds its target DNA in chromatin.
    DOI:  https://doi.org/10.1038/s41598-020-68850-1
  7. Development. 2020 Jul 14. pii: dev.191262. [Epub ahead of print]
      To identify candidate tissue regeneration enhancer elements (TREEs) important for zebrafish fin regeneration, we performed ATAC-seq from bulk tissue or purified fibroblasts of uninjured and regenerating caudal fins. We identified tens of thousands of DNA regions from each sample type with dynamic accessibility during regeneration, and assigned these regions to proximal genes with corresponding expression changes by RNA-seq. To determine whether these profiles reveal bona fide TREEs, we tested the sufficiency and requirements of several sequences in stable transgenic lines and mutant lines with homozygous deletions. These experiments validated new non-coding regulatory sequences near induced and/or essential genes during fin regeneration, including fgf20a, mdka, and cx43, identifying distinct domains of directed expression for each confirmed TREE. Whereas deletion of the previously identified LEN enhancer abolished detectable induction of the nearby leptin b gene during regeneration, deletions of enhancers linked to fgf20a, mdka, and cx43 had no effect or partially reduced gene expression. Our study generates a new resource for dissecting the regulatory mechanisms of appendage generation and reveals a range of requirements for individual TREEs in control of regeneration programs.
    Keywords:  Chromatin; Enhancers; Fin regeneration; Genome-wide profiling; Regeneration; Zebrafish
    DOI:  https://doi.org/10.1242/dev.191262
  8. Proc Natl Acad Sci U S A. 2020 Jul 16. pii: 202009316. [Epub ahead of print]
      In mammals, repressive histone modifications such as trimethylation of histone H3 Lys9 (H3K9me3), frequently coexist with DNA methylation, producing a more stable and silenced chromatin state. However, it remains elusive how these epigenetic modifications crosstalk. Here, through structural and biochemical characterizations, we identified the replication foci targeting sequence (RFTS) domain of maintenance DNA methyltransferase DNMT1, a module known to bind the ubiquitylated H3 (H3Ub), as a specific reader for H3K9me3/H3Ub, with the recognition mode distinct from the typical trimethyl-lysine reader. Disruption of the interaction between RFTS and the H3K9me3Ub affects the localization of DNMT1 in stem cells and profoundly impairs the global DNA methylation and genomic stability. Together, this study reveals a previously unappreciated pathway through which H3K9me3 directly reinforces DNMT1-mediated maintenance DNA methylation.
    Keywords:  DNA methylation; DNMT1; H3K9me3; allosteric regulation
    DOI:  https://doi.org/10.1073/pnas.2009316117
  9. Genome Med. 2020 Jul 15. 12(1): 63
       BACKGROUND: Small cell lung cancer (SCLC) is a more aggressive subtype of lung cancer that often results in rapid tumor growth, early metastasis, and acquired therapeutic resistance. Consequently, such phenotypical characteristics of SCLC set limitations on viable procedural options, making it difficult to develop both screenings and effective treatments. In this study, we examine a novel mechanistic insight in SCLC cells that could potentially provide a more sensitive therapeutic alternative for SCLC patients.
    METHODS: Biochemistry studies, including size exclusion chromatography, mass spectrometry, and western blot analysis, were conducted to determine the protein-protein interaction between additional sex combs-like protein 3 (ASXL3) and bromodomain-containing protein 4 (BRD4). Genomic studies, including chromatin immunoprecipitation sequencing (ChIP-seq), RNA sequencing, and genome-wide analysis, were performed in both human and mouse SCLC cells to determine the dynamic relationship between BRD4/ASXL3/BAP1 epigenetic axis in chromatin binding and its effects on transcriptional activity.
    RESULTS: We report a critical link between BAP1 complex and BRD4, which is bridged by the physical interaction between ASXL3 and BRD4 in an SCLC subtype (SCLC-A), which expresses a high level of ASCL1. We further showed that ASXL3 functions as an adaptor protein, which directly interacts with BRD4's extra-terminal (ET) domain via a novel BRD4 binding motif (BBM), and maintains chromatin occupancy of BRD4 to active enhancers. Genetic depletion of ASXL3 results in a genome-wide reduction of histone H3K27Ac levels and BRD4-dependent gene expression in SCLC. Pharmacologically induced inhibition with BET-specific chemical degrader (dBET6) selectively inhibits cell proliferation of a subtype of SCLC that is characterized with high expression of ASXL3.
    CONCLUSIONS: Collectively, this study provides a mechanistic insight into the oncogenic function of BRD4/ASXL3/BAP1 epigenetic axis at active chromatin enhancers in SCLC-A subtype, as well as a potential new therapeutic option that could become more effective in treating SCLC patients with a biomarker of ASXL3-highly expressed SCLC cells.
    Keywords:  ASXL3; BAP1 complex; BET inhibitors; BRD4; Enhancer activity; SCLC
    DOI:  https://doi.org/10.1186/s13073-020-00760-3
  10. Nucleic Acids Res. 2020 Jul 15. pii: gkaa584. [Epub ahead of print]
      Genome-wide passive DNA demethylation in cleavage-stage mouse embryos is related to the cytoplasmic localization of the maintenance methyltransferase DNMT1. However, recent studies provided evidences of the nuclear localization of DNMT1 and its contribution to the maintenance of methylation levels of imprinted regions and other genomic loci in early embryos. Using the DNA adenine methylase identification method, we identified Dnmt1-binding regions in four- and eight-cell embryos. The unbiased distribution of Dnmt1 peaks in the genic regions (promoters and CpG islands) as well as the absence of a correlation between the Dnmt1 peaks and the expression levels of the peak-associated genes refutes the active participation of Dnmt1 in the transcriptional regulation of genes in the early developmental period. Instead, Dnmt1 was found to associate with genomic retroelements in a greatly biased fashion, particularly with the LINE1 (long interspersed nuclear elements) and ERVK (endogenous retrovirus type K) sequences. Transcriptomic analysis revealed that the transcripts of the Dnmt1-enriched retroelements were overrepresented in Dnmt1 knockdown embryos. Finally, methyl-CpG-binding domain sequencing proved that the Dnmt1-enriched retroelements, which were densely methylated in wild-type embryos, became demethylated in the Dnmt1-depleted embryos. Our results indicate that Dnmt1 is involved in the repression of retroelements through DNA methylation in early mouse development.
    DOI:  https://doi.org/10.1093/nar/gkaa584
  11. Bioinformatics. 2020 Jul 01. 36(Supplement_1): i474-i481
       MOTIVATION: Recently, many chromatin immunoprecipitation sequencing experiments have been carried out for a diverse group of transcription factors (TFs) in many different types of human cells. These experiments manifest large-scale and dynamic changes in regulatory network connectivity (i.e. network 'rewiring'), highlighting the different regulatory programs operating in disparate cellular states. However, due to the dense and noisy nature of current regulatory networks, directly comparing the gains and losses of targets of key TFs across cell states is often not informative. Thus, here, we seek an abstracted, low-dimensional representation to understand the main features of network change.
    RESULTS: We propose a method called TopicNet that applies latent Dirichlet allocation to extract functional topics for a collection of genes regulated by a given TF. We then define a rewiring score to quantify regulatory-network changes in terms of the topic changes for this TF. Using this framework, we can pinpoint particular TFs that change greatly in network connectivity between different cellular states (such as observed in oncogenesis). Also, incorporating gene expression data, we define a topic activity score that measures the degree to which a given topic is active in a particular cellular state. And we show how activity differences can indicate differential survival in various cancers.
    AVAILABILITY AND IMPLEMENTATION: The TopicNet framework and related analysis were implemented using R and all codes are available at https://github.com/gersteinlab/topicnet.
    SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
    DOI:  https://doi.org/10.1093/bioinformatics/btaa403
  12. iScience. 2020 Jun 29. pii: S2589-0042(20)30507-1. [Epub ahead of print]23(7): 101320
      Impairments in the differentiation process can lead to skin diseases that can afflict ∼20% of the population. Thus, it is of utmost importance to understand the factors that promote the differentiation process. Here we identify the transcription factor KLF3 as a regulator of epidermal differentiation. Knockdown of KLF3 results in reduced differentiation gene expression and increased cell cycle gene expression. Over half of KLF3's genomic binding sites occur at active enhancers. KLF3 binds to active enhancers proximal to differentiation genes that are dependent upon KLF3 for expression. KLF3's genomic binding sites also highly overlaps with CBP, a histone acetyltransferase necessary for activating enhancers. Depletion of KLF3 causes reduced CBP localization at enhancers proximal to differentiation gene clusters, which leads to loss of enhancer activation but not priming. Our results suggest that KLF3 is necessary to recruit CBP to activate enhancers and drive epidermal differentiation gene expression.
    Keywords:  Molecular Genetics; Molecular Mechanism of Gene Regulation; Transcriptomics
    DOI:  https://doi.org/10.1016/j.isci.2020.101320
  13. Genome Res. 2020 Jul 06. pii: gr.258228.119. [Epub ahead of print]
      Transcription is tightly regulated by cis-regulatory DNA elements where transcription factors can bind. Thus, identification of transcription factor binding sites (TFBSs) is key to understanding gene expression and whole regulatory networks within a cell. The standard approaches used for TFBS prediction, such as position weight matrices (PWMs) and chromatin immunoprecipitation followed by sequencing (ChIP-seq), are widely used, but have their drawbacks including high false positive rates and limited antibody availability, respectively. Several computational footprinting algorithms have been developed to detect TFBSs by investigating chromatin accessibility patterns, however these also have limitations. We have developed a footprinting method to predict Transcription factor footpRints in Active Chromatin Elements (TRACE) to improve the prediction of TFBS footprints. TRACE incorporates DNase-seq data and PWMs within a multivariate Hidden Markov Model (HMM) to detect footprint-like regions with matching motifs. TRACE is an unsupervised method that accurately annotates binding sites for specific TFs automatically with no requirement for pre-generated candidate binding sites or ChIP-seq training data. Compared to published footprinting algorithms, TRACE has the best overall performance with the distinct advantage of targeting multiple motifs in a single model.
    DOI:  https://doi.org/10.1101/gr.258228.119
  14. Mol Hum Reprod. 2020 Jul 14. pii: gaaa048. [Epub ahead of print]
      Early embryonic development is characterized by drastic changes in chromatin structure that affects the accessibility of the chromatin. In human, the chromosome reorganization and its involvement in the first linage segregation are poorly characterized due to the difficulties in obtaining human embryonic material and limitation on low input technologies. In this study, we aimed to explore the chromatin remodeling pattern in human preimplantation embryos and gain insight into the epigenetic regulation of inner cell mass (ICM) and trophectoderm (TE) differentiation. We optimized ATAC-seq (an assay for transposase-accessible chromatin using sequencing) to analyze the chromatin accessibility landscape for low DNA input. Sixteen preimplantation human blastocysts frozen on day 6 were used. Our data showed that ATAC peak distributions of the promoter regions (<1kb) and distal regions versus other regions were significantly different between ICM versus TE samples (P < 0.01). We detected that a higher percentage of accessible binding loci were located within 1kb of the transcription start site in ICM compared to TE (p < 0.01). However, a higher percentage of accessible regions was detected in the distal region of TE compared to ICM (p < 0.01). In addition, eight differential peaks with a false discovery rate <0.05 between ICM and TE were detected. This is the first study to compare the landscape of the accessible chromatin between ICM and TE of human preimplantation embryos, which unveiled chromatin-level epigenetic regulation of cell lineage specification in early embryo development.
    Keywords:  atac seq; cell lineage; chromatin accessibility; human; inner cell mass; pre-implantation embryo; preimplantation embryo; trophectoderm
    DOI:  https://doi.org/10.1093/molehr/gaaa048
  15. Nucleic Acids Res. 2020 Jul 07. pii: gkaa544. [Epub ahead of print]
      Nuclear proteins bind chromatin to execute and regulate genome-templated processes. While studies of individual nucleosome interactions have suggested that an acidic patch on the nucleosome disk may be a common site for recruitment to chromatin, the pervasiveness of acidic patch binding and whether other nucleosome binding hot-spots exist remain unclear. Here, we use nucleosome affinity proteomics with a library of nucleosomes that disrupts all exposed histone surfaces to comprehensively assess how proteins recognize nucleosomes. We find that the acidic patch and two adjacent surfaces are the primary hot-spots for nucleosome disk interactions, whereas nearly half of the nucleosome disk participates only minimally in protein binding. Our screen defines nucleosome surface requirements of nearly 300 nucleosome interacting proteins implicated in diverse nuclear processes including transcription, DNA damage repair, cell cycle regulation and nuclear architecture. Building from our screen, we demonstrate that the Anaphase-Promoting Complex/Cyclosome directly engages the acidic patch, and we elucidate a redundant mechanism of acidic patch binding by nuclear pore protein ELYS. Overall, our interactome screen illuminates a highly competitive nucleosome binding hub and establishes universal principles of nucleosome recognition.
    DOI:  https://doi.org/10.1093/nar/gkaa544
  16. Proc Natl Acad Sci U S A. 2020 Jul 16. pii: 201919379. [Epub ahead of print]
      Genome-wide association studies have identified noncoding variants near TBX3 that are associated with PR interval and QRS duration, suggesting that subtle changes in TBX3 expression affect atrioventricular conduction system function. To explore whether and to what extent the atrioventricular conduction system is affected by Tbx3 dose reduction, we first characterized electrophysiological properties and morphology of heterozygous Tbx3 mutant (Tbx3 +/-) mouse hearts. We found PR interval shortening and prolonged QRS duration, as well as atrioventricular bundle hypoplasia after birth in heterozygous mice. The atrioventricular node size was unaffected. Transcriptomic analysis of atrioventricular nodes isolated by laser capture microdissection revealed hundreds of deregulated genes in Tbx3 +/- mutants. Notably, Tbx3 +/- atrioventricular nodes showed increased expression of working myocardial gene programs (mitochondrial and metabolic processes, muscle contractility) and reduced expression of pacemaker gene programs (neuronal, Wnt signaling, calcium/ion channel activity). By integrating chromatin accessibility profiles (ATAC sequencing) of atrioventricular tissue and other epigenetic data, we identified Tbx3-dependent atrioventricular regulatory DNA elements (REs) on a genome-wide scale. We used transgenic reporter assays to determine the functionality of candidate REs near Ryr2, an up-regulated chamber-enriched gene, and in Cacna1g, a down-regulated conduction system-specific gene. Using genome editing to delete candidate REs, we showed that a strong intronic bipartite RE selectively governs Cacna1g expression in the conduction system in vivo. Our data provide insights into the multifactorial Tbx3-dependent transcriptional network that regulates the structure and function of the cardiac conduction system, which may underlie the differences in PR duration and QRS interval between individuals carrying variants in the TBX3 locus.
    Keywords:  Cacna1g; Ryr2; Tbx3; atrioventricular conduction system; electrical patterning
    DOI:  https://doi.org/10.1073/pnas.1919379117
  17. Development. 2020 Jul 16. pii: dev.190637. [Epub ahead of print]
      Post-translational histone modifications regulate chromatin compaction and gene expression to control many aspects of development. Mutations in genes encoding regulators of H3K4 methylation are causally associated with neurodevelopmental disorders characterized by intellectual disability and deficits in motor functions. However, it remains unclear how H3K4 methylation influences nervous system development and contributes to the aetiology of disease. Here, we show that the catalytic activity of set-2, the C. elegans homolog of the H3K4 methyltransferase KMT2F/G (SETD1A/B) genes, controls embryonic transcription of neuronal genes and is required for establishing proper axon guidance and for neuronal functions related to locomotion and learning. Moreover, we uncover a striking correlation between components of the H3K4 regulatory machinery mutated in neurodevelopmental disorders and the process of axon guidance in C. elegans Thus, our study supports an epigenetic-based model for the aetiology of neurodevelopmental disorders, based on aberrant axon guidance process originating from deregulated H3K4 methylation.
    Keywords:  Axon guidance; C. elegans; Epigenetics; H3K4 methylation; Neurodevelopmental disease; Neuronal development
    DOI:  https://doi.org/10.1242/dev.190637
  18. Nat Mach Intell. 2020 Jul;2(7): 376-386
      Identifying the molecular mechanisms that control differential gene expression (DE) is a major goal of basic and disease biology. We develop a systems biology model to predict DE, and mine the biological basis of the factors that influence predicted gene expression, in order to understand how it may be generated. This model, called DEcode, utilizes deep learning to predict DE based on genome-wide binding sites on RNAs and promoters. Ranking predictive factors from the DEcode indicates that clinically relevant expression changes between thousands of individuals can be predicted mainly through the joint action of post-transcriptional RNA-binding factors. We also show the broad potential applications of DEcode to generate biological insights, by predicting DE between tissues, differential transcript-usage, and drivers of aging throughout the human lifespan, of gene coexpression relationships on a genome-wide scale, and of frequently DE genes across diverse conditions. Researchers can freely utilize DEcode to identify influential molecular mechanisms for any human expression data - www.differentialexpression.org.
    DOI:  https://doi.org/10.1038/s42256-020-0201-6
  19. Nucleic Acids Res. 2020 Jul 14. pii: gkaa571. [Epub ahead of print]
      The chromatin remodelers SWI/SNF and RSC function in evicting promoter nucleosomes at highly expressed yeast genes, particularly those activated by transcription factor Gcn4. Ino80 remodeling complex (Ino80C) can establish nucleosome-depleted regions (NDRs) in reconstituted chromatin, and was implicated in removing histone variant H2A.Z from the -1 and +1 nucleosomes flanking NDRs; however, Ino80C's function in transcriptional activation in vivo is not well understood. Analyzing the cohort of Gcn4-induced genes in ino80Δ mutants has uncovered a role for Ino80C on par with SWI/SNF in evicting promoter nucleosomes and transcriptional activation. Compared to SWI/SNF, Ino80C generally functions over a wider region, spanning the -1 and +1 nucleosomes, NDR and proximal genic nucleosomes, at genes highly dependent on its function. Defects in nucleosome eviction in ino80Δ cells are frequently accompanied by reduced promoter occupancies of TBP, and diminished transcription; and Ino80 is enriched at genes requiring its remodeler activity. Importantly, nuclear depletion of Ino80 impairs promoter nucleosome eviction even in a mutant lacking H2A.Z. Thus, Ino80C acts widely in the yeast genome together with RSC and SWI/SNF in evicting promoter nucleosomes and enhancing transcription, all in a manner at least partly independent of H2A.Z editing.
    DOI:  https://doi.org/10.1093/nar/gkaa571
  20. Bioinformatics. 2020 Jul 01. 36(Supplement_1): i84-i92
       MOTIVATION: Genetic variation in regulatory elements can alter transcription factor (TF) binding by mutating a TF binding motif, which in turn may affect the activity of the regulatory elements. However, it is unclear which motifs are prone to impact transcriptional regulation if mutated. Current motif analysis tools either prioritize TFs based on motif enrichment without linking to a function or are limited in their applications due to the assumption of linearity between motifs and their functional effects.
    RESULTS: We present MAGGIE (Motif Alteration Genome-wide to Globally Investigate Elements), a novel method for identifying motifs mediating TF binding and function. By leveraging measurements from diverse genotypes, MAGGIE uses a statistical approach to link mutations of a motif to changes of an epigenomic feature without assuming a linear relationship. We benchmark MAGGIE across various applications using both simulated and biological datasets and demonstrate its improvement in sensitivity and specificity compared with the state-of-the-art motif analysis approaches. We use MAGGIE to gain novel insights into the divergent functions of distinct NF-κB factors in pro-inflammatory macrophages, revealing the association of p65-p50 co-binding with transcriptional activation and the association of p50 binding lacking p65 with transcriptional repression.
    AVAILABILITY AND IMPLEMENTATION: The Python package for MAGGIE is freely available at https://github.com/zeyang-shen/maggie. The accession number for the NF-κB ChIP-seq data generated for this study is Gene Expression Omnibus: GSE144070.
    SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
    DOI:  https://doi.org/10.1093/bioinformatics/btaa476
  21. Nat Commun. 2020 Jul 14. 11(1): 3503
      DNA replication timing is tightly regulated during S-phase. S-phase length is determined by DNA synthesis rate, which depends on the number of active replication forks and their velocity. Here, we show that E2F-dependent transcription, through E2F6, determines the replication capacity of a cell, defined as the maximal amount of DNA a cell can synthesise per unit time during S-phase. Increasing or decreasing E2F-dependent transcription during S-phase increases or decreases replication capacity, and thereby replication rates, thus shortening or lengthening S-phase, respectively. The changes in replication rate occur mainly through changes in fork speed without affecting the number of active forks. An increase in fork speed does not induce replication stress directly, but increases DNA damage over time causing cell cycle arrest. Thus, E2F-dependent transcription determines the DNA replication capacity of a cell, which affects the replication rate, controlling the time it takes to duplicate the genome and complete S-phase.
    DOI:  https://doi.org/10.1038/s41467-020-17146-z
  22. Genome Res. 2020 Jul 06.
      Accurate mapping of transcription start sites (TSSs) is key for understanding transcriptional regulation. However, current protocols for genome-wide TSS profiling are laborious and/or expensive. We present Survey of TRanscription Initiation at Promoter Elements with high-throughput sequencing (STRIPE-seq), a simple, rapid, and cost-effective protocol for sequencing capped RNA 5' ends from as little as 50 ng total RNA. Including depletion of uncapped RNA and reaction cleanups, a STRIPE-seq library can be constructed in about 5 h. We show application of STRIPE-seq to TSS profiling in yeast and human cells and show that it can also be effectively used for quantification of transcript levels and analysis of differential gene expression. In conjunction with our ready-to-use computational workflows, STRIPE-seq is a straightforward, efficient means by which to probe the landscape of transcriptional initiation.
    DOI:  https://doi.org/10.1101/gr.261545.120
  23. Oncogene. 2020 Jul 17.
      Localized prostate cancer develops very slowly in most men, with the androgen receptor (AR) and MYC transcription factors amongst the most well-characterized drivers of prostate tumorigenesis. Canonically, MYC up-regulation in luminal prostate cancer cells functions to oppose the terminally differentiating effects of AR. However, the effects of MYC up-regulation are pleiotropic and inconsistent with a poorly proliferative phenotype. Here we show that increased MYC expression and activity are associated with the down-regulation of MEIS1, a HOX-family transcription factor. Using RNA-seq to profile a series of human prostate cancer specimens laser capture microdissected on the basis of MYC immunohistochemistry, MYC activity, and MEIS1 expression were inversely correlated. Knockdown of MYC expression in prostate cancer cells increased the expression of MEIS1 and increased the occupancy of MYC at the MEIS1 locus. Finally, we show in laser capture microdissected human prostate cancer samples and the prostate TCGA cohort that MEIS1 expression is inversely proportional to AR activity as well as HOXB13, a known interacting protein of both AR and MEIS1. Collectively, our data demonstrate that elevated MYC in a subset of primary prostate cancers functions in a negative role in regulating MEIS1 expression, and that this down-regulation may contribute to MYC-driven development and progression.
    DOI:  https://doi.org/10.1038/s41388-020-01389-7
  24. Nat Genet. 2020 Jul 13.
      Although DNA methylation is a key regulator of gene expression, the comprehensive methylation landscape of metastatic cancer has never been defined. Through whole-genome bisulfite sequencing paired with deep whole-genome and transcriptome sequencing of 100 castration-resistant prostate metastases, we discovered alterations affecting driver genes that were detectable only with integrated whole-genome approaches. Notably, we observed that 22% of tumors exhibited a novel epigenomic subtype associated with hypermethylation and somatic mutations in TET2, DNMT3B, IDH1 and BRAF. We also identified intergenic regions where methylation is associated with RNA expression of the oncogenic driver genes AR, MYC and ERG. Finally, we showed that differential methylation during progression preferentially occurs at somatic mutational hotspots and putative regulatory regions. This study is a large integrated study of whole-genome, whole-methylome and whole-transcriptome sequencing in metastatic cancer that provides a comprehensive overview of the important regulatory role of methylation in metastatic castration-resistant prostate cancer.
    DOI:  https://doi.org/10.1038/s41588-020-0648-8
  25. Genes Dev. 2020 Jul 16.
      YAP1 is a transcriptional coactivator and the principal effector of the Hippo signaling pathway, which is causally implicated in human cancer. Several YAP1 gene fusions have been identified in various human cancers and identifying the essential components of this family of gene fusions has significant therapeutic value. Here, we show that the YAP1 gene fusions YAP1-MAMLD1, YAP1-FAM118B, YAP1-TFE3, and YAP1-SS18 are oncogenic in mice. Using reporter assays, RNA-seq, ChIP-seq, and loss-of-function mutations, we can show that all of these YAP1 fusion proteins exert TEAD-dependent YAP activity, while some also exert activity of the C'-terminal fusion partner. The YAP activity of the different YAP1 fusions is resistant to negative Hippo pathway regulation due to constitutive nuclear localization and resistance to degradation of the YAP1 fusion proteins. Genetic disruption of the TEAD-binding domain of these oncogenic YAP1 fusions is sufficient to inhibit tumor formation in vivo, while pharmacological inhibition of the YAP1-TEAD interaction inhibits the growth of YAP1 fusion-expressing cell lines in vitro. These results highlight TEAD-dependent YAP activity found in these gene fusions as critical for oncogenesis and implicate these YAP functions as potential therapeutic targets in YAP1 fusion-positive tumors.
    Keywords:  MAMLD1; SS18; TEAD; TFE3; YAP1; angiosarcoma; cancer; ependymoma; gene fusion; verteporfin
    DOI:  https://doi.org/10.1101/gad.338681.120
  26. Clin Epigenetics. 2020 Jul 14. 12(1): 106
       BACKGROUND: H3K27ac histone acetylome changes contribute to the phenotypic response in heart diseases, particularly in end-stage heart failure. However, such epigenetic alterations have not been systematically investigated in remodeled non-failing human hearts. Therefore, valuable insight into cardiac dysfunction in early remodeling is lacking. This study aimed to reveal the acetylation changes of chromatin regions in response to myocardial remodeling and their correlations to transcriptional changes of neighboring genes.
    RESULTS: We detected chromatin regions with differential acetylation activity (DARs; Padj. < 0.05) between remodeled non-failing patient hearts and healthy donor hearts. The acetylation level of the chromatin region correlated with its RNA polymerase II occupancy level and the mRNA expression level of its adjacent gene per sample. Annotated genes from DARs were enriched in disease-related pathways, including fibrosis and cell metabolism regulation. DARs that change in the same direction have a tendency to cluster together, suggesting the well-reorganized chromatin architecture that facilitates the interactions of regulatory domains in response to myocardial remodeling. We further show the differences between the acetylation level and the mRNA expression level of cell-type-specific markers for cardiomyocytes and 11 non-myocyte cell types. Notably, we identified transcriptome factor (TF) binding motifs that were enriched in DARs and defined TFs that were predicted to bind to these motifs. We further showed 64 genes coding for these TFs that were differentially expressed in remodeled myocardium when compared with controls.
    CONCLUSIONS: Our study reveals extensive novel insight on myocardial remodeling at the DNA regulatory level. Differences between the acetylation level and the transcriptional level of cell-type-specific markers suggest additional mechanism(s) between acetylome and transcriptome. By integrating these two layers of epigenetic profiles, we further provide promising TF-encoding genes that could serve as master regulators of myocardial remodeling. Combined, our findings highlight the important role of chromatin regulatory signatures in understanding disease etiology.
    Keywords:  Histone acetylation; Myocardial remodeling; Transcription factor; Transcriptome
    DOI:  https://doi.org/10.1186/s13148-020-00895-5
  27. Cell Rep. 2020 Jul 14. pii: S2211-1247(20)30858-5. [Epub ahead of print]32(2): 107877
      Evolutionarily conserved SCAN (named after SRE-ZBP, CTfin51, AW-1, and Number 18 cDNA)-domain-containing zinc finger transcription factors (ZSCAN) have been found in both mouse and human genomes. Zscan4 is transiently expressed during zygotic genome activation (ZGA) in preimplantation embryos and induced pluripotent stem cell (iPSC) reprogramming. However, little is known about the mechanism of Zscan4 underlying these processes of cell fate control. Here, we show that Zscan4f, a representative of ZSCAN proteins, is able to recruit Tet2 through its SCAN domain. The Zscan4f-Tet2 interaction promotes DNA demethylation and regulates the expression of target genes, particularly those encoding glycolytic enzymes and proteasome subunits. Zscan4f regulates metabolic rewiring, enhances proteasome function, and ultimately promotes iPSC generation. These results identify Zscan4f as an important partner of Tet2 in regulating target genes and promoting iPSC generation and suggest a possible and common mechanism shared by SCAN family transcription factors to recruit ten-eleven translocation (TET) DNA dioxygenases to regulate diverse cellular processes, including reprogramming.
    Keywords:  SCAN domain; TET2; ZSCAN4; iPSCs; induced pluripotent stem cells; metabolic rewiring; proteasome function; stem cell potency
    DOI:  https://doi.org/10.1016/j.celrep.2020.107877
  28. Stem Cells. 2020 Jul 11.
      Embryonic stem cell (ESC) renewal and differentiation is regulated by metabolites that serve as co-factors for epigenetic enzymes. Increase of α-ketoglutarate (α-KG), a co-factor for histone and DNA demethylases, triggers multi-lineage differentiation in human ESCs. To gain further insight how the metabolic fluxes in pluripotent stem cells can be influenced by inactivating mutations in epigenetic enzymes, we generated human ESCs deficient for de novo DNA methyltransferases (DNMT) 3A and 3B. Our data reveal a bidirectional dependence between DNMT3B and α-KG levels: a-KG is significantly upregulated in cells deficient for DNMT3B, while DNMT3B expression is downregulated in human ESCs treated with α-KG. In addition, DNMT3B null human ESCs exhibit a disturbed mitochondrial fission and fusion balance and a switch from glycolysis to oxidative phosphorylation. Taken together, our data reveal a novel link between DNMT3B and the metabolic flux of human ESCs. © AlphaMed Press 2020 SIGNIFICANCE STATEMENT: The current study reveals a novel link between DNMT3B and the metabolic flux in human ESCs. Loss of DNMT3B disrupts the cells mitochondrial fusion and fission balance, reduces mitochondrial DNA levels, and elicits a switch from glycolysis to oxidative phosphorylation. The authors further show that loss of DNMT3B leads to an overexpression and hyperactivity of isocitrate dehydrogenases and buildup of α-ketoglutarate, as well as a significant upregulation of transcription factors during early neural differentiation. The observed increase in α-ketoglutarate levels can be reversed by re-expression of DNMT3B, demonstrating that its dysregulation is a direct consequence of DNMT3B-deficiency.
    DOI:  https://doi.org/10.1002/stem.3256
  29. Nature. 2020 Jul 15.
      Bone marrow transplantation therapy relies on the life-long regenerative capacity of haematopoietic stem cells (HSCs)1,2. HSCs present a complex variety of regenerative behaviours at the clonal level, but the mechanisms underlying this diversity are still undetermined3-11. Recent advances in single-cell RNA sequencing have revealed transcriptional differences among HSCs, providing a possible explanation for their functional heterogeneity12-17. However, the destructive nature of sequencing assays prevents simultaneous observation of stem cell state and function. To solve this challenge, we implemented expressible lentiviral barcoding, which enabled simultaneous analysis of lineages and transcriptomes from single adult HSCs and their clonal trajectories during long-term bone marrow reconstitution. Analysis of differential gene expression between clones with distinct behaviour revealed an intrinsic molecular signature that characterizes functional long-term repopulating HSCs. Probing this signature through in vivo CRISPR screening, we found the transcription factor TCF15 to be required and sufficient to drive HSC quiescence and long-term self-renewal. In situ, Tcf15 expression labels the most primitive subset of true multipotent HSCs. In conclusion, our work elucidates clone-intrinsic molecular programmes associated with functional stem cell heterogeneity and identifies a mechanism for the maintenance of the self-renewing HSC state.
    DOI:  https://doi.org/10.1038/s41586-020-2503-6
  30. Mol Cancer Res. 2020 Jul 13. pii: molcanres.0311.2019. [Epub ahead of print]
      Triple-negative breast cancer (TNBC) has the worst prognosis of all breast cancers, and lacks effective targeted treatment strategies. Previously, we identified 33 transcription factors highly expressed in TNBC. Here, we focused on six SOX transcription factors (SOX4, 6, 8, 9, 10 and 11) highly expressed in TNBCs. Our siRNA screening assay demonstrated that SOX9 knock-down suppressed TNBC cell growth and invasion in vitro. Thus, we hypothesized that SOX9 is an important regulator of breast cancer survival and metastasis, and demonstrated that knockout of SOX9 reduced breast tumor growth and lung metastasis in vivo. In addition, we found that loss of SOX9 induced profound apoptosis, with only a slight impairment of G1 to S progression within the cell cycle, and that SOX9 directly regulates genes controlling apoptosis. Based on published CHIP-seq data, we demonstrated that SOX9 binds to the promoter of apoptosis-regulating genes (tnfrsf1b, fadd, tnfrsf10a, tnfrsf10b, and ripk1), and represses their expression. SOX9 knock-down upregulates these genes, consistent with the induction of apoptosis. Analysis of available CHIP-seq data showed that SOX9 binds to the promoters of several EMT- and metastasis-regulating genes. Using CHIP assays, we demonstrated that SOX9 directly binds the promoters of genes involved in EMT (vim, cldn1, ctnnb1, and zeb1) and that SOX9 knock-down suppresses the expression of these genes. Implications: Our studies identified the SOX9 protein as a "master regulator" of breast cancer cell survival and metastasis, and provide preclinical rationale to develop SOX9 inhibitors for the treatment of women with metastatic triple-negative breast cancer.
    DOI:  https://doi.org/10.1158/1541-7786.MCR-19-0311
  31. Mol Cell. 2020 Jul 08. pii: S1097-2765(20)30433-0. [Epub ahead of print]
      Steroid receptors activate gene transcription by recruiting coactivators to initiate transcription of their target genes. For most nuclear receptors, the ligand-dependent activation function domain-2 (AF-2) is a primary contributor to the nuclear receptor (NR) transcriptional activity. In contrast to other steroid receptors, such as ERα, the activation function of androgen receptor (AR) is largely dependent on its ligand-independent AF-1 located in its N-terminal domain (NTD). It remains unclear why AR utilizes a different AF domain from other receptors despite that NRs share similar domain organizations. Here, we present cryoelectron microscopy (cryo-EM) structures of DNA-bound full-length AR and its complex structure with key coactivators, SRC-3 and p300. AR dimerization follows a unique head-to-head and tail-to-tail manner. Unlike ERα, AR directly contacts a single SRC-3 and p300. The AR NTD is the primary site for coactivator recruitment. The structures provide a basis for understanding assembly of the AR:coactivator complex and its domain contributions for coactivator assembly and transcriptional regulation.
    Keywords:  AR dimerization; N-terminal domain; N/C interaction; SRC-3; androgen receptor; coactivator; complex; cryo-EM; p300; structure
    DOI:  https://doi.org/10.1016/j.molcel.2020.06.031
  32. Nat Commun. 2020 Jul 17. 11(1): 3603
      Members of the PR/SET domain-containing (PRDM) family of zinc finger transcriptional regulators play diverse developmental roles. PRDM10 is a yet uncharacterized family member, and its function in vivo is unknown. Here, we report an essential requirement for PRDM10 in pre-implantation embryos and embryonic stem cells (mESCs), where loss of PRDM10 results in severe cell growth inhibition. Detailed genomic and biochemical analyses reveal that PRDM10 functions as a sequence-specific transcription factor. We identify Eif3b, which encodes a core component of the eukaryotic translation initiation factor 3 (eIF3) complex, as a key downstream target, and demonstrate that growth inhibition in PRDM10-deficient mESCs is in part mediated through EIF3B-dependent effects on global translation. Our work elucidates the molecular function of PRDM10 in maintaining global translation, establishes its essential role in early embryonic development and mESC homeostasis, and offers insights into the functional repertoire of PRDMs as well as the transcriptional mechanisms regulating translation.
    DOI:  https://doi.org/10.1038/s41467-020-17304-3
  33. Development. 2020 Jul 16. pii: dev.190181. [Epub ahead of print]
      Neuronal phenotypes are controlled by terminal selector transcription factors in invertebrates, but only a few examples of such regulators have been provided in vertebrates. We hypothesised that TCF7L2 regulates different stages of postmitotic differentiation in the thalamus, and functions as a thalamic terminal selector. To investigate this hypothesis, we used complete and conditional knockouts of Tcf7l2 in mice. The connectivity and clustering of neurons were disrupted in the thalamo-habenular region in Tcf7l2 -/- embryos. The expression of subregional thalamic and habenular transcription factors was lost and region-specific cell migration and axon guidance genes were downregulated. In mice with a postnatal Tcf7l2 knockout, the induction of genes that confer thalamic terminal electrophysiological features was impaired. Many of these genes proved to be direct targets of TCF7L2. The role of TCF7L2 in terminal selection was functionally confirmed by impaired firing modes in thalamic neurons in the mutant mice. These data corroborate the existence of master regulators in the vertebrate brain that control stage-specific genetic programs and regional subroutines, maintain regional transcriptional network during embryonic development, and induce terminal selection postnatally.
    Keywords:  Brain development; Neuronal identity; TCF7L2; Terminal selector; Thalamus; Transcription factor
    DOI:  https://doi.org/10.1242/dev.190181
  34. Cell Stem Cell. 2020 Jul 14. pii: S1934-5909(20)30285-X. [Epub ahead of print]
      DNA methyltransferase 3A (DNMT3A) is the most commonly mutated gene in clonal hematopoiesis (CH). Somatic DNMT3A mutations arise in hematopoietic stem cells (HSCs) many years before malignancies develop, but difficulties in comparing their impact before malignancy with wild-type cells have limited the understanding of their contributions to transformation. To circumvent this limitation, we derived normal and DNMT3A mutant lymphoblastoid cell lines from a germline mosaic individual in whom these cells co-existed for nearly 6 decades. Mutant cells dominated the blood system, but not other tissues. Deep sequencing revealed similar mutational burdens and signatures in normal and mutant clones, while epigenetic profiling uncovered the focal erosion of DNA methylation at oncogenic regulatory regions in mutant clones. These regions overlapped with those sensitive to DNMT3A loss after DNMT3A ablation in HSCs and in leukemia samples. These results suggest that DNMT3A maintains a conserved DNA methylation pattern, the erosion of which provides a distinct competitive advantage to hematopoietic cells.
    Keywords:  DNMT3A; HSC; cell competition; clonal hematopoiesis; hematopoietic stem cells; mutation burden; mutation signature
    DOI:  https://doi.org/10.1016/j.stem.2020.06.018
  35. Genome Biol. 2020 Jul 13. 21(1): 171
      We present Hierarchical Bayesian Analysis of Differential Expression and ALternative Splicing (HBA-DEALS), which simultaneously characterizes differential expression and splicing in cohorts. HBA-DEALS attains state of the art or better performance for both expression and splicing and allows genes to be characterized as having differential gene expression, differential alternative splicing, both, or neither. HBA-DEALS analysis of GTEx data demonstrated sets of genes that show predominant DGE or DAST across multiple tissue types. These sets have pervasive differences with respect to gene structure, function, membership in protein complexes, and promoter architecture.
    Keywords:  Alternative splicing; Differential expression; Transcription
    DOI:  https://doi.org/10.1186/s13059-020-02072-6
  36. J Cell Sci. 2020 Jul 13. pii: jcs.243303. [Epub ahead of print]
      CENP-B binds to CENP-B boxes on centromeric satellite DNAs (alphoid DNA in human). CENP-B maintains kinetochore function through interactions with CENP-A nucleosomes and CENP-C. CENP-B binding to transfected alphoid DNA can induce de novo CENP-A assembly, functional centromere/kinetochore formation and subsequent human artificial chromosome (HAC) formation. On the other hand, CENP-B also facilitates H3K9 tri-methylation on alphoid DNA via Suv39h1 at ectopic alphoid DNA integration sites. Excessive heterochromatin invasion into centromere chromatin suppresses CENP-A assembly. It is unclear how CENP-B controls such different chromatin states. Here, we show that the CENP-B acidic domain recruits histone chaperones and many chromatin modifiers including H3K36 methylase ASH1L, as well as the heterochromatin components, Suv39h1 and HP1s. ASH1L facilitates open chromatin formation competent for CENP-A assembly on alphoid DNA. These results indicate that CENP-B is a nexus for histone modifiers that alternatively promote or suppress CENP-A assembly by mutually exclusive mechanisms. Besides the DNA binding domain, the CENP-B acidic domain also facilitates CENP-A assembly de novo on transfected alphoid DNA. CENP-B therefore balances CENP-A assembly or heterochromatin formation on satellite DNA.
    Keywords:  ASH1L; Acidic rich domain; Alternative epigenetic states; CENP-B; Centromere; HP1
    DOI:  https://doi.org/10.1242/jcs.243303
  37. Bioinformatics. 2020 Jul 01. 36(Supplement_1): i542-i550
       MOTIVATION: Cellular Indexing of Transcriptomes and Epitopes by sequencing (CITE-seq), couples the measurement of surface marker proteins with simultaneous sequencing of mRNA at single cell level, which brings accurate cell surface phenotyping to single-cell transcriptomics. Unfortunately, multiplets in CITE-seq datasets create artificial cell types (ACT) and complicate the automation of cell surface phenotyping.
    RESULTS: We propose CITE-sort, an artificial-cell-type aware surface marker clustering method for CITE-seq. CITE-sort is aware of and is robust to multiplet-induced ACT. We benchmarked CITE-sort with real and simulated CITE-seq datasets and compared CITE-sort against canonical clustering methods. We show that CITE-sort produces the best clustering performance across the board. CITE-sort not only accurately identifies real biological cell types (BCT) but also consistently and reliably separates multiplet-induced artificial-cell-type droplet clusters from real BCT droplet clusters. In addition, CITE-sort organizes its clustering process with a binary tree, which facilitates easy interpretation and verification of its clustering result and simplifies cell-type annotation with domain knowledge in CITE-seq.
    AVAILABILITY AND IMPLEMENTATION: http://github.com/QiuyuLian/CITE-sort.
    SUPPLEMENTARY INFORMATION: Supplementary data is available at Bioinformatics online.
    DOI:  https://doi.org/10.1093/bioinformatics/btaa467
  38. Bioinformatics. 2020 Jul 01. 36(Supplement_1): i508-i515
       MOTIVATION: Gaining a comprehensive understanding of the genetics underlying cancer development and progression is a central goal of biomedical research. Its accomplishment promises key mechanistic, diagnostic and therapeutic insights. One major step in this direction is the identification of genes that drive the emergence of tumors upon mutation. Recent advances in the field of computational biology have shown the potential of combining genetic summary statistics that represent the mutational burden in genes with biological networks, such as protein-protein interaction networks, to identify cancer driver genes. Those approaches superimpose the summary statistics on the nodes in the network, followed by an unsupervised propagation of the node scores through the network. However, this unsupervised setting does not leverage any knowledge on well-established cancer genes, a potentially valuable resource to improve the identification of novel cancer drivers.
    RESULTS: We develop a novel node embedding that enables classification of cancer driver genes in a supervised setting. The embedding combines a representation of the mutation score distribution in a node's local neighborhood with network propagation. We leverage the knowledge of well-established cancer driver genes to define a positive class, resulting in a partially labeled dataset, and develop a cross-validation scheme to enable supervised prediction. The proposed node embedding followed by a supervised classification improves the predictive performance compared with baseline methods and yields a set of promising genes that constitute candidates for further biological validation.
    AVAILABILITY AND IMPLEMENTATION: Code available at https://github.com/BorgwardtLab/MoProEmbeddings.
    SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
    DOI:  https://doi.org/10.1093/bioinformatics/btaa452
  39. Nature. 2020 Jul 15.
      Proteins are manufactured by ribosomes-macromolecular complexes of protein and RNA molecules that are assembled within major nuclear compartments called nucleoli1,2. Existing models suggest that RNA polymerases I and III (Pol I and Pol III) are the only enzymes that directly mediate the expression of the ribosomal RNA (rRNA) components of ribosomes. Here we show, however, that RNA polymerase II (Pol II) inside human nucleoli operates near genes encoding rRNAs to drive their expression. Pol II, assisted by the neurodegeneration-associated enzyme senataxin, generates a shield comprising triplex nucleic acid structures known as R-loops at intergenic spacers flanking nucleolar rRNA genes. The shield prevents Pol I from producing sense intergenic noncoding RNAs (sincRNAs) that can disrupt nucleolar organization and rRNA expression. These disruptive sincRNAs can be unleashed by Pol II inhibition, senataxin loss, Ewing sarcoma or locus-associated R-loop repression through an experimental system involving the proteins RNaseH1, eGFP and dCas9 (which we refer to as 'red laser'). We reveal a nucleolar Pol-II-dependent mechanism that drives ribosome biogenesis, identify disease-associated disruption of nucleoli by noncoding RNAs, and establish locus-targeted R-loop modulation. Our findings revise theories of labour division between the major RNA polymerases, and identify nucleolar Pol II as a major factor in protein synthesis and nuclear organization, with potential implications for health and disease.
    DOI:  https://doi.org/10.1038/s41586-020-2497-0
  40. Nat Plants. 2020 Jul 13.
      The DNA methyltransferases MET1 and CMT3 are known to be responsible for maintenance of DNA methylation at symmetric CG and CHG sites, respectively, in Arabidopsis thaliana. However, it is unknown how the expression of methyltransferase genes is regulated in different cell states and whether change in expression affects DNA methylation at the whole-genome level. Using a reverse genetic screen, we identified TCX5, a tesmin/TSO1-like CXC domain-containing protein, and demonstrated that it is a transcriptional repressor of genes required for maintenance of DNA methylation, which include MET1, CMT3, DDM1, KYP and VIMs. TCX5 functions redundantly with its paralogue TCX6 in repressing the expression of these genes. In the tcx5 tcx6 double mutant, expression of these genes is markedly increased, thereby leading to markedly increased DNA methylation at CHG sites and, to a lesser extent, at CG sites at the whole-genome level. Furthermore, our whole-genome DNA methylation analysis indicated that the CG and CHG methylation level is lower in differentiated quiescent cells than in dividing cells in the wild type but is comparable in the tcx5/6 mutant, suggesting that TCX5/6 are required for maintenance of the difference in DNA methylation between the two cell types. We identified TCX5/6-containing multi-subunit complexes, which are known as DREAM in other eukaryotes, and demonstrated that the Arabidopsis DREAM components function as a whole to preclude DNA hypermethylation. Given that the DREAM complexes are conserved from plants to animals, the preclusion of DNA hypermethylation by DREAM complexes may represent a conserved mechanism in eukaryotes.
    DOI:  https://doi.org/10.1038/s41477-020-0710-7
  41. J Clin Invest. 2020 Jul 14. pii: 124070. [Epub ahead of print]
      Chronic inflammation is deeply involved in various human disorders, such as cancer, neurodegenerative disorders, and metabolic disorders. Induction of epigenetic alterations, especially aberrant DNA methylation, is one of the major mechanisms, but how it is induced is still unclear. Here, we found that expression of TET genes, methylation erasers, was down-regulated in inflamed mouse and human tissues, and that this was caused by up-regulation of TET-targeting miRNAs, such as MIR20A, MIR26B, and MIR29C, likely due to activation of NF-kB signaling, downstream of IL-1b and TNF-a. However, TET knockdown induced only mild aberrant methylation. Nitric oxide (NO), produced by NOS2, enhanced enzymatic activity of DNMTs, methylation writers, and NO exposure induced minimal aberrant methylation. In contrast, a combination of TET knockdown and NO exposure synergistically induced aberrant methylation, involving genomic regions not methylated by either alone. The results showed that a vicious combination of TET repression, due to NF-kB activation, and DNMT activation, due to NO production, is responsible for aberrant methylation induction in human tissues.
    Keywords:  Epigenetics; Oncology
    DOI:  https://doi.org/10.1172/JCI124070
  42. Proc Natl Acad Sci U S A. 2020 Jul 15. pii: 202002449. [Epub ahead of print]
      Early pregnancy loss affects ∼15% of all implantation-confirmed human conceptions. However, evolutionarily conserved molecular mechanisms that regulate self-renewal of trophoblast progenitors and their association with early pregnancy loss are poorly understood. Here, we provide evidence that transcription factor TEAD4 ensures survival of postimplantation mouse and human embryos by controlling self-renewal and stemness of trophoblast progenitors within the placenta primordium. In an early postimplantation mouse embryo, TEAD4 is selectively expressed in trophoblast stem cell-like progenitor cells (TSPCs), and loss of Tead4 in postimplantation mouse TSPCs impairs their self-renewal, leading to embryonic lethality before embryonic day 9.0, a developmental stage equivalent to the first trimester of human gestation. Both TEAD4 and its cofactor, yes-associated protein 1 (YAP1), are specifically expressed in cytotrophoblast (CTB) progenitors of a first-trimester human placenta. We also show that a subset of unexplained recurrent pregnancy losses (idiopathic RPLs) is associated with impaired TEAD4 expression in CTB progenitors. Furthermore, by establishing idiopathic RPL patient-specific human trophoblast stem cells (RPL-TSCs), we show that loss of TEAD4 is associated with defective self-renewal in RPL-TSCs and rescue of TEAD4 expression restores their self-renewal ability. Unbiased genomics studies revealed that TEAD4 directly regulates expression of key cell cycle genes in both mouse and human TSCs and establishes a conserved transcriptional program. Our findings show that TEAD4, an effector of the Hippo signaling pathway, is essential for the establishment of pregnancy in a postimplantation mammalian embryo and indicate that impairment of the Hippo signaling pathway could be a molecular cause for early human pregnancy loss.
    Keywords:  Hippo signaling; TEAD4; placenta; recurrent pregnancy loss; trophoblast progenitor
    DOI:  https://doi.org/10.1073/pnas.2002449117
  43. Sci Adv. 2020 Jul;6(27): eaaz4012
      Expanded CAG/CTG repeats underlie 13 neurological disorders, including myotonic dystrophy type 1 (DM1) and Huntington's disease (HD). Upon expansion, disease loci acquire heterochromatic characteristics, which may provoke changes to chromatin conformation and thereby affect both gene expression and repeat instability. Here, we tested this hypothesis by performing 4C sequencing at the DMPK and HTT loci from DM1 and HD-derived cells. We find that allele sizes ranging from 15 to 1700 repeats displayed similar chromatin interaction profiles. This was true for both loci and for alleles with different DNA methylation levels and CTCF binding. Moreover, the ectopic insertion of an expanded CAG repeat tract did not change the conformation of the surrounding chromatin. We conclude that CAG/CTG repeat expansions are not enough to alter chromatin conformation in cis. Therefore, it is unlikely that changes in chromatin interactions drive repeat instability or changes in gene expression in these disorders.
    DOI:  https://doi.org/10.1126/sciadv.aaz4012
  44. BMC Genomics. 2020 Jul 13. 21(1): 479
       BACKGROUND: Whole-Genome Bisulfite Sequencing (WGBS) is a Next Generation Sequencing (NGS) technique for measuring DNA methylation at base resolution. Continuing drops in sequencing costs are beginning to enable high-throughput surveys of DNA methylation in large samples of individuals and/or single cells. These surveys can easily generate hundreds or even thousands of WGBS datasets in a single study. The efficient pre-processing of these large amounts of data poses major computational challenges and creates unnecessary bottlenecks for downstream analysis and biological interpretation.
    RESULTS: To offer an efficient analysis solution, we present MethylStar, a fast, stable and flexible pre-processing pipeline for WGBS data. MethylStar integrates well-established tools for read trimming, alignment and methylation state calling in a highly parallelized environment, manages computational resources and performs automatic error detection. MethylStar offers easy installation through a dockerized container with all preloaded dependencies and also features a user-friendly interface designed for experts/non-experts. Application of MethylStar to WGBS from Human, Maize and A. thaliana shows favorable performance in terms of speed and memory requirements compared with existing pipelines.
    CONCLUSIONS: MethylStar is a fast, stable and flexible pipeline for high-throughput pre-processing of bulk or single-cell WGBS data. Its easy installation and user-friendly interface should make it a useful resource for the wider epigenomics community. MethylStar is distributed under GPL-3.0 license and source code is publicly available for download from github https://github.com/jlab-code/MethylStar . Installation through a docker image is available from http://jlabdata.org/methylstar.tar.gz.
    Keywords:  DNA methylation; NGS; Pipeline; Single cell; Whole genome bisulfite sequencing
    DOI:  https://doi.org/10.1186/s12864-020-06886-3