bims-crepig Biomed News
on Chromatin regulation and epigenetics in cell fate and cancer
Issue of 2020‒05‒17
forty-five papers selected by
Connor Rogerson
University of Cambridge, MRC Cancer Unit

  1. Nat Genet. 2020 May 11.
    Douillet D, Sze CC, Ryan C, Piunti A, Shah AP, Ugarenko M, Marshall SA, Rendleman EJ, Zha D, Helmin KA, Zhao Z, Cao K, Morgan MA, Singer BD, Bartom ET, Smith ER, Shilatifard A.
      The COMPASS protein family catalyzes histone H3 Lys 4 (H3K4) methylation and its members are essential for regulating gene expression. MLL2/COMPASS methylates H3K4 on many developmental genes and bivalent clusters. To understand MLL2-dependent transcriptional regulation, we performed a CRISPR-based screen with an MLL2-dependent gene as a reporter in mouse embryonic stem cells. We found that MLL2 functions in gene expression by protecting developmental genes from repression via repelling PRC2 and DNA methylation machineries. Accordingly, repression in the absence of MLL2 is relieved by inhibition of PRC2 and DNA methyltransferases. Furthermore, DNA demethylation on such loci leads to reactivation of MLL2-dependent genes not only by removing DNA methylation but also by opening up previously CpG methylated regions for PRC2 recruitment, diluting PRC2 at Polycomb-repressed genes. These findings reveal how the context and function of these three epigenetic modifiers of chromatin can orchestrate transcriptional decisions and demonstrate that prevention of active repression by the context of the enzyme and not H3K4 trimethylation underlies transcriptional regulation on MLL2/COMPASS targets.
  2. Nucleic Acids Res. 2020 May 11. pii: gkaa360. [Epub ahead of print]
    Wen Z, Zhang L, Ruan H, Li G.
      Nucleosome is the basic structural unit of chromatin, and its dynamics plays critical roles in the regulation of genome functions. However, how the nucleosome structure is regulated by histone variants in vivo is still largely uncharacterized. Here, by employing Micrococcal nuclease (MNase) digestion of crosslinked chromatin followed by chromatin immunoprecipitation (ChIP) and paired-end sequencing (MNase-X-ChIP-seq), we mapped unwrapping states of nucleosomes containing histone variant H2A.Z in mouse embryonic stem (ES) cells. We found that H2A.Z nucleosomes are more enriched with unwrapping states compared with canonical nucleosomes. Interestingly, +1 H2A.Z nucleosomes with 30-80 bp DNA is correlated with less active genes compared with +1 H2A.Z nucleosomes with 120-140 bp DNA. We confirmed the unwrapping of H2A.Z nucleosomes under native condition by re-ChIP of H2A.Z and H2A after CTCF CUT&RUN in mouse ES cells. Importantly, we found that depletion of H2A.Z results in decreased unwrapping of H3.3 nucleosomes and increased CTCF binding. Taken together, through MNase-X-ChIP-seq, we showed that histone variant H2A.Z regulates nucleosome unwrapping in vivo and that its function in regulating transcription or CTCF binding is correlated with unwrapping states of H2A.Z nucleosomes.
  3. Cell Mol Life Sci. 2020 May 14.
    Montibus B, Cercy J, Bouschet T, Charras A, Maupetit-Méhouas S, Nury D, Gonthier-Guéret C, Chauveau S, Allegre N, Chariau C, Hong CC, Vaillant I, Marques CJ, Court F, Arnaud P.
      The acquisition of cell identity is associated with developmentally regulated changes in the cellular histone methylation signatures. For instance, commitment to neural differentiation relies on the tightly controlled gain or loss of H3K27me3, a hallmark of polycomb-mediated transcriptional gene silencing, at specific gene sets. The KDM6B demethylase, which removes H3K27me3 marks at defined promoters and enhancers, is a key factor in neurogenesis. Therefore, to better understand the epigenetic regulation of neural fate acquisition, it is important to determine how Kdm6b expression is regulated. Here, we investigated the molecular mechanisms involved in the induction of Kdm6b expression upon neural commitment of mouse embryonic stem cells. We found that the increase in Kdm6b expression is linked to a rearrangement between two 3D configurations defined by the promoter contact with two different regions in the Kdm6b locus. This is associated with changes in 5-hydroxymethylcytosine (5hmC) levels at these two regions, and requires a functional ten-eleven-translocation (TET) 3 protein. Altogether, our data support a model whereby Kdm6b induction upon neural commitment relies on an intronic enhancer the activity of which is defined by its TET3-mediated 5-hmC level. This original observation reveals an unexpected interplay between the 5-hmC and H3K27me3 pathways during neural lineage commitment in mammals. It also questions to which extent KDM6B-mediated changes in H3K27me3 level account for the TET-mediated effects on gene expression.
    Keywords:  5-Hydroxymethylcytosine; Enhancer; H3K27me3; Kdm6b; Neural stem cells; Neurogenesis; Tet3
  4. Nat Cell Biol. 2020 May 11.
    Adam RC, Yang H, Ge Y, Infarinato NR, Gur-Cohen S, Miao Y, Wang P, Zhao Y, Lu CP, Kim JE, Ko JY, Paik SS, Gronostajski RM, Kim J, Krueger JG, Zheng D, Fuchs E.
      Tissue homeostasis and regeneration rely on resident stem cells (SCs), whose behaviour is regulated through niche-dependent crosstalk. The mechanisms underlying SC identity are still unfolding. Here, using spatiotemporal gene ablation in murine hair follicles, we uncover a critical role for the transcription factors (TFs) nuclear factor IB (NFIB) and IX (NFIX) in maintaining SC identity. Without NFI TFs, SCs lose their hair-regenerating capability, and produce skin bearing striking resemblance to irreversible human alopecia, which also displays reduced NFIs. Through single-cell transcriptomics, ATAC-Seq and ChIP-Seq profiling, we expose a key role for NFIB and NFIX in governing super-enhancer maintenance of the key hair follicle SC-specific TF genes. When NFIB and NFIX are genetically removed, the stemness epigenetic landscape is lost. Super-enhancers driving SC identity are decommissioned, while unwanted lineages are de-repressed ectopically. Together, our findings expose NFIB and NFIX as crucial rheostats of tissue homeostasis, functioning to safeguard the SC epigenome from a breach in lineage confinement that otherwise triggers irreversible tissue degeneration.
  5. Mol Cell. 2020 May 04. pii: S1097-2765(20)30261-6. [Epub ahead of print]
    Kim SA, Zhu J, Yennawar N, Eek P, Tan S.
      LSD1 (lysine specific demethylase; also known as KDM1A), the first histone demethylase discovered, regulates cell-fate determination and is overexpressed in multiple cancers. LSD1 demethylates histone H3 Lys4, an epigenetic mark for active genes, but requires the CoREST repressor to act on nucleosome substrates. To understand how an accessory subunit (CoREST) enables a chromatin enzyme (LSD1) to function on a nucleosome and not just histones, we have determined the crystal structure of the LSD1/CoREST complex bound to a 191-bp nucleosome. We find that the LSD1 catalytic domain binds extranucleosomal DNA and is unexpectedly positioned 100 Å away from the nucleosome core. CoREST makes critical contacts with both histone and DNA components of the nucleosome, explaining its essential function in demethylating nucleosome substrates. Our studies also show that the LSD1(K661A) frequently used as a catalytically inactive mutant in vivo (based on in vitro peptide studies) actually retains substantial H3K4 demethylase activity on nucleosome substrates.
    Keywords:  X-ray crystallography; chromatin biology; epigenetics; gene regulation; histone demethylation; histone modifications; nucleosome binding
  6. Leukemia. 2020 May 12.
    Ghasemi R, Struthers H, Wilson ER, Spencer DH.
      Transcriptional regulation of the HOXA genes is thought to involve CTCF-mediated chromatin loops and the opposing actions of the COMPASS and Polycomb epigenetic complexes. We investigated the role of these mechanisms at the HOXA cluster in AML cells with the common NPM1c mutation, which express both HOXA and HOXB genes. CTCF binding at the HOXA locus is conserved across primary AML samples, regardless of HOXA gene expression, and defines a continuous chromatin domain marked by COMPASS-associated histone H3 trimethylation in NPM1-mutant primary AML samples. Profiling of the three-dimensional chromatin architecture in primary AML samples with the NPM1c mutation identified chromatin loops between the HOXA cluster and loci in the SNX10 and SKAP2 genes, and an intergenic region located 1.4 Mbp upstream of the HOXA locus. Deletion of CTCF binding sites in the NPM1-mutant OCI-AML3 AML cell line reduced multiple long-range interactions, but resulted in CTCF-independent loops with sequences in SKAP2 that were marked by enhancer-associated histone modifications in primary AML samples. HOXA gene expression was maintained in CTCF binding site mutants, indicating that transcriptional activity at the HOXA locus in NPM1-mutant AML cells may be sustained through persistent interactions with SKAP2 enhancers, or by intrinsic factors within the HOXA gene cluster.
  7. Nucleic Acids Res. 2020 May 12. pii: gkaa369. [Epub ahead of print]
    Kim S, Piquerez SJM, Ramirez-Prado JS, Mastorakis E, Veluchamy A, Latrasse D, Manza-Mianza D, Brik-Chaouche R, Huang Y, Rodriguez-Granados NY, Concia L, Blein T, Citerne S, Bendahmane A, Bergounioux C, Crespi M, Mahfouz MM, Raynaud C, Hirt H, Ntoukakis V, Benhamed M.
      The modification of histones by acetyl groups has a key role in the regulation of chromatin structure and transcription. The Arabidopsis thaliana histone acetyltransferase GCN5 regulates histone modifications as part of the Spt-Ada-Gcn5 Acetyltransferase (SAGA) transcriptional coactivator complex. GCN5 was previously shown to acetylate lysine 14 of histone 3 (H3K14ac) in the promoter regions of its target genes even though GCN5 binding did not systematically correlate with gene activation. Here, we explored the mechanism through which GCN5 controls transcription. First, we fine-mapped its GCN5 binding sites genome-wide and then used several global methodologies (ATAC-seq, ChIP-seq and RNA-seq) to assess the effect of GCN5 loss-of-function on the expression and epigenetic regulation of its target genes. These analyses provided evidence that GCN5 has a dual role in the regulation of H3K14ac levels in their 5' and 3' ends of its target genes. While the gcn5 mutation led to a genome-wide decrease of H3K14ac in the 5' end of the GCN5 down-regulated targets, it also led to an increase of H3K14ac in the 3' ends of GCN5 up-regulated targets. Furthermore, genome-wide changes in H3K14ac levels in the gcn5 mutant correlated with changes in H3K9ac at both 5' and 3' ends, providing evidence for a molecular link between the depositions of these two histone modifications. To understand the biological relevance of these regulations, we showed that GCN5 participates in the responses to biotic stress by repressing salicylic acid (SA) accumulation and SA-mediated immunity, highlighting the role of this protein in the regulation of the crosstalk between diverse developmental and stress-responsive physiological programs. Hence, our results demonstrate that GCN5, through the modulation of H3K14ac levels on its targets, controls the balance between biotic and abiotic stress responses and is a master regulator of plant-environmental interactions.
  8. Physiol Genomics. 2020 May 11.
    Yin S, Ray G, Kerschner JL, Hao S, Perez A, Drumm M, Browne J, Leir SH, Longworth M, Harris A.
      Organoids are a valuable 3D model to study the differentiated functions of the human intestinal epithelium. They are a particularly powerful tool to measure epithelial transport processes in health and disease. Though biological assays such as organoid swelling and intraluminal pH measurements are well established, their underlying functional genomics are not well characterized. Here we combine genome-wide analysis of open chromatin by ATAC-seq with transcriptome mapping by RNA-seq to define the genomic signature of human intestinal organoids (HIOs). These data provide an important tool for investigating key physiological and biochemical processes in the intestinal epithelium. We next compared the transcriptome and open chromatin profiles of HIOs with equivalent datasets from the Caco2 colorectal carcinoma line, which is an important 2D model of the intestinal epithelium. Our results define common features of the intestinal epithelium in HIO and Caco2 and further illustrate the cancer-associated program of the cell line. Generation of Caco2 cysts enabled interrogation of the molecular divergence of the 2D and 3D cultures. Over-represented motif analysis of open chromatin peaks identified Caudal Type Homeobox 2 (CDX2) as a key activating transcription factor in HIO, but not in monolayer cultures of Caco2. However, the CDX2 motif becomes overrepresented in open chromatin from Caco2 cysts, reinforcing the importance of this factor in intestinal epithelial differentiation and function. Intersection of the HIO and Caco2 transcriptomes further showed functional overlap in pathways of ion transport and tight junction integrity, among others. These data contribute to understanding human intestinal organoid biology.
    Keywords:  Functional genomics; Intestinal organoids; gene expression; open chromatin
  9. Mol Cell. 2020 May 07. pii: S1097-2765(20)30260-4. [Epub ahead of print]78(3): 506-521.e6
    Zhang X, Jeong M, Huang X, Wang XQ, Wang X, Zhou W, Shamim MS, Gore H, Himadewi P, Liu Y, Bochkov ID, Reyes J, Doty M, Huang YH, Jung H, Heikamp E, Aiden AP, Li W, Su J, Aiden EL, Goodell MA.
      Higher-order chromatin structure and DNA methylation are implicated in multiple developmental processes, but their relationship to cell state is unknown. Here, we find that large (>7.3 kb) DNA methylation nadirs (termed "grand canyons") can form long loops connecting anchor loci that may be dozens of megabases (Mb) apart, as well as inter-chromosomal links. The interacting loci cover a total of ∼3.5 Mb of the human genome. The strongest interactions are associated with repressive marks made by the Polycomb complex and are diminished upon EZH2 inhibitor treatment. The data are suggestive of the formation of these loops by interactions between repressive elements in the loci, forming a genomic subcompartment, rather than by cohesion/CTCF-mediated extrusion. Interestingly, unlike previously characterized subcompartments, these interactions are present only in particular cell types, such as stem and progenitor cells. Our work reveals that H3K27me3-marked large DNA methylation grand canyons represent a set of very-long-range loops associated with cellular identity.
    Keywords:  3D genomics; CpG; DNA methylation; DNA methylation canyon; Polycomb; chromosomal looping; hematopoietic; self-renewal; stem cells
  10. Elife. 2020 May 12. pii: e53885. [Epub ahead of print]9
    Golfier S, Quail T, Kimura H, Brugués J.
      Loop extrusion by structural maintenance of chromosomes complexes (SMCs) has been proposed as a mechanism to organize chromatin in interphase and metaphase. However, the requirements for chromatin organization in these cell cycle phases are different, and it is unknown whether loop extrusion dynamics and the complexes that extrude DNA also differ. Here, we used Xenopus egg extracts to reconstitute and image loop extrusion of single DNA molecules during the cell cycle. We show that loops form in both metaphase and interphase, but with distinct dynamic properties. Condensin extrudes DNA loops non-symmetrically in metaphase, whereas cohesin extrudes loops symmetrically in interphase. Our data show that loop extrusion is a general mechanism underlying DNA organization, with dynamic and structural properties that are biochemically regulated during the cell cycle.
    Keywords:  cell biology; chromosomes; gene expression; xenopus
  11. Cells. 2020 May 11. pii: E1190. [Epub ahead of print]9(5):
    Lee YJ, Son SH, Lim CS, Kim MY, Lee SW, Lee S, Jeon J, Ha DH, Jung NR, Han SY, Do BR, Na I, Uversky VN, Kim CG.
      Chromatin remodeling, including histone modification, chromatin (un)folding, and nucleosome remodeling, is a significant transcriptional regulation mechanism. By these epigenetic modifications, transcription factors and their regulators are recruited to the promoters of target genes, and thus gene expression is controlled through either transcriptional activation or repression. The Mat1-mediated transcriptional repressor (MMTR)/DNA methyltransferase 1 (DNMT1)-associated protein (Dmap1) is a transcription corepressor involved in chromatin remodeling, cell cycle regulation, DNA double-strand break repair, and tumor suppression. The Tip60-p400 complex proteins, including MMTR/Dmap1, interact with the oncogene Myc in embryonic stem cells (ESCs). These proteins interplay with the stem cell-related proteome networks and regulate gene expressions. However, the detailed mechanisms of their functions are unknown. Here, we show that MMTR/Dmap1, along with other Tip60-p400 complex proteins, bind the promoters of differentiation commitment genes in mouse ESCs. Hence, MMTR/Dmap1 controls gene expression alterations during differentiation. Furthermore, we propose a novel mechanism of MMTR/Dmap1 function in early stage lineage commitment of mouse ESCs by crosstalk with the polycomb group (PcG) proteins. The complex controls histone mark bivalency and transcriptional poising of commitment genes. Taken together, our comprehensive findings will help better understand the MMTR/Dmap1-mediated transcriptional regulation in ESCs and other cell types.
    Keywords:  MMTR/Dmap1; Tip-p400 complex; bivalency; embryonic stem cells; poised gene; polycomb repressive complexes (PRCs)
  12. Mol Cell. 2020 May 01. pii: S1097-2765(20)30258-6. [Epub ahead of print]
    Bheda P, Aguilar-Gómez D, Becker NB, Becker J, Stavrou E, Kukhtevich I, Höfer T, Maerkl S, Charvin G, Marr C, Kirmizis A, Schneider R.
      Transcriptional memory of gene expression enables adaptation to repeated stimuli across many organisms. However, the regulation and heritability of transcriptional memory in single cells and through divisions remains poorly understood. Here, we combined microfluidics with single-cell live imaging to monitor Saccharomyces cerevisiae galactokinase 1 (GAL1) expression over multiple generations. By applying pedigree analysis, we dissected and quantified the maintenance and inheritance of transcriptional reinduction memory in individual cells through multiple divisions. We systematically screened for loss- and gain-of-memory knockouts to identify memory regulators in thousands of single cells. We identified new loss-of-memory mutants, which affect memory inheritance into progeny. We also unveiled a gain-of-memory mutant, elp6Δ, and suggest that this new phenotype can be mediated through decreased histone occupancy at the GAL1 promoter. Our work uncovers principles of maintenance and inheritance of gene expression states and their regulators at the single-cell level.
    Keywords:  ChIP; Gal1; SGA; epigenetics; inheritance; microfluidics; modeling; pedigree; single cell; transcriptional memory
  13. FEBS Lett. 2020 May 11.
    Marcum RD, Radhakrishnan I.
      The Sin3L/Rpd3L histone deacetylase (HDAC) complex is one of six major HDAC complexes in the nucleus, and its recruitment by promoter-bound transcription factors is an important step in many gene transcription regulatory pathways. Here, we investigate how the Myt1L zinc finger transcription factor, important for neuronal differentiation and the maintenance of neuronal identity, recruits this complex at the molecular level. We show that Myt1L, through a highly conserved segment shared with its paralogs, interacts directly and specifically with the Sin3 PAH1 domain, binding principally to the canonical hydrophobic cleft found in PAH domains. Our findings are relevant not only for other members of the Myt family but also for enhancing our understanding of the rules of protein-protein interactions involving Sin3 PAH domains.
    Keywords:  Histone deacetylases; neuronal differentiation; protein-protein interaction; repressor-corepressor interaction; solution NMR spectroscopy; structure-function analysis; transcriptional repression
  14. Nat Commun. 2020 May 12. 11(1): 2364
    Zorzan I, Pellegrini M, Arboit M, Incarnato D, Maldotti M, Forcato M, Tagliazucchi GM, Carbognin E, Montagner M, Oliviero S, Martello G.
      Human pluripotent stem cells (hPSCs) have the capacity to give rise to all differentiated cells of the adult. TGF-beta is used routinely for expansion of conventional hPSCs as flat epithelial colonies expressing the transcription factors POU5F1/OCT4, NANOG, SOX2. Here we report a global analysis of the transcriptional programme controlled by TGF-beta followed by an unbiased gain-of-function screening in multiple hPSC lines to identify factors mediating TGF-beta activity. We identify a quartet of transcriptional regulators promoting hPSC self-renewal including ZNF398, a human-specific mediator of pluripotency and epithelial character in hPSCs. Mechanistically, ZNF398 binds active promoters and enhancers together with SMAD3 and the histone acetyltransferase EP300, enabling transcription of TGF-beta targets. In the context of somatic cell reprogramming, inhibition of ZNF398 abolishes activation of pluripotency and epithelial genes and colony formation. Our findings have clear implications for the generation of bona fide hPSCs for regenerative medicine.
  15. Cell. 2020 May 05. pii: S0092-8674(20)30481-5. [Epub ahead of print]
    Basu S, Mackowiak SD, Niskanen H, Knezevic D, Asimi V, Grosswendt S, Geertsema H, Ali S, Jerković I, Ewers H, Mundlos S, Meissner A, Ibrahim DM, Hnisz D.
      Expansions of amino acid repeats occur in >20 inherited human disorders, and many occur in intrinsically disordered regions (IDRs) of transcription factors (TFs). Such diseases are associated with protein aggregation, but the contribution of aggregates to pathology has been controversial. Here, we report that alanine repeat expansions in the HOXD13 TF, which cause hereditary synpolydactyly in humans, alter its phase separation capacity and its capacity to co-condense with transcriptional co-activators. HOXD13 repeat expansions perturb the composition of HOXD13-containing condensates in vitro and in vivo and alter the transcriptional program in a cell-specific manner in a mouse model of synpolydactyly. Disease-associated repeat expansions in other TFs (HOXA13, RUNX2, and TBP) were similarly found to alter their phase separation. These results suggest that unblending of transcriptional condensates may underlie human pathologies. We present a molecular classification of TF IDRs, which provides a framework to dissect TF function in diseases associated with transcriptional dysregulation.
    Keywords:  activation domain; condensate; intrinscially disordered region; phase separation; repeat expansion; synpolydactyly; transcription factor; transcriptional condensate
  16. Nat Commun. 2020 May 15. 11(1): 2423
    Marchetto A, Ohmura S, Orth MF, Knott MML, Colombo MV, Arrigoni C, Bardinet V, Saucier D, Wehweck FS, Li J, Stein S, Gerke JS, Baldauf MC, Musa J, Dallmayer M, Romero-Pérez L, Hölting TLB, Amatruda JF, Cossarizza A, Henssen AG, Kirchner T, Moretti M, Cidre-Aranaz F, Sannino G, Grünewald TGP.
      Ewing sarcoma (EwS) is an aggressive childhood cancer likely originating from mesenchymal stem cells or osteo-chondrogenic progenitors. It is characterized by fusion oncoproteins involving EWSR1 and variable members of the ETS-family of transcription factors (in 85% FLI1). EWSR1-FLI1 can induce target genes by using GGAA-microsatellites as enhancers.Here, we show that EWSR1-FLI1 hijacks the developmental transcription factor SOX6 - a physiological driver of proliferation of osteo-chondrogenic progenitors - by binding to an intronic GGAA-microsatellite, which promotes EwS growth in vitro and in vivo. Through integration of transcriptome-profiling, published drug-screening data, and functional in vitro and in vivo experiments including 3D and PDX models, we discover that constitutively high SOX6 expression promotes elevated levels of oxidative stress that create a therapeutic vulnerability toward the oxidative stress-inducing drug Elesclomol.Collectively, our results exemplify how aberrant activation of a developmental transcription factor by a dominant oncogene can promote malignancy, but provide opportunities for targeted therapy.
  17. Bioinformatics. 2020 May 12. pii: btaa492. [Epub ahead of print]
    Shen Z, Zou Q.
      MOTIVATION: Methylation and transcription factors (TFs) are part of the mechanisms regulating gene expression. However, the numerous mechanisms regulating the interactions between methylation and TFs remain unknown. We employ machine learning techniques to discover the characteristics of transcription factors that bind to methylation sites.RESULTS: The classical machine learning analysis process focuses on improving the performance of the analysis method. Conversely, we focus on the functional properties of the TF sequences. We obtain the principal properties of TFs, namely, the basic polar and hydrophobic Ile amino acids affecting the interaction between TFs and methylated DNA. The recall of the positive instances is 0.878 when their basic polar value is greater than 0.1743. Both basic polar and hydrophobic Ile amino acids distinguish 74% of TFs bound to methylation sites. Therefore, we infer that basic polar amino acids affect the interactions of TFs with methylation sites. Based on our results, the role of the hydrophobic Ile residue is consistent with that described in previous studies, and the basic polar amino acids may also be a key factor modulating the interactions between TFs and methylation.
    SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online and∼shenzijie/
    Keywords:  DNA binding domain; hydrophobicity; machine learning; methylation; transcription factor
  18. Genome Biol. 2020 May 11. 21(1): 114
    Ambrosini G, Vorontsov I, Penzar D, Groux R, Fornes O, Nikolaeva DD, Ballester B, Grau J, Grosse I, Makeev V, Kulakovskiy I, Bucher P.
      BACKGROUND: Positional weight matrix (PWM) is a de facto standard model to describe transcription factor (TF) DNA binding specificities. PWMs inferred from in vivo or in vitro data are stored in many databases and used in a plethora of biological applications. This calls for comprehensive benchmarking of public PWM models with large experimental reference sets.RESULTS: Here we report results from all-against-all benchmarking of PWM models for DNA binding sites of human TFs on a large compilation of in vitro (HT-SELEX, PBM) and in vivo (ChIP-seq) binding data. We observe that the best performing PWM for a given TF often belongs to another TF, usually from the same family. Occasionally, binding specificity is correlated with the structural class of the DNA binding domain, indicated by good cross-family performance measures. Benchmarking-based selection of family-representative motifs is more effective than motif clustering-based approaches. Overall, there is good agreement between in vitro and in vivo performance measures. However, for some in vivo experiments, the best performing PWM is assigned to an unrelated TF, indicating a binding mode involving protein-protein cooperativity.
    CONCLUSIONS: In an all-against-all setting, we compute more than 18 million performance measure values for different PWM-experiment combinations and offer these results as a public resource to the research community. The benchmarking protocols are provided via a web interface and as docker images. The methods and results from this study may help others make better use of public TF specificity models, as well as public TF binding data sets.
    Keywords:  Benchmarking; ChIP-seq; HT-SELEX; PBM; PWM; Transcription factor binding sites
  19. Mol Cell. 2020 May 05. pii: S1097-2765(20)30266-5. [Epub ahead of print]
    Bacon CW, Challa A, Hyder U, Shukla A, Borkar AN, Bayo J, Liu J, Wu SY, Chiang CM, Kutateladze TG, D'Orso I.
      Precise control of the RNA polymerase II (RNA Pol II) cycle, including pausing and pause release, maintains transcriptional homeostasis and organismal functions. Despite previous work to understand individual transcription steps, we reveal a mechanism that integrates RNA Pol II cycle transitions. Surprisingly, KAP1/TRIM28 uses a previously uncharacterized chromatin reader cassette to bind hypo-acetylated histone 4 tails at promoters, guaranteeing continuous progression of RNA Pol II entry to and exit from the pause state. Upon chromatin docking, KAP1 first associates with RNA Pol II and then recruits a pathway-specific transcription factor (SMAD2) in response to cognate ligands, enabling gene-selective CDK9-dependent pause release. This coupling mechanism is exploited by tumor cells to aberrantly sustain transcriptional programs commonly dysregulated in cancer patients. The discovery of a factor integrating transcription steps expands the functional repertoire by which chromatin readers operate and provides mechanistic understanding of transcription regulation, offering alternative therapeutic opportunities to target transcriptional dysregulation.
    Keywords:  CDK9; KAP1; RNA polymerase II; SMAD; TGF-β; TRIM28; cancer; chromatin reader; epigenetics; pausing
  20. Nat Cell Biol. 2020 May 11.
    Borg M, Jacob Y, Susaki D, LeBlanc C, Buendía D, Axelsson E, Kawashima T, Voigt P, Boavida L, Becker J, Higashiyama T, Martienssen R, Berger F.
      Epigenetic marks are reprogrammed in the gametes to reset genomic potential in the next generation. In mammals, paternal chromatin is extensively reprogrammed through the global erasure of DNA methylation and the exchange of histones with protamines1,2. Precisely how the paternal epigenome is reprogrammed in flowering plants has remained unclear since DNA is not demethylated and histones are retained in sperm3,4. Here, we describe a multi-layered mechanism by which H3K27me3 is globally lost from histone-based sperm chromatin in Arabidopsis. This mechanism involves the silencing of H3K27me3 writers, activity of H3K27me3 erasers and deposition of a sperm-specific histone, H3.10 (ref. 5), which we show is immune to lysine 27 methylation. The loss of H3K27me3 facilitates the transcription of genes essential for spermatogenesis and pre-configures sperm with a chromatin state that forecasts gene expression in the next generation. Thus, plants have evolved a specific mechanism to simultaneously differentiate male gametes and reprogram the paternal epigenome.
  21. Clin Epigenetics. 2020 May 11. 12(1): 64
    Ivanova E, Canovas S, Garcia-Martínez S, Romar R, Lopes JS, Rizos D, Sanchez-Calabuig MJ, Krueger F, Andrews S, Perez-Sanz F, Kelsey G, Coy P.
      Preimplantation embryos experience profound resetting of epigenetic information inherited from the gametes. Genome-wide analysis at single-base resolution has shown similarities but also species differences between human and mouse preimplantation embryos in DNA methylation patterns and reprogramming. Here, we have extended such analysis to two key livestock species, the pig and the cow. We generated genome-wide DNA methylation and whole-transcriptome datasets from gametes to blastocysts in both species. In oocytes from both species, a distinctive bimodal methylation landscape is present, with hypermethylated domains prevalent over hypomethylated domains, similar to human, while in the mouse the proportions are reversed.An oocyte-like pattern of methylation persists in the cleavage stages, albeit with some reduction in methylation level, persisting to blastocysts in cow, while pig blastocysts have a highly hypomethylated landscape. In the pig, there was evidence of transient de novo methylation at the 8-16 cell stages of domains unmethylated in oocytes, revealing a complex dynamic of methylation reprogramming. The methylation datasets were used to identify germline differentially methylated regions (gDMRs) of known imprinted genes and for the basis of detection of novel imprinted loci. Strikingly in the pig, we detected a consistent reduction in gDMR methylation at the 8-16 cell stages, followed by recovery to the blastocyst stage, suggesting an active period of imprint stabilization in preimplantation embryos. Transcriptome analysis revealed absence of expression in oocytes of both species of ZFP57, a key factor in the mouse for gDMR methylation maintenance, but presence of the alternative imprint regulator ZNF445. In conclusion, our study reveals species differences in DNA methylation reprogramming and suggests that porcine or bovine models may be closer to human in key aspects than in the mouse model.
    Keywords:  DNA methylation; Embryo; Epigenetic; Imprinting
  22. Cell Stem Cell. 2020 May 10. pii: S1934-5909(20)30148-X. [Epub ahead of print]
    Shen C, Sheng Y, Zhu AC, Robinson S, Jiang X, Dong L, Chen H, Su R, Yin Z, Li W, Deng X, Chen Y, Hu YC, Weng H, Huang H, Prince E, Cogle CR, Sun M, Zhang B, Chen CW, Marcucci G, He C, Qian Z, Chen J.
      N6-methyladenosine (m6A), the most abundant internal modification in mRNA, has been implicated in tumorigenesis. As an m6A demethylase, ALKBH5 has been shown to promote the development of breast cancer and brain tumors. However, in acute myeloid leukemia (AML), ALKBH5 was reported to be frequently deleted, implying a tumor-suppressor role. Here, we show that ALKBH5 deletion is rare in human AML; instead, ALKBH5 is aberrantly overexpressed in AML. Moreover, its increased expression correlates with poor prognosis in AML patients. We demonstrate that ALKBH5 is required for the development and maintenance of AML and self-renewal of leukemia stem/initiating cells (LSCs/LICs) but not essential for normal hematopoiesis. Mechanistically, ALKBH5 exerts tumor-promoting effects in AML by post-transcriptional regulation of its critical targets such as TACC3, a prognosis-associated oncogene in various cancers. Collectively, our findings reveal crucial functions of ALKBH5 in leukemogenesis and LSC/LIC self-renewal/maintenance and highlight the therapeutic potential of targeting the ALKBH5/m6A axis.
    Keywords:  ALKBH5; MYC; P21; TACC3; acute myeloid leukemia; hematopoietic stem cells (HSCs); leukemia stem cells (LSCs/LICs); m(6)A modification; normal hematopoiesis; prognosis
  23. Epigenetics. 2020 May 13. 1-15
    Uh K, Ryu J, Farrell K, Wax N, Lee K.
      The ten-eleven translocation (TET) family (TET1/2/3) initiates conversion of 5-methylcytosine to 5-hydroxymethylcytosine, thereby orchestrating the DNA demethylation process and changes in epigenetic marks during early embryogenesis. In this study, CRISPR/Cas9 technology and a TET-specific inhibitor were applied to elucidate the role of TET family in regulating pluripotency in preimplantation embryos using porcine embryos as a model. Disruption of TET1 unexpectedly resulted in the upregulation of NANOG and ESRRB transcripts, although there was no change to the level of DNA methylation in the promoter of NANOG. Surprisingly, a threefold increase in the transcript level of TET3 was observed in blastocysts carrying modified TET1, which may explain the upregulation of NANOG and ESRRB. When the activity of TET enzymes was inhibited by dimethyloxalylglycine (DMOG) treatment, a dioxygenase inhibitor, to investigate the role of TET1 while eliminating the potential compensatory activation of TET3, reduced level of pluripotency genes including NANOG and ESRRB, and increased level of DNA methylation in the NANOG promoter was detected. Blastocysts treated with DMOG also presented a lower inner cell mass/TE ratio, implying the involvement of TET family in lineage specification in blastocysts. Our results indicate that the TET family modulates proper expression of NANOG, a key pluripotency marker, by controlling its DNA methylation profile in the promoter during embryogenesis. This study suggests that TET family is a critical component in pluripotency network of porcine embryos by regulating gene expression involved in pluripotency and early lineage specification.
    Keywords:  DNA methylation; TET; embryo; methylcytosine dioxygenase; pluripotency; porcine
  24. Diabetes. 2020 May 13. pii: db190906. [Epub ahead of print]
    Wang Y, Sun J, Lin Z, Zhang W, Wang S, Wang W, Wang Q, Ning G.
      m6A RNA modification is essential during embryonic development of various organs; however, its role in embryonic and early postnatal islet development remains unknown. Mice in which RNA methyltransferase-like 3/14 (Mettl3/14) were deleted in Ngn3+ endocrine progenitors (Mettl3/14 nKO ) developed hyperglycemia and hypo-insulinemia at 2 weeks after birth. We found that Mettl3/14 specifically regulated both functional maturation and mass expansion of neonatal β cells before weaning. Transcriptome and m6A methylome analyses provided m6A-dependent mechanisms in regulating cell identity, insulin secretion and proliferation in neonatal β cells. Importantly, we found that Mettl3/14 were dispensable for β cell differentiation, but directly regulated essential transcriptional factor MafA expression at least partially via modulating its mRNA stability and failure to maintain this modification impacted the ability to fulfill β cell functional maturity. In both diabetic db/db mice and type 2 diabetes patients, decreased Mettl3/14 expression in β cells were observed, suggesting its possible role in type 2 diabetes. Our study unraveled the essential role of Mettl3/14 in neonatal β cell development and functional maturation, both of which determined functional β cell mass and glycemic control in adulthood.
  25. Nucleic Acids Res. 2020 May 14. pii: gkaa384. [Epub ahead of print]
    Ni W, Perez AA, Schreiner S, Nicolet CM, Farnham PJ.
      Our study focuses on a family of ubiquitously expressed human C2H2 zinc finger proteins comprised of ZFX, ZFY and ZNF711. Although their protein structure suggests that ZFX, ZFY and ZNF711 are transcriptional regulators, the mechanisms by which they influence transcription have not yet been elucidated. We used CRISPR-mediated deletion to create bi-allelic knockouts of ZFX and/or ZNF711 in female HEK293T cells (which naturally lack ZFY). We found that loss of either ZFX or ZNF711 reduced cell growth and that the double knockout cells have major defects in proliferation. RNA-seq analysis revealed that thousands of genes showed altered expression in the double knockout clones, suggesting that these TFs are critical regulators of the transcriptome. To gain insight into how these TFs regulate transcription, we created mutant ZFX proteins and analyzed them for DNA binding and transactivation capability. We found that zinc fingers 11-13 are necessary and sufficient for DNA binding and, in combination with the N terminal region, constitute a functional transactivator. Our functional analyses of the ZFX family provides important new insights into transcriptional regulation in human cells by members of the large, but under-studied family of C2H2 zinc finger proteins.
  26. Cell Rep. 2020 May 12. pii: S2211-1247(20)30578-7. [Epub ahead of print]31(6): 107625
    Brunton H, Caligiuri G, Cunningham R, Upstill-Goddard R, Bailey UM, Garner IM, Nourse C, Dreyer S, Jones M, Moran-Jones K, Wright DW, Paulus-Hock V, Nixon C, Thomson G, Jamieson NB, McGregor GA, Evers L, McKay CJ, Gulati A, Brough R, Bajrami I, Pettitt SJ, Dziubinski ML, Barry ST, Grützmann R, Brown R, Curry E, , , Pajic M, Musgrove EA, Petersen GM, Shanks E, Ashworth A, Crawford HC, Simeone DM, Froeling FEM, Lord CJ, Mukhopadhyay D, Pilarsky C, Grimmond SE, Morton JP, Sansom OJ, Chang DK, Bailey PJ, Biankin AV.
      Pancreatic ductal adenocarcinoma (PDAC) can be divided into transcriptomic subtypes with two broad lineages referred to as classical (pancreatic) and squamous. We find that these two subtypes are driven by distinct metabolic phenotypes. Loss of genes that drive endodermal lineage specification, HNF4A and GATA6, switch metabolic profiles from classical (pancreatic) to predominantly squamous, with glycogen synthase kinase 3 beta (GSK3β) a key regulator of glycolysis. Pharmacological inhibition of GSK3β results in selective sensitivity in the squamous subtype; however, a subset of these squamous patient-derived cell lines (PDCLs) acquires rapid drug tolerance. Using chromatin accessibility maps, we demonstrate that the squamous subtype can be further classified using chromatin accessibility to predict responsiveness and tolerance to GSK3β inhibitors. Our findings demonstrate that distinct patterns of chromatin accessibility can be used to identify patient subgroups that are indistinguishable by gene expression profiles, highlighting the utility of chromatin-based biomarkers for patient selection in the treatment of PDAC.
    Keywords:  GATA6; GSK3B; HNF4A; PDAC subtypes; chromatin landscapes; intronic and distal promoters; metabolic targeting; therapeutic tolerance
  27. Cell Stem Cell. 2020 May 08. pii: S1934-5909(20)30140-5. [Epub ahead of print]
    Wang J, Li Y, Wang P, Han G, Zhang T, Chang J, Yin R, Shan Y, Wen J, Xie X, Feng M, Wang Q, Hu J, Cheng Y, Zhang T, Li Y, Gao Z, Guo C, Wang J, Liang J, Cui M, Gao K, Chai J, Liu W, Cheng H, Li L, Zhou F, Liu L, Luo Y, Li S, Zhang H.
      N6-methyladenosine (m6A) is a commonly present modification of mammalian mRNAs and plays key roles in various cellular processes. m6A modifiers catalyze this reversible modification. However, the underlying mechanisms by which these m6A modifiers are regulated remain elusive. Here we show that expression of m6A demethylase ALKBH5 is regulated by chromatin state alteration during leukemogenesis of human acute myeloid leukemia (AML), and ALKBH5 is required for maintaining leukemia stem cell (LSC) function but is dispensable for normal hematopoiesis. Mechanistically, KDM4C regulates ALKBH5 expression via increasing chromatin accessibility of ALKBH5 locus, by reducing H3K9me3 levels and promoting recruitment of MYB and Pol II. Moreover, ALKBH5 affects mRNA stability of receptor tyrosine kinase AXL in an m6A-dependent way. Thus, our findings link chromatin state dynamics with expression regulation of m6A modifiers and uncover a selective and critical role of ALKBH5 in AML that might act as a therapeutic target of specific targeting LSCs.
    Keywords:  ALKBH5; N(6)-methyladenosine; acute myeloid leukemia; chromatin accessibility; hematopoiesis; leukemia stem cell
  28. Genome Biol. 2020 May 12. 21(1): 116
    Li B, Li Y, Li K, Zhu L, Yu Q, Cai P, Fang J, Zhang W, Du P, Jiang C, Lin J, Qu K.
      The development of sequencing technologies has promoted the survey of genome-wide chromatin accessibility at single-cell resolution. However, comprehensive analysis of single-cell epigenomic profiles remains a challenge. Here, we introduce an accessibility pattern-based epigenomic clustering (APEC) method, which classifies each cell by groups of accessible regions with synergistic signal patterns termed "accessons". This python-based package greatly improves the accuracy of unsupervised single-cell clustering for many public datasets. It also predicts gene expression, identifies enriched motifs, discovers super-enhancers, and projects pseudotime trajectories. APEC is available at
    Keywords:  Accesson; Cell clustering; Pseudotime trajectory; Regulome; scATAC-seq
  29. EMBO J. 2020 May 12. e103697
    Liu Z, Tardat M, Gill ME, Royo H, Thierry R, Ozonov EA, Peters AH.
      Chromatin integrity is essential for cellular homeostasis. Polycomb group proteins modulate chromatin states and transcriptionally repress developmental genes to maintain cell identity. They also repress repetitive sequences such as major satellites and constitute an alternative state of pericentromeric constitutive heterochromatin at paternal chromosomes (pat-PCH) in mouse pre-implantation embryos. Remarkably, pat-PCH contains the histone H3.3 variant, which is absent from canonical PCH at maternal chromosomes, which is marked by histone H3 lysine 9 trimethylation (H3K9me3), HP1, and ATRX proteins. Here, we show that SUMO2-modified CBX2-containing Polycomb Repressive Complex 1 (PRC1) recruits the H3.3-specific chaperone DAXX to pat-PCH, enabling H3.3 incorporation at these loci. Deficiency of Daxx or PRC1 components Ring1 and Rnf2 abrogates H3.3 incorporation, induces chromatin decompaction and breakage at PCH of exclusively paternal chromosomes, and causes their mis-segregation. Complementation assays show that DAXX-mediated H3.3 deposition is required for chromosome stability in early embryos. DAXX also regulates repression of PRC1 target genes during oogenesis and early embryogenesis. The study identifies a novel critical role for Polycomb in ensuring heterochromatin integrity and chromosome stability in mouse early development.
    Keywords:  PRC1; SUMOylation; chromosome stability; constitutive heterochromatin; histone variant
  30. Nat Commun. 2020 May 13. 11(1): 2379
    Liu T, Mi L, Xiong J, Orchard P, Yu Q, Yu L, Zhao XY, Meng ZX, Parker SCJ, Lin JD, Li S.
      Brown and beige fat share a remarkably similar transcriptional program that supports fuel oxidation and thermogenesis. The chromatin-remodeling machinery that governs genome accessibility and renders adipocytes poised for thermogenic activation remains elusive. Here we show that BAF60a, a subunit of the SWI/SNF chromatin-remodeling complexes, serves an indispensable role in cold-induced thermogenesis in brown fat. BAF60a maintains chromatin accessibility at PPARγ and EBF2 binding sites for key thermogenic genes. Surprisingly, fat-specific BAF60a inactivation triggers more pronounced cold-induced browning of inguinal white adipose tissue that is linked to induction of MC2R, a receptor for the pituitary hormone ACTH. Elevated MC2R expression sensitizes adipocytes and BAF60a-deficient adipose tissue to thermogenic activation in response to ACTH stimulation. These observations reveal an unexpected dichotomous role of BAF60a-mediated chromatin remodeling in transcriptional control of brown and beige gene programs and illustrate a pituitary-adipose signaling axis in the control of thermogenesis.
  31. Sci Rep. 2020 May 14. 10(1): 7960
    Chen X, Gu J, Neuwald AF, Hilakivi-Clarke L, Clarke R, Xuan J.
      Genome-wide transcription factor (TF) binding signal analyses reveal co-localization of TF binding sites based on inferred cis-regulatory modules (CRMs). CRMs play a key role in understanding the cooperation of multiple TFs under specific conditions. However, the functions of CRMs and their effects on nearby gene transcription are highly dynamic and context-specific and therefore are challenging to characterize. BICORN (Bayesian Inference of COoperative Regulatory Network) builds a hierarchical Bayesian model and infers context-specific CRMs based on TF-gene binding events and gene expression data for a particular cell type. BICORN automatically searches for a list of candidate CRMs based on the input TF bindings at regulatory regions associated with genes of interest. Applying Gibbs sampling, BICORN iteratively estimates model parameters of CRMs, TF activities, and corresponding regulation on gene transcription, which it models as a sparse network of functional CRMs regulating target genes. The BICORN package is implemented in R (version 3.4 or later) and is publicly available on the CRAN server at
  32. Cancer Res. 2020 May 14. pii: canres.2415.2019. [Epub ahead of print]
    Hoxha S, Shepard A, Troutman S, Diao H, Doherty JR, Janiszewska M, Witwicki RM, Pipkin ME, Ja WW, Kareta MS, Kissil JL.
      The Hippo pathway regulates cell proliferation and organ size through control of the transcriptional regulators YAP (yes-associated protein) and TAZ. Upon extracellular stimuli such as cell-cell contact, the pathway negatively regulates YAP through cytoplasmic sequestration. Under conditions of low cell density, YAP is nuclear and associates with enhancer regions and gene promoters. YAP is mainly described as a transcriptional activator of genes involved in cell proliferation and survival. Using a genome-wide approach, we show here that, in addition to its known function as a transcriptional activator, YAP functions as a transcriptional repressor by interacting with the multifunctional transcription factor Yin Yang 1 (YY1) and Polycomb repressive complex (PRC2) member EZH2. YAP co-localized with YY1 and EZH2 on the genome to transcriptionally repress a broad network of genes mediating a host of cellular functions, including repression of the cell-cycle kinase inhibitor p27, whose role is to functionally promote contact inhibition. This work unveils a broad and underappreciated aspect of YAP nuclear function as a transcriptional repressor and highlights how loss of contact inhibition in cancer is mediated in part through YAP repressive function.
  33. Nat Med. 2020 May;26(5): 792-802
    Slyper M, Porter CBM, Ashenberg O, Waldman J, Drokhlyansky E, Wakiro I, Smillie C, Smith-Rosario G, Wu J, Dionne D, Vigneau S, Jané-Valbuena J, Tickle TL, Napolitano S, Su MJ, Patel AG, Karlstrom A, Gritsch S, Nomura M, Waghray A, Gohil SH, Tsankov AM, Jerby-Arnon L, Cohen O, Klughammer J, Rosen Y, Gould J, Nguyen L, Hofree M, Tramontozzi PJ, Li B, Wu CJ, Izar B, Haq R, Hodi FS, Yoon CH, Hata AN, Baker SJ, Suvà ML, Bueno R, Stover EH, Clay MR, Dyer MA, Collins NB, Matulonis UA, Wagle N, Johnson BE, Rotem A, Rozenblatt-Rosen O, Regev A.
      Single-cell genomics is essential to chart tumor ecosystems. Although single-cell RNA-Seq (scRNA-Seq) profiles RNA from cells dissociated from fresh tumors, single-nucleus RNA-Seq (snRNA-Seq) is needed to profile frozen or hard-to-dissociate tumors. Each requires customization to different tissue and tumor types, posing a barrier to adoption. Here, we have developed a systematic toolbox for profiling fresh and frozen clinical tumor samples using scRNA-Seq and snRNA-Seq, respectively. We analyzed 216,490 cells and nuclei from 40 samples across 23 specimens spanning eight tumor types of varying tissue and sample characteristics. We evaluated protocols by cell and nucleus quality, recovery rate and cellular composition. scRNA-Seq and snRNA-Seq from matched samples recovered the same cell types, but at different proportions. Our work provides guidance for studies in a broad range of tumors, including criteria for testing and selecting methods from the toolbox for other tumors, thus paving the way for charting tumor atlases.
  34. Nat Commun. 2020 May 13. 11(1): 2380
    Eder N, Roncaroli F, Dolmart MC, Horswell S, Andreiuolo F, Flynn HR, Lopes AT, Claxton S, Kilday JP, Collinson L, Mao JH, Pietsch T, Thompson B, Snijders AP, Ultanir SK.
      YAP1 gene fusions have been observed in a subset of paediatric ependymomas. Here we show that, ectopic expression of active nuclear YAP1 (nlsYAP5SA) in ventricular zone neural progenitor cells using conditionally-induced NEX/NeuroD6-Cre is sufficient to drive brain tumour formation in mice. Neuronal differentiation is inhibited in the hippocampus. Deletion of YAP1's negative regulators LATS1 and LATS2 kinases in NEX-Cre lineage in double conditional knockout mice also generates similar tumours, which are rescued by deletion of YAP1 and its paralog TAZ. YAP1/TAZ-induced mouse tumours display molecular and ultrastructural characteristics of human ependymoma. RNA sequencing and quantitative proteomics of mouse tumours demonstrate similarities to YAP1-fusion induced supratentorial ependymoma. Finally, we find that transcriptional cofactor HOPX is upregulated in mouse models and in human YAP1-fusion induced ependymoma, supporting their similarity. Our results show that uncontrolled YAP1/TAZ activity in neuronal precursor cells leads to ependymoma-like tumours in mice.
  35. Nucleic Acids Res. 2020 May 11. pii: gkaa349. [Epub ahead of print]
    Erbe R, Kessler MD, Favorov AV, Easwaran H, Gaykalova DA, Fertig EJ.
      While the methods available for single-cell ATAC-seq analysis are well optimized for clustering cell types, the question of how to integrate multiple scATAC-seq data sets and/or sequencing modalities is still open. We present an analysis framework that enables such integration across scATAC-seq data sets by applying the CoGAPS Matrix Factorization algorithm and the projectR transfer learning program to identify common regulatory patterns across scATAC-seq data sets. We additionally integrate our analysis with scRNA-seq data to identify orthogonal evidence for transcriptional regulators predicted by scATAC-seq analysis. Using publicly available scATAC-seq data, we find patterns that accurately characterize cell types both within and across data sets. Furthermore, we demonstrate that these patterns are both consistent with current biological understanding and reflective of novel regulatory biology.
  36. Sci Rep. 2020 May 13. 10(1): 7933
    Oh D, Strattan JS, Hur JK, Bento J, Urban AE, Song G, Cherry JM.
      ChIP-seq is one of the core experimental resources available to understand genome-wide epigenetic interactions and identify the functional elements associated with diseases. The analysis of ChIP-seq data is important but poses a difficult computational challenge, due to the presence of irregular noise and bias on various levels. Although many peak-calling methods have been developed, the current computational tools still require, in some cases, human manual inspection using data visualization. However, the huge volumes of ChIP-seq data make it almost impossible for human researchers to manually uncover all the peaks. Recently developed convolutional neural networks (CNN), which are capable of achieving human-like classification accuracy, can be applied to this challenging problem. In this study, we design a novel supervised learning approach for identifying ChIP-seq peaks using CNNs, and integrate it into a software pipeline called CNN-Peaks. We use data labeled by human researchers who annotate the presence or absence of peaks in some genomic segments, as training data for our model. The trained model is then applied to predict peaks in previously unseen genomic segments from multiple ChIP-seq datasets including benchmark datasets commonly used for validation of peak calling methods. We observe a performance superior to that of previous methods.
  37. Elife. 2020 May 11. pii: e52563. [Epub ahead of print]9
    Sollberger G, Streeck R, Apel F, Caffrey BE, Skoultchi AI, Zychlinsky A.
      Neutrophils are important innate immune cells that tackle invading pathogens with different effector mechanisms. They acquire this antimicrobial potential during their maturation in the bone marrow, where they differentiate from hematopoietic stem cells in a process called granulopoiesis. Mature neutrophils are terminally differentiated and short-lived with a high turnover rate. Here, we show a critical role for linker histone H1 on the differentiation and function of neutrophils using a genome-wide CRISPR/Cas9 screen in the human cell line PLB-985. We systematically disrupted expression of somatic H1 subtypes to show that individual H1 subtypes affect PLB-985 maturation in opposite ways. Loss of H1.2 and H1.4 induced an eosinophil-like transcriptional program, thereby negatively regulating the differentiation into the neutrophil lineage. Importantly, H1 subtypes also affect neutrophil differentiation and the eosinophil-directed bias of murine bone marrow stem cells, demonstrating an unexpected subtype-specific role for H1 in granulopoiesis.
    Keywords:  developmental biology; human; immunology; inflammation; mouse
  38. Biophys J. 2020 May 05. pii: S0006-3495(20)30166-1. [Epub ahead of print]118(9): 2193-2208
    Kumari K, Duenweg B, Padinhateeri R, Prakash JR.
      The three-dimensional (3D) organization of chromatin, on the length scale of a few genes, is crucial in determining the functional state-accessibility and amount of gene expression-of the chromatin. Recent advances in chromosome conformation capture experiments provide partial information on the chromatin organization in a cell population, namely the contact count between any segment pairs, but not on the interaction strength that leads to these contact counts. However, given the contact matrix, determining the complete 3D organization of the whole chromatin polymer is an inverse problem. In this work, a novel inverse Brownian dynamics method based on a coarse-grained bead-spring chain model has been proposed to compute the optimal interaction strengths between different segments of chromatin such that the experimentally measured contact count probability constraints are satisfied. Applying this method to the α-globin gene locus in two different cell types, we predict the 3D organizations corresponding to active and repressed states of chromatin at the locus. We show that the average distance between any two segments of the region has a broad distribution and cannot be computed as a simple inverse relation based on the contact probability alone. The results presented for multiple normalization methods suggest that all measurable quantities may crucially depend on the nature of normalization. We argue that by experimentally measuring predicted quantities, one may infer the appropriate form of normalization.
  39. BMC Bioinformatics. 2020 May 11. 21(1): 184
    Hu Y, Xi X, Yang Q, Zhang X.
      BACKGROUND: With the rapid development of single-cell genomics, technologies for parallel sequencing of the transcriptome and genome in each single cell is being explored in several labs and is becoming available. This brings us the opportunity to uncover association between genotypes and gene expression phenotypes at single-cell level by eQTL analysis on single-cell data. New method is needed for such tasks due to special characteristics of single-cell sequencing data.RESULTS: We developed an R package SCeQTL that uses zero-inflated negative binomial regression to do eQTL analysis on single-cell data. It can distinguish two type of gene-expression differences among different genotype groups. It can also be used for finding gene expression variations associated with other grouping factors like cell lineages or cell types.
    CONCLUSIONS: The SCeQTL method is capable for eQTL analysis on single-cell data as well as detecting associations of gene expression with other grouping factors. The R package of the method is available at
    Keywords:  Multi-class differential expression analysis; Single-cell eQTL; Single-cell gene regulation; Zero-inflated negative binomial regression
  40. BMC Bioinformatics. 2020 May 11. 21(1): 181
    Choi J, Chae H.
      BACKGROUND: Recently, DNA methylation has drawn great attention due to its strong correlation with abnormal gene activities and informative representation of the cancer status. As a number of studies focus on DNA methylation signatures in cancer, demand for utilizing publicly available methylome dataset has been increased. To satisfy this, large-scale projects were launched to discover biological insights into cancer, providing a collection of the dataset. However, public cancer data, especially for certain cancer types, is still limited to be used in research. Several simulation tools for producing epigenetic dataset have been introduced in order to alleviate the issue, still, to date, generation for user-specified cancer type dataset has not been proposed.RESULTS: In this paper, we present methCancer-gen, a tool for generating DNA methylome dataset considering type for cancer. Employing conditional variational autoencoder, a neural network-based generative model, it estimates the conditional distribution with latent variables and data, and generates samples for specified cancer type.
    CONCLUSIONS: To evaluate the simulation performance of methCancer-gen for the user-specified cancer type, our proposed model was compared to a benchmark method and it could successfully reproduce cancer type-wise data with high accuracy helping to alleviate the lack of condition-specific data issue. methCancer-gen is publicly available at
    Keywords:  Cancer; Conditional variational autoencoder; DNA methylation; Generator; Simulator
  41. Commun Biol. 2020 May 12. 3(1): 235
    Golicz AA, Bhalla PL, Edwards D, Singh MB.
      Genomes of many eukaryotic species have a defined three-dimensional architecture critical for cellular processes. They are partitioned into topologically associated domains (TADs), defined as regions of high chromatin inter-connectivity. While TADs are not a prominent feature of A. thaliana genome organization, they have been reported for other plants including rice, maize, tomato and cotton and for which TAD formation appears to be linked to transcription and chromatin epigenetic status. Here we show that in the rice genome, sequence variation and meiotic recombination rate correlate with the 3D genome structure. TADs display increased SNP and SV density and higher recombination rate compared to inter-TAD regions. We associate the observed differences with the TAD epigenetic landscape, TE composition and an increased incidence of meiotic crossovers.
  42. Plant Cell. 2020 May 14. pii: tpc.00155.2020. [Epub ahead of print]
    Jores T, Tonnies J, Dorrity MW, Cuperus J, Fields S, Queitsch C.
      Genetic engineering of cis-regulatory elements in crop plants is a promising strategy to ensure food security. However, such engineering is currently hindered by our limited knowledge of plant cis-regulatory elements. Here, we adapted STARR-seq-a technology for the high-throughput identification of enhancers-for its use in transiently transformed tobacco (Nicotiana benthamiana) leaves. We demonstrate that the optimal placement in the reporter construct of enhancer sequences from a plant virus, pea (Pisum sativum) and wheat (Triticum aestivum) was just upstream of a minimal promoter, and that none of these four known enhancers was active in the 3'-UTR of the reporter gene. The optimized assay sensitively identified small DNA regions containing each of the four enhancers, including two whose activity was stimulated by light. Furthermore, we coupled the assay to saturation mutagenesis to pinpoint functional regions within an enhancer, which we recombined to create synthetic enhancers. Our results describe an approach to define enhancer properties that can be performed in potentially any plant species or tissue transformable by Agrobacterium and that can use regulatory DNA derived from any plant genome.
  43. Cell Rep. 2020 May 12. pii: S2211-1247(20)30584-2. [Epub ahead of print]31(6): 107631
    Ghosh A, Syed SM, Kumar M, Carpenter TJ, Teixeira JM, Houairia N, Negi S, Tanwar PS.
      The mesenchymal to epithelial transition (MET) is thought to be involved in the maintenance, repair, and carcinogenesis of the fallopian tube (oviduct) and uterine epithelium. However, conclusive evidence for the conversion of mesenchymal cells to epithelial cells in these organs is lacking. Using embryonal cell lineage tracing with reporters driven by mesenchymal cell marker genes of the female reproductive tract (AMHR2, CSPG4, and PDGFRβ), we show that these reporters are also expressed by some oviductal and uterine epithelial cells at birth. These mesenchymal reporter-positive epithelial cells are maintained in adult mice across multiple pregnancies, respond to ovarian hormones, and form organoids. However, no labeled epithelial cells are present in any oviductal or uterine epithelia when mesenchymal cell labeling was induced in adult mice. Organoids developed from mice labeled in adulthood were also negative for mesenchymal reporters. Collectively, our work found no definitive evidence of MET in the adult fallopian tube and uterine epithelium.
    Keywords:  FRT; endometrial cancer; endometrium; fertility; fertilization; ovarian cancer; regeneration; repair
  44. Cell Rep. 2020 May 12. pii: S2211-1247(20)30582-9. [Epub ahead of print]31(6): 107629
    Yildirim O, Izgu EC, Damle M, Chalei V, Ji F, Sadreyev RI, Szostak JW, Kingston RE.
      Many proteins that are needed for progression through S-phase are produced from transcripts that peak in the S-phase, linking temporal expression of those proteins to the time that they are required in cell cycle. Here, we explore the potential roles of long non-coding RNAs in cell cycle progression. We use a sensitive click-chemistry approach to isolate nascent RNAs in a human cell line, and we identify more than 900 long non-coding RNAs (lncRNAs) whose synthesis peaks during the S-phase. More than 200 of these are long intergenic non-coding RNAs (lincRNAs) with S-phase-specific expression. We characterize three of these lincRNAs by knockdown and find that all three lincRNAs are required for appropriate S-phase progression. We infer that non-coding RNAs are key regulatory effectors during the cell cycle, acting on distinct regulatory networks, and herein, we provide a large catalog of candidate cell-cycle regulatory RNAs.
    Keywords:  S-phase; cell cycle; click chemistry; lincRNA; metabolic labeling; nascent RNA; non-coding RNA