bims-crepig Biomed News
on Chromatin regulation and epigenetics in cell fate and cancer
Issue of 2021‒11‒14
twenty-nine papers selected by
Connor Rogerson
University of Cambridge, MRC Cancer Unit


  1. Nucleic Acids Res. 2021 Nov 10. pii: gkab1039. [Epub ahead of print]
      The human genome contains ∼2000 transcriptional regulatory proteins, including ∼1600 DNA-binding transcription factors (TFs) recognizing characteristic sequence motifs to exert regulatory effects on gene expression. The binding specificities of these factors have been profiled both in vitro, using techniques such as HT-SELEX, and in vivo, using techniques including ChIP-seq. We previously developed Factorbook, a TF-centric database of annotations, motifs, and integrative analyses based on ChIP-seq data from Phase II of the ENCODE Project. Here we present an update to Factorbook which significantly expands the breadth of cell type and TF coverage. The update includes an expanded motif catalog derived from thousands of ENCODE Phase II and III ChIP-seq experiments and HT-SELEX experiments; this motif catalog is integrated with the ENCODE registry of candidate cis-regulatory elements to annotate a comprehensive collection of genome-wide candidate TF binding sites. The database also offers novel tools for applying the motif models within machine learning frameworks and using these models for integrative analysis, including annotation of variants and disease and trait heritability. Factorbook is publicly available at www.factorbook.org; we will continue to expand the resource as ENCODE Phase IV data are released.
    DOI:  https://doi.org/10.1093/nar/gkab1039
  2. Proc Natl Acad Sci U S A. 2021 Nov 16. pii: e2113579118. [Epub ahead of print]118(46):
      Using a tamoxifen-inducible time-course ChIP-sequencing (ChIP-seq) approach, we show that the ubiquitous transcription factor SP1 has different binding dynamics at its target sites in the human genome. SP1 very rapidly reaches maximal binding levels at some sites, but binding kinetics at other sites is biphasic, with rapid half-maximal binding followed by a considerably slower increase to maximal binding. While ∼70% of SP1 binding sites are located at promoter regions, loci with slow SP1 binding kinetics are enriched in enhancer and Polycomb-repressed regions. Unexpectedly, SP1 sites with fast binding kinetics tend to have higher quality and more copies of the SP1 sequence motif. Different cobinding factors associate near SP1 binding sites depending on their binding kinetics and on their location at promoters or enhancers. For example, NFY and FOS are preferentially associated near promoter-bound SP1 sites with fast binding kinetics, whereas DNA motifs of ETS and homeodomain proteins are preferentially observed at sites with slow binding kinetics. At promoters but not enhancers, proteins involved in sumoylation and PML bodies associate more strongly with slow SP1 binding sites than with the fast binding sites. The speed of SP1 binding is not associated with nucleosome occupancy, and it is not necessarily coupled to higher transcriptional activity. These results with SP1 are in contrast to those of human TBP, indicating that there is no common mechanism affecting transcription factor binding kinetics. The biphasic kinetics at some SP1 target sites suggest the existence of distinct chromatin states at these loci in different cells within the overall population.
    Keywords:  DNA binding kinetics; DNA binding protein; chromatin; gene regulation; transcription factor
    DOI:  https://doi.org/10.1073/pnas.2113579118
  3. Nat Aging. 2021 Aug;1(8): 684-697
      A repressive chromatin state featuring trimethylated lysine 36 on histone H3 (H3K36me3) and DNA methylation suppresses cryptic transcription in embryonic stem cells. Cryptic transcription is elevated with age in yeast and nematodes, and reducing it extends yeast lifespan, though whether this occurs in mammals is unknown. We show that cryptic transcription is elevated in aged mammalian stem cells, including murine hematopoietic stem cells (mHSCs) and neural stem cells (NSCs) and human mesenchymal stem cells (hMSCs). Precise mapping allowed quantification of age-associated cryptic transcription in hMSCs aged in vitro. Regions with significant age-associated cryptic transcription have a unique chromatin signature: decreased H3K36me3 and increased H3K4me1, H3K4me3, and H3K27ac with age. Genomic regions undergoing such changes resemble known promoter sequences and are bound by TBP even in young cells. Hence, the more permissive chromatin state at intragenic cryptic promoters likely underlies increased cryptic transcription in aged mammalian stem cells.
    DOI:  https://doi.org/10.1038/s43587-021-00091-x
  4. Nat Commun. 2021 Nov 12. 12(1): 6549
      Understanding gene expression will require understanding where regulatory factors bind genomic DNA. The frequently used sequence-based motifs of protein-DNA binding are not predictive, since a genome contains many more binding sites than are actually bound and transcription factors of the same family share similar DNA-binding motifs. Traditionally, these motifs only depict sequence but neglect DNA shape. Since shape may contribute non-linearly and combinational to binding, machine learning approaches ought to be able to better predict transcription factor binding. Here we show that a random forest machine learning approach, which incorporates the 3D-shape of DNA, enhances binding prediction for all 216 tested Arabidopsis thaliana transcription factors and improves the resolution of differential binding by transcription factor family members which share the same binding motif. We observed that DNA shape features were individually weighted for each transcription factor, even if they shared the same binding sequence.
    DOI:  https://doi.org/10.1038/s41467-021-26819-2
  5. Cancer Discov. 2021 Nov 12. pii: candisc.0385.2021. [Epub ahead of print]
      Gene expression is regulated by promoters and enhancers marked by histone H3-lysine-27 acetylation (H3K27ac), which is established by the paralogous histone acetyltransferases (HATs), EP300 and CBP. These enzymes display overlapping regulatory roles in untransformed cells, but less characterized roles in cancer cells. We demonstrate that the majority of high-risk pediatric neuroblastoma (NB) depend on EP300, whereas CBP has a limited role. EP300 controls enhancer acetylation by interacting with TFAP2β, a transcription factor member of the lineage-defining transcriptional core regulatory circuitry (CRC) in NB. To disrupt EP300, we developed a proteolysis-targeted-chimaera (PROTAC) compound termed "JQAD1" that selectively targets EP300 for degradation. JQAD1 treatment causes loss of H3K27ac at CRC enhancers and rapid neuroblastoma apoptosis, with limited toxicity to untransformed cells where CBP may compensate. Further, JQAD1 activity is critically determined by cereblon (CRBN) expression across neuroblastoma cells.
    DOI:  https://doi.org/10.1158/2159-8290.CD-21-0385
  6. Nat Commun. 2021 Nov 12. 12(1): 6566
      As sequencing depth of chromatin studies continually grows deeper for sensitive profiling of regulatory elements or chromatin spatial structures, aligning and preprocessing of these sequencing data have become the bottleneck for analysis. Here we present Chromap, an ultrafast method for aligning and preprocessing high throughput chromatin profiles. Chromap is comparable to BWA-MEM and Bowtie2 in alignment accuracy and is over 10 times faster than traditional workflows on bulk ChIP-seq/Hi-C profiles and than 10x Genomics' CellRanger v2.0.0 pipeline on single-cell ATAC-seq profiles.
    DOI:  https://doi.org/10.1038/s41467-021-26865-w
  7. Nat Commun. 2021 Nov 09. 12(1): 6469
      Subunit switches in the BAF chromatin remodeler are essential during development. ARID1B and its paralog ARID1A encode for mutually exclusive BAF subunits. De novo ARID1B haploinsufficient mutations cause neurodevelopmental disorders, including Coffin-Siris syndrome, which is characterized by neurological and craniofacial features. Here, we leveraged ARID1B+/- Coffin-Siris patient-derived iPSCs and modeled cranial neural crest cell (CNCC) formation. We discovered that ARID1B is active only during the first stage of this process, coinciding with neuroectoderm specification, where it is part of a lineage-specific BAF configuration (ARID1B-BAF). ARID1B-BAF regulates exit from pluripotency and lineage commitment by attenuating thousands of enhancers and genes of the NANOG and SOX2 networks. In iPSCs, these enhancers are maintained active by ARID1A-containing BAF. At the onset of differentiation, cells transition from ARID1A- to ARID1B-BAF, eliciting attenuation of the NANOG/SOX2 networks and triggering pluripotency exit. Coffin-Siris patient cells fail to perform the ARID1A/ARID1B switch, and maintain ARID1A-BAF at the pluripotency enhancers throughout all stages of CNCC formation. This leads to persistent NANOG/SOX2 activity which impairs CNCC formation. Despite showing the typical neural crest signature (TFAP2A/SOX9-positive), ARID1B-haploinsufficient CNCCs are also aberrantly NANOG-positive. These findings suggest a connection between ARID1B mutations, neuroectoderm specification and a pathogenic mechanism for Coffin-Siris syndrome.
    DOI:  https://doi.org/10.1038/s41467-021-26810-x
  8. Genome Biol. 2021 11 08. 22(1): 309
      BACKGROUND: Topologically associating domains (TADs) are important building blocks of three-dimensional genome architectures. The formation of TADs has been shown to depend on cohesin in a loop-extrusion mechanism. Recently, advances in an image-based spatial genomics technique known as chromatin tracing lead to the discovery of cohesin-independent TAD-like structures, also known as single-cell domains, which are highly variant self-interacting chromatin domains with boundaries that occasionally overlap with TAD boundaries but tend to differ among single cells and among single chromosome copies. Recent computational modeling studies suggest that epigenetic interactions may underlie the formation of the single-cell domains.RESULTS: Here we use chromatin tracing to visualize in female human cells the fine-scale chromatin folding of inactive and active X chromosomes, which are known to have distinct global epigenetic landscapes and distinct population-averaged TAD profiles, with inactive X chromosomes largely devoid of TADs and cohesin. We show that both inactive and active X chromosomes possess highly variant single-cell domains across the same genomic region despite the fact that only active X chromosomes show clear TAD structures at the population level. These X chromosome single-cell domains exist in distinct cell lines. Perturbations of major epigenetic components and transcription mostly do not affect the frequency or strength of the single-cell domains. Increased chromatin compaction of inactive X chromosomes occurs at a length scale above that of the single-cell domains.
    CONCLUSIONS: In sum, this study suggests that single-cell domains are genome architecture building blocks independent of the tested major epigenetic components.
    Keywords:  3D genomics; Chromatin compaction; Chromatin folding; Chromatin tracing; Image-based spatial genomics; Multiplexed sequential fluorescence in situ hybridization (FISH); Single-cell domain; TAD-like structure; Topologically associating domain (TAD); X chromosome; X inactivation
    DOI:  https://doi.org/10.1186/s13059-021-02523-8
  9. J Biol Chem. 2021 Nov 08. pii: S0021-9258(21)01195-9. [Epub ahead of print] 101389
      Sox2 (SRY-box 2) is a transcription factor with critical roles in maintaining embryonic and adult stem cell functions and in tumorigenesis. However, how Sox2 exerts its transcriptional function remains unclear. Here we used an in vitro protein-protein interaction assay to discover transcriptional regulators for embryonic stem cell core transcription factors (Oct4, Sox2, Klf4 and c-Myc) and identified members of the steroid receptor coactivators (SRCs) as Sox2-specific interacting proteins. The SRC family coactivators have broad roles in transcriptional regulation, but it is unknown whether they also serve as Sox2 coactivators. We demonstrated that these proteins facilitate Sox2 transcriptional activity and acts synergistically with p300. Furthermore, we uncovered an acetylation-enhanced interaction between Sox2 and SRC-2/3, but not SRC-1, demonstrating it is Sox2 acetylation that promotes the interaction. We identified putative Sox2 acetylation sites required for acetylation-enhanced interaction between Sox2 and SRC-3, and demonstrated that acetylation on these sites contributes to Sox2 transcriptional activity and recruitment of SRC-3. We showed that activation domains 1 (AD1) and 2 (AD2) of SRC-3 both display a preferential binding to acetylated Sox2. Finally, functional analyses in mouse embryonic stem (ES) cells demonstrated that knockdown of SRC-2/3 but not SRC-1 in mouse ES cells significantly down-regulates the transcriptional activities of various Sox2 target genes and impairs ES cell stemness. Taken together, we identify specific SRC family proteins as novel Sox2 coactivators and uncover the role of Sox2 acetylation in promoting coactivator recruitment and Sox2 transcriptional function.
    Keywords:  SRC-1; SRC-2; SRC-3; Sox2 acetylation; Steroid receptor coactivators; p300; pluripotency; transcription
    DOI:  https://doi.org/10.1016/j.jbc.2021.101389
  10. Cell Rep. 2021 Nov 09. pii: S2211-1247(21)01446-7. [Epub ahead of print]37(6): 109967
      Stem and progenitor cells have the capacity to balance self-renewal and differentiation. Hematopoietic myeloid progenitors replenish more than 25 billion terminally differentiated neutrophils every day under homeostatic conditions and can increase this output in response to stress or infection. At what point along the spectrum of maturation do progenitors lose capacity for self-renewal and become irreversibly committed to differentiation? Using a system of conditional myeloid development that can be toggled between self-renewal and differentiation, we interrogate determinants of this "point of no return" in differentiation commitment. Irreversible commitment is due primarily to loss of open regulatory site access and disruption of a positive feedback transcription factor activation loop. Restoration of the transcription factor feedback loop extends the window of cell plasticity and alters the point of no return. These findings demonstrate how the chromatin state enforces and perpetuates cell fate and identify potential avenues for manipulating cell identity.
    Keywords:  Hoxa9; acute myeloid leukemia; cell fate; chromatin; development; differentiation; epigenetics; hematopoiesis; lineage commitment
    DOI:  https://doi.org/10.1016/j.celrep.2021.109967
  11. Cell Rep. 2021 Nov 09. pii: S2211-1247(21)01429-7. [Epub ahead of print]37(6): 109952
      Gene regulation often results from the action of multiple transcription factors (TFs) acting at a promoter, obscuring the individual regulatory effect of each TF on RNA polymerase (RNAP). Here we measure the fundamental regulatory interactions of TFs in E. coli by designing synthetic target genes that isolate individual TFs' regulatory effects. Using a thermodynamic model, each TF's regulatory interactions are decoupled from TF occupancy and interpreted as acting through (de)stabilization of RNAP and (de)acceleration of transcription initiation. We find that the contribution of each mechanism depends on TF identity and binding location; regulation immediately downstream of the promoter is insensitive to TF identity, but the same TFs regulate by distinct mechanisms upstream of the promoter. These two mechanisms are uncoupled and can act coherently, to reinforce the observed regulatory role (activation/repression), or incoherently, wherein the TF regulates two distinct steps with opposing effects.
    Keywords:  gene regulation; quantitative modeling; synthetic biology; systems biology; transcription factor
    DOI:  https://doi.org/10.1016/j.celrep.2021.109952
  12. Nat Commun. 2021 Nov 12. 12(1): 6581
      The mammalian SWI/SNF nucleosome remodeler is essential for spermatogenesis. Here, we identify a role for ARID2, a PBAF (Polybromo - Brg1 Associated Factor)-specific subunit, in meiotic division. Arid2cKO spermatocytes arrest at metaphase-I and are deficient in spindle assembly, kinetochore-associated Polo-like kinase1 (PLK1), and centromeric targeting of Histone H3 threonine3 phosphorylation (H3T3P) and Histone H2A threonine120 phosphorylation (H2AT120P). By determining ARID2 and BRG1 genomic associations, we show that PBAF localizes to centromeres and promoters of genes known to govern spindle assembly and nuclear division in spermatocytes. Consistent with gene ontology of target genes, we also identify a role for ARID2 in centrosome stability. Additionally, misexpression of genes such as Aurkc and Ppp1cc (Pp1γ), known to govern chromosome segregation, potentially compromises the function of the chromosome passenger complex (CPC) and deposition of H3T3P, respectively. Our data support a model where-in PBAF activates genes essential for meiotic cell division.
    DOI:  https://doi.org/10.1038/s41467-021-26828-1
  13. Dev Cell. 2021 Nov 08. pii: S1534-5807(21)00811-X. [Epub ahead of print]56(21): 2995-3005.e4
      Genomic imprinting and X chromosome inactivation (XCI) require epigenetic mechanisms to encode allele-specific expression, but how these specific tasks are accomplished at single loci or across chromosomal scales remains incompletely understood. Here, we systematically disrupt essential epigenetic pathways within polymorphic embryos in order to examine canonical and non-canonical genomic imprinting as well as XCI. We find that DNA methylation and Polycomb group repressors are indispensable for autosomal imprinting, albeit at distinct gene sets. Moreover, the extraembryonic ectoderm relies on a broader spectrum of imprinting mechanisms, including non-canonical targeting of maternal endogenous retrovirus (ERV)-driven promoters by the H3K9 methyltransferase G9a. We further identify Polycomb-dependent and -independent gene clusters on the imprinted X chromosome, which appear to reflect distinct domains of Xist-mediated suppression. From our data, we assemble a comprehensive inventory of the epigenetic pathways that maintain parent-specific imprinting in eutherian mammals, including an expanded view of the placental lineage.
    Keywords:  DNA methylation; X chromosome; epigenetic regulators; imprinting; placenta; scRNA-seq
    DOI:  https://doi.org/10.1016/j.devcel.2021.10.010
  14. NAR Genom Bioinform. 2021 Dec;3(4): lqab100
      Cellular reprogramming is a promising technology to develop disease models and cell-based therapies. Identification of the key regulators defining the cell type specificity is pivotal to devising reprogramming cocktails for successful cell conversion but remains a great challenge. Here, we present a systems biology approach called Taiji-reprogram to efficiently uncover transcription factor (TF) combinations for conversion between 154 diverse cell types or tissues. This method integrates the transcriptomic and epigenomic data to construct cell-type specific genetic networks and assess the global importance of TFs in the network. Comparative analysis across cell types revealed TFs that are specifically important in a particular cell type and often tightly associated with cell-type specific functions. A systematic search of TFs with differential importance in the source and target cell types uncovered TF combinations for desired cell conversion. We have shown that Taiji-reprogram outperformed the existing methods to better recover the TFs in the experimentally validated reprogramming cocktails. This work not only provides a comprehensive catalog of TFs defining cell specialization but also suggests TF combinations for direct cell conversion.
    DOI:  https://doi.org/10.1093/nargab/lqab100
  15. Genome Biol. 2021 11 08. 22(1): 308
      BACKGROUND: Enhancers are non-coding regions of the genome that control the activity of target genes. Recent efforts to identify active enhancers experimentally and in silico have proven effective. While these tools can predict the locations of enhancers with a high degree of accuracy, the mechanisms underpinning the activity of enhancers are often unclear.RESULTS: Using machine learning (ML) and a rule-based explainable artificial intelligence (XAI) model, we demonstrate that we can predict the location of known enhancers in Drosophila with a high degree of accuracy. Most importantly, we use the rules of the XAI model to provide insight into the underlying combinatorial histone modifications code of enhancers. In addition, we identified a large set of putative enhancers that display the same epigenetic signature as enhancers identified experimentally. These putative enhancers are enriched in nascent transcription, divergent transcription and have 3D contacts with promoters of transcribed genes. However, they display only intermediary enrichment of mediator and cohesin complexes compared to previously characterised active enhancers. We also found that 10-15% of the predicted enhancers display similar characteristics to super enhancers observed in other species.
    CONCLUSIONS: Here, we applied an explainable AI model to predict enhancers with high accuracy. Most importantly, we identified that different combinations of epigenetic marks characterise different groups of enhancers. Finally, we discovered a large set of putative enhancers which display similar characteristics with previously characterised active enhancers.
    Keywords:  Drosophila; Enhancers; Explainable Artificial Intelligence; Gene regulation; Histone modifications
    DOI:  https://doi.org/10.1186/s13059-021-02532-7
  16. Nat Commun. 2021 Nov 11. 12(1): 6535
      Super-enhancers (SEs) govern macrophage polarization and function. However, the mechanism underlying the signal-dependent latent SEs remodeling in macrophages remains largely undefined. Here we show that the epigenetic reader ZMYND8 forms liquid compartments with NF-κB/p65 to silence latent SEs and restrict macrophage-mediated inflammation. Mechanistically, the fusion of ZMYND8 and p65 liquid condensates is reinforced by signal-induced acetylation of p65. Then acetylated p65 guides the ZMYND8 redistribution onto latent SEs de novo generated in polarized macrophages, and consequently, recruit LSD1 to decommission latent SEs. The liquidity characteristic of ZMYND8 is critical for its regulatory effect since mutations coagulating ZMYND8 into solid compartments disable the translocation of ZMYND8 and its suppressive function. Thereby, ZMYND8 serves as a molecular rheostat to switch off latent SEs and control the magnitude of the immune response. Meanwhile, we propose a phase separation model by which the latent SEs are fine-tuned in a spatiotemporal manner.
    DOI:  https://doi.org/10.1038/s41467-021-26864-x
  17. Genome Biol. 2021 Nov 11. 22(1): 311
      BACKGROUND: Recent single-cell transcriptomic studies report that IDH-mutant gliomas share a common hierarchy of cellular phenotypes, independent of genetic subtype. However, the genetic differences between IDH-mutant glioma subtypes are prognostic, predictive of response to chemotherapy, and correlate with distinct tumor microenvironments.RESULTS: To reconcile these findings, we profile 22 human IDH-mutant gliomas using scATAC-seq and scRNA-seq. We determine the cell-type-specific differences in transcription factor expression and associated regulatory grammars between IDH-mutant glioma subtypes. We find that while IDH-mutant gliomas do share a common distribution of cell types, there are significant differences in the expression and targeting of transcription factors that regulate glial identity and cytokine elaboration. We knock out the chromatin remodeler ATRX, which suffers loss-of-function alterations in most IDH-mutant astrocytomas, in an IDH-mutant immunocompetent intracranial murine model. We find that both human ATRX-mutant gliomas and murine ATRX-knockout gliomas are more heavily infiltrated by immunosuppressive monocytic-lineage cells derived from circulation than ATRX-intact gliomas, in an IDH-mutant background. ATRX knockout in murine glioma recapitulates gene expression and open chromatin signatures that are specific to human ATRX-mutant astrocytomas, including drivers of astrocytic lineage and immune-cell chemotaxis. Through single-cell cleavage under targets and tagmentation assays and meta-analysis of public data, we show that ATRX loss leads to a global depletion in CCCTC-binding factor association with DNA, gene dysregulation along associated chromatin loops, and protection from therapy-induced senescence.
    CONCLUSIONS: These studies explain how IDH-mutant gliomas from different subtypes maintain distinct phenotypes and tumor microenvironments despite a common lineage hierarchy.
    DOI:  https://doi.org/10.1186/s13059-021-02535-4
  18. Elife. 2021 Nov 08. pii: e67952. [Epub ahead of print]10
      The post-translational modification of histones by the small ubiquitin-like modifier (SUMO) protein has been associated with gene regulation, centromeric localization and double-strand break repair in eukaryotes. Although sumoylation of histone H4 was specifically associated with gene repression, this could not be proven due to the challenge of site-specifically sumoylating H4 in cells. Biochemical crosstalk between SUMO and other histone modifications, such as H4 acetylation and H3 methylation, that are associated with active genes also remains unclear. We addressed these challenges in mechanistic studies using an H4 chemically modified at Lys12 by SUMO-3 (H4K12su) and incorporated into mononucleosomes and chromatinized plasmids for functional studies. Mononucleosome-based assays revealed that H4K12su inhibits transcription-activating H4 tail acetylation by the histone acetyltransferase p300, as well as transcription-associated H3K4 methylation by the extended catalytic module of the Set1/COMPASS histone methyltransferase complex. Activator- and p300-dependent in vitro transcription assays with chromatinized plasmids revealed that H4K12su inhibits both H4 tail acetylation and RNA polymerase II-mediated transcription. Finally, cell-based assays with a SUMO-H4 fusion that mimics H4 tail sumoylation confirmed the negative crosstalk between histone sumoylation and acetylation/methylation. Thus, our studies establish the key role for histone sumoylation in gene silencing and its negative biochemical crosstalk with active transcription-associated marks in human cells.
    Keywords:  E. coli; S. cerevisiae; biochemistry; chemical biology; chromosomes; gene expression; human
    DOI:  https://doi.org/10.7554/eLife.67952
  19. Nat Commun. 2021 Nov 09. 12(1): 6462
      Polymorphic integrations of endogenous retroviruses (ERVs) have been previously detected in mouse and human genomes. While most are inert, a subset can influence the activity of the host genes. However, the molecular mechanism underlying how such elements affect the epigenome and transcriptome and their roles in driving intra-specific variation remain unclear. Here, by utilizing wildtype murine embryonic stem cells (mESCs) derived from distinct genetic backgrounds, we discover a polymorphic MMERGLN (GLN) element capable of regulating H3K27ac enrichment and transcription of neighboring loci. We demonstrate that this polymorphic element can enhance the neighboring Klhdc4 gene expression in cis, which alters the activity of downstream stress response genes. These results suggest that the polymorphic ERV-derived cis-regulatory element contributes to differential phenotypes from stimuli between mouse strains. Moreover, we identify thousands of potential polymorphic ERVs in mESCs, a subset of which show an association between proviral activity and nearby chromatin states and transcription. Overall, our findings elucidate the mechanism of how polymorphic ERVs can shape the epigenome and transcriptional networks that give rise to phenotypic divergence between individuals.
    DOI:  https://doi.org/10.1038/s41467-021-26630-z
  20. Cell Rep. 2021 Nov 09. pii: S2211-1247(21)01461-3. [Epub ahead of print]37(6): 109982
      Early blastomeres of mouse preimplantation embryos exhibit bi-potential cell fate, capable of generating both embryonic and extra-embryonic lineages in blastocysts. Here we identify three major two-cell-stage (2C)-specific endogenous retroviruses (ERVs) as the molecular hallmark of this bi-potential plasticity. Using the long terminal repeats (LTRs) of all three 2C-specific ERVs, we identify Krüppel-like factor 5 (Klf5) as their major upstream regulator. Klf5 is essential for bi-potential cell fate; a single Klf5-overexpressing embryonic stem cell (ESC) generates terminally differentiated embryonic and extra-embryonic lineages in chimeric embryos, and Klf5 directly induces inner cell mass (ICM) and trophectoderm (TE) specification genes. Intriguingly, Klf5 and Klf4 act redundantly during ICM specification, whereas Klf5 deficiency alone impairs TE specification. Klf5 is regulated by multiple 2C-specific transcription factors, particularly Dux, and the Dux/Klf5 axis is evolutionarily conserved. The 2C-specific transcription program converges on Klf5 to establish bi-potential cell fate, enabling a cell state with dual activation of ICM and TE genes.
    Keywords:  ICM; Klf4; Klf5; MERVL; ORR1A0; ORR1A1; TE; preimplantation development
    DOI:  https://doi.org/10.1016/j.celrep.2021.109982
  21. Nat Protoc. 2021 Nov 12.
      Precise control of gene expression requires the coordinated action of multiple factors at cis-regulatory elements. We recently developed single-molecule footprinting to simultaneously resolve the occupancy of multiple proteins including transcription factors, RNA polymerase II and nucleosomes on single DNA molecules genome-wide. The technique combines the use of cytosine methyltransferases to footprint the genome with bisulfite sequencing to resolve transcription factor binding patterns at cis-regulatory elements. DNA footprinting is performed by incubating permeabilized nuclei with recombinant methyltransferases. Upon DNA extraction, whole-genome or targeted bisulfite libraries are prepared and loaded on Illumina sequencers. The protocol can be completed in 4-5 d in any laboratory with access to high-throughput sequencing. Analysis can be performed in 2 d using a dedicated R package and requires access to a high-performance computing system. Our method can be used to analyze how transcription factors cooperate and antagonize to regulate transcription.
    DOI:  https://doi.org/10.1038/s41596-021-00630-1
  22. Nucleic Acids Res. 2021 Nov 11. pii: gkab1032. [Epub ahead of print]
      Previous studies on enhancers and their target genes were largely based on bulk samples that represent 'average' regulatory activities from a large population of millions of cells, masking the heterogeneity and important effects from the sub-populations. In recent years, single-cell sequencing technology has enabled the profiling of open chromatin accessibility at the single-cell level (scATAC-seq), which can be used to annotate the enhancers and promoters in specific cell types. A comprehensive resource is highly desirable for exploring how the enhancers regulate the target genes at the single-cell level. Hence, we designed a single-cell database scEnhancer (http://enhanceratlas.net/scenhancer/), covering 14 527 776 enhancers and 63 658 600 enhancer-gene interactions from 1 196 906 single cells across 775 tissue/cell types in three species. An unsupervised learning method was employed to sort and combine tens or hundreds of single cells in each tissue/cell type to obtain the consensus enhancers. In addition, we utilized a cis-regulatory network algorithm to identify the enhancer-gene connections. Finally, we provided a user-friendly platform with seven useful modules to search, visualize, and browse the enhancers/genes. This database will facilitate the research community towards a functional analysis of enhancers at the single-cell level.
    DOI:  https://doi.org/10.1093/nar/gkab1032
  23. iScience. 2021 Nov 19. 24(11): 103218
      There has been extensive research in predictive modeling of genome-scale metabolic reaction networks. Living systems involve complex stochastic processes arising from interactions among different biomolecules. For more accurate and robust prediction of target metabolic behavior under different conditions, not only metabolic reactions but also the genetic regulatory relationships involving transcription factors (TFs) affecting these metabolic reactions should be modeled. We have developed a modeling and simulation pipeline enabling the analysis of Transcription Regulation Integrated with Metabolic Regulation: TRIMER. TRIMER utilizes a Bayesian network (BN) inferred from transcriptomes to model the transcription factor regulatory network. TRIMER then infers the probabilities of the gene states relevant to the metabolism of interest, and predicts the metabolic fluxes and their changes that result from the deletion of one or more transcription factors at the genome scale. We demonstrate TRIMER's applicability to both simulated and experimental data and provide performance comparison with other existing approaches.
    Keywords:  Bioinformatics; Metabolomics; Transcriptomics
    DOI:  https://doi.org/10.1016/j.isci.2021.103218
  24. Nucleic Acids Res. 2021 Nov 09. pii: gkab996. [Epub ahead of print]
      ReMap (https://remap.univ-amu.fr) aims to provide manually curated, high-quality catalogs of regulatory regions resulting from a large-scale integrative analysis of DNA-binding experiments in Human, Mouse, Fly and Arabidopsis thaliana for hundreds of transcription factors and regulators. In this 2022 update, we have uniformly processed >11 000 DNA-binding sequencing datasets from public sources across four species. The updated Human regulatory atlas includes 8103 datasets covering a total of 1210 transcriptional regulators (TRs) with a catalog of 182 million (M) peaks, while the updated Arabidopsis atlas reaches 4.8M peaks, 423 TRs across 694 datasets. Also, this ReMap release is enriched by two new regulatory catalogs for Mus musculus and Drosophila melanogaster. First, the Mouse regulatory catalog consists of 123M peaks across 648 TRs as a result of the integration and validation of 5503 ChIP-seq datasets. Second, the Drosophila melanogaster catalog contains 16.6M peaks across 550 TRs from the integration of 1205 datasets. The four regulatory catalogs are browsable through track hubs at UCSC, Ensembl and NCBI genome browsers. Finally, ReMap 2022 comes with a new Cis Regulatory Module identification method, improved quality controls, faster search results, and better user experience with an interactive tour and video tutorials on browsing and filtering ReMap catalogs.
    DOI:  https://doi.org/10.1093/nar/gkab996
  25. Proc Natl Acad Sci U S A. 2021 Nov 16. pii: e2111450118. [Epub ahead of print]118(46):
      Signal processing is critical to a myriad of biological phenomena (natural and engineered) that involve gene regulation. Biological signal processing can be achieved by way of allosteric transcription factors. In canonical regulatory systems (e.g., the lactose repressor), an INPUT signal results in the induction of a given transcription factor and objectively switches gene expression from an OFF state to an ON state. In such biological systems, to revert the gene expression back to the OFF state requires the aggressive dilution of the input signal, which can take 1 or more d to achieve in a typical biotic system. In this study, we present a class of engineered allosteric transcription factors capable of processing two-signal INPUTS, such that a sequence of INPUTS can rapidly transition gene expression between alternating OFF and ON states. Here, we present two fundamental biological signal processing filters, BANDPASS and BANDSTOP, that are regulated by D-fucose and isopropyl-β-D-1-thiogalactopyranoside. BANDPASS signal processing filters facilitate OFF-ON-OFF gene regulation. Whereas, BANDSTOP filters facilitate the antithetical gene regulation, ON-OFF-ON. Engineered signal processing filters can be directed to seven orthogonal promoters via adaptive modular DNA binding design. This collection of signal processing filters can be used in collaboration with our established transcriptional programming structure. Kinetic studies show that our collection of signal processing filters can switch between states of gene expression within a few minutes with minimal metabolic burden-representing a paradigm shift in general gene regulation.
    Keywords:  BANDPASS; BANDSTOP; protein engineering; synthetic biology; transcription factors
    DOI:  https://doi.org/10.1073/pnas.2111450118
  26. iScience. 2021 Nov 19. 24(11): 103234
      Genetic studies of autism have revealed causal roles for chromatin remodeling gene mutations. Chromodomain helicase DNA binding protein 8 (CHD8) encodes a chromatin remodeler with significant de novo mutation rates in sporadic autism. However, relationships between CHD8 genomic function and autism-relevant biology remain poorly elucidated. Published studies utilizing ChIP-seq to map CHD8 protein-DNA interactions have high variability, consistent with technical challenges and limitations associated with this method. Thus, complementary approaches are needed to establish CHD8 genomic targets and regulatory functions in developing brain. We used in utero CHD8 Targeted DamID followed by sequencing (TaDa-seq) to characterize CHD8 binding in embryonic mouse cortex. CHD8 TaDa-seq reproduced interaction patterns observed from ChIP-seq and further highlighted CHD8 distal interactions associated with neuronal loci. This study establishes TaDa-seq as a useful alternative for mapping protein-DNA interactions in vivo and provides insights into the regulatory targets of CHD8 and autism-relevant pathophysiology associated with CHD8 mutations.
    Keywords:  Biotechnology; Genomic analysis; Genomics; Molecular neuroscience
    DOI:  https://doi.org/10.1016/j.isci.2021.103234
  27. Nucleic Acids Res. 2021 Nov 08. pii: gkab1015. [Epub ahead of print]
      Telomere shortening can cause detrimental diseases and contribute to aging. It occurs due to the end replication problem in cells lacking telomerase. Furthermore, recent studies revealed that telomere shortening can be attributed to difficulties of the semi-conservative DNA replication machinery to replicate the bulk of telomeric DNA repeats. To investigate telomere replication in a comprehensive manner, we develop QTIP-iPOND - Quantitative Telomeric chromatin Isolation Protocol followed by isolation of Proteins On Nascent DNA - which enables purification of proteins that associate with telomeres specifically during replication. In addition to the core replisome, we identify a large number of proteins that specifically associate with telomere replication forks. Depletion of several of these proteins induces telomere fragility validating their importance for telomere replication. We also find that at telomere replication forks the single strand telomere binding protein POT1 is depleted, whereas histone H1 is enriched. Our work reveals the dynamic changes of the telomeric proteome during replication, providing a valuable resource of telomere replication proteins. To our knowledge, this is the first study that examines the replisome at a specific region of the genome.
    DOI:  https://doi.org/10.1093/nar/gkab1015
  28. Cell Rep. 2021 Nov 09. pii: S2211-1247(21)01459-5. [Epub ahead of print]37(6): 109980
      Plants exhibit high regenerative capacity, which is controlled by various genetic factors. Here, we report that ARABIDOPSIS TRITHORAX-RELATED 2 (ATXR2) controls de novo shoot organogenesis by regulating auxin-cytokinin interaction. The auxin-inducible ATXR2 Trithorax Group (TrxG) protein temporally interacts with the cytokinin-responsive type-B ARABIDOPSIS RESPONSE REGULATOR 1 (ARR1) at early stages of shoot regeneration. The ATXR2-ARR1 complex binds to and deposits the H3K36me3 mark in the promoters of a subset of type-A ARR genes, ARR5 and ARR7, thus activating their expression. Consequently, the ATXR2/ARR1-type-A ARR module transiently represses cytokinin signaling and thereby de novo shoot regeneration. The atxr2-1 mutant calli exhibit enhanced shoot regeneration with low expression of ARR5 and ARR7, which ultimately upregulates WUSCHEL (WUS) expression. Thus, ATXR2 regulates cytokinin signaling and prevents premature WUS activation to ensure proper cell fate transition, and the auxin-cytokinin interaction underlies the initial specification of shoot meristem in callus.
    Keywords:  Auxin-cytokinin interaction; callus; de novo shoot organogenesis; histone modification; plant regeneration
    DOI:  https://doi.org/10.1016/j.celrep.2021.109980
  29. Cell Rep. 2021 Nov 09. pii: S2211-1247(21)01447-9. [Epub ahead of print]37(6): 109968
      N6-methyladenosine (m6A) RNA modification is a fundamental determinant of mRNA metabolism, but its role in innate immunity-driven non-alcoholic fatty liver disease (NAFLD) and obesity is not known. Here, we show that myeloid lineage-restricted deletion of the m6A "writer" protein Methyltransferase Like 3 (METTL3) prevents age-related and diet-induced development of NAFLD and obesity in mice with improved inflammatory and metabolic phenotypes. Mechanistically, loss of METTL3 results in the differential expression of multiple mRNA transcripts marked with m6A, with a notable increase of DNA Damage Inducible Transcript 4 (DDIT4) mRNA level. In METTL3-deficient macrophages, there is a significant downregulation of mammalian target of rapamycin (mTOR) and nuclear factor κB (NF-κB) pathway activity in response to cellular stress and cytokine stimulation, which can be restored by knockdown of DDIT4. Taken together, our findings identify the contribution of METTL3-mediated m6A modification of Ddit4 mRNA to macrophage metabolic reprogramming in NAFLD and obesity.
    Keywords:  DDIT4; NAFLD; liver inflammation; mRNA m(6)A methylation; myeloid cells; non-alcoholic fatty liver disease
    DOI:  https://doi.org/10.1016/j.celrep.2021.109968