bims-crepig Biomed News
on Chromatin regulation and epigenetics in cell fate and cancer
Issue of 2023–10–22
25 papers selected by
Connor Rogerson, University of Cambridge



  1. J Mol Biol. 2023 Oct 17. pii: S0022-2836(23)00426-6. [Epub ahead of print] 168315
      Enhancers activate their cognate promoters over huge distances but how enhancer/promoter interactions become established is not completely understood. There is strong evidence that cohesin-mediated loop extrusion is involved but this does not appear to be a universal mechanism. Here, we identify an element within the mouse immunoglobulin lambda (Igλ) light chain locus, HSCλ1, that has characteristics of active regulatory elements but lacks intrinsic enhancer or promoter activity. Remarkably, knock-out of the YY1 binding site from HSCλ1 reduces Igλ transcription significantly and disrupts enhancer/promoter interactions, even though these elements are >10 kb from HSCλ1. Genome-wide analyses of mouse embryonic stem cells identified 2671 similar YY1-bound, putative genome organizing elements that lie within CTCF/cohesin loop boundaries but that lack intrinsic enhancer activity. We suggest that such elements play a fundamental role in locus folding and in facilitating enhancer/promoter interactions.
    Keywords:  Enhancer; Gene Activation; Locus folding; Promoter; YY1 binding
    DOI:  https://doi.org/10.1016/j.jmb.2023.168315
  2. BMC Genomics. 2023 Oct 19. 24(1): 623
       BACKGROUND: Establishment of DNA methylation (DNAme) patterns is essential for balanced multi-lineage cellular differentiation, but exactly how these patterns drive cellular phenotypes is unclear. While > 80% of CpG sites are stably methylated, tens of thousands of discrete CpG loci form hypomethylated regions (HMRs). Because they lack DNAme, HMRs are considered transcriptionally permissive, but not all HMRs actively regulate genes. Unlike promoter HMRs, a subset of non-coding HMRs is cell type-specific and enriched for tissue-specific gene regulatory functions. Our data further argues not only that HMR establishment is an important step in enforcing cell identity, but also that cross-cell type and spatial HMR patterns are functionally informative of gene regulation.
    RESULTS: To understand the significance of non-coding HMRs, we systematically dissected HMR patterns across diverse human cell types and developmental timepoints, including embryonic, fetal, and adult tissues. Unsupervised clustering of 126,104 distinct HMRs revealed that levels of HMR specificity reflects a developmental hierarchy supported by enrichment of stage-specific transcription factors and gene ontologies. Using a pseudo-time course of development from embryonic stem cells to adult stem and mature hematopoietic cells, we find that most HMRs observed in differentiated cells (~ 60%) are established at early developmental stages and accumulate as development progresses. HMRs that arise during differentiation frequently (~ 35%) establish near existing HMRs (≤ 6 kb away), leading to the formation of HMR clusters associated with stronger enhancer activity. Using SNP-based partitioned heritability from GWAS summary statistics across diverse traits and clinical lab values, we discovered that genetic contribution to trait heritability is enriched within HMRs. Moreover, the contribution of heritability to cell-relevant traits increases with both increasing HMR specificity and HMR clustering, supporting the role of distinct HMR subsets in regulating normal cell function.
    CONCLUSIONS: Our results demonstrate that the entire HMR repertoire within a cell-type, rather than just the cell type-specific HMRs, stores information that is key to understanding and predicting cellular phenotypes. Ultimately, these data provide novel insights into how DNA hypo-methylation provides genetically distinct historical records of a cell's journey through development, highlighting HMRs as functionally distinct from other epigenomic annotations.
    Keywords:  Cell history; DNA hypomethylation; Enhancer methylation; Epigenetics; Partitioned heritability
    DOI:  https://doi.org/10.1186/s12864-023-09622-9
  3. Cell Rep. 2023 Oct 17. pii: S2211-1247(23)01271-8. [Epub ahead of print]42(10): 113259
      CCCTC-binding factor (CTCF), a ubiquitously expressed architectural protein, has emerged as a key regulator of cell identity gene transcription. However, the precise molecular mechanism underlying specialized functions of CTCF remains elusive. Here, we investigate the mechanism through integrative analyses of primary hepatocytes, myocytes, and B cells from mouse and human. We demonstrate that CTCF cooperates with lineage-specific pioneer transcription factors (TFs), including MyoD, FOXA, and PU.1, to control cell identity at 1D and 3D levels. At the 1D level, pioneer TFs facilitate lineage-specific CTCF occupancy via opening chromatin. At the 3D level, CTCF and pioneer TFs form regulatory hubs to govern the expression of cell identity genes. This mechanism is validated using MyoD-null mice, CTCF knockout mice, and CRISPR editing during myogenic differentiation. Collectively, these findings uncover a general mechanism whereby CTCF acts as a cell identity cofactor to control cell identity genes via orchestrating regulatory hubs with pioneer TFs.
    Keywords:  3D genome; CP: Molecular biology; CP: Stem cell research; CRISPR editing; CTCF; MyoD; cell identity; pioneer transcription factor; regulatory hub
    DOI:  https://doi.org/10.1016/j.celrep.2023.113259
  4. Nat Commun. 2023 Oct 16. 14(1): 6519
      The interphase genome is dynamically organized in the nucleus and decorated with chromatin-associated RNA (caRNA). It remains unclear whether the genome architecture modulates the spatial distribution of caRNA and vice versa. Here, we generate a resource of genome-wide RNA-DNA and DNA-DNA contact maps in human cells. These maps reveal the chromosomal domains demarcated by locally transcribed RNA, hereafter termed RNA-defined chromosomal domains. Further, the spreading of caRNA is constrained by the boundaries of topologically associating domains (TADs), demonstrating the role of the 3D genome structure in modulating the spatial distribution of RNA. Conversely, stopping transcription or acute depletion of RNA induces thousands of chromatin loops genome-wide. Activation or suppression of the transcription of specific genes suppresses or creates chromatin loops straddling these genes. Deletion of a specific caRNA-producing genomic sequence promotes chromatin loops that straddle the interchromosomal target sequences of this caRNA. These data suggest a feedback loop where the 3D genome modulates the spatial distribution of RNA, which in turn affects the dynamic 3D genome organization.
    DOI:  https://doi.org/10.1038/s41467-023-42274-7
  5. Nat Commun. 2023 Oct 18. 14(1): 6570
      Cooperativity and antagonism between transcription factors (TFs) can drastically modify their binding to regulatory DNA elements. While mapping these relationships between TFs is important for understanding their context-specific functions, existing approaches either rely on DNA binding motif predictions, interrogate one TF at a time, or study individual TFs in parallel. Here, we introduce paired yeast one-hybrid (pY1H) assays to detect cooperativity and antagonism across hundreds of TF-pairs at DNA regions of interest. We provide evidence that a wide variety of TFs are subject to modulation by other TFs in a DNA region-specific manner. We also demonstrate that TF-TF relationships are often affected by alternative isoform usage and identify cooperativity and antagonism between human TFs and viral proteins from human papillomaviruses, Epstein-Barr virus, and other viruses. Altogether, pY1H assays provide a broadly applicable framework to study how different functional relationships affect protein occupancy at regulatory DNA regions.
    DOI:  https://doi.org/10.1038/s41467-023-42445-6
  6. Cell Syst. 2023 Oct 18. pii: S2405-4712(23)00270-3. [Epub ahead of print]14(10): 906-922.e6
      Long non-coding RNAs (lncRNAs) are involved in gene expression regulation in cis. Although enriched in the cell chromatin fraction, to what degree this defines their regulatory potential remains unclear. Furthermore, the factors underlying lncRNA chromatin tethering, as well as the molecular basis of efficient lncRNA chromatin dissociation and its impact on enhancer activity and target gene expression, remain to be resolved. Here, we developed chrTT-seq, which combines the pulse-chase metabolic labeling of nascent RNA with chromatin fractionation and transient transcriptome sequencing to follow nascent RNA transcripts from their transcription on chromatin to release and allows the quantification of dissociation dynamics. By incorporating genomic, transcriptomic, and epigenetic metrics, as well as RNA-binding protein propensities, in machine learning models, we identify features that define transcript groups of different chromatin dissociation dynamics. Notably, lncRNAs transcribed from enhancers display reduced chromatin retention, suggesting that, in addition to splicing, their chromatin dissociation may shape enhancer activity.
    Keywords:  RNA processing; RNA-binding protein interactions; chromatin dissociation dynamics; co-transcriptional splicing; enhancer; enhancer-associated lncRNAs; lncRNAs; machine learning; nascent RNA transcription; predictive models
    DOI:  https://doi.org/10.1016/j.cels.2023.09.005
  7. EMBO J. 2023 Oct 18. e113798
      Based on studies of animals and yeasts, methylation of histone H3 lysine 4 (H3K4me1/2/3, for mono-, di-, and tri-methylation, respectively) is regarded as the key epigenetic modification of transcriptionally active genes. In plants, however, H3K4me2 correlates negatively with transcription, and the regulatory mechanisms of this counterintuitive H3K4me2 distribution in plants remain largely unexplored. A previous genetic screen for factors regulating plant regeneration identified Arabidopsis LYSINE-SPECIFIC DEMETHYLASE 1-LIKE 3 (LDL3), which is a major H3K4me2 demethylase. Here, we show that LDL3-mediated H3K4me2 demethylation depends on the transcription elongation factor Paf1C and phosphorylation of the C-terminal domain (CTD) of RNA polymerase II (RNAPII). In addition, LDL3 binds to phosphorylated RNAPII. These results suggest that LDL3 is recruited to transcribed genes by binding to elongating RNAPII and demethylates H3K4me2 cotranscriptionally. Importantly, the negative correlation between H3K4me2 and transcription is significantly attenuated in the ldl3 mutant, demonstrating the genome-wide impacts of the transcription-driven LDL3 pathway to control H3K4me2 in plants. Our findings implicate H3K4me2 demethylation in plants as chromatin records of transcriptional activity, which ensures robust gene control.
    Keywords:  RNA polymerase II; histone demethylase; histone methylation; transcription
    DOI:  https://doi.org/10.15252/embj.2023113798
  8. Cell Rep. 2023 Oct 18. pii: S2211-1247(23)01312-8. [Epub ahead of print]42(10): 113300
      All vertebrate genomes encode for three large histone H2A variants that have an additional metabolite-binding globular macrodomain module, macroH2A. MacroH2A variants impact heterochromatin organization and transcription regulation and establish a barrier for cellular reprogramming. However, the mechanisms of how macroH2A is incorporated into chromatin and the identity of any chaperones required for histone deposition remain elusive. Here, we develop a split-GFP-based assay for chromatin incorporation and use it to conduct a genome-wide mutagenesis screen in haploid human cells to identify proteins that regulate macroH2A dynamics. We show that the histone chaperone ANP32B is a regulator of macroH2A deposition. ANP32B associates with macroH2A in cells and in vitro binds to histones with low nanomolar affinity. In vitro nucleosome assembly assays show that ANP32B stimulates deposition of macroH2A-H2B and not of H2A-H2B onto tetrasomes. In cells, depletion of ANP32B strongly affects global macroH2A chromatin incorporation, revealing ANP32B as a macroH2A histone chaperone.
    Keywords:  ANP32B; CP: Molecular biology; MacroH2A; chromatin; histone; histone chaperone
    DOI:  https://doi.org/10.1016/j.celrep.2023.113300
  9. PLoS Comput Biol. 2023 Oct 20. 19(10): e1011568
      Histone ChIP-seq is one of the primary methods for charting the cellular epigenomic landscape, the components of which play a critical regulatory role in gene expression. Analyzing the activity of regulatory elements across datasets and cell types can be challenging due to shifting peak positions and normalization artifacts resulting from, for example, differing read depths, ChIP efficiencies, and target sizes. Moreover, broad regions of enrichment seen in repressive histone marks often evade detection by commonly used peak callers. Here, we present a simple and versatile method for identifying enriched regions in ChIP-seq data that relies on estimating a gamma distribution fit to non-overlapping 5kB genomic bins to establish a global background. We use this distribution to assign a probability of being signal (PBS) between zero and one to each 5 kB bin. This approach, while lower in resolution than typical peak-calling methods, provides a straightforward way to identify enriched regions and compare enrichments among multiple datasets, by transforming the data to values that are universally normalized and can be readily visualized and integrated with downstream analysis methods. We demonstrate applications of PBS for both broad and narrow histone marks, and provide several illustrations of biological insights which can be gleaned by integrating PBS scores with downstream data types.
    DOI:  https://doi.org/10.1371/journal.pcbi.1011568
  10. Nat Commun. 2023 Oct 18. 14(1): 6594
      The cell type-specific expression of key transcription factors is central to development and disease. Brachyury/T/TBXT is a major transcription factor for gastrulation, tailbud patterning, and notochord formation; however, how its expression is controlled in the mammalian notochord has remained elusive. Here, we identify the complement of notochord-specific enhancers in the mammalian Brachyury/T/TBXT gene. Using transgenic assays in zebrafish, axolotl, and mouse, we discover three conserved Brachyury-controlling notochord enhancers, T3, C, and I, in human, mouse, and marsupial genomes. Acting as Brachyury-responsive, auto-regulatory shadow enhancers, in cis deletion of all three enhancers in mouse abolishes Brachyury/T/Tbxt expression selectively in the notochord, causing specific trunk and neural tube defects without gastrulation or tailbud defects. The three Brachyury-driving notochord enhancers are conserved beyond mammals in the brachyury/tbxtb loci of fishes, dating their origin to the last common ancestor of jawed vertebrates. Our data define the vertebrate enhancers for Brachyury/T/TBXTB notochord expression through an auto-regulatory mechanism that conveys robustness and adaptability as ancient basis for axis development.
    DOI:  https://doi.org/10.1038/s41467-023-42151-3
  11. Stem Cell Reports. 2023 Oct 11. pii: S2213-6711(23)00373-9. [Epub ahead of print]
      Congenital heart disease often arises from perturbations of transcription factors (TFs) that guide cardiac development. ISLET1 (ISL1) is a TF that influences early cardiac cell fate, as well as differentiation of other cell types including motor neuron progenitors (MNPs) and pancreatic islet cells. While lineage specificity of ISL1 function is likely achieved through combinatorial interactions, its essential cardiac interacting partners are unknown. By assaying ISL1 genomic occupancy in human induced pluripotent stem cell-derived cardiac progenitors (CPs) or MNPs and leveraging the deep learning approach BPNet, we identified motifs of other TFs that predicted ISL1 occupancy in each lineage, with NKX2.5 and GATA motifs being most closely associated to ISL1 in CPs. Experimentally, nearly two-thirds of ISL1-bound loci were co-occupied by NKX2.5 and/or GATA4. Removal of NKX2.5 from CPs led to widespread ISL1 redistribution, and overexpression of NKX2.5 in MNPs led to ISL1 occupancy of CP-specific loci. These results reveal how ISL1 guides lineage choices through a combinatorial code that dictates genomic occupancy and transcription.
    Keywords:  ISL1; NKX2.5; cardiac development; cardiac progenitor; cell specification; combinatorial code; transcription factor motifs; transcription factors; transcriptional regulation
    DOI:  https://doi.org/10.1016/j.stemcr.2023.09.014
  12. Nucleic Acids Res. 2023 Oct 18. pii: gkad894. [Epub ahead of print]
      MethMotif (https://methmotif.org) is a publicly available database that provides a comprehensive repository of transcription factor (TF)-binding profiles, enriched with DNA methylation patterns. In this release, we have enhanced the platform, expanding our initial collection to over 700 position weight matrices (PWM), all of which include DNA methylation profiles. One of the key advancements in this release is the segregation of TF-binding motifs based on their cofactors and DNA methylation status. We have previously demonstrated that gene ontology (GO) enriched terms associated with TF target genes may differ based on their association with alternative cofactors and DNA methylation status. MethMotif provides precomputed GO annotations for each human TF of interest, as well as for TF-co-TF complexes, enabling a comprehensive analysis of TF functions in the context of their co-factors. Additionally, MethMotif has been updated to encompass data for two new species, Mus musculus and Arabidopsis thaliana, widening its applicability to a broader community. MethMotif stands out as the first and only TF-binding motifs database to incorporate context-specific PWM coupled with epigenetic information, thereby enlightening context-specific TF functions. This enhancement allows the community to explore and gain deeper insights into the regulatory mechanisms governing transcriptional processes.
    DOI:  https://doi.org/10.1093/nar/gkad894
  13. Nucleic Acids Res. 2023 Oct 16. pii: gkad841. [Epub ahead of print]
      Gene regulation plays a critical role in the cellular processes that underlie human health and disease. The regulatory relationship between transcription factors (TFs), key regulators of gene expression, and their target genes, the so called TF regulons, can be coupled with computational algorithms to estimate the activity of TFs. However, to interpret these findings accurately, regulons of high reliability and coverage are needed. In this study, we present and evaluate a collection of regulons created using the CollecTRI meta-resource containing signed TF-gene interactions for 1186 TFs. In this context, we introduce a workflow to integrate information from multiple resources and assign the sign of regulation to TF-gene interactions that could be applied to other comprehensive knowledge bases. We find that the signed CollecTRI-derived regulons outperform other public collections of regulatory interactions in accurately inferring changes in TF activities in perturbation experiments. Furthermore, we showcase the value of the regulons by examining TF activity profiles in three different cancer types and exploring TF activities at the level of single-cells. Overall, the CollecTRI-derived TF regulons enable the accurate and comprehensive estimation of TF activities and thereby help to interpret transcriptomics data.
    DOI:  https://doi.org/10.1093/nar/gkad841
  14. Nucleic Acids Res. 2023 Oct 19. pii: gkad889. [Epub ahead of print]
      Development of multicellular animals requires epigenetic repression by Polycomb group proteins. The latter assemble in multi-subunit complexes, of which two kinds, Polycomb Repressive Complex 1 (PRC1) and Polycomb Repressive Complex 2 (PRC2), act together to repress key developmental genes. How PRC1 and PRC2 recognize specific genes remains an open question. Here we report the identification of several hundreds of DNA elements that tether canonical PRC1 to human developmental genes. We use the term tether to describe a process leading to a prominent presence of canonical PRC1 at certain genomic sites, although the complex is unlikely to interact with DNA directly. Detailed analysis indicates that sequence features associated with PRC1 tethering differ from those that favour PRC2 binding. Throughout the genome, the two kinds of sequence features mix in different proportions to yield a gamut of DNA elements that range from those tethering predominantly PRC1 or PRC2 to ones capable of tethering both complexes. The emerging picture is similar to the paradigmatic targeting of Polycomb complexes by Polycomb Response Elements (PREs) of Drosophila but providing for greater plasticity.
    DOI:  https://doi.org/10.1093/nar/gkad889
  15. Cell Rep. 2023 Oct 18. pii: S2211-1247(23)01285-8. [Epub ahead of print]42(10): 113273
      RNA N6-methyladenosine (m6A) modification is implicated in cancer progression, yet its role in regulating long noncoding RNAs during cancer progression remains unclear. Here, we report that the m6A demethylase fat mass and obesity-associated protein (FTO) stabilizes long intergenic noncoding RNA for kinase activation (LINK-A) to promote cell proliferation and chemoresistance in esophageal squamous cell carcinoma (ESCC). Mechanistically, LINK-A promotes the interaction between minichromosome maintenance complex component 3 (MCM3) and cyclin-dependent kinase 1 (CDK1), increasing MCM3 phosphorylation. This phosphorylation facilitates the loading of the MCM complex onto chromatin, which promotes cell-cycle progression and subsequent cell proliferation. Moreover, LINK-A disrupts the interaction between MCM3 and hypoxia-inducible factor 1α (HIF-1α), abrogating MCM3-mediated HIF-1α transcriptional repression and promoting glycolysis and chemoresistance. These results elucidate the mechanism by which FTO-stabilized LINK-A plays oncogenic roles and identify the FTO/LINK-A/MCM3/HIF-1α axis as a promising therapeutic target for ESCC.
    Keywords:  CDK1; CP: Cancer; FTO; HIF-1α; LINK-A; MCM3; cell cycle progression; chemoresistance; esophageal squamous cell carcinoma; glycolysis; m(6)A
    DOI:  https://doi.org/10.1016/j.celrep.2023.113273
  16. Nucleic Acids Res. 2023 Oct 18. pii: gkad853. [Epub ahead of print]
      The cistrome consists of all cis-acting regulatory elements recognized by transcription factors (TFs). However, only a portion of the cistrome is active for TF binding in a specific tissue. Resolving the active cistrome in plants remains challenging. In this study, we report the assay sequential extraction assisted-active TF identification (sea-ATI), a low-input method that profiles the DNA sequences recognized by TFs in a target tissue. We applied sea-ATI to seven plant tissues to survey their active cistrome and generated 41 motif models, including 15 new models that represent previously unidentified cis-regulatory vocabularies. ATAC-seq and RNA-seq analyses confirmed the functionality of the cis-elements from the new models, in that they are actively bound in vivo, located near the transcription start site, and influence chromatin accessibility and transcription. Furthermore, comparing dimeric WRKY CREs between sea-ATI and DAP-seq libraries revealed that thermodynamics and genetic drifts cooperatively shaped their evolution. Notably, sea-ATI can identify not only positive but also negative regulatory cis-elements, thereby providing unique insights into the functional non-coding genome of plants.
    DOI:  https://doi.org/10.1093/nar/gkad853
  17. Genome Biol. 2023 Oct 16. 24(1): 232
       BACKGROUND: The evolution of genomic regulatory regions plays a critical role in shaping the diversity of life. While this process is primarily sequence-dependent, the enormous complexity of biological systems complicates the understanding of the factors underlying regulation and its evolution. Here, we apply deep neural networks as a tool to investigate the sequence determinants underlying chromatin accessibility in different species and tissues of Drosophila.
    RESULTS: We train hybrid convolution-attention neural networks to accurately predict ATAC-seq peaks using only local DNA sequences as input. We show that our models generalize well across substantially evolutionarily diverged species of insects, implying that the sequence determinants of accessibility are highly conserved. Using our model to examine species-specific gains in accessibility, we find evidence suggesting that these regions may be ancestrally poised for evolution. Using in silico mutagenesis, we show that accessibility can be accurately predicted from short subsequences in each example. However, in silico knock-out of these sequences does not qualitatively impair classification, implying that accessibility is mutationally robust. Subsequently, we show that accessibility is predicted to be robust to large-scale random mutation even in the absence of selection. Conversely, simulations under strong selection demonstrate that accessibility can be extremely malleable despite its robustness. Finally, we identify motifs predictive of accessibility, recovering both novel and previously known motifs.
    CONCLUSIONS: These results demonstrate the conservation of the sequence determinants of accessibility and the general robustness of chromatin accessibility, as well as the power of deep neural networks to explore fundamental questions in regulatory genomics and evolution.
    Keywords:  ATAC-seq; Chromatin accessibility; Deep learning; Drosophila; Drosophila melanogaster; Drosophila simulans; Drosophila yakuba; In silico mutagenesis; Robustness
    DOI:  https://doi.org/10.1186/s13059-023-03079-5
  18. Cell Rep. 2023 Oct 12. pii: S2211-1247(23)01237-8. [Epub ahead of print]42(10): 113225
      An increasing number of studies have shown the key role that RNA polymerase II (RNA Pol II) elongation plays in gene regulation. We systematically examine how various enhancers, promoters, and gene body composition influence the RNA Pol II elongation rate through a single-cell-resolution live imaging assay. By using reporter constructs containing 5' MS2 and 3' PP7 repeating stem loops, we quantify the rate of RNA Pol II elongation in live Drosophila embryos. We find that promoters and exonic gene lengths have no effect on elongation rate, while enhancers and the presence of long introns may significantly change how quickly RNA Pol II moves across a gene. Furthermore, we observe in multiple constructs that the RNA Pol II elongation rate accelerates after the transcriptional onset of nuclear cycle 14 in Drosophila embryos. Our study provides a single-cell view of various mechanisms that affect the dynamic RNA Pol II elongation rate, ultimately affecting the rate of mRNA production.
    Keywords:  CP: Molecular biology; Drosophila; MS2; PP7; elongation rate; enhancers; live imaging; transcription
    DOI:  https://doi.org/10.1016/j.celrep.2023.113225
  19. Genome Res. 2023 Oct 18. pii: gr.278205.123. [Epub ahead of print]
      Transcription factors (TFs) are trans-acting proteins that bind cis-regulatory elements (CREs) in DNA to control gene expression. Here, we analyzed the genomic localization profiles of 529 sequence-specific TFs and 151 cofactors and chromatin regulators in the human cancer cell line HepG2, for a total of 680 broadly-termed DNA-Associated Proteins (DAPs). We used this deep collection to model each TF's impact on gene expression, and identified a cohort of 26 candidate transcriptional repressors. We examine High Occupancy Target (HOT) sites in the context of three-dimensional genome organization and show biased motif placement in distal-promoter connections involving HOT sites. We also found a substantial number of closed chromatin regions with multiple DAPs bound and explored their properties, finding that a MAFF/MAFK TF pair correlates with transcriptional repression. Altogether, these analyses provide novel insights into the regulatory logic of the human cell line HepG2 genome and demonstrate the usefulness of large genomic analyses for elucidation of individual TF functions.
    DOI:  https://doi.org/10.1101/gr.278205.123
  20. Cancer Res. 2023 Oct 19.
      Enhancers are non-coding regulatory DNA regions that modulate the transcription of target genes, often over large distances along the genomic sequence. Enhancer alterations have been associated with various pathological conditions, including cancer. However, the identification and characterization of somatic mutations in non-coding regulatory regions with a functional effect on tumorigenesis and prognosis remain a major challenge. Here we present a strategy for detecting and characterizing enhancer mutations in a genome-wide analysis of patient cohorts, across three lung cancer subtypes. Lung tissue-specific enhancers were defined by integrating experimental data and public epigenomic profiles, and the genome-wide enhancer-target gene regulatory network of lung cells was constructed by integrating chromatin 3D architecture data. Lung cancers possessed a similar mutation burden at tissue-specific enhancers and exons but with differences in their mutation signatures. Functionally relevant alterations were prioritized based on the pathway-level integration of the effect of a mutation and the frequency of mutations on individual enhancers. The genes enriched for mutated enhancers converged on the regulation of key biological processes and pathways relevant to tumor biology. Recurrent mutations in individual enhancers also impacted the expression of target genes with potential relevance for patient prognosis. Together, these findings show that non-coding regulatory mutations have a potential relevance for cancer pathogenesis and can be exploited for patient classification.
    DOI:  https://doi.org/10.1158/0008-5472.CAN-23-1129
  21. Cell. 2023 Oct 12. pii: S0092-8674(23)01040-1. [Epub ahead of print]
      Cellular lineage histories and their molecular states encode fundamental principles of tissue development and homeostasis. Current lineage-recording mouse models have insufficient barcode diversity and single-cell lineage coverage for profiling tissues composed of millions of cells. Here, we developed DARLIN, an inducible Cas9 barcoding mouse line that utilizes terminal deoxynucleotidyl transferase (TdT) and 30 CRISPR target sites. DARLIN is inducible, generates massive lineage barcodes across tissues, and enables the detection of edited barcodes in ∼70% of profiled single cells. Using DARLIN, we examined fate bias within developing hematopoietic stem cells (HSCs) and revealed unique features of HSC migration. Additionally, we established a protocol for joint transcriptomic and epigenomic single-cell measurements with DARLIN and found that cellular clonal memory is associated with genome-wide DNA methylation rather than gene expression or chromatin accessibility. DARLIN will enable the high-resolution study of lineage relationships and their molecular signatures in diverse tissues and physiological contexts.
    Keywords:  DNA methylation; hematopoiesis; lineage tracing; multiomics; single cell
    DOI:  https://doi.org/10.1016/j.cell.2023.09.019
  22. Nucleic Acids Res. 2023 Oct 18. pii: gkad846. [Epub ahead of print]
      Parental histone recycling is vital for maintaining chromatin-based epigenetic information during replication, yet its underlying mechanisms remain unclear. Here, we uncover an unexpected role of histone chaperone FACT and its N-terminus of the Spt16 subunit during parental histone recycling and transfer in budding yeast. Depletion of Spt16 and mutations at its middle domain that impair histone binding compromise parental histone recycling on both the leading and lagging strands of DNA replication forks. Intriguingly, deletion of the Spt16-N domain impairs parental histone recycling, with a more pronounced defect observed on the lagging strand. Mechanistically, the Spt16-N domain interacts with the replicative helicase MCM2-7 and facilitates the formation of a ternary complex involving FACT, histone H3/H4 and Mcm2 histone binding domain, critical for the recycling and transfer of parental histones to lagging strands. Lack of the Spt16-N domain weakens the FACT-MCM interaction and reduces parental histone recycling. We propose that the Spt16-N domain acts as a protein-protein interaction module, enabling FACT to function as a shuttle chaperone in collaboration with Mcm2 and potentially other replisome components for efficient local parental histone recycling and inheritance.
    DOI:  https://doi.org/10.1093/nar/gkad846
  23. BMC Bioinformatics. 2023 Oct 20. 24(1): 395
       BACKGROUND: Transcription factors (TF) play a crucial role in the regulation of gene transcription; alterations of their activity and binding to DNA areas are strongly involved in cancer and other disease onset and development. For proper biomedical investigation, it is hence essential to correctly trace TF dense DNA areas, having multiple bindings of distinct factors, and select DNA high occupancy target (HOT) zones, showing the highest accumulation of such bindings. Indeed, systematic and replicable analysis of HOT zones in a large variety of cells and tissues would allow further understanding of their characteristics and could clarify their functional role.
    RESULTS: Here, we propose, thoroughly explain and discuss a full computational procedure to study in-depth DNA dense areas of transcription factor accumulation and identify HOT zones. This methodology, developed as a computationally efficient parametric algorithm implemented in an R/Bioconductor package, uses a systematic approach with two alternative methods to examine transcription factor bindings and provide comparative and fully-reproducible assessments. It offers different resolutions by introducing three distinct types of accumulation, which can analyze DNA from single-base to region-oriented levels, and a moving window, which can estimate the influence of the neighborhood for each DNA base under exam.
    CONCLUSIONS: We quantitatively assessed the full procedure by using our implemented software package, named TFHAZ, in two example applications of biological interest, proving its full reliability and relevance.
    Keywords:  Accumulation computation; DNA high occupancy target zones; Neighborhood-accounting moving window; Transcription factor binding accumulation
    DOI:  https://doi.org/10.1186/s12859-023-05528-1
  24. Dev Cell. 2023 Oct 16. pii: S1534-5807(23)00492-6. [Epub ahead of print]
      Transcriptional enhancers direct precise gene expression patterns during development and harbor the majority of variants associated with phenotypic diversity, evolutionary adaptations, and disease. Pinpointing which enhancer variants contribute to changes in gene expression and phenotypes is a major challenge. Here, we find that suboptimal or low-affinity binding sites are necessary for precise gene expression during heart development. Single-nucleotide variants (SNVs) can optimize the affinity of ETS binding sites, causing gain-of-function (GOF) gene expression, cell migration defects, and phenotypes as severe as extra beating hearts in the marine chordate Ciona robusta. In human induced pluripotent stem cell (iPSC)-derived cardiomyocytes, a SNV within a human GATA4 enhancer increases ETS binding affinity and causes GOF enhancer activity. The prevalence of suboptimal-affinity sites within enhancers creates a vulnerability whereby affinity-optimizing SNVs can lead to GOF gene expression, changes in cellular identity, and organismal-level phenotypes that could contribute to the evolution of novel traits or diseases.
    Keywords:  GATA4; affinity-optimizing SNVs; causal enhancer variants; enhanceropathies; enhancers; heart development; low affinity; suboptimal affinity; suboptimization
    DOI:  https://doi.org/10.1016/j.devcel.2023.09.005
  25. Cell Rep. 2023 Oct 12. pii: S2211-1247(23)01257-3. [Epub ahead of print]42(10): 113245
      Many tumors recapitulate the developmental and differentiation program of their tissue of origin, a basis for tumor cell heterogeneity. Although stem-cell-like tumor cells are well studied, the roles of tumor cells undergoing differentiation remain to be elucidated. We employ Drosophila genetics to demonstrate that the differentiation program of intestinal stem cells is crucial for enabling intestinal tumors to invade and induce non-tumor-autonomous phenotypes. The differentiation program that generates absorptive cells is aberrantly recapitulated in the intestinal tumors generated by activation of the Yap1 ortholog Yorkie. Inhibiting it allows stem-cell-like tumor cells to grow but suppresses invasiveness and reshapes various phenotypes associated with cachexia-like wasting by altering the expression of tumor-derived factors. Our study provides insight into how a native differentiation program determines a tumor's capacity to induce advanced cancer phenotypes and suggests that manipulating the differentiation programs co-opted in tumors might alleviate complications of cancer, including cachexia.
    Keywords:  CP: Cancer; CP: Cell biology; Notch signaling; cachexia; cell differentiation; dissemination; focal adhesion; intestine; protrusions; stem cells; tumor cell heterogeneity
    DOI:  https://doi.org/10.1016/j.celrep.2023.113245