bims-gerecp Biomed News
on Gene regulatory networks of epithelial cell plasticity
Issue of 2024–10–06
24 papers selected by
Xiao Qin, University of Oxford



  1. Genes Dev. 2024 Oct 03.
      Massively parallel reporter assays (MPRAs) are powerful tools for quantifying the impacts of sequence variation on gene expression. Reading out molecular phenotypes with sequencing enables interrogating the impact of sequence variation beyond genome scale. Machine learning models integrate and codify information learned from MPRAs and enable generalization by predicting sequences outside the training data set. Models can provide a quantitative understanding of cis-regulatory codes controlling gene expression, enable variant stratification, and guide the design of synthetic regulatory elements for applications from synthetic biology to mRNA and gene therapy. This review focuses on cis-regulatory MPRAs, particularly those that interrogate cotranscriptional and post-transcriptional processes: alternative splicing, cleavage and polyadenylation, translation, and mRNA decay.
    Keywords:  gene regulation; machine learning; massively parallel reporter assays
    DOI:  https://doi.org/10.1101/gad.351800.124
  2. STAR Protoc. 2024 Oct 01. pii: S2666-1667(24)00521-5. [Epub ahead of print]5(4): 103356
      The snapshot nature of single-cell transcriptomics presents a challenge for studying the dynamics of gene expression. Metabolic labeling, where nascent RNA is labeled with 4-thiouridine (4sU), captures temporal information at the single-cell level, providing greater insight into expression dynamics. Here, we present an optimized, automation-friendly protocol for the metabolic labeling of RNA alongside single-cell RNA sequencing through combinatorial indexing. We describe steps for 4sU labeling, cell fixation and chemical treatment, and automated two-level combinatorial indexing. For complete details on the use and execution of this protocol, please refer to Maizels et al.1.
    Keywords:  Cell culture; Developmental biology; Gene Expression; Molecular Biology; Molecular/Chemical Probes; RNAseq; Single Cell
    DOI:  https://doi.org/10.1016/j.xpro.2024.103356
  3. bioRxiv. 2024 Sep 19. pii: 2024.09.13.612892. [Epub ahead of print]
      The phenotypic and functional states of a cell are modulated by a complex interactive molecular hierarchy of multiple omics layers, involving the genome, epigenome, transcriptome, proteome, and metabolome. Spatial omics approaches have enabled the capture of information from different molecular layers directly in the tissue context. However, current technologies are limited to map one to two modalities at the same time, providing an incomplete representation of cellular identity. Such data is inadequate to fully understand complex biological systems and their underlying regulatory mechanisms. Here we present spatial-Mux-seq, a multi-modal spatial technology that allows simultaneous profiling of five different modalities, including genome-wide profiles of two histone modifications and open chromatin, whole transcriptome, and a panel of proteins at tissue scale and cellular level in a spatially resolved manner. We applied this technology to generate multi-modal tissue maps in mouse embryos and mouse brains, which discriminated more cell types and states than unimodal data. We investigated the spatiotemporal relationship between histone modifications, chromatin accessibility, gene and protein expression in neuron differentiation revealing the relationship between tissue organization, function, and gene regulatory networks. We were able to identify a radial glia spatial niche and revealed spatially changing gradient of epigenetic signals in this region. Moreover, we revealed previously unappreciated involvement of repressive histone marks in the mouse hippocampus. Collectively, the spatial multi-omics approach heralds a new era for characterizing tissue and cellular heterogeneity that single modality studies alone could not reveal.
    DOI:  https://doi.org/10.1101/2024.09.13.612892
  4. Nat Genet. 2024 Oct 03.
      Many enhancers control gene expression by assembling regulatory factor clusters, also referred to as condensates. This process is vital for facilitating enhancer communication and establishing cellular identity. However, how DNA sequence and transcription factor (TF) binding instruct the formation of high regulatory factor environments remains poorly understood. Here we developed a new approach leveraging enhancer-centric chromatin accessibility quantitative trait loci (caQTLs) to nominate regulatory factor clusters genome-wide. By analyzing TF-binding signatures within the context of caQTLs and comparing episomal versus endogenous enhancer activities, we discovered a class of regulators, 'context-only' TFs, that amplify the activity of cell type-specific caQTL-binding TFs, that is, 'context-initiator' TFs. Similar to super-enhancers, enhancers enriched for context-only TF-binding sites display high coactivator binding and sensitivity to bromodomain-inhibiting molecules. We further show that binding sites for context-only and context-initiator TFs underlie enhancer coordination, providing a mechanistic rationale for how a loose TF syntax confers regulatory specificity.
    DOI:  https://doi.org/10.1038/s41588-024-01892-7
  5. bioRxiv. 2024 Sep 16. pii: 2024.09.12.612590. [Epub ahead of print]
      Inferring gene regulatory networks from gene expression data is an important and challenging problem in the biology community. We propose OTVelo, a methodology that takes time-stamped single-cell gene expression data as input and predicts gene regulation across two time points. It is known that the rate of change of gene expression, which we will refer to as gene velocity, provides crucial information that enhances such inference; however, this information is not always available due to the limitations in sequencing depth. Our algorithm overcomes this limitation by estimating gene velocities using optimal transport. We then infer gene regulation using time-lagged correlation and Granger causality via regularized linear regression. Instead of providing an aggregated network across all time points, our method uncovers the underlying dynamical mechanism across time points. We validate our algorithm on 13 simulated datasets with both synthetic and curated networks and demonstrate its efficacy on 4 experimental data sets.
    Author summary: Understanding how genes interact to regulate cellular functions is crucial for advancing our knowledge of biology and disease. We present OTVelo, a method that uses single-cell gene expression data collected at different time points to infer gene regulatory networks. OTVelo offers a dynamic view of how gene interactions change over time, providing deeper insights into cellular processes. Unlike traditional methods, OTVelo captures temporal information through ancestor-descendant transitions without assuming a specific underlying regulatory model. We validate our approach using both simulated and real-world data, demonstrating its effectiveness in revealing complex gene regulation patterns. This method could lead to new discoveries in understanding biological systems and developing disease treatments.
    DOI:  https://doi.org/10.1101/2024.09.12.612590
  6. Cancer Heterog Plast. 2024 ;pii: 0005. [Epub ahead of print]1(1):
      Most human cancers are heterogeneous consisting of cancer cells at different epigenetic and transcriptional states and with distinct phenotypes, functions, and drug sensitivities. This inherent cancer cell heterogeneity contributes to tumor resistance to clinical treatment, especially the molecularly targeted therapies such as tyrosine kinase inhibitors (TKIs) and androgen receptor signaling inhibitors (ARSIs). Therapeutic interventions, in turn, induce lineage plasticity (also called lineage infidelity) in cancer cells that also drives therapy resistance. In this Perspective, we focus our discussions on cancer cell lineage plasticity manifested as treatment-induced switching of epithelial cancer cells to basal/stem-like, mesenchymal, and neural lineages. We employ prostate cancer (PCa) as the prime example to highlight ARSI-induced lineage plasticity during and towards development of castration-resistant PCa (CRPC). We further discuss how the tumor microenvironment (TME) influences therapy-induced lineage plasticity. Finally, we offer an updated summary on the regulators and mechanisms driving cancer cell lineage infidelity, which should be therapeutically targeted to extend the therapeutic window and improve patients' survival.
    Keywords:  androgen receptor; cancer cell heterogeneity; cancer stem cells; castration-resistant prostate cancer; lineage plasticity; prostate cancer; stemness; therapy resistance
    DOI:  https://doi.org/10.47248/chp2401010005
  7. bioRxiv. 2024 Sep 23. pii: 2024.09.19.613754. [Epub ahead of print]
      Understanding how regulatory DNA elements shape gene expression across individual cells is a fundamental challenge in genomics. Joint RNA-seq and epigenomic profiling provides opportunities to build unifying models of gene regulation capturing sequence determinants across steps of gene expression. However, current models, developed primarily for bulk omics data, fail to capture the cellular heterogeneity and dynamic processes revealed by single-cell multi-modal technologies. Here, we introduce scooby, the first model to predict scRNA-seq coverage and scATAC-seq insertion profiles along the genome from sequence at single-cell resolution. For this, we leverage the pre-trained multi-omics profile predictor Borzoi as a foundation model, equip it with a cell-specific decoder, and fine-tune its sequence embeddings. Specifically, we condition the decoder on the cell position in a precomputed single-cell embedding resulting in strong generalization capability. Applied to a hematopoiesis dataset, scooby recapitulates cell-specific expression levels of held-out genes and cells, and identifies regulators and their putative target genes through in silico motif deletion. Moreover, accurate variant effect prediction with scooby allows for breaking down bulk eQTL effects into single-cell effects and delineating their impact on chromatin accessibility and gene expression. We anticipate scooby to aid unraveling the complexities of gene regulation at the resolution of individual cells.
    DOI:  https://doi.org/10.1101/2024.09.19.613754
  8. Cell Syst. 2024 Sep 28. pii: S2405-4712(24)00267-9. [Epub ahead of print]
      To facilitate single-cell multi-omics analysis and improve reproducibility, we present single-cell pipeline for end-to-end data integration (SPEEDI), a fully automated end-to-end framework for batch inference, data integration, and cell-type labeling. SPEEDI introduces data-driven batch inference and transforms the often heterogeneous data matrices obtained from different samples into a uniformly annotated and integrated dataset. Without requiring user input, it automatically selects parameters and executes pre-processing, sample integration, and cell-type mapping. It can also perform downstream analyses of differential signals between treatment conditions and gene functional modules. SPEEDI's data-driven batch-inference method works with widely used integration and cell-typing tools. By developing data-driven batch inference, providing full end-to-end automation, and eliminating parameter selection, SPEEDI improves reproducibility and lowers the barrier to obtaining biological insight from these valuable single-cell datasets. The SPEEDI interactive web application can be accessed at https://speedi.princeton.edu/. A record of this paper's transparent peer review process is included in the supplemental information.
    Keywords:  batch identification; cell-type mapping; information theory; integration; scATAC-seq; scRNA-seq; single-cell genomics
    DOI:  https://doi.org/10.1016/j.cels.2024.09.003
  9. Front Genet. 2024 ;15 1425456
      Multi-omics data integration is a term that refers to the process of combining and analyzing data from different omic experimental sources, such as genomics, transcriptomics, methylation assays, and microRNA sequencing, among others. Such data integration approaches have the potential to provide a more comprehensive functional understanding of biological systems and has numerous applications in areas such as disease diagnosis, prognosis and therapy. However, quantitative integration of multi-omic data is a complex task that requires the use of highly specialized methods and approaches. Here, we discuss a number of data integration methods that have been developed with multi-omics data in view, including statistical methods, machine learning approaches, and network-based approaches. We also discuss the challenges and limitations of such methods and provide examples of their applications in the literature. Overall, this review aims to provide an overview of the current state of the field and highlight potential directions for future research.
    Keywords:  LASSO; cancer biology; data integration; multi-omics; regulatory models; statistical and probabilistic modelling
    DOI:  https://doi.org/10.3389/fgene.2024.1425456
  10. Methods Mol Biol. 2025 ;2850 297-306
      Prokaryotes use CRISPR-Cas systems to interfere with viruses and other mobile genetic elements. CRISPR arrays comprise repeated DNA elements and spacer sequences that can be engineered for custom target sites. These arrays are transcribed into precursor CRISPR RNAs (pre-crRNAs) that undergo maturation steps to form individual CRISPR RNAs (crRNAs). Each crRNA contains a single spacer that identifies the target cleavage site for a large variety of Cas protein effectors. Precise manipulation of spacer sequences within CRISPR arrays is crucial for advancing the functionality of CRISPR-based technologies. Here, we describe a protocol for the design and creation of a minimal, plasmid-based CRISPR array to enable the expression of specific, synthetic crRNAs. Plasmids contain entry spacer sequences with two type IIS restriction sites and Golden Gate cloning enables the efficient exchange of these spacer sequences. Factors that influence the compatibility of the CRISPR arrays with native or recombinant Cas proteins are discussed.
    Keywords:  CRISPR; Genome editing; Golden Gate cloning; Restriction enzymes; Spacer; crRNA
    DOI:  https://doi.org/10.1007/978-1-0716-4220-7_16
  11. Cancer Cell. 2024 Sep 20. pii: S1535-6108(24)00349-0. [Epub ahead of print]
      Microscopic examination of cells in their tissue context has been the driving force behind diagnostic histopathology over the past two centuries. Recently, the rise of advanced molecular biomarkers identified through single cell profiling has increased our understanding of cellular heterogeneity in cancer but have yet to significantly impact clinical care. Spatial technologies integrating molecular profiling with microenvironmental features are poised to bridge this translational gap by providing critical in situ context for understanding cellular interactions and organization. Here, we review how spatial tools have been used to study tumor ecosystems and their clinical applications. We detail findings in cell-cell interactions, microenvironment composition, and tissue remodeling for immune evasion and therapeutic resistance. Additionally, we highlight the emerging role of multi-omic spatial profiling for characterizing clinically relevant features including perineural invasion, tertiary lymphoid structures, and the tumor-stroma interface. Finally, we explore strategies for clinical integration and their augmentation of therapeutic and diagnostic approaches.
    DOI:  https://doi.org/10.1016/j.ccell.2024.09.001
  12. Nat Rev Cancer. 2024 Oct 01.
    Precancer Think Tank Team
      The term 'precancer' typically refers to an early stage of neoplastic development that is distinguishable from normal tissue owing to molecular and phenotypic alterations, resulting in abnormal cells that are at least partially self-sustaining and function outside of normal cellular cues that constrain cell proliferation and survival. Although such cells are often histologically distinct from both the corresponding normal and invasive cancer cells of the same tissue origin, defining precancer remains a challenge for both the research and clinical communities. Once sufficient molecular and phenotypic changes have occurred in the precancer, the tissue is identified as a 'cancer' by a histopathologist. While even diagnosing cancer can at times be challenging, the determination of invasive cancer is generally less ambiguous and suggests a high likelihood of and potential for metastatic disease. The 'hallmarks of cancer' set out the fundamental organizing principles of malignant transformation but exactly how many of these hallmarks and in what configuration they define precancer has not been clearly and consistently determined. In this Expert Recommendation, we provide a starting point for a conceptual framework for defining precancer, which is based on molecular, pathological, clinical and epidemiological criteria, with the goal of advancing our understanding of the initial changes that occur and opportunities to intervene at the earliest possible time point.
    DOI:  https://doi.org/10.1038/s41568-024-00744-0
  13. Cancer Discov. 2024 Oct 04. 14(10): 1774-1778
      People diagnosed with cancer and their formal and informal caregivers are increasingly faced with a deluge of complex information, thanks to rapid advancements in the type and volume of diagnostic, prognostic, and treatment data. This commentary discusses the opportunities and challenges that the society faces as we integrate large volumes of data into regular cancer care.
    DOI:  https://doi.org/10.1158/2159-8290.CD-24-1130
  14. Cell Rep Methods. 2024 Sep 25. pii: S2667-2375(24)00244-3. [Epub ahead of print] 100866
      The tumor microenvironment (TME) is increasingly appreciated to play a decisive role in cancer development and response to therapy in all solid tumors. Hypoxia, acidosis, high interstitial pressure, nutrient-poor conditions, and high cellular heterogeneity of the TME arise from interactions between cancer cells and their environment. These properties, in turn, play key roles in the aggressiveness and therapy resistance of the disease, through complex reciprocal interactions between the cancer cell genotype and phenotype, and the physicochemical and cellular environment. Understanding this complexity requires the combination of sophisticated cancer models and high-resolution analysis tools. Models must allow both control and analysis of cellular and acellular TME properties, and analyses must be able to capture the complexity at high depth and spatial resolution. Here, we review the advantages and limitations of key models and methods in order to guide further TME research and outline future challenges.
    Keywords:  CP: Biotechnology; CP: Cancer biology; cancer; heterogeneity; metabolism; microfluidics; organoids; tumor microenvironment; tumor models
    DOI:  https://doi.org/10.1016/j.crmeth.2024.100866
  15. Cancer Res. 2024 Oct 04.
      Neuroendocrine cells have been implicated in therapeutic resistance and worse overall survival in many cancer types. Mucinous colorectal cancer (mCRC) is uniquely enriched for enteroendocrine cells (EECs), the neuroendocrine cell of the normal colon epithelium, as compared to non-mCRC. Therefore, targeting EEC differentiation may have clinical value in mCRC. Here, single cell multi-omics uncovered epigenetic alterations that accompany EEC differentiation, identified STAT3 as a regulator of EEC specification, and discovered a rare cancer-specific cell type with enteric neuron-like characteristics. Furthermore, LSD1 and CoREST2 mediated STAT3 demethylation and enhanced STAT3 chromatin binding. Knockdown of CoREST2 in an orthotopic xenograft mouse model resulted in decreased primary tumor growth and lung metastases. Collectively, these results provide rationale for developing LSD1 inhibitors that target the interaction between LSD1 and STAT3 or CoREST2, which may improve clinical outcomes for patients with mCRC.
    DOI:  https://doi.org/10.1158/0008-5472.CAN-24-0788
  16. Nature. 2024 Oct 02.
      Ageing impairs the ability of neural stem cells (NSCs) to transition from quiescence to proliferation in the adult mammalian brain. Functional decline of NSCs results in the decreased production of new neurons and defective regeneration following injury during ageing1-4. Several genetic interventions have been found to ameliorate old brain function5-8, but systematic functional testing of genes in old NSCs-and more generally in old cells-has not been done. Here we develop in vitro and in vivo high-throughput CRISPR-Cas9 screening platforms to systematically uncover gene knockouts that boost NSC activation in old mice. Our genome-wide screens in primary cultures of young and old NSCs uncovered more than 300 gene knockouts that specifically restore the activation of old NSCs. The top gene knockouts are involved in cilium organization and glucose import. We also establish a scalable CRISPR-Cas9 screening platform in vivo, which identified 24 gene knockouts that boost NSC activation and the production of new neurons in old brains. Notably, the knockout of Slc2a4, which encodes the GLUT4 glucose transporter, is a top intervention that improves the function of old NSCs. Glucose uptake increases in NSCs during ageing, and transient glucose starvation restores the ability of old NSCs to activate. Thus, an increase in glucose uptake may contribute to the decline in NSC activation with age. Our work provides scalable platforms to systematically identify genetic interventions that boost the function of old NSCs, including in vivo, with important implications for countering regenerative decline during ageing.
    DOI:  https://doi.org/10.1038/s41586-024-07972-2
  17. Med. 2024 Sep 19. pii: S2666-6340(24)00343-X. [Epub ahead of print]
      Organoids are three-dimensional (3D) cultures, normally derived from stem cells, that replicate the complex structure and function of human tissues. They offer a physiologically relevant model to address important questions in cancer research. The generation of patient-derived organoids (PDOs) from various human cancers allows for deeper insights into tumor heterogeneity and spatial organization. Additionally, interrogating non-tumor stromal cells increases the relevance in studying the tumor microenvironment, thereby enhancing the relevance of PDOs in personalized medicine. PDOs mark a significant advancement in cancer research and patient care, signifying a shift toward more innovative and patient-centric approaches. This review covers aspects of PDO cultures to address the modeling of the tumor microenvironment, including extracellular matrices, air-liquid interface and microfluidic cultures, and organ-on-chip. Specifically, the role of PDOs as preclinical models in gene editing, molecular profiling, drug testing, and biomarker discovery and their potential for guiding personalized treatment in clinical practice are discussed.
    Keywords:  cancer treatment; drug screening; organ-on-chip; patient-derived organoids; precision medicine; tumor microenvironment
    DOI:  https://doi.org/10.1016/j.medj.2024.08.010
  18. Nat Rev Mol Cell Biol. 2024 Sep 30.
      Integrin receptors are the main molecular link between cells and the extracellular matrix (ECM) as well as mediating cell-cell interactions. Integrin-ECM binding triggers the formation of heterogeneous multi-protein assemblies termed integrin adhesion complexes (IACs) that enable integrins to transform extracellular cues into intracellular signals that affect many cellular processes, especially cell motility. Cell migration is essential for diverse physiological and pathological processes and is dysregulated in cancer to favour cell invasion and metastasis. Here, we discuss recent findings on the role of integrins in cell migration with a focus on cancer cell dissemination. We review how integrins regulate the spatial distribution and dynamics of different IACs, covering classical focal adhesions, emerging adhesion types and adhesion regulation. We discuss the diverse roles integrins have during cancer progression from cell migration across varied ECM landscapes to breaching barriers such as the basement membrane, and eventual colonization of distant organs.
    DOI:  https://doi.org/10.1038/s41580-024-00777-1
  19. Sci Rep. 2024 10 01. 14(1): 22769
      Genotypic and phenotypic diversity, which generates heterogeneity during disease evolution, is common in cancer. The identification of features specific to each patient and tumor is central to the development of precision medicine and preclinical studies for cancer treatment. However, the complexity of the disease due to inter- and intratumor heterogeneity increases the difficulty of effective analysis. Here, we introduce a sequential deep learning model, preprocessing to organize the complexity due to heterogeneity, which contrasts with general approaches that apply a single model directly. We characterized morphological heterogeneity using microscopy images of patient-derived organoids (PDOs) and identified gene subsets relevant to distinguishing differences among original tumors. PDOs, which reflect the features of their origins, can be reproduced in large quantities and varieties, contributing to increasing the variation by enhancing their common characteristics, in contrast to those from different origins. This resulted in increased efficiency in the extraction of organoid morphological features sharing the same origin. Linking these tumor-specific morphological features to PDO gene expression data enables the extraction of genes strongly correlated with intertumor differences. The relevance of the selected genes was assessed, and the results suggest potential applications in preclinical studies and personalized clinical care.
    DOI:  https://doi.org/10.1038/s41598-024-73725-w
  20. NAR Genom Bioinform. 2024 Sep;6(4): lqae138
      Understanding cancer mechanisms, defining subtypes, predicting prognosis and assessing therapy efficacy are crucial aspects of cancer research. Gene-expression signatures derived from bulk gene expression data have played a significant role in these endeavors over the past decade. However, recent advancements in high-resolution transcriptomic technologies, such as single-cell RNA sequencing and spatial transcriptomics, have revealed the complex cellular heterogeneity within tumors, necessitating the development of computational tools to characterize tumor mass heterogeneity accurately. Thus we implemented signifinder, a novel R Bioconductor package designed to streamline the collection and use of cancer transcriptional signatures across bulk, single-cell, and spatial transcriptomics data. Leveraging publicly available signatures curated by signifinder, users can assess a wide range of tumor characteristics, including hallmark processes, therapy responses, and tumor microenvironment peculiarities. Through three case studies, we demonstrate the utility of transcriptional signatures in bulk, single-cell, and spatial transcriptomic data analyses, providing insights into cell-resolution transcriptional signatures in oncology. Signifinder represents a significant advancement in cancer transcriptomic data analysis, offering a comprehensive framework for interpreting high-resolution data and addressing tumor complexity.
    DOI:  https://doi.org/10.1093/nargab/lqae138