Nucleic Acids Res. 2025 Nov 26. pii: gkaf1382. [Epub ahead of print]53(22):
Transcription factor (TF)-DNA binding specificity, shaped by both sequence and epigenetic modifications, is central to gene regulation. Universal protein-binding microarrays (uPBMs), based on compact de Bruijn sequence designs, have emerged as powerful tools to characterize the specificity of hundreds of TFs. However, conventional uPBMs binding measurements are limited to direct measurement of short ($\le$8 bp) motifs composed of four canonical bases, lacking the ability to resolve the effects of extended sequence context or modifications. To address these limitations, we developed two enhanced platforms: Ex-uPBM, based on extended higher-order de Bruijn sequences, and Mod-uPBM, based on de Bruijn sequences that incorporate modified bases. Applying Ex-uPBM to known TFs allowed direct measurements for motifs up to 10 bp long and exposed specificity to flanking regions, unattainable in standard uPBM. By applying Mod-uPBM, we measured the effect of 5-methylcytosine (5mC) in all possible contexts, summarized in a full energetic position weight matrix (PWM). This PWM not only reproduced known TF binding specificity but also revealed context-specific energetic effects of 5mC in the full consensus motif at single-nucleotide resolution. Together, our platforms provide a robust and scalable strategy for TF binding quantification that better captures the sequence and modification complexity of genomic DNA.