RNA. 2025 Dec 02. pii: rna.080625.125. [Epub ahead of print]
Eukaryotic translation initiation is critically regulated by 5' UTR features, including uORFs, Kozak sequences, and secondary structures, that modulate ribosome dynamics. Although canonical mRNAs dominate protein synthesis, ribosome profiling and peptidomics reveal ribosomes actively engaging putative non-coding RNAs (ncRNAs), translating enigmatic short ORFs (sORFs). We systematically analyzed 5' UTR architectures across canonical mRNAs, ribosome-associated ncRNAs (translationally active), and non-translated ncRNAs using curated human datasets. mRNAs exhibited optimal translational features (short 5' UTRs, few uORFs), while translated ncRNAs showed intermediate features, and non-translated ncRNAs the weakest. Notably, mRNAs with long 5' UTRs maintained high translational efficiency through conserved regulatory elements. Integrating these features into our newly developed random forest model, plusCE, surpassed existing methods in predicting translation efficiency, suggesting their potential relevance to translation mechanisms and providing guidance for rational 5' UTR design to modulate translation. Although some ncRNAs are frequently bound by ribosomes, they show no evidence of stable translation, consistent with their lack of coding-related evolutionary signatures. Our analysis suggests that ribosome-bound ncRNAs may not reflect adaptive evolution toward coding function, but rather represent a reservoir of untranslated transcripts that engage the translation machinery through permissive sequence features. Together, these results demonstrate that ribosome engagement is primarily shaped by 5' UTR sequence features, highlighting the importance of regulatory grammar in translation control and complementing current models of ncRNA evolution.
Keywords: 5 untranslated regions; non-coding RNAs; translational efficiency prediction; translational features