Abstract
Background: We investigate the flow of genetic information from DNA to RNA to protein as described by the Central Dogma in molecular biology, to determine the impact of intermediate genomic levels on plant protein expression.
Results: We perform genomic profiling of rosette leaves in two Arabidopsis accessions, Col-0 and Can-0, and assemble their genomes using long reads and chromatin interaction data. We measure gene and protein expression in biological replicates grown in a controlled environment, also measuring CpG methylation, ribosome-associated transcript levels, and tRNA abundance. Each omic level is highly reproducible between biological replicates and between accessions despite their ~1% sequence divergence; the single best predictor of any level in one accession is the corresponding level in the other. Within each accession, gene codon frequencies accurately model both mRNA and protein expression. The effects of a codon on mRNA and protein expression are highly correlated but independent of genome-wide codon frequencies or tRNA levels which instead match genome-wide amino acid frequencies. Ribosome-associated transcripts closely track mRNA levels.
Conclusions: DNA codon frequencies and mRNA expression levels are the main predictors of protein abundance. In the absence of environmental perturbation neither gene-body methylation, tRNA abundance nor ribosome-associated transcript levels add appreciable information. The impact of constitutive gene-body methylation is mostly explained by gene codon composition. tRNA abundance tracks overall amino acid demand. However, genetic differences between accessions associate with differential gene-body methylation by inflating differential expression variation. Our data show that the dogma holds only if both sequence and abundance information in mRNA are considered.
Results: We perform genomic profiling of rosette leaves in two Arabidopsis accessions, Col-0 and Can-0, and assemble their genomes using long reads and chromatin interaction data. We measure gene and protein expression in biological replicates grown in a controlled environment, also measuring CpG methylation, ribosome-associated transcript levels, and tRNA abundance. Each omic level is highly reproducible between biological replicates and between accessions despite their ~1% sequence divergence; the single best predictor of any level in one accession is the corresponding level in the other. Within each accession, gene codon frequencies accurately model both mRNA and protein expression. The effects of a codon on mRNA and protein expression are highly correlated but independent of genome-wide codon frequencies or tRNA levels which instead match genome-wide amino acid frequencies. Ribosome-associated transcripts closely track mRNA levels.
Conclusions: DNA codon frequencies and mRNA expression levels are the main predictors of protein abundance. In the absence of environmental perturbation neither gene-body methylation, tRNA abundance nor ribosome-associated transcript levels add appreciable information. The impact of constitutive gene-body methylation is mostly explained by gene codon composition. tRNA abundance tracks overall amino acid demand. However, genetic differences between accessions associate with differential gene-body methylation by inflating differential expression variation. Our data show that the dogma holds only if both sequence and abundance information in mRNA are considered.
| Original language | English |
|---|---|
| Article number | 319 |
| Number of pages | 40 |
| Journal | Genome Biology |
| Volume | 26 |
| Issue number | 1 |
| Early online date | 29 Sept 2025 |
| DOIs | |
| Publication status | E-pub ahead of print - 29 Sept 2025 |
Keywords
- Gene expression
- Data-independent acquisition
- Chromatin interaction
- RNAseq
- Mim-tRNAseq
- Long reads
- Ribosome-associated expression
- Gene-body methylation
- Genome assembly
- Central Dogma
- Protein expression
- Genome Assembly
- Protein Biosynthesis
- Mim-trnaseq
- RNA, Transfer
- DNA Methylation
- Arabidopsis Proteins
- Codon
- Long Reads
- Rnaseq
- Gene Expression Regulation, Plant
- Transcription, Genetic
- Gene Expression
- Arabidopsis
- Genome, Plant
- Central dogma
- Chromatin Interaction
- Data-independent Acquisition
- RNA, Messenger
- Gene-body Methylation
- Ribosome-associated Expression
- Arabidopsis Proteins/genetics
- Arabidopsis/genetics
- RNA, Messenger/genetics
- RNA, Transfer/genetics