Abstract
During a survey of two-component system genes, a list of neighboring histidine kinase and response regulator genes, encoded on the same strand, was compiled from over 200 fully sequenced bacteria. It was observed that many gene pairs overlapped, and although such overlaps can potentially occur in two phases (relative reading frames), one phase predominated for overlaps of seven or more nucleotides. Preference for a particular phase cannot be explained by arguments of sequence restraint (mutations in one gene differentially affect an overlapping gene, depending on phase). We have therefore investigated a potential explanation of the observed phase bias. For phase +1 gene overlaps, simulated point mutations in the overlapping region result in more severe changes to the downstream gene product than to the upstream gene product; vice versa in phase +2. Additionally, codon usage frequencies in nonoverlapping regions are more similar to those at the end of the upstream gene than the beginning of the downstream gene in overlaps. Taking both observations together, we propose that new gene overlaps generally arise by N-terminal extension of a downstream gene, creating a novel sequence at the start of the downstream gene. Sequence changes in this newly coding sequence will alter the sequences of both the new and the original coding sequence (the C-terminal region of the upstream gene). However, these changes will be less detrimental to the original coding sequence if the two genes overlap in phase +1, leading to selective retention during evolution of phase +1 overlaps relative to phase +2 overlaps.
Original language | English |
---|---|
Pages (from-to) | 457-462 |
Number of pages | 6 |
Journal | Journal of Molecular Evolution |
Volume | 64 |
Issue number | 4 |
Early online date | 19 Mar 2007 |
DOIs | |
Publication status | Published - Apr 2007 |
Keywords
- Two-component systems
- response regulator
- histidine kinase
- gene overlaps
- reading frame
- condon usage