The combinatorics of overlapping genes
From MaRDI portal
Abstract: Overlapping genes exist in all domains of life and are much more abundant than expected at their first discovery in the late 1970s. Assuming that the reference gene is read in frame +0, an overlapping gene can be encoded in two reading frames in the sense strand, denoted by +1 and +2, and in three reading frames in the opposite strand, denoted by -0, -1 and -2. This motivated numerous researchers to study the constraints induced by the genetic code on the various overlapping frames, mostly based on information theory. Our focus in this paper is on the constraints induced on two overlapping genes in terms of amino acids, as well as polypeptides. We show that simple linear constraints bind the amino acid composition of two proteins encoded by overlapping genes. Novel constraints are revealed when polypeptides are considered, and not just single amino acids. For example, in double-coding sequences with an overlapping reading frame -2, each Tyrosine (denoted as Tyr or Y) in the overlapping frame overlaps a Tyrosine in the reference frame +0 (and reciprocally), whereas specific words (e.g. YY) never occur. We thus distinguish between null constraints (YY = 0 in frame -2) and non-null constraints (Y in frame +0 <=> Y in frame -2). Our equivalence-based constraints are symmetrical and thus enable the characterization of the joint composition of overlapping proteins. We describe several formal frameworks and a graph algorithm to characterize and compute these constraints. These results yield support for understanding the mechanisms and evolution of overlapping genes, and for developing novel overlapping gene detection methods.
Recommendations
Cites work
Cited in
(8)- Positions of silent mutations in overlapping genes from different DNA strands
- scientific article; zbMATH DE number 1305428 (Why is no real title available?)
- Two proteins for the price of one: the design of maximally compressed coding sequences
- The Combinatorics of Sequencing the Corn Genome
- A theorem on the genetic code
- Overlapping genes coded in the 3'-to-5'-direction in mitochondrial genes and 3'-to-5' polymerization of non-complementary RNA by an `invertase'
- Overlapping genes in vertebrate genomes
- Overlapping genetic codes for overlapping frameshifted genes in Testudines, and \textit{Lepidochelys olivacea} as special case
This page was built for publication: The combinatorics of overlapping genes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2013690)