New Genetic Information
Before reading this, you should get familiar with basic genetics and the vocabulary connected to it.
Here and there on the AiG web-page (it is actually a popular view among creationists) it is stated that there is no known process by which new genetic information can be formed. If this was true, evolution would be a lie. But this is off course false.
The following is a short review of the most important parts of a paper in Annual Reviews of Genetics.
The full text paper can beseen here. (Rather technical. Look here if you need some basics in genetics). A list of papers can be found at the bottom of this page.
Several processes can lead to
new genes. It should be noted that often
more than one of the processes are involved. E.g. duplication of a gene result
in one copy of the gene being free to acquire a new function by whatever means
This paper refers to investigations into de novo genes.
This involves both duplication
of DNA and so-called retro-duplication via RNA.
Retro-duplication is the
transcription of RNA into DNA (Reverse transcription) and insertion of the
DNA-sequence into the genome.
Retro-duplicated genes can be recognized by their lack of introns and the presence
Duplication can involve
anything from short sequences of DNA to entire genomes in the process of polyploidization.
After polyploidization most genes has been shown to lose their function, but up
to 30 % have been shown to retain their function or acquire new function due to
some of the processes mentioned below.
An example is Hemoglobin. In Mammals there is an extra copy of this gene, which is expressed in the fetus. The hemoglobin in the fetus has a higher affinity for oxygen. This makes the transport of oxygen from the mothers blood to the blood of the fetus possible.
Changing existing genes
The simplest process is point
mutations in existing genes, resulting in new functions.
Another possibility is Exon shuffling, where exons from one gene is combined
with exons from one or more other genes. Loss of exons also can lead to new
Frame-shift mutations normally only will affect one exon, and therefor can
result in slightly altered proteins with a new function.
Alternative splicing of mRNA
from an existing gene, result in a new amino acid sequences, which also can
have new function.
A formerly non-transcribed DNA
sequence can mutate into a transcribed sequence, translated to a protein. On rare
occasions as a functional protein.
Especially in bacteria horizontal gene-transfer have great importance in the
evolution of the genome.
One way to create a de-novo gene is for a lncRNA (Long Non-Coding RNA) to be translated. lncRNA does not neccesarily have a function. They might be transcriptional junk, that just happens to have the right signals to be translated.
If any transcriped non-translated DNA-sequence mutates and starts being translated, the result will normally be a protein much shorter than the average of 300 amino acids.
An open reading frame (ORF) is a sequence of DNA between a START-codon (ATG - Methionine) and a STOP-codon (TAA, TAG or TGA). In a random DNA sequence, on average there will be about 21 codons from a START to a STOP. The lenght will follow a poisson distribution with parameter 64/3 = 21,3.
About 1% will be longer than 33 codons.
About 1:1000 will be longer than 37 codons.
About 1:10,000 will be longer than 41 codons.
Each codon (except the STOP) corresponds to one amino acid in the resulting protein.
This paper confirms that de-novo genes are much shorter than the average (old) gene.
Look at the buttom of the page to see more references to de novo gene formation.
Also for random amino acis sequences with function (Keefe and Szostak, 2001)
How could such sequences be recognized? How do we know how to interpret translated genes with a function in one organism, which are homologous to non-translated or non-functional DNA-sequences in other species?
It could be one of two: Either a functional gene lost it's function and stopped being translated, or a nonfunctional DNA-sequence gained function. How to know what is what?
Cao et al. (2015) provide us with an example.
A non-coding sequence (lets call it Z) is found in monkeys and Apes on an autosomal (non-sex-) chromosome.
In addition to this a coding homolouge (lets call it Z') is found on the Ape Y-chromosome. A non-coding homolouge, closely resempling the autosomal copy, is found on some Monkey's Y-chromosme.
The most parsimonal explanation to this is that Z originally was positioned on the autosomal chromosome. Then it was copied to the Y-chromosome. Then (perhaps after the Ape/Monkey split) it underwent mutations that resulted in Z', A translated gene coding for a functional protein.
A pseudogene, a gene that has
lost its function due to mutation, can result in a functional, though not
translated RNA-molecule. Often such RNAs has a gene-regulating function.
New regulation of existing
If duplication result in a
gene in a new sequence-environment, it can be placed near to a promoter that
guides the expression of the gene in a new way, resulting in a different
function, e.g. in another organ or at a different time in development.
Transportable elements (TEs)
TEs can lead to duplication of
genes or can be incorporated in existing genes, thereby changing their function
conclusion is that there is a number of known observable processes by which new genetic
information can be added to the genetic code.
If an insertion result in one or more exons from one gene being inserted into another gene, a new protein will result.
Papers on De Novo gene formation
1: Cao et al. (2015) De Novo Origin of VCY2 from Autosome to Y-Transposed Amplicon.
PLoS ONE 10(3): e01229651
2: Cui et al. (2015) Young genes out of the male: An insight from evolutionare age analysis of the pollen transcriptome. Molecular Plant 8: 935-945
3: Chen et al. (2015) Emergence, Retention and Selection: A Trilogy of Origination for Functional De Novo Proteins from Ancestral LncRNAs in Primates.
PLoS Genet. 11(7): e1005391
4: Frietze and Leatherman (2014) Examining the Process of de Novo Gene Birth: An Educational Primer on “Integration of New Genes into Cellular Networks, and Their Structural Maturation”
Genetics 196: 593-599
5: Light et al. (2014) Orphan and new gene origination, a structural ans evolutionary perspective. Curr.Op. Struct. Biol. 26: 73-83
6: Zhang and Long (2014) New genes contribute to genetic and phenotypic novelties in human evolution. Curr. Op. Genet. Develop. 29: 90-96
7: Xie et al. (2013) Hominoid-Specific De Novo Protein-Coding Genes Originating from Long Non-Coding RNAs. PloS Gent. 8(9): e1002942
8: Keefe and Szostak (2001) Functional Protein from a Random-Sequence Library.
Nature 410: 715-718
Comments on the papers
1: Cao et al.
The gene VCY2, located on the Y-chromosome, has a non-coding homolog on the autosomes (non-sex-chromosomes) in all the great Apes, including Humans. It is shown that the autosomal homolog is transcribed, but cannot be translated, and therefor cannot encode a protein.
Non-coding DNA sequences homologous to VCY2 has been found in several monkeys.
The simplest explanation is that the DNA-sequence has been transposed to the Y-chromosome and later experienced one or more mutations, resulting is the sequence being translated.
2: Cui et al.
Young genes in Rice and Arabidopsis have been identified. These genes were on average shorter than older genes. There is a tendency that such genes are preferencially expressed in male reproductive cells. Also transcription of intergenic sequences is more intense in these cells, which would facilitate the formation of new genes, when such transcriped sequecenses mutate and become translated.
3: Chen et al.
A group of 64 new genes unique for humans was identified. The genes originated from Long Non-Coding RNAs (lncRNAs) that is present in monkeys. The GC-rich sequence and short lenght of the genes, relative to the average old gene, show that they are less exposed to mutations resulting in Stop-codons, and therefor have a better chance of surviving for long periods.
The fraction of synonemous vs. non-synonemous mutations in the lncRNAs indicate that these are not translated into functinal proteins. The 64 genes in question, on the other hand, are under purifying selection, indicating function.
4: Frietze and Leatherman
The paper goes through the practice of identying de novo genes.
5: Light et al.8: Keefe and Szostak
The paper report that from a random set of amino acid sequences, a few was found to bind to ATP. Subsequent random mutation of the sequences in question, resulted in high specificity.
The study shows that random functional genes are possible.
The many fully sequenced organisms leave the impression that lineage specific genes are common.
More papers (no comments, only short reference, Only 2013-2016)
Yeast 33: 43-53
Cell Host and Microbe 20: 189-201 (doi.org/10.1016/j.chom.2016.06.007)
Bioinformatics and Biology Insights 10: 121-131 (doi: 10.4137/BBi.s39950.)
BMC Informatics 17:226 (DOI 10.1186/s12859-016-1102-x)
BMC Genomics 17:133 (DOI 10.1186/s12864-016-2456-1)
Nature Reviews/Genetics 17: 567-579
Biochemical and Biophysical Research Communications 466: 400-405 (doi.org/10.1016/j.bbrc.2015.09.038)
Genome Biol. Evol. 8(6):1812–1823 (doi:10.1093/gbe/evw113)
BMC Evolutionary Biology (2015) 15:283
Genome Biol. Evol. 8(7):2190–2202. doi:10.1093/gbe/evw164 (DOI 10.1186/s12862-015-0558-z)
Trends in Genetics, Vol. 31: 215-219
PLoS ONE 10(3): e0119651 (doi:10.1371/journal.pone.0119651)
Molecular Plant 8, 935–945 (DOI: http://dx.doi.org/10.1016/j.molp.2014.12.008)
PLoS Genetics 11(7): e1005391 (doi:10.1371/journal. pgen.1005391)
Genetics 195: 1407–1417 (doi: 10.1534/genetics.113.160895)
Science 343: 769-772
Current Opinion in Structural Biology 26: 73–83
Current Opinion in Genetics & Development 29: 90–96
Mol. Biol. Evol. 32(1):216–228 (doi:10.1093/molbev/msu299)
EMBO reports Vol 15: 460-461
Mol. Biol. Evol. (doi:10.1093/molbev/mss179)
Annual Review of Genetics (doi: 10.1146/annurev-genet-111212-133301)
Nature Reviews/Genetics 14: 645-660
PLoS Genetics 8: e1002942 (doi:10.1371/journal.pgen.1002942)
PLoS One 7: e48650 (doi:10.1371/ journal.pone.0048650)
PLoS Genet 8(3): e1002589 (doi:10.1371/journal.pgen.1002589)
PLoS Genet 9(10): e1003860 (doi:10.1371/journal.pgen.1003860)
Genetics, Vol. 195, 1407–1417
Science 340: 1211-1214 (doi: 10.1126/science.1234393)