Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

The evolution of de novo genes

Tags: gene
De novo genes are new genes that arise spontaneously from junk DNA [De novo gene birth]. The frequency of de novo gene creation is important for an understanding of evolution. If it's a frequent event, then species with a large amount of junk DNA might have a selective advantage over species with less junk DNA, especially in a changing environment.Last week I read a short Nature article on de novo genes [Levy, 2019] and I think the subject deserves more attention. Most new genes in a species appear to arise by gene duplication and subsequent divergence but de novo genes are genes that are unrelated to genes in any other clade so we can assume that they are created from junk DNA that accidentally becomes associated with a promoter causing the DNA to be transcribed. A new gene is formed if the RNA acquires a function. If the transcript contains an open reading frame then it may be translated to produce a polypeptide and if the polypeptide performs a new function then the resulting de novo gene is a new protein-coding gene.

The important question is whether the evolution of de novo genes is a common event or a rare event.

Noncoding genes

Noncoding genes1 are genes that produce a functional RNA that isn't translated. The human genome contains several thousand well-established genes in this category and there is widespread speculation that we have thousands of others that have arisen recently in the human lineage. Thus, the prevailing view is that such genes arise very frequently.

These presumed de novo genes produce a variety of RNAs but many of them are referred to as lncRNAs. However, in spite of the prevailing belief, there is very little evidence that most of the postulated noncoding genes are real genes that produce functional RNAs. The fact that they produce RNAs isn't in doubt: what's in doubt is whether these RNAs are junk or not [How many lncRNAs are functional?]. Personally, I don't think there are very many de novo noncoding genes but it's still an open question.

We can't assume that the formation of de novo genes is a frequent event based on the data for noncoding genes. Conversely, we can't assume that it's a rare event until we have more data, although I think the evidence points in that direction.

Protein-coding genes

The recent review in Nature focuses exclusively on the formation of de novo protein-coding genes. The author, Adam Levy, gives some confirmed examples of such genes in fish and fruit flies and he speculates that there may be many other examples in rice, mice, and primates (including humans).

The idea that de novo genes could arise spontaneously has been around for a very long time and there have been suggestive examples in the scientific literature dating back to the 1980s but it has only been in the 21st century that really good examples have been demonstrated. There still aren't many confirmed cases. As in the case of noncoding genes, the number of speculative unproven examples is far higher. Nevertheless, Levy suggests that it's time to think about the implications ...
De novo genes are even prompting a rethink of some portions of evolutionary theory. Conventional wisdom was that new genes tended to arise when existing ones are accidentally duplicated, blended with others or broken up, but some researchers now think that de novo genes could be quite common: some studies suggest at least one-tenth of genes could be made in this way; others estimate that more genes could emerge de novo than from gene duplication.
It's easy to establish whether a potential new protein-coding gene produces a protein because all you have to do is identify the protein in some cell. If you can't detect the protein by looking in a wide variety of cells and tissues, then it's possible that the RNA is never translated. The absence of a protein has tentatively eliminated hundreds of potential protein-coding genes [Origin of de novo genes in humans].

Just because you can detect a protein made from a putative de novo gene doesn't mean that the protein is actually functional. It could just be a spurious polypeptide. (Most of the putative de novo genes only encode a short polypeptide of less than 100 amino acid residues.)

The other problem is that in order to be a truly de novo gene there must not be any homologs in other species. Demonstrating this is more difficult than it seems because of the lack of highly accurate genome sequences. A rigorous and critical analysis of putative mouse de novo genes has resulted in a substantial reduction of the total possibilities so that now there appear to be only 139 candidates. This means that the rate of formation of de novo genes in the rodent lineage (mouse vs rat) is about 12 per million years. This is an upper limit, the real rate is almost certainly lower (Casola, 2018).

The rate in primate lineages is unknown but a recent paper suggests that it is about 2 per million years (Guerzoni and McLysaght, 2016) and this is consistent with other rigorous analyses of de novo protein-coding genes in humans. The rates in primates and rodents are significant but they are at least an order of magnitude lower than the rates of new gene formation from duplication.

It is commonly believed that formation of new species and diversification within a clade is associated with the evolution of new genes. This assumption is unnecessary since both speciation and diversifaction can be achieved by simply modifying the expression of existing genes without the necessity of creating new genes—the field of evo-devo is devoted to proving this fact. However, if you believe that new genes are necessary then the only way to evolve genes with entirely new function is by de novo formation from junk DNA.2 That leads to the suggestion that the presence of large amounts of junk DNA in a species/clade gives it a selective advantage over other species with smaller genomes.

This argument is suspicious because it sounds teleological and it invokes species level selection. It also has to deal with the fact that the rate of de novo gene formation is probably too low to account for any substantial amount of diversification over the time frames that are required for speciation.


1. It's awkward to define this group in negative terms (noncoding) but I haven't come up with a better term. Any suggestions are welcome.

2. Strictly speaking, this is not true since it's possible to create new genes from the opposite strand of existing genes. In fact, a good many of the putative de novo genes in rodents and primates fall into this category. I'm skeptical of those putative genes since there are very few proven examples of bona fide overlapping genes in eukaryotes.

Casola, C. (2018) From De Novo to “De Nono”: The Majority of Novel Protein-Coding Genes Identified with Phylostratigraphy Are Old Genes or Recent Duplicates. Genome Biology and Evolution, 10:2906-2918. [doi: 10.1093/gbe/evy231]

Guerzoni, D., and McLysaght, A. (2016) De novo genes arise at a slow but steady rate along the primate lineage and have been subject to incomplete lineage sorting. Genome Biology and Evolution, 8(4), 1222-1232. doi: [doi: 10.1093/gbe/evw074]

Levy, A. (2019) How evolution builds genes from scratch. Nature 574: 314-316. (Online title is: "Genes from the Junkyard.") [Nature]


This post first appeared on Sandwalk, please read the originial post: here

Share the post

The evolution of de novo genes

×

Subscribe to Sandwalk

Get updates delivered right to your inbox!

Thank you for your subscription

×