Tags: gene junk dna

March 13th 2018

Kostas Kampourakis is a specialist in science education at the University of Geneva, Geneva (Switzerland). Most of his book is an argument against genetic determinism in the style of Richard Lewontin. You should read this book if you are interested in that argument. The best way to describe the main thesis is to quote from the last chapter.

Does The Samsung Galaxy Tab S8 Tablet…
Niger fashion designer wants to sprea…
Angelina Pivarnick Biography, Age, In…

Here is the take-home message of this book: Genes were initially conceived as immaterial factors with heuristic values for research, but along the way they acquired a parallel identity as DNA segments. The two identities never converged completely, and therefore the best we can do so far is to think of genes as DNA segments that encode functional products. There are neither 'genes for' characters nor 'genes for' diseases. Genes do nothing on their own, but are important resources for our self-regulated organism. If we insist in asking what genes do, we can accept that they are implicated in the development of characters and disease, and that they account for variation in characters in particular populations. Beyond that, we should remember that genes are part of an interactive genome that we have just begun to understand, the study of which has various limitations. Genes are not our essences, they do not determine who we are, and they are not the explanation of who we are and what we do. Therefore we are not the prisoners of any genetic fate. This is what the present book has aimed to explain.

If you are interested in real facts about genes and the history of gene definitions, then you will be sorely disappointed because the author has fallen for the ENCODE hype. Similarly, if you want to know about genomes and junk DNA then don't read this book. The author takes his cues from Junk DNA by Nessa Carey and The Deeper Genome by John Parrington.

Genomes and junk are the topics that interest me so let's look at some other excerpts from the book, keeping in mind that the main part of the book is about genetic determinism and the large-scale phenotypic effects of genes and alleles.

The concept of a "gene" was poorly defined in the first part of the twentieth century. That fuzzy definition is still common today. It imagines a gene as a nebulous entity responsible for some visible trait. It's the way most people still think of a gene and it's the way students are often taught when they study genetics. Kostas Kampourakis does a pretty good job of describing the history of this idea up until 1953.

The next stage is something he calls the "molecularization" of genes. That's the transformation from a gene as the subject of genetics to the idea that a gene is the subject of biochemistry and molecular biology. This is an important shift and the author is justified in emphasizing the transformation.

From this point on, the book gets pretty confusing. The part I like is that the author doesn't get bogged down in the old-fashioned idea that genes only encode proteins. From fairly early on in the book he recognizes that a gene can specify either a protein or a functional RNA.¹ So far, so good.

The problems begin when he starts describing all the things that make a precise definition of a gene so difficult. Rather than treat these as exceptions that can be accommodated by a good working definition [What Is a Gene?], he focuses on the problems ...

Regulatory sequences, discontinuous genes, overlapping genes, trans-splicing, RNA editing, among other things, have made impossible the structural individualization of genes on DNA. Looking more closely into the phenomena presented in this chapter might make one argue that the RNA transcript should be considered as the "true" gene. ... The important conclusion from all these phenomena is that DNA does not contain distinct segments corresponding to the genes it is supposed to contain, or, in other words, that genes cannot be structurally individuated. These phenomena can therefore put the existence of genes into doubt. Do genes really exist? Perhaps they are a heuristic tool for research but nevertheless a human invention that we are still trying to force into existence.

Kampourakis has created a problem for himself by failing to point out that there are functional DNA sequences that don't count as genes using the molecular definition (regulatory sequences, centromeres, origins of replication) but do count as "genes" in the classic genetic sense since mutations in these sequences can produce an effect on the organism. His description would be much clearer if he had made this distinction.

In addition, he got confused by reading the ENCODE papers and falling for their paradigm shaft about the nature of genes [What is a gene, post-ENCODE?].

Now let's look at how the author deals with junk DNA. It's the subject of Chapter 11: "Genomes Are More than the Sum of Genes." That's an interesting title. It's correct, of course, especially if you take into account essential DNA sequences that aren't genes. However, it's a bit late in the book to be bringing up this topic. Here's what he says on page 210.

Is 98 percent of our DNA meaningless, as in the [example] above? Is it really "junk," perhaps the relic of our evolutionary history during which DNA sequences were simply accumulated? The answer is no, and in this chapter I explain why. The relevant knowledge has been emerging during the recent years, and we have come to know that much of what we used to call "junk" DNA seems to have important functions, particularly in the regulation of the the expression of genes.

As I pointed out above, Kampourakis should have addressed this point early on when he was discussing how to define a gene. He left readers with the impression that the only important genome sequences were genes. He brings up the old canard that protein-coding regions are the only ones that count and all the rest was thought to be junk. Now he proposes to refute this strawman by explaining what he should have made clear 100 pages earlier.²

In fairness, he notes that the strawman view has been challenged in the past.

However, it should be noted that although the details have emerged recently, several researchers had been long aware that "junk" DNA was not entirely useless and that some DNA that does not code for proteins has important roles (Palazzo & Gregory, 2014).

I find it interesting that he quotes a four-year-old paper from my colleagues where they explain the real history of the problem. The details have not emerged recently as Kampourakis claims. We've known about important non-coding DNA for 50 years!

So, what is this recent data that calls into question the existence of junk DNA? You can probably guess the answer. Kampourakis recognizes that the genes for transfer RNAs (tRNAs) and ribosomal RNAs (rRNAs) had been identified long ago. But then he says,

By that time [late 1960s], it had already become clear that nontranslated or noncoding RNA molecules, such as rRNA and tRNA, have an important role in gene expression. But as the ENCODE project showed, there are other functional sequences outside protein-coding genes, which encode certain noncoding RNA molecules. This led to the expanded definition of genes presented in Chapter 4, which includes the genes for noncoding RNA as well. Except for tRNA and rRNA, these genes encode other types of RNA molecules, such as small nuclear RNAs (snoRNAs) that are involved in RNA editing and micro RNAs (miRNAs) that have important regulatory functions. Although the details are still under study, the emerging evidence suggest there are a lot more genes encoding regulatory RNAs than proteins in the human genome (Morris & Mattick, 2014).

There are several things wrong with those sentences. For one thing, it totally misrepresents the history of the field. Noncoding RNAs such as snRNAs, miRNAs, and others were well known for many decades before ENCODE was started. Also, the definition of a gene as a DNA sequence that specifies a functional RNA was common in textbooks long before ENCODE. The ENCODE results did not prompt a serious revision of the definition of a gene in spite of the claims of ENCODE researchers. Finally, it is not true that there are more genes for regulatory RNAs than for proteins. (There are about 20,000 protein-coding genes.) The final results are not in but it's very unlikely that there are 20,000 different genes for noncoding RNAs. And even if that statement turns out to be true, it doesn't represent a significant fraction of the genome.

It's clear that Kampourakis is solidly in ENCODE camp and it's clear that he does not understand the Palazzo & Gregory paper and does not understand the evidence for junk DNA [Five Things You Should Know if You Want to Participate in the Junk DNA Debate].

Some beating of dead horses may be ethical, where here and there they display unexpected twitches that look like life.

Zuckerkandl and Pauling (1965)

Sandwalk readers are probably annoyed at me for beating a dead horse but here's the problem. It's been more that ten years since the initial ENCODE results were published and more than five years since the main results were published in 2012 (along with the massive publicity campaign). Criticisms of the ENCODE hype have been widely available in the scientific literature and elsewhere since 2007. Many experts in evolutionary biology have explained the evidence for Junk Dna and pointed out the limitations of the ENCODE conclusions.

All of this information is available to anyone who studies the problem. All knowledgeable scientists recognize that the case for junk DNA is very strong. Kampoourakis addresses some of this criticism—notably the lack of conservation of presumed functional RNAs—but he ignores most of the other criticisms. Why? Why do so many authors perpetuate the ENCODE hype in the face of so much evidence that it's wrong? Is it because the publicity campaign organized by ENCODE researchers—with the help of Nature and Science—was so effective that it continues to overwhelm any attempt to correct the record? That's not a very good excuse for someone who is supposed to do the research before publishing a book on the subject of genes and genomes.

1. He's not very consistent. There are times in the second half of the book when he talks about genes as sequences that encode proteins.

2. Keep in mind that we include introns when we define a gene as a sequence that's transcribed. Thus, intron-containing protein-coding sequences make up 25% of our genome and known noncoding genes account for another 5%. Genes occupy 30% of our genome—a fact that should be mentioned in a book about genes.

Palazzo, A.F. and Gregory, T.R. (2014) The Case for Junk DNA. PLOS Genetics, 10:e1004351. [doi: 10.1371/journal.pgen.1004351]

This post first appeared on Sandwalk, please read the originial post: here