June 26th 2017

Proponents of massive alternative splicing argue that most human genes produce many different protein isoforms. According to these scientists, this means that humans can make about 100,000 different proteins from only ~20,000 protein-coding genes. They tend to believe humans are considerably more complex than other animals even though we have about the same number of genes. They think alternative splicing accounts for this complexity [see The Deflated Ego Problem].

Opponents (I am one) argue that most splice variants are due to splicing errors and most of those predicted protein isoforms don't exist. (We also argue that the differences between humans and other animals can be adequately explained by differential regulation of 20,000 protein-coding genes.) The controversy can only be resolved when proponents of massive alternative splicing provide evidence to support their claim that there are 100,000 functional proteins.

Some scientists are attempting to test the hypothesis by looking for the predicted proteins. One of the ways to do that is to look for them using mass spectroscopy. Recently (February, 2017) Tress et al. reviewed and reanalyzed eight large scale experiments and databases. Here's the abstract of their paper ...

Alternative splicing is commonly believed to be a major source of cellular protein diversity. However, although many thousands of alternatively spliced transcripts are routinely detected in RNA-seq studies, reliable large-scale mass spectrometry-based proteomics analyses identify only a small fraction of annotated alternative isoforms. The clearest finding from proteomics experiments is that most human genes have a single main protein isoform, while those alternative isoforms that are identified tend to be the most biologically plausible: those with the most cross-species conservation and those that do not compromise functional domains. Indeed, most alternative exons do not seem to be under selective pressure, suggesting that a large majority of predicted alternative transcripts may not even be translated into proteins.

There are lots of problems with these mass spec experiments. For example, they hardly ever detect all of the peptides of most genes even when the proteins are present in high concentrations. In addition to these well-known false negatives, there are some interesting false positives in the data. The authors are aware of these limitations and they are described and discussed in the paper.

Nevertheless, it is remarkable that tens of thousands of predicted protein variants are not detected in these experiments. The authors conclude ...

Alternative splicing is well documented at the transcript level, and microarray and RNA-seq experiments routinely detect evidence for many thousands of splice variants. However, large-scale proteomics experiments identify few alternative isoforms. The gap between the numbers of alternative variants detected in large-scale transcriptomics experiments and proteomics analyses is real and is difficult to explain away as a purely technical phenomenon. While alternative splicing clearly does contribute to the cellular proteome, the proteomics evidence indicates that it is not as widespread a phenomenon as suggested by transcript data. In particular, the popular view that alternative splicing can somehow compensate for the perceived lack of complexity in the human proteome is manifestly wrong. [my emphasis LAM]

Note: I strongly object to using "alternative splicing" as a synonym for "detection of large numbers of splice variants." I think the term "alternative splicing" should be restricted to genuine examples of real alternative slicing that generate different functional proteins. We should be very careful to make this distinction very clear in our writing.

The authors review other data on splice variants noting that they are not conserved and they are usually present at low concentrations. Coupling that data with the lack of detection of protein variants they say ...

The results from large-scale proteomics experiments are in line with evidence from cross-species conservation, human population variation studies, and investigations into the relative effect of gene expression and alternative splicing. Gene expression levels, not alternative splicing, seem to be the key to tissue specificity. While a small number of alternative isoforms are conserved across species, have strong tissue dependence, and are translated in detectable quantities, most have variable tissue specificities and appear to be evolving neutrally. This suggests that most annotated alternative variants are unlikely to have a functional cellular role as proteins. [my emphasis, LAM]

My colleague at the University of Toronto, Ben Blencowe, is a strong supporter of alternative splicing and its role in creating multiple isoforms of most proteins. He wrote a letter to Trends in Biochemical Sciences in which he challenges their results and conclusions. I'll discuss that letter in my next post. The authors of the paper respond to that letter.

If you have questions about the Tress et al. paper this is a good place to ask them since one at least one of the authors (Frederico Abascal) reads Sandwalk. I've repeatedly asked proponents of alternative splicing to address the issues I raise here but they have consistently declined to engage in debate. I don't know why they are so reluctant to defend their views.

Finally, allow me to make an important point about massive alternative spicing. This is the view that most human protein-coding genes (~90%) are alternatively spliced to produce multiple functional protein isoforms. This view is just speculation. It is not supported by solid evidence.

The fact are:

Almost all intron-containing genes produce slice variants or various sorts. Most databases show a strong correlation between the size of a gene and the number of variants that have been detected. Most genes have ten or more different variants in the various databases.
Splicing is an error-prone process. The error rate of splicing ranges from 0.01% to 1%. Modern techniques are quite capable of detecting the products of splicing errors and entering their sequences into the databases.
There are some excellent examples of true alternative splicing where the various protein isoforms have been detected and their functions elucidated. There are 35 examples listed in Tress et al. (2017). There may be several hundred examples if you include those with weaker evidence. There are about 20,000 protein-coding genes in the human genome.
Most variant splice sites are not conserved. The same gene in related species can produce a very different pattern of splice variants.
After decades of searching, the vast majority of predicted protein isoforms have not been detected.

There are some interesting questions that have not been addressed.

Most of the predicted protein isoforms postulated by massive alternative splicing proponents make no sense. For example, why would there be multiple isoforms of the standard metabolic enzymes? For proteins involved in large complexes (e.g. RNA polymerase) why would there be multiple isoforms that completely alter the structures of the polypeptides?
Why is it necessary to "explain" human complexity by postulating massive alternative splicing? What's wrong with the standard explanation from developmental biology?
Is the evolution of massive alternative splicing in a single species, like humans, compatible with our understanding of evolution? How about the presumed expansion in clades such as mammals? Is that more compatible? Is natural selection really so powerful?
Why do proponents of massive alternative splicing consistently ignore the possibility that splice variants could be just splicing errors? Why do reviewers of their papers allow them to ignore the main scientific criticism of their views? This is not how science is supposed to work.

Tress, M.L., Abascal, F., and Valencia, A. (2017) Alternative splicing may not be the key to proteome complexity. Trends Biochem. Sci. 42:98-110. [doi: 10.1016/j.tibs.2016.08.008]

This post first appeared on Sandwalk, please read the originial post: here

People also like

Master Your Gaming Skills with Luna Cloud Gaming: A Step-by-Step Guide

Is the Euphoria Around Electric Vehicles Fading?

Debating alternative splicing (Part III)

Share the post

Subscribe to Sandwalk

Thank you for your subscription