Abstract
Organic molecules preserved in ancient rocks can function as ‘biomarkers’, providing a unique window into the evolution of life. While biomarkers demonstrate intriguing patterns through the Neoproterozoic, it can be difficult to constrain particular biomarkers to specific organisms. The goal of the present paper is to demonstrate the utility of biomarkers when we focus less on which organisms produce them, and more on how their underlying genetic pathways evolved. Using this approach, it becomes clear that there are discrepancies between the biomarker, fossil, and molecular records. However, these discrepancies probably represent long time periods between the diversification of eukaryotic groups through the Neoproterozoic and their eventual rise to ecological significance. This ‘long fuse’ hypothesis contrasts with the adaptive radiations often associated with the development of complex life.
Introduction
Genetics offers an exciting vantage to study evolution, but geology takes primacy as a direct window into the past [1]. By its nature, the fossil record will always be woefully incomplete, but new technologies continually push the boundaries of what can be recovered; clumped isotopes reveal the metabolic rates of extinct organisms [2,3], fossilized melanosomes demonstrate the colors of preserved skin and feathers [4,5], and proteins are being recovered from increasingly old specimens [6,7]. One important contribution to this paleontological renaissance comes from geochemistry, where the molecular compounds produced by long-extinct organisms can be extracted from rocks and analyzed using mass spectrometry. These organic compounds — collectively referred to as ‘molecular fossils’ or ‘biomarkers’ — provide insights into the organisms that existed in the deep past, even when traditional fossils fail to preserve [8].
In this review, I will focus on a particular class of biomarkers called steranes. There are many other interesting biomarkers; 2-methylhopanes, for example, are produced by cyanobacteria and have been used to infer microbial community dynamics across major Earth history events [9,10]. However, steranes are particularly relevant to the Neoproterozoic, as they show dynamic fluctuations during this period that are thought to chart the evolution of complex life. It has been suggested that steranes offer a single character for evolutionary analysis and are therefore insufficient to identify specific lifeforms in the past [11,12]; I strongly disagree with this logic. Many scientists have spent decades elucidating the genetic underpinnings of biomarker biosynthesis [13]. Thanks to this effort, a biomarker is not merely a single chemical compound, it also speaks to the many genes required to make it. By combining genetic and geochemical data, we have an opportunity to fully leverage the information inherent in biomarkers.
Understanding sterane biomarkers
Interpreting sterane biomarkers requires an understanding of their molecular structure. Steranes are the degraded (in geological terms, diagenetic) products of sterols, which are a class of lipid molecules. All sterols share a basic carbon skeleton structure, which includes a cyclopentic ring nucleus and a side chain (Figure 1). The sterol shown in Figure 1A is cholesterol — the sterol that is probably most familiar to readers — but various organisms modify the side chain and/or nucleus to make hundreds of different variants. Over deep time, preserved sterols tend to lose the alcohol (OH) functional group attached to the third carbon. When this occurs, the sterol is now referred to as a sterane. Other than the loss of this functional group and most double bonds, steranes are highly resistant to diagenesis when incorporated into macromolecular structures such as petroleums and bitumens [8]. In addition to cholestane (the sterane derivative of cholesterol), common steranes preserved in the geologic record include ergostane (a derivative of the 28-carbon ergosterol) and stigmastane (a derivative of the 29-carbon stigmasterol/sitosterol). A variety of common and rare steranes are shown in Figure 1B.
(A) Carbon skeleton structure of cholesterol. (B) The structure of six geologically important steranes. The number of carbons in each sterane is indicated to the right. (C) The relative abundance of these six steranes through geological time. Note that the Neoproterozoic and Phanerozoic are not shown to scale. Data for (C) adapted from ref. [20].
Many sterols are produced by specific lineages of life; in biological terms, such sterols are phylogenetically informative. Complex sterols, such as ergosterol and stigmasterol, are restricted to eukaryotic lifeforms [14,15]. Thus, the presence of steranes in the fossil record indicates the evolution of eukaryotic life. Additionally, certain sterols are more common in some eukaryotic groups than others. In sweeping generalities (that will be nuanced later in the present paper), ergostanes are often considered a biomarker for fungi and algae, 29-carbon steranes like stigmastane indicate plants and green algae, and 24-isoproylcholestane is a biomarker for prehistoric sea sponges. These patterns help scientists ‘read’ the biomarker record and determine which organisms were dominant at different points in the past.
The trajectory of steranes through the Neoproterozoic
The biomarker record in the Neoproterozoic demonstrates dramatic fluctuations and is summarized in Figure 1C. The oldest steranes were once thought to come from ∼2.7 billion-year-old rocks from the Pilbara Craton in Australia, but these have since been rejected as contamination [16,17]. Most of the published pre-Ediacaran biomarker record might also be compromised by younger contaminants, and resolving this is an important area of active research [18,19]. Today, the oldest accepted steranes occur ∼820 million years ago (Mya) [20]. For ∼100 million years, sterane concentrations remain low relative to bacterial hopanes [21]. Cholestanes are the dominant steranes found in these rocks, although trace amounts of an unusual sterane called cryostane have also been recovered [20,22]. The Cryogenian period (720–635 Mya) represents a time of increasing sterane diversity and abundance. Between the Sturtian and Marinoan glaciation events, steranes increase two to three orders of magnitude relative to hopanes [20]. Ergostane and stigmastane reach approximately modern levels, and rare steranes, such as 24-isopropylcholestane and 24-n-propylcholestane, first appear [20,23].
Taken at face value, this record could signify the evolution and expansion of marine eukaryotic life. The presence of cholestanes ∼820 Mya could represent the evolution of unicellular eukaryotes, and perhaps multicellular red algae (rhodophytes), which predominantly synthesize cholesterol today [21]. The expansion of stigmastanes would mark the development of green algae (chlorophytes), while ergostanes signify the presence of fungi and/or additional marine eukaryotes (stramenopiles/alveolates/rhizarians). 24-isopropylcholestane would mark the evolution of sea sponges and thus the earliest animals [24]. This offers a narrative of eukaryotic diversification starting in the interglacial period of the Cryogenian, which continued through the Neoproterozoic until the Cambrian explosion of modern animal groups [25,26].
As compelling as this story is, it is inconsistent with other lines of evidence. For example, eukaryotes appear to have evolved well before ∼820 Mya. Molecular clocks — which compare genetic differences between living organisms to infer the timing of their last common ancestor [27] — place the origin of living eukaryotes between 950 and 1870 Mya [28–30]. Fossils also suggest an older history; Bangiomorpha pubescens is a well-accepted red algae from ∼1047 Mya [31], while other, more controversial red algae could be closer to 1600 Mya [32]. Sponge biomarkers have the opposite problem; unequivocal sponge fossils do not appear until the Cambrian, almost ∼100 million years after the origin of sponge biomarkers. These discrepancies have led to some doubt about our interpretations of the biomarker record.
The above statements describing the sources of various biomarkers are based on studies of living organisms, but a more rigorous way to interpret biomarkers is by thinking about the evolution of the genes required for their biosynthesis. A recurring theme throughout the present paper is that genes have a distinct evolutionary history from species. I have illustrated this point in Figure 2. When one species splits into two, each carries a copy of an ancestral gene. In one lineage that gene might eventually be removed by natural selection, while in the other lineage a gene duplication event occurs, meaning it now has two copies. Additionally, horizontal gene transfer allows genes to be swapped between distantly related organisms (this process appears to have been critical for the early evolution of sterol biosynthesis [33]). The important takeaway is that evolutionary events at the genetic level can occur at different times than the events that lead to different species. These patterns of gene loss, gene duplication, and horizontal gene transfer dictate the types of sterols that different lineages can produce. For the rest of the present paper, I will provide two case studies that show how a gene-centered view of sterol evolution helps clarify the patterns seen in the sterol record.
The history of two genes is illustrated in this scenario; the first indicated with a solid line and the second with a dotted line. Images adapted from ref. [34].
Sponge biomarkers and the rise of animal life
A plethora of Neoproterozoic sponge fossils have been described, but all remain controversial. I am inclined to agree with several recent reviews that suggest there are no unambiguous sponge fossils prior to the Cambrian [35,36]. This is, however, in complete disagreement with the molecular data, which consistently support a Neoproterozoic origin for sponge diversification [37–42]. In this regard, the presence of putative sponge biomarkers ∼650 Mya is compelling, as it brings the geological and genetic records into general congruence. It is possible that the Precambrian sponge fossil record has been lost due to significant taphonomic bias caused by sparse biomineralization in early sponges and/or poor preservation conditions [43]. It is worth noting that anoxic deposits from the Late Permian are associated with a decrease in the size and abundance of siliceous sponge spicules [44], and that oxygen levels would have been even lower in the Neoproterozoic. However, I sympathize with paleontologists who express skepticism regarding the extensive missing fossil record that biomarkers imply, particularly given the effort scientists have made in trying to find Precambrian sponges. Is it possible that convincing sponge fossils have not been found in the Neoproterozoic because they do not exist? My goal is to walk readers through the sponge biomarker hypothesis and give my perspective on the idea's strengths and limitations.
The sponge biomarker hypothesis [24,45] is an attempt to explain the relative abundance of 24-isopropylcholestanes in certain Neoproterozoic rocks. The 30-carbon (C30) precursor sterol is exceedingly rare in nature, but is found in two relevant lineages, the demosponges (a subset of living sea sponges) and pelagophyte algae. Some sea sponges produce 24-isopropylcholesterol as the major component of their lipid biomass [46–48], while pelagophytes produce trace amounts of the sterol during the biosynthesis of a different compound, 24-n-propylcholesterol (Figure 1B) [49–54]. Based on these observations, the ratio in rocks of 24-isopropylcholestanes to 24-n-propylcholestanes (24ipc/24npc) should be indicative of their source: low 24ipc/24npc ratios provide a biomarker for algae, while high ratios indicate sponges. Of course, all of these ideas are based on the production of sterols in living organisms; how can we be confident that pelagophyte algae did not produce larger quantities of 24-isopropylcholestrol in the deep past [12]?
This question can be addressed at the genetic level. Scientists have determined that the gene sterol 24-C-methyltransferase (or smt) is required to add methyl groups (carbon) to the 24th carbon in the sterol skeleton (Figure 1) [55]. In most cases, the number of smt genes an organism has dictates the upper limit to the number of methyl groups that can be added to carbon 24 [56]. Pelagophyte alage and sponges both appear to produce C30 sterols through independent gene duplications of the smt gene. So, understanding when these gene duplication events occurred can elucidate when the two lineages evolved the ability to produce C30 sterols.
Figure 3 shows a molecular clock of the smt gene (adapted from ref. [56]). In other words, this analysis tracks the pattern of speciation events and gene duplication events that led to the diversity of smt genes that exist across eukaryotes today. Every star in this figure marks a gene duplication event; the stars relevant to sponge and algal duplications are highlighted in yellow. The bars around each star represent 95% confidence windows. As the bars demonstrate, there is significant uncertainty regarding the timing of these gene duplication events. But, the results strongly reject one hypothesis for the sterane's source; the gene duplication event in the algal lineage did not occur until hundreds of millions of years after the Neoproterozoic. This work provides compelling evidence that neither pelagophyte algae nor their direct ancestors had the genes necessary to produce C30 sterols in the Neoproterozoic.
These results refute the hypothesis that pelagophyte algae could be responsible for Neoproterozoic biomarkers, but several additional arguments against the sponge biomarker hypothesis have been proposed. One claim is that this biomarker disappears in the Cambrian which, in seeming contradiction, is precisely when diverse sponge fossils finally show up. This argument is not entirely correct. It is not true that 24-isopropylcholestanes disappear in the Cambrian; instead, the 24ipc/24npc ratio becomes heavily biased toward 24-n-propylcholestanes [45]. This is consistent with increased algal outputs relative to sponges in the Cambrian. Secondly, most sponges do not produce 24-isopropylcholestrol, and while the sterol is present in all major classes of demosponges, only a handful of species produce it as a significant component of their sterol repertoire [24]. It is therefore quite possible that the drop in 24-isopropylcholestanes after the Cambrian represents the replacement of a dominant, primitive sponge lineage by various diverse forms found in the Cambrian, many of which traded 24-isopropylcholestrol for other sterols.
A second argument is how do we know that another group of organisms (either unsampled or extinct) did not converge on the ability to synthesize 24-isopropylcholesterol? This is a more serious problem, and given the nature of science will never be fully refutable. However, I do not think this problem is intractable. Sterol pathways have not changed much in eukaryotes over evolutionary time. Key enzymes in the sterol biosynthesis pathway are deeply conserved [13,57], and major clades of eukaryotes tend to produce similar sets of sterols [58]. No demonstrably extinct sterols have been identified in the fossil record [59]. Sponges are unique among living organisms in the diversity of unusual sterols they produce. From a genetic level, they appear to do this through a two-step process. Firstly, modification of the sponge smt gene allows for the promiscuous methylation of sterols, meaning a single SMT enzyme can generate both C28 and C29 sterols [56,60]. One or more gene duplication events resulted in additional sponge smt genes, which allow for the production of rare C30 sterols [56]. This hypothesis has only been proposed over the last few years and needs to be further tested with additional genetic and functional experiments. Most importantly, more genetic data need to be collected from sponge species that produce C30 sterols as their primary lipids. However, the more we come to understand about the genetic history of sponge smt genes, the more complex their evolution appears to be, and the less likely convergence appears.
As a final point for those interested in molecular clocks, the inability to fully rule out alternative sterane sources — combined with uncertainty about when precisely demosponges evolved the ability to produce this compound [61] — is why sponge biomarkers should not be used for the calibration of molecular clocks. Fortunately, the inclusion of sponge biomarkers as a molecular clock calibration is not necessary to get a Neoproterozoic origin for sponges, and the two should be treated as independent lines of evidence for sponge-grade animals existing in the Cryogenian. I find this convergence of data compelling evidence for a Neoproterozoic origin of animal life, and enigmatic fossils from this period should be considered in light of these data.
Algal biomarkers: a Cryogenian diversification?
For this second case study, I would like to turn back to the major pattern of Neoproterozoic steranes — the diversification of ergosteranes and stigmastanes following the Sturtian glaciation. Hoshino et al. [23] emphasize that this pattern is consistent with the genetic record as described in ref. [56]. They note that molecular clock data suggest that green algal smt genes diverged in the Late Cryogenian, and that ‘stigmasteroid biosynthesis emerged in an ancestral green algae and subsequently led to the rise of this group to ecological dominance’. I have reproduced the relevant image in Figure 4. To be clear, I think Hoshino et al. provide important new data on the biomarker record, and present an intriguing hypothesis about the rise of complex life. However, interpreting their particular claim in the light of the gene/species evolution dichotomy reveals that the genetic data are more problematic than Hoshino et al. suggest.
What Hoshino et al. focus on is a speciation event, specifically when the green algae Ostreococcus tauri separated from higher plants. But, stigmastanes are far more ubiquitous than 24-isopropylcholestanes. Many other eukaryotes, besides green algae and plants, synthesize C29 sterols, including certain fungi, ichthyosporeans, amoebas, sea sponges, eustigmatophytes, and kinetoplastids [56,57,60,62,63]. In most of these organisms, the ability to synthesize C29 sterols is associated with a second copy of the smt gene [56]. So, from a genetics perspective, the question is not when did green algae like Ostreococcus evolve, but when did eukaryotes evolve a second smt gene?
Figure 4 suggests that the ‘algal’ gene duplication event is actually quite old and extends all the way back to the origin of the Bikonta (i.e. the last common ancestor of stramenopile algae and green algae). The molecular data therefore suggest that the ability to make C29 sterols predates stigmastanes in the rock record by hundreds of millions of years. It also shows that C29 biosynthesis is an ancient trait in eukaryotes. It is therefore unlikely that the evolution of stigmasteroid biosynthesis played a causal role in the success of green algae in the Neoproterozoic, as this ability already existed in its ancestors for hundreds of millions of years. The basic premise of Hoshino et al. — that the rise of stigmastanes in the Neoproterozoic coincides with the rise of green algae — is still consistent with the molecular data, but a gene-centered understanding of biomarkers reveals as many questions as answers.
Conclusion
At first glance, a gene-centered interpretation of the biomarker record suggests significant discrepancies between the geochemical, paleontological, and molecular datasets. In truth, I think they are telling us a consistent story. I suspect the likeliest explanation for these discrepancies is a ‘long fuse’ hypothesis, meaning there is often a lag time between when organisms first evolve, and when they achieve sufficient ecological presence to leave a fossil record. Molecular clocks suggest red and green algae evolved in the Paleo/Mesoproterozoic [64,65], consistent with the presence of Bangiomorpha in the fossil record. Presumably, the sulfidic oceans and global glaciation events that dominated the Meso- and Neoproterozoic kept algal populations small, as indicated by undetectable-to-low sterane levels [20,25,66]. Only after the Sturtian glaciation event and the increased oxygenation of ocean waters did algae come to dominate the ocean's photic zone and subsequently leave a biomarker record. Similarly, molecular clocks consistently demonstrate that animals evolved in the Cryogenian [38,39,67]. 24-Isopropylcholestanes suggest that sponge-grade animals were present at this time, but it is unclear how common they were, or whether they produced the structural spicules that are diagnostic of sponge fossils [37]. Putative animals’ fossils through the Neoproterozoic suggest that most forms were uncommon, small, unmineralized, and generally unlikely to leave a fossil record [68–71]. It was only after the Cambrian that biomineralization evolved and animal diversity radiated. If this ‘long fuse’ hypothesis is correct, then I predict that the earliest members of eukaryote groups will also be some of the hardest to find. These point to the importance of combining genetic, fossil, and geochemical tools to explore this critical phase of Earth's history.
Summary
Organic compounds preserved in rocks can provide biomarkers for prehistoric organisms.
Sterane biomarkers show a complicated pattern through the Neoproterozoic that could represent the evolution of complex life.
Steranes are difficult to interpret on their own, but studying the genes responsible for their biosynthesis reveals what organism(s) could have produced them.
Combining biomarker, fossil, and genetic data suggests a lag time between the origin of several major groups (such as algae and animals) and when they become ecologically important enough to leave fossil records.
Competing Interests
The Authors declare that there are no competing interests associated with the manuscript.
Abbreviations: 24ipc/24npc, 24-isopropylcholestanes/24-n-propylcholestanes; C30, 30-carbon; Mya, million years ago; smt, sterol 24-C-methyltransferase
- Received February 6, 2018.
- Revision received April 30, 2018.
- Accepted May 23, 2018.
- © 2018 The Author(s). Published by Portland Press Limited on behalf of the Biochemical Society and the Royal Society of Biology