The Information Content of Molecular Fossils
Many different organisms can produce the same molecule. We use information theory and biosynthetic network analysis to quantify how much a molecular fossil can actually tell you about its source.
You find an organic compound preserved in a billion-year-old rock. What made it? The honest answer is often “we’re not sure,” because multiple organisms, pathways, and precursors can produce the same molecule, and sometimes different biologically-produced molecules get altered to end up looking the same. Geochemists have traditionally handled this by making qualitative judgments about biomarker “specificity,” but that’s hard to do rigorously.
We take a quantitative approach. Using information theory, metabolic network analysis, and other tools, we measure how much a given molecular fossil actually constrains its biological source. Some compounds turn out to be highly diagnostic. Others are ambiguous no matter how well preserved they are.
Retrobiosynthetic analysis
We trace the chemistry backward: starting from a preserved product and working through the network of possible biosynthetic routes that could have made it. This tells us how much the structure of biochemistry itself constrains source attribution, and where the real diagnostic power lies.
What molecular fossils can’t tell you
Compounds that would be most useful for identifying their source aren’t necessarily the ones that get preserved. But luckily some of the molecules that are preserved tend to be highly specific. The limits on what we can learn from the molecular fossil record aren’t just about diagenesis. They’re also about the structure of biochemistry: how many organisms make how many molecules through how many pathways.