Understanding how life began and evolved on Earth is a question that has fascinated humans for a long time, and modern scientists have made great advances when it comes to finding some answers. Now, our recent study hopes to offer new insights into the origin of life on Earth.
Around 375 million years ago, our fish-like ancestors breathed through gills. Over 600 million years ago, the common ancestor of all animals emerged – the microscopic urmetazoan. Billions of years before all of that happened, however, the common ancestor of all living organisms, the last universal common ancestor (Luca), must have existed.
Scientists have worked on identifying Luca over the decades with different ideas about what Luca was like. Another point of contention is Luca’s age. The earliest fossil evidence we have for life is around 3.4 billion years old. Some studies push back Luca’s age close to the birth of Earth, 4.5 billion years ago. Others think this is impossible because of the time it would take to establish the genetic code and DNA replication machinery.
Luca was not the first form of life; it was the organism from which all living organisms have descended. Nevertheless, scientists think living organisms may have existed way before Luca. Understanding what Luca was like, and when it lived, is important for helping us figure out how life has evolved on Earth.
In our recent study, published in Nature Ecology & Evolution, we used a combination of scientific methods to reconstruct Luca’s genome and show how the genes we found might have allowed Luca to live. This project was the result of several years of work and an international team of collaborators.
The nature of Luca
To reconstruct Luca’s genome, we needed a sample of genomes (all the genetic information in an organism) from across different groups of bacteria and archaea (single cell organisms distinct from bacteria) so that we could be sure we were sampling modern life. We excluded eukaryotes (plants, animals, and fungi) because scientists think they evolved from a union of archaea and bacteria, much later on. We had a set of 700 genomes (350 archaea and 350 bacteria), already curated from a 2022 study some of us were involved in.
We sorted these genes into different families to understand their purpose in modern organisms. We used a database for this, called KEGG, that helps scientists figure out organisms’ metabolic pathways (how they sustain life).
Next, we used these families to infer phylogenetic trees (or phylogenies, somewhat like a family tree) to understand the relationship between different species and see how they evolved over time. We also built a separate set of 57 genes that are common to all the 700 organisms in our study and that are probably in almost all life. These types of genes have not changed much over the last few billion years.
We used these 57 genes to build a species tree, which shows the Darwinian relationship of the different organisms. We could then combine our KEGG gene trees with the species tree, by modelling rates of gene duplication, gene transfer and loss. This also allowed us to calculate the likelihood of different gene families being present in Luca.
Reconstructing Luca’s genome allowed us to estimate its metabolism, as if it were alive today. We picture Luca as a quite complex organism such as modern bacteria and archaea, with a small genome. However, we did not find evidence for photosynthesis (which some bacteria use) or nitrogen fixation, a chemical process some modern bacteria and archaea use to stay alive.
How old was Luca?
We also tried a new method to estimate Luca’s age by using genes which we think duplicated before Luca together with information from fossils.
Normally, to infer evolutionary timelines, we would obtain a phylogeny of our species of interest with homologous genes, which trace back to a common ancestor.
Read more:
Explainer: what is the molecular clock?
Then, we would find a group of species that are distantly related (an outgroup) to our species of interest to establish the root of the phylogeny.
The “branches” that connect the species in a phylogeny hold information about the rate at which genetic changes (mutations) happened and the time at which species diverged. We can use fossil or geological evidence to inform the molecular clock about potential minimum ages at which speciation events took place.
With Luca, however, we have two problems. There is no outgroup to the origin of life and there are not many fossils or much geological evidence from the early Earth that we can use to calibrate the molecular clock.
To overcome these restrictions, we used paralogous genes that scientists had already traced to Luca. Paralogous genes are related to each other through gene duplication. This can happen when a species splits into two, each with its own copy of the duplicated gene.
We estimate that Luca roamed the Earth around 4.2 billion years ago. If our time estimate is close to the truth, things such as the genetic code, protein translation, and life itself must have evolved rapidly, almost right after the Earth was formed.
Our reconstruction of Luca is not the first, and it certainly will not be the last. More and more organisms are being discovered and sequenced each year, computers are getting more powerful, and evolutionary models are continuously improving. Therefore, our understanding of Luca may change when more data and powerful techniques are available.
For instance, we should consider that there were probably many other organisms living at the time of Luca which are no longer represented by any organisms today. If any of Luca’s early descendants did not make it to the modern day, and their genes did not survive, then we will never be able to map these gene families back to Luca, which means our reconstruction of Luca may be incomplete.
Despite all technical limitations, our study sets a new way to understand Luca. But there is still much more work to be done to better understand how life has evolved since the formation of our planet Earth.