What Limits a Genome’s Size?

Words by
Niko McCarty

This is the first article in our new column. These are pieces that are too short to run in our quarterly Issues. Niko McCarty kicks off the series with a short essay on the biophysical limits that keep genomes from growing too long.

***

In the forests of New Caledonia, a remote South Pacific island, there live spindly ferns that extend their tendrils skyward like miniature beanstalks. Botanists recently trekked across this island to study the ferns, Tmesipteris oblanceolata. And what they found, remarkably, is the world’s largest genome; each fern cell contains 160.45 billion bases of DNA which, if unfurled, would extend for more than 100 meters—taller than the Statue of Liberty.

By way of comparison, with 3.1 billion bases of DNA, the human genome measures just 2 meters in length when stretched end-to-end. Even though humans have a relatively small genome compared to the New Caledonian fork fern, the way our DNA condenses into a cell’s nucleus is similarly impressive; it is, in fact, “geometrically equivalent to packing 40 kilometers (24 miles) of extremely fine thread into a tennis ball,” writes Bruce Alberts in Molecular Biology of the Cell.

So how can a genome 50 times longer—equivalent to 2,000 kilometers of thread, or roughly the distance between Cairo and Rome—do the same? The New Caledonian fork fern has clearly found a solution to this problem. While I initially suspected that the volume of a cell's nucleus would set the upper limit on genome size, I was surprised to learn that energy and diffusion are likelier bottlenecks.

Fork ferns, pictured here in New Caledonia, are small, green plants with yellow pods. Credit: Pol Fernandez

First, it’s helpful to understand how the fern’s genome was measured to begin with. It wasn’t done by sequencing because this particular genome is filled with millions of repetitive regions that make sequencing difficult. Instead, the scientists collected three fern specimens, chopped them up, and extracted nuclei from the cells. The fern nuclei were then mixed with nuclei extracted from an onion—which served as a control—and stained with propidium iodide, a fluorescent molecule that locks into the grooves of DNA strands.

Next came the fun part. The researchers measured the fluorescence of each nucleus using a device called a flow cytometer, and found that samples from the New Caledonian fork fern were about 9-times brighter than those from an onion. Since the onion genome is 16 billion bases in length, and the dye’s fluorescence is directly proportional to the amount of DNA it “locks onto,” this experiment suggested the fern’s genome must be at least 9-times longer than an onion’s—more than 144 billion bases. From here, this value was gradually refined by staining nuclei from other plants and comparing the measurements.1

Packing 100 meters of DNA into a cell nucleus that is orders of magnitude smaller than a grain of sand sounds, on its face, impossible. So it was with some incredulity that I sat down, pulled out a scratch piece of paper, and jotted down some estimates.

Scientists trek through Grande Terre, part of the archipelago known as New Caledonia, in search of ferns. Credit: Oriane Hidalgo

Per the book Cell Biology by the Numbers, each base pair of DNA occupies 1 nm3 of space. Multiplying this value by the fern’s genome size suggests that its DNA occupies a volume of at least 120 μm3, about 400,000 times smaller than a drop of rain.

Eukaryotic cells, including neurons, myocytes and, yes, ferns, have large nuclei in which to store DNA. HeLa cells, for example, a type of cell line commonly used to study cancer, have nuclei with a measured volume of about 370 μm3. Plant cell nuclei are often much larger. The Aztec Lily, a brilliant red flower native to Guatemala, has cell nuclei with volumes exceeding 5,000 μm3—among the largest nuclei ever found. While I haven’t seen any measurements or photographs of cells isolated from the New Caledonian fork fern that would enable me to estimate the size of its nuclei, I can make ballpark estimates based on measurements of other plants.

Prior studies have measured the dimensions of nuclei in tobacco plants and potatoes, with values ranging from 7 microns to 12 microns. Assuming that the nucleus is roughly spherical, then the internal volumes of such nuclei would range from 500 to 4,000 μm3. All this to say that even a massive genome can fit comfortably into a cell nucleus (much to my surprise!)2

Furthermore, scattered evidence suggests a positive correlation between a genome’s size and the nucleus volume. Various duckweed species, for example, have genomes that span anywhere from 160 million to 1.8 billion base pairs. And as their genomes get larger, their nuclei grow proportionally.

The real question to consider, then, is not how DNA fits into the nucleus, but rather: What sets the limit on genome sizes at all? And if the nucleus scales with a genome’s length, why can’t organisms just expand their genomes forever? The bottlenecks that seem likeliest to set an upper limit on genome sizes are energy and diffusion.

{{signup}}

It takes a great deal of energy for cells to make, copy, and maintain a large genome. Genomes are able to pack tightly inside the cell nucleus because DNA wraps around histones, positively-charged proteins that condense and hold the strands in place. Tangles and knots are inevitable, so enzymes called topoisomerases are tasked with cutting out knotted DNA and stitching the pieces back together. Every 180 base pairs of DNA wraps around 8 histone proteins to form a structural unit called a chromatosome. A genome as large as the New Caledonian fork fern’s requires billions of histones to stay compact and untangled.3

And therein lies the first bottleneck. A single fern has billions of cells, each with billions of nucleotides and billions of histones that must be made by collecting and shaping atoms from the environment. Each atom is forced into position by burning energy molecules, such as ATP. Each nucleotide of DNA requires 50 ATP to build.4 Larger genomes are also more susceptible to random mutations and double-strand breaks that require fixing. Correcting a single break in DNA requires the burning of more than 10,000 ATP molecules. The energy required to maintain a genome, in other words, scales with its length and eventually reaches a biophysical limit.

Electron micrographs of DNA wrapped around histones to form chromatin. Credit: B. Hamkalo, UC Irvine.

The second bottleneck is diffusion. The larger a cell’s genome, the more statistically unlikely it is that the enzymes responsible for decoding, or “reading” the instructions stored in DNA, will find the correct sequence. If an enzyme is searching for a sequence like ‘ATTTC,’ but the genome contains thousands of similar sequences, then the enzyme will waste time binding to the wrong places. And because a large genome correlates with a large nucleus, enzymes presumably must also travel further to find their binding site. All this “wasted time” adds up and can be the difference between life and death.5

Due to this cellular inefficiency, it is exceptionally rare for organisms to have large genomes. Out of all the organisms studied to date for which genome data are available, just 0.09 percent of them have a genome larger than 100 billion bases. The New Caledonian fork fern has managed to survive with its large genome—despite the energy costs—because no selection pressures are forcing it to downsize. The fern lives in a “relatively stable environment with little competition,” the authors of the recent study told Science. In other words, its large genome is neither helpful nor a hindrance.

The New Caledonian fork fern has hundreds of chromosomes that have accumulated over thousands of years. Their genomes swelled in size due to “whole genome multiplication” events, which happen because of errors in meiosis, the cellular process where cells divide and cut their chromosome numbers in half to prepare for sexual reproduction. The ferns accumulated all this DNA through random mistakes and, not needing to expend energy to rid themselves of the useless bits, have kept their massive genomes intact.

Still, many other plants and ferns have much smaller genomes, and a large genome is not a prerequisite for an organism’s complexity.6 Genlisea tuberosa, native to Brazil, has the smallest genome of any flowering plant at 61 million bases—and it’s a carnivore that kills protozoa and absorbs their nutrients through its leaves!

By beating the genome size record, the New Caledonian fork fern not only helps us appreciate “endless forms most beautiful,” as Charles Darwin wrote in his closing passages of On the Origin of Species in 1895, but also inch closer to answering fundamental questions about the delicate tradeoffs that shape genomes and, therefore, life itself.

***

Niko McCarty is a founder of Asimov Press.

Thanks to Tom Ellis, Howard Salis, Max Haase, David Savage and others on Twitter/X for helpful comments.

***

Footnotes

  1. A single-celled protist, Polychaos dubia, was once thought to have the largest genome. In the 1960s, biochemists clocked its genome at a whopping 300 billion bases. But those studies were criticized in 2014 when researchers claimed that the earlier methods used “a rough biochemical approach…now considered to be unreliable for accurate genome size determinations.” Earlier methods used whole cells, not just isolated nuclei, and likely picked up DNA from the mitochondria and wider environment. Amoeba proteus, once thought to have a genome size of 300 billion bases, was revealed to have a genome of just 38 billion bases—a full order-of-magnitude smaller—upon closer inspection.
  2. The same cannot be said for viruses. The HK97 virus, for example, infects E. coli cells and has 40,000 base pairs of DNA, all of which must fit within a viral shell that measures 60 nanometers across—about 2,000-times thinner  than a standard sheet of paper. The HK97 genome occupies about 50 percent of the available space inside the capsid. Even though the capsid is only “half filled,” it has an internal pressure five -times greater than the pressure within a car tire, based on molecular simulations performed on a supercomputer. This pressure is partly caused by intense electrostatic forces; DNA is negatively charged and therefore repels itself more and more as molecules pack into a tight space.
  3. Histones do not take up much volume, so it’s not necessary to revise my earlier estimates about one base pair of DNA occupying 1 cubic nanometer of space. In a yeast cell, DNA and chromatin together occupy just 1-2 percent of the total nuclear volume, according to Max Haase. I’m assuming that a similar logic holds true in plants.
  4. For a detailed analysis of the energetic costs involved in building a gene, see Lynch & Marinov PNAS (2015).
  5. Tom Ellis, a professor at Imperial College London and advisor to Asimov Press, says this another way: If a genome gets too large, and therefore its nucleus grows accordingly, then at some point the nucleus will become “too big to function.”
  6. An oak tree’s genome, stretched end-to-end, would measure just 66 centimeters!
Learn More