Shedding light on the human genome

Shedding light on the human genome

Shedding light on the human genome
They said it was their family curse: a rare congenital deformity called syndactyly, in which the thumb and index finger are fused together on one or both hands. Ten members of the extended clan were affected, and with each new birth, they told Dr Stefan Mundlos of the Max Planck Institute for Molecular Genetics, Germany the first question was always: “How are the baby’s hands? Are they normal?” The family, under promise of anonymity, is taking part in a study by Stefan and his colleagues of the origin and development of limb malformations. And while the researchers cannot yet offer a way to prevent syndactyly, or to entirely correct it through surgery, Stefan has sought to replace the notion of a family curse with “a rational answer for their condition,” he said.

The scientists have traced the family’s limb anomaly to a novel class of genetic defects unlike any seen before, a finding with profound implications for understanding a raft of heretofore mysterious diseases. The mutations affect a newly discovered design feature of the DNA molecule called topologically associating domains, or TADs. It turns out that the vast informational expanse of the genome is divvied up into a series of manageable, parochial and law-abiding neighbourhoods with strict nucleic partitions between them — each one a TAD.

Folding protocol
By studying TADs, researchers hope to better fathom the deep structure of the human genome, in real time and three dimensions, and to determine how a quivering, mucilaginous string of some three billion chemical subunits that would measure more than six-feet long if stretched out nonetheless can be coiled and compressed down to four-10,000ths of an inch, the width of a cell nucleus.

“DNA is a superlong molecule packed into a very small space, and it’s clear that it’s not packed randomly,” Stefan said. “It follows a very intricate and controlled packing mechanism, and TADs are a major part of the folding protocol.” For much of the past 50 years, genetic research has focused on DNA as a kind of computer code, a sequence of genetic “letters” that inscribe instructions for piecing together amino acids into proteins, which in turn do the work of keeping us alive.

Most of the genetic diseases deciphered to date have been linked to mishaps in one or another protein recipe. Scanning the DNA of patients with Duchenne muscular dystrophy, for example, scientists have identified telltale glitches in the gene that encodes dystrophin, a protein critical to muscle stability. The mutant product that results soon shatters into neurotoxic shards.

Yet, researchers soon realised there was much more to the genome than the protein codes it enfolded. “We were caught up in the idea of genetic information being linear and one-dimensional,” said Job Dekker, a biologist at the University of Massachusetts Medical School, USA. For one thing, as the sequencing of the complete human genome revealed, the portions devoted to specifying the components of hemoglobin, collagen, pepsin and other proteins account for just a tiny fraction of the whole, maybe three per cent of human DNA’s three billion chemical bases. And there was the restless physicality of the genome, the way it arranged itself during cell division into 23 spindly pairs of chromosomes that could be stained and studied under a microscope, and then somehow, when cell replication was through, merged back together into a baffling, ever-wriggling ball of chromatin — DNA wrapped in a protective packaging of histone proteins.

Through chromosome conformation studies and related research, scientists have discovered the genome is organised into about 2,000 jurisdictions. As with city neighbourhoods, TADs come in a range of sizes, from tiny walkable zones a few dozen DNA subunits long to TADs that sprawl over tens of thousands of bases and you’re better off taking the subway. TAD borders serve as folding instructions for DNA.

Different domains
TAD boundaries also dictate the rules of genetic engagement. Scientists have long known that protein codes are controlled by an assortment of genetic switches and enhancers — noncoding sequences designed to flick protein production on, pump it into high gear and muzzle it back down again. The new research indicates that switches and enhancers act only on those genes, those protein codes, stationed within their own precincts. “Genes and regulatory elements are like people,” Job said. “They care about and communicate with those in their own domain, and they ignore everything else.”

What exactly do these boundaries consist of? Scientists are not entirely sure, but preliminary results indicate that the boundaries are DNA sequences that attract the attention of sticky, roughly circular proteins called cohesin and CTCF, which adhere thickly to the boundary sequences like insulating tape. Between those boundary points, those clusters of insulating proteins, the chromatin strand can loop up and over like the ribbon in a birthday bow, allowing genetic elements distributed along the ribbon to touch and interact with one another. But the insulating proteins constrain the movement of each chromatin ribbon, said Richard A Young of the Whitehead Institute for Biomedical Research, USA, and keep it from getting entangled with neighbouring loops — and the genes and regulatory elements located thereon.

The best evidence for the importance of TADs is to see what happens when they break down. Researchers have lately linked a number of disorders to a loss of boundaries between genomic domains, including cancers of the colon, esophagus, brain and blood. In such cases, scientists have failed to find mutations in any of the protein-coding sequences commonly associated with the malignancies, but instead identified DNA damage that appeared to shuffle around or eliminate TAD boundaries. As a result, enhancers from neighbouring estates suddenly had access to genes they were not meant to activate.

Reporting in the journal Science, Richard and his colleagues described a case of leukemia in which a binding site for insulator proteins had been altered not far from a gene called TAL1, which if improperly activated is known to cause leukemia. Now that researchers know what to look for, he said, TAD disruptions may prove to be a common cause of cancer. The same may be true of developmental disorders — like syndactyly.