A study by an international research consortium promises to reshape our understanding of how the human genome functions. The findings challenge the traditional view of our genetic blueprint as a tidy collection of independent genes, pointing instead to a complex network in which genes, along with regulatory elements and other types of DNA sequences that do not code for proteins, interact in overlapping ways not yet fully understood.
The ENCyclopedia Of DNA Elements (ENCODE) consortium, which is organised by the National Human Genome Research Institute (NHGRI), part of the National Institutes of Health (NIH), have reported results of its exhaustive, four-year effort to build a parts list of all biologically functional elements in 1 per cent of the human genome. Carried out by 35 groups from 80 organisations around the world, the research served as a pilot to test the feasibility of a full-scale initiative to produce a comprehensive catalogue of all components of the human genome crucial for biological function.
The completion of the Human Genome Project in April 2003 was a major achievement, but the sequencing of the genome marked just the first step toward the goal of using such information to diagnose, treat and prevent disease.
In recent years, researchers have made major strides in using DNA sequence data to identify genes, which are traditionally defined as the parts of the genome that code for proteins. The protein-coding component of these genes makes up just a small fraction of the human genome – 1,5 per cent to 2 per cent. Evidence exists that other parts of the genome also have important functions.
However, until now, most studies have concentrated on functional elements associated with specific genes and have not provided insights about functional elements throughout the genome. The ENCODE project represents the first systematic effort to determine where all types of functional elements are located and how they are organised.
In the pilot phase, ENCODE researchers devised and tested high-throughput approaches for identifying functional elements in the genome. Those elements included genes that code for proteins; genes that do not code for proteins; regulatory elements that control the transcription of genes; and elements that maintain the structure of chromosomes and mediate the dynamics of their replication.
The collaborative study focused on 44 targets, which together cover about 1 per cent of the human genome sequence, or about 30 million DNA base pairs. The targets were strategically selected to provide a representative cross section of the entire human genome.
“Our results reveal important principles about the organisation of functional elements in the human genome, providing new perspectives on everything from DNA transcription to mammalian evolution. In particular, we gained significant insight into DNA sequences that do not encode proteins, which we knew very little about before,” said Ewan Birney, Ph.D., head of genome annotation at the European Molecular Biology Laboratory’s European Bioinformatics Institute (EBI) in Hinxton, England.
The ENCODE consortium’s major findings include the discovery that the majority of DNA in the human genome is transcribed into functional molecules, called RNA, and that these transcripts extensively overlap one another. This broad pattern of transcription challenges the long-standing view that the human genome consists of a relatively small set of discrete genes, along with a vast amount of so-called junk DNA that is not biologically active.
The new data indicate the genome contains very little unused sequences and, in fact, is a complex, interwoven network. In this network, genes are just one of many types of DNA sequences that have a functional impact. “Our perspective of transcription and genes may have to evolve,” the researchers state, noting the network model of the genome “poses some interesting mechanistic questions” that have yet to be answered.
Other surprises in the ENCODE data have major implications for our understanding of the evolution of genomes, particularly mammalian genomes. Until recently, researchers had thought that most of the DNA sequences important for biological function would be in areas of the genome most subject to evolutionary constraint – that is, most likely to be conserved as species evolve. However, the ENCODE effort found about half of functional elements in the human genome do not appear to have been obviously constrained during evolution, at least when examined by current methods used by computational biologists.
According to ENCODE researchers, this lack of evolutionary constraint may indicate that many species’ genomes contain a pool of functional elements, including RNA transcripts, that provide no specific benefits in terms of survival or reproduction. As this pool turns over during evolutionary time, researchers speculate it may serve as a “warehouse for natural selection” by acting as a source of functional elements unique to each species and of elements that perform the similar functions among species despite having sequences that appear dissimilar.
“This impressive effort has uncovered many exciting surprises and blazed the way for future efforts to explore the functional landscape of the entire human genome,” said NHGRI Director Francis S Collins, M.D., Ph.D. “Because of the hard work and keen insights of the ENCODE consortium, the scientific community will need to rethink some long-held views about what genes are and what they do, as well as how the genome’s functional elements have evolved. This could have significant implications for efforts to identify the DNA sequences involved in many human diseases.”