Barcelona, September 5, 2012
FAST FORWARD FOR BIOMEDICAL RESEARCH:
Scientists describe the map for the human genome
- An international team of researchers reveal that much of what has been called “junk DNA” in the human genome is actually useful and really important.
- More than 30 papers published at the same time in the renowned journals Nature, Genome Research and Genome Biology.
- 20 researchers from the Centre for Genomic Regulation (CRG) have participated in this project with a significant role on it.
Today, an international team of researchers reveal that much of what has been called ‘junk DNA’ in the human genome is actually a massive control panel with millions of switches regulating the activity of our genes. Without these switches, genes would not work – and mutations in these regions might lead to human disease. Discovered by hundreds of scientists working on the ENCODE Project, the new information is so comprehensive and complex that it has given rise to a new publishing model in which electronic documents and datasets are interconnected. Just as the Human Genome Project revolutionised biomedical research, ENCODE will drive new understanding and open new avenues for biomedical science. Led by the National Genome Research Institute (NHGRI) in the US and the EMBL-European Bioinformatics Institute (EMBL-EBI) in the UK, ENCODE now presents a detailed map of genome function that identifies 4 million gene ‘switches’. This essential reference will help researchers pinpoint very specific areas of research for human disease. The findings are published in 30 connected, open-access papers appearing in three science journals: Nature, Genome Biology and Genome Research. “Our genome is simply alive with switches: millions of places that determine whether a gene is switched on or off,” says Ewan Birney of EMBL-EBI, lead analysis coordinator for ENCODE. “The Human Genome Project showed that only 2% of the genome contains genes, the instructions to make proteins. With ENCODE, we can see that around80% of the genome is actively doing something. We found that a much bigger part of the genome – a surprising amount, in fact – is involved in controlling when and where proteins are produced, than in simply manufacturing the building blocks.”
These findings give us the knowledge we need to look beyond the linear structure of the genome to how the whole network is connected. It is important to know, not just where certain genes are located, but which sequences control them. Because of the complex, three-dimensional shape of our genome, those controls are sometimes far from the gene they regulate and looping around to make contact. Were it not for ENCODE, we might never have looked in those regions. This is a major step toward understanding the wiring diagram of a human being. ENCODE helps us look deeply into the regulatory circuit that tells us how all of the parts come together to make a complex being.
ENCODE combined the efforts of 442 scientists in 32 labs in the UK, US, Spain, Singapore, Japan and Switzerland. They generated and analysed over 15 terabytes (15 trillion bytes) of raw data – all of which is now publicly available. The study used around 300 years’ worth of computer time studying 147 tissue types to determine what turns specific genes on and off, and how that ‘switch’ differs between cell types.
The articles published today “represent a new wayto enable researchers to navigate through the data,” said Magdalena Skipper, senioreditorat Nature, which produced the freely available publishing platform on the Internet.All of the published ENCODE content, in all three journals, is connected digitally through topical ‘threads’, so that readers can follow their area of interest between papers and all the way down to the original data.
The CRG contribution
Twenty of the 442 ENCODE scientists are from the Centre for Genomic Regulation (CRG) in Barcelona (although some of them are now at the CNAG or other institutes).
Roderic Guigó, coordinator of the CRG Bioinformatics and Genomics Programme, and Professor of the Universitat Pompeu Fabra, has led the RNA analysis group within the ENCODE project. CRG scientists participate in two of the manuscripts published in Nature (and are the leading authors in one of them), in four of those published in Genome Research (and are the leading authors in three of them), and in two of those published in Genome Biology. Researchers associated to the CRG have also designed the cover of the special issue of Genome Research, in the style of the Catalan artist Joan Miro. At Spanish level, two more scientists from the CNIO have also participated in the investigation. The work has also been supported by the Spanish Instituto Nacional de Bioinformatica.
The CRG researchers have participated in the analysis of the transcriptional activity of the genome, as part of the subprojects lead by Tom Gingeras from Cold Spring Harbour Laboratory (US) and by Tim Hubbard from the Welcome Trust Sanger Institute (UK). The unfolding of the instructions encoded in the genome is triggered by the transcription of the DNA to RNA. Before ENCODE, it was assumed that most of the transcriptional activity of the genome was directed towards the synthesis of messenger RNA molecules. These are subsequently translated into proteins. However, during the last decade new technologies have been developed that allow monitoring the transcriptional activity of the genome with unprecedented resolution. Using these technologies, ENCODE researchers have uncovered a wealth of transcriptional activity in the human genome that it is not directed towards the production of proteins. “RNA molecules are very abundant and much more diverse in sequence, structure and function that previously though. RNA biology will become increasingly central, both to basic research and to technical applications in Biology or Medicine in particular”, says Roderic Guigó.
Participating in the ENCODE project has been both challenging and rewarding for the CRG researchers. “The ENCODE project has set up new standards on Scientific Cooperation” says Roderic Guigó. “We have been working very closely with scientists all over the globe. In addition to us, the ENCODE RNA weekly teleconferences participate scientists from California, the East Coast, UK, Switzerland, Singapore and Japan. It was 6AM for the California scientists, but midnight for the Japanese.” Sarah Djebali, who has taken upon the logistics of the coordination at the CRG agrees: “The logistics are challenging, but the discussions among scientists all over the world, the planning of the experiments, the analysis of the results, and more over the openness and the willingness to share, are all very rewarding”. One of the largest challenges at the CRG has been to cope with the large amount of data generated by the project. “The CRG has served as hub for the RNA data, and this has often challenged the capacity of the CRG’s informatics infrastructure”, says Julien Lagarde, who has been responsible of the informatics of the project at the CRG.
The ENCODE project only represents the first step to the long and complex task of deciphering the meaning of the genome sequence. “This is actually the challenge of the 21st century Biology. As researchers we feel privileged of contributing to this project”, states Roderic Guigó. He also adds, “our participation in this project owns in part to insightful policies for promoting scientific research. Indeed, the possibility for scientists in our country to participate in such scientifically relevant international projects depends critically of a strong and decisive support to scientific research”.
For further information:
Laia Cendrós, Press Office, Centre for Genomic Regulation (CRG).
Tel. +34 93 316 02 37 - +34 607611798.