Enhancer and Promoter Atlases
Consortium annotates the human genome with cell type-specific information about transcription start sites, active enhancers, and their expression throughout the body.
Researchers at Japan’s RIKEN institute, in collaboration with scientists worldwide, have produced two atlases of genetic regulatory elements throughout the human genome, as reported in a pair of papers published today (March 26) in Nature. The first paper presents an atlas of transcription start sites, where RNA polymerase begins to transcribe DNA into RNA; the second maps active enhancers, non-promoter stretches of DNA that upregulate the transcription of certain genes. Sixteen additional papers related to this work—results from the fifth edition of the Functional Annotation of the Mammalian Genome (FANTOM) project—are also today being published in other journals, including Blood and BMC Genomics.
“Both papers are very significant,” said biochemist Wei Wang from the University of California, San Diego, who was not involved in the work. “This will be a very valuable resource for the community.”
“We made an encyclopedia of the definition of the normal cell: 185,000 promoters, 44,000 enhancers, and the majority of them are tissue-specific,” said the RIKEN Omics Science Center’s Yoshihide Hayashizaki, who led the promoter annotation project.
“This is a very broad survey of transcriptional activity in diverse cell types, [making it] a very valuable resource, and currently, quite unique,” said Zhiping Weng from the University of Massachusetts Medical School, who was not involved in the work. Weng noted that the only comparable resource is the Genotype-Tissue Expression Program (GTEX), which when compared with FANTOM, is “not nearly as comprehensive at this point,” she said. “Right now, this is the most comprehensive, extensive collection of transcription data available, especially in primary cell types. I find that to be very significant. I think a lot of people are going to find the data to be highly useful.”
FANTOM is one of several projects that aim to annotate the human genome and to determine how the expression of its genes can produce a variety of cell types. Members of the Encyclopedia of DNA Elements (ENCODE) project, in which some RIKEN researchers took part, used chromatin immunoprecipitation analyses and mapped DNase hypersensitive sites, among other things, to determine where transcription factors bind DNA and where chromatin is “open,” and therefore vulnerable to cleavage by DNAse. The ENCODE team used many cell lines and examined only a few cell types, whereas the FANTOM group studied myriad primary cell and tissue types, as well as cell lines.
“I see FANTOM and ENCODE being very complementary, because FANTOM mainly generates transcription data, and ENCODE generates a much wider diversity, much more types of data. But FANTOM has a huge representation in the cell type dimension, while ENCODE is primarily focused on cell lines and only a few types of primary cells and tissue types,” said Weng, who was part of ENCODE. “You can imagine two very big projects—they are very extensive in different dimensions.”
To create these atlases, the FANTOM researchers used cap analysis of gene expression (CAGE) to sequence the beginnings of RNA transcripts. By mapping CAGE tags onto the human genome, the RIKEN-led team identified the promoter regions upstream of the transcription start sites. The researchers used CAGE to identify promoters in human primary cells, as well as in tissue samples and immortalized cell lines. They found that many genes have multiple transcription start sites and that transcription begins at different locations in different cell types.
Using CAGE, the team also identified the RNA sequences transcribed from enhancers. Other groups had previously shown that some enhancers are transcribed bidirectionally—from the center, outward in both directions, and from both DNA strands. The FANTOM team found evidence to suggest that bidirectional transcripts are signatures of active enhancers. About 75 percent of the enhancers detected by CAGE drove expression of a reporter gene in HeLa cells, a larger percentage than the untranscribed enhancers previously identified through ENCODE.
Wang said he was most interested in the enhancer atlas. “This was the very first time people have done this enhancer RNA [analysis] on such a large scale,” he said.
“Over so many cell types, this number [of active enhancers] is kind of at the low end,” Wang noted, “particularly if you compare with ENCODE and other annotations. . . . I think the reason is related to the low abundance of eRNAs [enhancer RNAs].”
Other groups had also found that enhancers produce RNA in low amounts. “My only concern is they probably missed a lot of active enhancers,” said Wang. “My understanding is they have both [a] high true-positive rate and also false-negative rate. So whatever they identified, I believe those are real, active enhancers, but they may also miss many active enhancers because [of] this low abundance of enhancer RNA.”
In the future, Hayashizaki said, knowledge of the enhancer and promoter usages that define different cell types raises the possibility of turning one cell type into another. It could also aid in predicting whether or not a particular cancer is going to metastasize, he added.
The FANTOM Consortium and the RIKEN PMI [Preventative Medicine and Diagnosis Innovation Program] and CLST [Center For Life Science Technologies] (DGT [Division of Genomic Technologies]), “A promoter-level mammalian expression atlas,” Nature, doi:10.1038/nature13182, 2014.
R. Andersson, et al., “An atlas of active enhancers across human cell types and tissues,” Nature, doi:10.1038/nature12787, 2014.