Cell Type-Specific Chromatin Signatures Underline Regulatory DNA Elements in Human Induced Pluripotent Stem Cells and Somatic CellsNovelty and Significance
Rationale: Regulatory DNA elements in the human genome play important roles in determining the transcriptional abundance and spatiotemporal gene expression during embryonic heart development and somatic cell reprogramming. It is not well known how chromatin marks in regulatory DNA elements are modulated to establish cell type–specific gene expression in the human heart.
Objective: We aimed to decipher the cell type–specific epigenetic signatures in regulatory DNA elements and how they modulate heart-specific gene expression.
Methods and Results: We profiled genome-wide transcriptional activity and a variety of epigenetic marks in the regulatory DNA elements using massive RNA-seq (n=12) and ChIP-seq (chromatin immunoprecipitation combined with high-throughput sequencing; n=84) in human endothelial cells (CD31+CD144+), cardiac progenitor cells (Sca-1+), fibroblasts (DDR2+), and their respective induced pluripotent stem cells. We uncovered 2 classes of regulatory DNA elements: class I was identified with ubiquitous enhancer (H3K4me1) and promoter (H3K4me3) marks in all cell types, whereas class II was enriched with H3K4me1 and H3K4me3 in a cell type–specific manner. Both class I and class II regulatory elements exhibited stimulatory roles in nearby gene expression in a given cell type. However, class I promoters displayed more dominant regulatory effects on transcriptional abundance regardless of distal enhancers. Transcription factor network analysis indicated that human induced pluripotent stem cells and somatic cells from the heart selected their preferential regulatory elements to maintain cell type–specific gene expression. In addition, we validated the function of these enhancer elements in transgenic mouse embryos and human cells and identified a few enhancers that could possibly regulate the cardiac-specific gene expression.
Conclusions: Given that a large number of genetic variants associated with human diseases are located in regulatory DNA elements, our study provides valuable resources for deciphering the epigenetic modulation of regulatory DNA elements that fine-tune spatiotemporal gene expression in human cardiac development and diseases.
Human pluripotent stem cells (PSCs) share the dual hallmarks of self-renewal and ability to generate all cell types in the body, and thereby they hold great promise in disease modeling, drug development, and regenerative medicine.1–3 Human induced pluripotent stem cells (iPSCs) are directly derived from somatic cells by transient overexpression of 4 transcription factors (OCT4/SOX2/C-MYC/KLF4 [POU domain, class 5, transcription factor 1/transcription factor SOX-2/Myc proto-oncogene protein/Kruppel-like factor 4]),4 and as such, they are free of ethical issues associated with the use of human oocytes for nuclear reprogramming and therapeutic cloning.5,6 For genetically inherited cardiovascular diseases, patient-specific iPSCs have emerged as powerful tools to generate human cardiac cells for modeling disease progression and novel drug discovery. Although disease-causing mutations are frequently seen in protein-coding regions across the genome, noncoding sequences, including regulatory DNA elements, have been demonstrated to affect the susceptibility to many cardiovascular diseases.7 However, evaluating regulatory DNA elements in cardiovascular pathogenesis has been difficult because of the lack of a catalog of cardiac cell type–specific regulatory DNA elements, particularly for enhancers and promoters.
Meet the First Author, see p 1206
In mammals, cell type identity is defined and maintained by specific gene expression programs. Cell type–specific gene expression is primarily driven by the proximal and distal regulatory DNA elements, including promoters and enhancers. Promoters are short DNA sequences proximal to the transcription start sites (TSSs) bound by general transcriptional machinery. Enhancers are usually distal to TSSs and contain short DNA motifs that can be recognized by lineage-determining transcription factors.8 A large number of putative enhancers have been identified and comprise at least 12% of the human genome.9 Cell type–specific enhancers are usually associated with histone modification and higher-order chromatin structure, which can be used to predict putative enhancers in a given cell type.10 Genome-wide ChIP-seq (chromatin immunoprecipitation combined with high-throughput sequencing) studies reveal that tissue-specific enhancers are enriched with several chromatin marks.11,12 In human embryonic stem cells (ESCs), active enhancers are mostly associated with p300, H3K4me1, and H3K27ac, whereas poised enhancers are enriched in H3K27me3 with the depletion of H3K27ac.13,14
Somatic cell reprogramming is accompanied by resetting of cell type–specific transcriptional programs from the differentiated cell state to the pluripotent state. Cellular differentiation of patient-specific iPSCs to cardiac lineages is associated with extensive epigenetic reprogramming, which includes massive reorganization of DNA and histone modifications at regulatory DNA elements.15,16 Recent studies have illustrated the dynamic and coordinated epigenetic modulation of regulatory DNA elements during cardiac lineage differentiation.17–19 Although thousands of tissue-specific enhancers have been identified in the heart using ChIP-seq prediction,20 they are not cell type–specific. Conversely, it is still unknown how genome-wide reorganization of chromatin modifications in regulatory DNA elements is established when somatic cells from the heart are reprogrammed into pluripotent cells, which may also inform cardiac lineage dedifferentiation. Here, we performed massive RNA-seq (n=12) and ChIP-seq (n=84) to profile genome-wide transcriptional activity, as well as promoter and enhancer chromatin marks (H3K4me1, H3K4me3, H3K27ac, and H3K27me3) in human iPSCs and their parental cells (fibroblasts, endothelial cells [ECs], and cardiac progenitor cells [CPCs]) from the same individuals. We identified 2 classes of cell type–specific regulatory DNA elements in human iPSCs and somatic cells and functionally validated the putative enhancer elements in transgenic mouse embryos and human cells.
A detailed description of the experimental procedures is provided in the Online Data Supplement.
Resetting Cell Type–Specific Transcriptional Program by Somatic Cell Reprogramming
To remove potential effects of genetic composition, we generated multiple human iPSC lines from isogenic somatic cells derived from the same fetal heart: fibroblasts, ECs, and Sca-1+ CPCs (Figure 1A).21 The resulting iPSCs were denoted as FiPSCs (fibroblast-derived iPSCs), EiPSCs (endothelial cell-derived iPSCs), and CiPSCs (cardiac progenitor cell-derived iPSCs), respectively. These iPSCs highly expressed OCT4 and NANOG (homeobox protein NANOG; Online Figure IA), with the majority of cells in the colonies being OCT4+NANOG+ (Online Figure IB through ID). We confirmed the identity of somatic cells by flow cytometry using cell surface markers: CD31/CD144 for ECs, Sca-1 for CPCs, and DDR2 for fibroblasts (Online Figure IE through II). Next, we performed high-throughput RNA-seq to profile the transcriptional changes in these somatic cells and their respective iPSCs. The reprogramming process reshaped the transcriptomes of somatic cells to the pluripotent state, regardless of the parental transcriptional signatures. The transcriptional difference between somatic cells and iPSCs were apparent, with 6,151 differentially expressed genes identified (Figure 1B). We further divided these differentially expressed genes into 5 cell type–specific clusters (clusters A through E): 87% (5353 genes, clusters A and B) of differentially expressed genes were between iPSCs and somatic cells, including 279 EC-specific genes (cluster C), 205 fibroblast signature genes (cluster D), and 314 CPC/fibroblast-specific genes (cluster E). We also checked the cell type–specific signature gene expression, discovering that POU5F1 (cluster A) was uniquely expressed in human iPSCs (Figure 1D), CDH5 (cluster B) in somatic cells, VWF (cluster C) in ECs, S100A4 in fibroblasts (cluster D), and GDF6 (cluster E) in fibroblasts and CPCs (Online Figure IIA through IID). Gene ontology analysis showed that these differentially expressed genes were mostly associated with blood vessel morphogenesis, cardiovascular development, and focal adhesion, highlighting the fundamental transcriptional differences between iPSCs and somatic cells (Figure 1E).
In general, gene expression variation is far greater in different tissues (and derived primary cells) than in the same tissue with different genetic makeups.22 Within iPSCs, we found that the transcriptional variance was mostly contributed by the genetic makeups. The principal component analysis plot of global gene expression showed that iPSCs were clearly separated by the individual genetic background (Figure 1C). When compared with somatic cell types, the inter-iPSC transcriptional variation was much smaller than that between iPSCs and somatic cells (Online Figure IIE). These results were consistent with previous studies and reiterated the influence of genetic composition on the gene expression of human iPSCs.23 Collectively, these results indicate that cell type–specific transcriptomes of somatic cells from the heart are reshaped to the unique gene expression pattern in iPSCs, the transcriptional variation of which is mostly driven by genetic makeups rather than the cell types of origin.
Identification of 2 Classes of Cell Type–Specific Enhancers in iPSCs and Somatic Cells
To identify prospective enhancers, we next performed ChIP-seq experiments (n=84) using antibodies against several histone marks (H3K4me1, H3K4me3, H3K27ac, and H3K27me3), cofactor (p300), and a component of transcriptional machinery (RNA polymerase II [Pol II]). Overall, these chromatin marks and cofactors showed a genome-wide cell type–specific distribution, and iPSCs were obviously separated from their parental somatic cells in the t-SNE plot (Online Figure III). H3K27ac and H3K4me1 have been widely used to identify active (H3K4me1+/H3K27ac+) and poised (H3K4me1+/H3K27ac−) enhancers.13,24 Because we had a variety of conditions (6 cell types) with multiple sets of chromatin marks, we first used H3K27ac to predict all potential enhancers outside of ±3 kb regions of annotated TSSs. In total, we identified 46,261 potential enhancer elements using significantly enriched H3K27ac peaks in at least 1 of our 12 samples. We further divided these putative enhancers into 2 categories based on the pattern of H3K4me1 enrichment.25 Class I enhancers were enriched with H3K4me1 in all cell types, whereas class II enhancers exhibited cell type–specific H3K4me1 enrichment. Class I enhancers (2700) comprised of 5.8% of the total, whereas class II enhancers (43 561) were dominant in all putative enhancers (Online Table I). These putative enhancers were active (H3K4me1+/H3K27ac+) in at least 1 cell type and were poised or silenced in other cell types.
Ubiquitous H3K4me1 Enhancers Are Mostly Active in Human iPSCs
Class I enhancers showed cell type–specific distribution of H3K27ac and ubiquitous enrichment of H3K4me1 in both somatic cells and iPSCs (Figure 2A and 2B). Because H3K27ac is enriched in active enhancers, most class I enhancers displayed high activity in iPSCs but were selectively active in some somatic cell types (Figure 2A). We also examined the p300 and Pol II distribution on the same genomic loci enriched by H3K27ac. H3K27ac enrichment was positively correlated with the binding of cofactor p300 and the component of transcriptional machinery Pol II (Figure 2C and 2D), indicating synergized chromatin modifications for active gene transcription. Furthermore, we observed positively correlated H3K4me3 and negatively correlated H3K27me3 enrichment across these genomic regions shared with H3K27ac (Online Figure IVA and IVB). We performed correlation analysis and found that H3K27ac was positively correlated with H3K4me1, p300, and Pol II, but was negatively correlated with H3K27me3 in class I enhancers (Online Figure V). There were 2 conditions for class I enhancers: active enhancers with both H3K27ac and H3K4me1 enrichment (H3K27ac+/H3K4me1+) in 1 cell type versus poised enhancers with only H3K4me1 enrichment (H3K27ac−/H3K4me1+) in all cell types (Figure 2E). Class I enhancers were mostly located in the gene bodies (>60%) and intergenic regions (Online Figure IVD). In contrast, a substantial number of class II enhancers were located in the gene deserts (Online Figure IVE).
Class I enhancers were further grouped into 7 clusters (1–7, from top to bottom, Figure 2A) depending on the dynamic profiles of H3K27ac enrichment in somatic cells and iPSCs. Cluster 1 and 2 enhancers were mostly active in human iPSCs (80% of class I enhancers; Online Table I), as shown by the enrichment of active enhancer mark H3K27ac, cofactor p300, and RNA Pol II. Class I enhancers were typically enriched with a high density of H3K4me1 and cell type–specific distribution of H3K27ac across a large genomic region (Figure 2F). These enhancers can be activated in 1 cell type but were poised in another cell type, with most of them being active in human iPSCs but poised in somatic cells (Figure 2A). We then interrogated the nearby gene expression profiles of these clusters. We found the average transcription levels of these genes were affected by class I enhancers (Figure 2G; Online Figure IVC). The gene expression level was well correlated with the enrichment of H3K27ac; higher levels of H3K27ac enrichment corresponded to higher levels of cell type–specific gene expression. We then looked into the significantly enriched gene ontology terms of the nearby genes in these clusters. We found that cluster 1 genes were associated with signal transduction, cell communication, and endocytosis, whereas cluster 4 genes were related to blood vessel development and EC function (Online Figure VIA and VIB). Furthermore, motif enrichment analysis of these clusters in class I enhancers revealed that they were possibly bound by lineage-determining transcription factors, such as ETV1 (ETS translocation variant), ETV2, and ERG (transcriptional regulator ERG; Online Figure VIC and VID). Together, these results indicate that class I enhancers are mostly active in iPSCs and possibly modulate the establishment of iPSC-specific gene expression during somatic cell reprogramming.
Cell Type–Specific H3k4me1 Enhancers Reflect Cell Type–Specific Gene Expression
Class II enhancers were the major part (94.2%) of cell type–specific enhancers identified between human iPSCs and their parental somatic cells. These enhancers showed cell type–specific enrichment of H3K27ac and H3K4me1, positively correlating with the enrichment of cofactor p300 and RNA Pol II (Figure 3A through 3D). Additionally, H3K4me3 displayed a similar cell type–specific distribution pattern, whereas the repressive mark H3K27me3 was not significantly enriched in a cell type–specific manner (Online Figure VIIA and VIIB). Next we grouped class II enhancers into 7 clusters based on their activation patterns in human iPSCs and somatic cells. Class II enhancers were highly cell type–specific with iPSC-specific enhancers (active in iPSCs) accounting for only 28.2%, in contrast to the fact that 79.8% of class I enhancers were active in iPSCs (Online Table I). Class II enhancers were either active (H3K4me1+/H3K27ac+) or silenced (H3K4me1−/H3K27ac−), not counting the poised enhancers that were prevalent in class I enhancers (Figure 2E). Most of the class II enhancers were located in gene bodies and intergenic regions. However, ≈10% of class II enhancers were situated in gene deserts devoid of protein-coding genes (Online Figure IVE), indicating the possible functional differences between class I and class II enhancers.
We then investigated the effects of class II enhancers on the cell type–specific gene expression by examining the expression levels of the nearby genes. Active class II enhancers were separated from silent enhancers by H3K27ac enrichment, although H3K4me1 could cover a broader genomic locus (Figure 3E). We observed consistently higher gene expression activities in active class II enhancers than those in silent class II enhancers across all clusters (Figure 3F; Online Figure VIIC), suggesting the former had greater functional activity in regulating gene expression.
We next looked into the functions of genes that were putatively regulated by class II enhancers. Interestingly, the genes regulated by class II enhancers were different from those regulated by class I enhancers. For example, nearby genes targeted by cluster I (iPSC specific) enhancers were mostly associated with chromatin modification, antiapoptosis, and organelle organization (Online Figure VIID), whereas flanking genes affected by cluster 5 (EC specific) enhancers were related to chemokine production, inflammatory response, and immune system process (Online Figure VIIE). These results indicate that compared with gene functions of class I enhancers that are mostly associated with cell type identity, Class II enhancers seem to regulate the biological function of specific cell types. We next examined the transcription factor motifs enriched by class II enhancers. Interestingly, the top enriched transcription factor (TF) motifs in cluster 1 were PSC transcription factors (OCT4, SOX2, and NANOG), whereas those in cluster 4 were relevant to EC lineage determination (Figure 3G and 3H), which were distinct from motifs bound by class I enhancers (Online Figure VIC and VID). Additionally, distinct gene ontology terms were associated with the genes that were potentially regulated by class I and class II enhancers, respectively (Online Figure VIII). Taken together, these results suggest that class II enhancers reflect cell type–specific expression by regulating cell identity determining TFs and modulate different biological functions compared with class I enhancers.
Ubiquitous H3K4me3 Promoters Are Prevalent in Human iPSCs and Somatic Cells
Because promoters are usually marked by H3K4me3 and located adjacent to the TSSs,26 we next probed the epigenetic signatures of promoters using H3Kme3, H3K27ac (active), H3K27me3 (repressive), and RNA Pol II (transcription; Figure 4A through 4D). We also interrogated other histone marks (H3K4me1 and p300), but did not find a significant enrichment in the promoter regions (Online Figure IXA and IXB). To exclude any potential enhancers, we only looked into ±3 kb within TSSs. We identified 5,230 promoter regions with differential enrichment of H3K27ac activity between human iPSCs and their parental cells (fibroblasts, ECs, and CPCs). We further divided them into 2 distinct groups according to the distribution of general promoter mark H3K4me3: class I promoters with ubiquitous H3K4me3 distribution versus class II promoters with cell type–specific H3K4me3 enrichment (Figure 4E). Promoters with both H3K4me3 and H3K27ac were considered active, promoters with H3K4me3 but without H3K27ac were poised, and promoters with neither H3K4me3 nor H3K27ac were inactive in a given cell type. Surprisingly, ≈75% of these promoters (3925) were class I promoters with ubiquitous H3K4me3 enrichment and cell type–specific distribution of H3K27ac (Figure 4A through 4D; Online Table II). In contrast, the repression mark H3K27me3 was negatively correlated with active mark H3K27ac (Figure 4C). The genes driven by active class I promoters (H3K27ac+) showed a higher transcriptional activity than those with low H3K27ac enrichment in any given cell types (Figure 4F and 4G; Online Figure IXC). Although only a small percentage (5.8%) of class I enhancers showed ubiquitous H3K4me1 enrichment, class I promoters were much more prevalent (75%) and constituted the majority of promoters driving strong cell type–specific gene expression.
Class II Promoters With Cell Type–Specific H3K4me3 Enrichment Are Weaker in Driving Gene Expression
About a quarter of cell type–specific promoters (1305) were enriched with cell type–specific H4K3me3 and H3K27ac and termed as class II promoters (Figure 4E). Class II promoters were marked with cell type–specific H3K27ac, H3K4me3, and RNA Pol II but negatively correlated with the repressive mark H3K27me3 (Figure 5A through 5D). This cell type–specific enrichment pattern was also observed for histone mark H3K4me1 and cofactor p300 (Online Figure XA and XB). The genes associated with class II promoters displayed dramatic cell type–specific expression patterns: active promoters (H3K4me3+/H3K27ac+) drove higher gene expression than inactive promoters (H3K4me3−/H3K27ac−) in any given cell types (Figure 5E and 5F). Class II promoters were further divided into 7 cell type–specific clusters (Online Table II). For example, cluster 1 promoters were active in iPSCs, whereas cluster 2 promoters were active in somatic cells. The cell type–specific gene expression regulated by these clusters was correlated with chromatin mark (H3K27ac) enrichment (Figure 5E and 5F; Online Figure XC). However, the average levels of gene expression driven by class II promoters were much lower than those driven by class I promoters (Figure 4F and 4G), suggesting a stronger promoter activity with consistent H3K4me3 presence in all cell types. We then surveyed the potential TF motifs enriched by class I and class II promoters. Compared with enhancers, the TF motif enrichment scores for promoters were much lower, although stem cells factor POU5F1 motif was enriched in cluster 1 of class I promoters (Figure 5G). TF motifs enriched by class II promoters were distinct from those by class I promoters (Figure 5H), suggesting that different biological functions are modulated by these 2 types of promoters. In addition, the biological functions of genes regulated by class I and class II promoters were clearly separated. Class I promoters were mostly associated with cellular development and gene expression regulation, whereas class II promoters regulated genes relevant to cellular and molecular functions and metabolic processes (Online Figure XI). In summary, we identified 2 classes of cell type–specific promoters with distinct gene regulatory functions that were primed by histone chromatin marks (H3K4me3 and H3K27ac).
Cell Type–Specific Gene Expression Regulated by Promoters and Enhancers
Cell type–specific gene expression is regulated by distal enhancers and driven by proximal promoters.8 To illustrate the combinatorial influence of promoters and enhancers on cell type–specific transcriptional activity, we analyzed the common genes regulated by both class I and II promoters and enhancers activated in a given cell type. These genes were divided into 4 groups: class I enhancers/class I promoters (E1_P1, 497 genes), class I enhancers/class II promoters (E1_P2, 162 genes), class II enhancers/class I promoters (E2_P1, 2245 genes), and class II enhancers/class II promoters (E2_P2, 882 genes; Online Figure XIIA). These overlapped genes were determined by the genomic locations near the regulatory DNA elements, so the overlaps within promoters and enhancers were also observed (Figure 6A). We next examined the expression of these common genes regulated by the combination of promoters and enhancers. Regardless of the presence of enhancers, transcription activity executed by class I promoters was much stronger than those driven by class II promoters, although the average transcripts were different among individual cell types (Online Figure XIIB and XIIC). In particular, gene expression was predominately affected by the activity of promoters, with class I promoters showing a higher gene expression than class II promoters in any combinations with enhancers in both human iPSCs and somatic cells (Figure 6B). Accordingly, the gene ontology analysis also showed higher enrichment scores (lower P values) associated with common genes mediated by class I promoters than those by class II promoters, independent of class I or class II enhancer activity (Online Figure XIID). Finally, we constructed gene regulatory networks associated with cell type–specific transcription factors, regulatory DNA elements, and gene expression (mRNA transcripts). For iPSCs, all classes of regulatory elements (promoters and enhancers) were potentially targeted by stem cell factors NANOG, OCT4, STAT3 (signal transducer and activator of transcription 3), and SOX2 (Figure 6C). However, in ECs, class II enhancers preferentially interacted with endothelial TFs, such as ETV2, NR2F2 (COUP transcription factor 2), and GATA2 (endothelial transcription factor GATA-2; Figure 6D), suggesting that human iPSCs and somatic cells from the heart exhibit distinct preferences in selecting regulatory DNA elements to maintain their cell type–specific transcriptional program. Taken together, these results demonstrate that promoters determine the transcriptional activity and highlight the role of enhancers on the cell type–specific transcriptional activation.
Functional Validation of Putative Regulatory DNA Elements
To functionally validate the cell type–specific regulatory DNA elements, we first performed data mining to locate the identified enhancers in the Vista Enhancer Browser (https://enhancer.lbl.gov). We found that many of these cell type–specific human enhancers could modulate the tissue-specific gene expression in transgenic mouse embryos (Figure 7A), highlighting the evolutionary conservation of these regulatory elements between human and mouse.27 We retrieved several human enhancer elements that could drive the tissue-specific expression of the reporter, particularly in the heart, blood vessel, and somite of mouse transgenic embryos (Figure 7B through 7D). To further confirm the activity of these enhancer elements in human cells, we made enhancer reporter constructs with a basal promoter driving a firefly luciferase reporter (Figure 7E). We transfected multiple types of human cells, including iPSCs, iPSC-derived cardiomyocytes, ECs (fetal aorta), and fibroblasts (fetal heart) to test the cell type–specific activation of these enhancer elements. As predicted, the basal construct pGL3-promoter lacking any enhancer elements did not show cell type–specific reporter activity (Figure 7F). In contrast, the vector including a SV40 enhancer could drive more preferential expression of reporter genes in HEK293T cells than any other cell types (Figure 7G), indicating the cell type–specific activation of enhancer elements. Using the human enhancer reporter vectors for transfection, we observed a cell type–specific enhancement of reporter luciferase activity, with most of these enhancers highly active in cardiomyocytes and ECs (Figure 7H). These results were consistent with the tissue-specific gene expression in the transgenic embryos because these enhancers could modulate the reporter genes specifically in the heart (cardiomyocytes) and blood vessels (ECs; Figure 7A through 7D). To further illustrate the target genes that are possibly modulated by these cell type–specific enhancers, we surveyed the expression of genes adjacent to these enhancer elements. HS2205 was a 4.8 kb enhancer element residing in the GATA4 locus. GATA4 was highly expressed in cardiomyocytes compared with other cell types (Figure 7I), coinciding with a high level of H3K27ac enrichment in this region (Figure 7J). Simultaneously, HS2205 could exogenously drive the heart-specific mRNA expression in transgenic mouse embryos (Figure 7D), indicating that GATA4 is likely regulated by this enhancer. In addition, we found that HS1887 could potentially regulate heart (cardiomyocyte)-specific expression of TEAD3 and HS2027 would possibly modulate the expression of TANC1 in cardiomyocytes and ECs (Online Figure XIIIA and XIIIB). This prediction was further consolidated with heart-specific reporter gene expression driven by these enhancer elements (Online Figure XIIIC and XIIID). The activation of histone mark H3K27ac was also enriched in the element HS2027 in a cell type–specific manner, which was positively correlated with the cell type–specific gene expression in somatic cells (Online Figure XIIIE). In summary, we validated the cardiac-specific enhancer elements in human cells and transgenic mouse embryos and identified the target genes that could be potentially regulated by these enhancers in the heart.
In this study, we identified 2 classes of cell type–specific enhancers and promoters based on chromatin histone marks (H3K27ac, H3K4me1, and H3K4me3) enrichment in human iPSCs and somatic cells (Figure 6E). We found that ubiquitous H3K4me1 enhancers (class I) were mostly active in human iPSCs, whereas cell type–specific H3K4me1 enhancers (class II) reflected cell type–specific gene expression. Likewise, we discovered 2 types of promoters with ubiquitous (class I) H3K4me3 and cell type–specific (class II) H3K4me3 enrichment in multiple cell types. Moreover, we validated the function of these human enhancer elements in both transgenic mouse embryos and human cells, and identified many enhancers that could potentially modulate cardiac cell-specific gene expression. We conclude that promoters determine the transcriptional activity, whereas enhancers confer cell type–specific gene expression in a particular cell type. Collectively, our data may prove valuable for future efforts to understand the epigenetic chromatin remodeling of regulatory DNA elements in cardiac development and heart diseases.
Previous studies identified poised enhancers marked with H3K4me1 and H3K27me3 but depleted of H3K27ac as underlying developmental enhancers in human ESCs.13 Although later studies profiled chromatin mark dynamics during human ESC differentiation, they did not investigate the cell type–specific epigenetic features of regulatory DNA elements (promoters and enhancers) between human PSCs and tissue-derived primary cells.28,29 During the process of somatic cell reprogramming, DNA regulatory elements must be epigenetically remodeled to establish stem cell signatures associated with transcription factor binding redistribution.30 Our study uncovers 2 classes of cell type–specific DNA regulatory elements in human iPSCs and somatic cells. Although other studies have focused on cellular differentiation, particularly ESC/iPSCs and their differentiated progeny, here we used tissue-derived somatic cells to benchmark the regulatory DNA elements for 2 reasons. First, iPSC-derived differentiated cells are usually immature and more like fetal-stage cells. Second, differentiated stem cell progeny display global epigenetic profiles closer to their parental iPSCs than tissue-derived primary cells.31 The epigenetic difference between human iPSCs and somatic cells will be informative for improving in vitro cardiac lineage differentiation. In addition, transcriptional variation among iPSCs derived from different cell types is mostly contributed by genetic compositions among individuals,32 and cell type–specific gene expression is completely remodeled to iPSC-specific transcriptional profiles. Therefore, our genomic data pave the way to understanding how cell type–specific transcriptional program is modulated by the interactions between regulatory DNA elements, chromatin marks, and transcription factors during somatic cell reprogramming and cardiac lineage differentiation.
The reciprocal interactions between promoters and enhancers determine the spatiotemporal gene expression during embryonic development. Promoters typically ensure the accurate transcriptional initiation of a gene, whereas enhancers are primarily responsible for the precise regulation of gene expression in a spatial and temporal manner.33 In this study, we demonstrate the combinatorial effects of promoters and enhancers on cell type–specific gene expression. For genes that are presumptively regulated by both promoters and enhancers, promoters tend to control the quantity of mRNA transcripts, whereas enhancers execute cell type–specific gene expression, although RNA Pol II can bind both of these regulatory regions and initiate transcription. The long-distance interaction of promoters and enhancers mediated by the mediator and cohesin complex may account for their functional control of gene expression in a cell type–specific manner. Recent studies on higher-order chromatin organization in human ESCs and differentiated cells also suggest that enhancers are actively involved in the looping interactions with genes and promoters.34
Functional validation of human regulatory DNA elements is crucial for understanding the roles of regulatory elements during embryonic development and disease pathology. Recent genome-wide association studies have identified thousands of human DNA variants associated with complex diseases, the majority of which are noncoding DNA elements.35 However, the molecular mechanisms of disease-associated loci are rarely illustrated because of the lack of systematic annotation of functional noncoding elements. Epigenomic annotation of cardiac-specific regulatory DNA elements has facilitated the understanding of the functional roles of previously identified noncoding DNA variants in the contribution to the pathogenesis of cardiovascular diseases.36 In this respect, our study identified several enhancer elements that could regulate the gene expression of cardiac-specific genes (such as GATA4) associated with congenital heart disease.37 This is important because future interrogation of such disease-associated genetic variants in the regulatory DNA elements may generate novel insights on personalized diagnosis and treatment of cardiovascular diseases.
In summary, we have identified 2 classes of cell type–specific enhancers and promoters in human iPSCs and somatic cells. Class I and class II regulatory DNA elements exhibit distinct regulatory roles on cell type–specific gene expression in a given cell type. Our study provides invaluable resources for understanding how cell type–specific gene expression is maintained and modulated by regulatory DNA elements, as well as how the cell identity is epigenetically preserved by chromatin modifications in human PSCs and cardiac cells.19 Given that a large number of genetic variants associated with human diseases are located in regulatory DNA elements, our data will also shed light on the potential genetic and epigenetic interventions to correct abnormal gene expression in a given cell type under disease conditions.
We thank Larry Bowen, Blake Wu, and Angela Zhang for critical editing of this article. We would like to thank Drs Joanna Wysocka and Tomek Swigut for their suggestions on ChIP-seq experiment and data analysis. We thank the Stanford Center for Genomics and Personalized Medicine for assistance with high-throughput DNA sequencing (supported by the NIH grant S10OD020141).
Sources of Funding
This study was supported by the National Institutes of Health (NIH) grants R01 HL128170, R01 HL123968, R01 HL113006, R01 HL130020, R01 HL126527 (J.C. Wu), P01 GM099130 (M.P. Snyder), R24 HL117756 (J.C. Wu, M.P. Snyder), American Heart Association Merit Award (J.C. Wu), and California Institute for Regenerative Medicine grant GC1R-06673-A (M.P. Snyder). M.T. Zhao was partially supported by a Research Award from the Lucile Packard Foundation for Children’s Health, Stanford NIH-NCATS-CTSA UL1 TR001085, and Child Health Research Institute of Stanford University.
In September 2017, the average time from submission to first decision for all original research papers submitted to Circulation Research was 13 days.
This manuscript was sent to Mark Sussman, Consulting Editor, for review by expert referees, editorial decision, and final disposition.
The online-only Data Supplement is available with this article at http://circres.ahajournals.org/lookup/suppl/doi:10.1161/CIRCRESAHA.117.311367/-/DC1.
- Nonstandard Abbreviations and Acronyms
- cardiac progenitor cell
- endothelial cell
- embryonic stem cell
- induced pluripotent stem cell
- Pol II
- RNA polymerase II
- pluripotent stem cell
- transcription start site
- Received May 18, 2017.
- Revision received September 27, 2017.
- Accepted October 12, 2017.
- © 2017 American Heart Association, Inc.
- Paige SL,
- Thomas S,
- Stoick-Cooper CL,
- Wang H,
- Maves L,
- Sandstrom R,
- Pabon L,
- Reinecke H,
- Pratt G,
- Keller G,
- Moon RT,
- Stamatoyannopoulos J,
- Murry CE
- Hu S,
- Zhao MT,
- Jahanbani F,
- Shao NY,
- Lee WH,
- Chen H,
- Snyder MP,
- Wu JC
- Melé M,
- Ferreira PG,
- Reverter F,
- et al
- Choi J,
- Lee S,
- Mallard W,
- Clement K,
- Tagliazucchi GM,
- Lim H,
- Choi IY,
- Ferrari F,
- Tsankov AM,
- Pop R,
- Lee G,
- Rinn JL,
- Meissner A,
- Park PJ,
- Hochedlinger K
- Heintzman ND,
- Stuart RK,
- Hon G,
- Fu Y,
- Ching CW,
- Hawkins RD,
- Barrera LO,
- Van Calcar S,
- Qu C,
- Ching KA,
- Wang W,
- Weng Z,
- Green RD,
- Crawford GE,
- Ren B
- Chronis C,
- Fiziev P,
- Papp B,
- Butz S,
- Bonora G,
- Sabri S,
- Ernst J,
- Plath K
- Maurano MT,
- Humbert R,
- Rynes E,
- et al
- Gupta RM,
- Hadaya J,
- Trehan A,
- et al
- Garg V,
- Kathiriya IS,
- Barnes R,
- Schluterman MK,
- King IN,
- Butler CA,
- Rothrock CR,
- Eapen RS,
- Hirayama-Yamada K,
- Joo K,
- Matsuoka R,
- Cohen JC,
- Srivastava D
Novelty and Significance
What Is Known?
The majority of human genome is comprised of noncoding elements, and protein-coding genes account for <2%.
Genome-wide association studies reveal that >3 quarters of disease-associated single nucleotide polymorphisms are located in regulatory DNA elements.
Regulatory DNA elements, such as promoters and enhancers, play a pivotal role in modulating the spatial and temporal gene expression during cardiac development.
What New Information Does This Article Contribute?
The use of state-of-the-art next-generation sequencing technology (RNA-seq and ChIP-seq [chromatin immunoprecipitation combined with high-throughput sequencing]) to profile transcriptional and epigenetic changes in regulatory DNA elements of human induced pluripotent stem cells and somatic cells.
Cell type–specific gene expression and epigenetic signatures in regulatory elements are dramatically remodeled during cellular reprogramming.
Cardiac-specific enhancer elements were experimentally validated in transgenic mouse embryos and human cells.
Regulatory DNA elements (promoters and enhancers) mediate cell type–specific gene expression during cardiac development. However, cell type–specific regulatory DNA elements have been largely unknown in the human heart. Here, we decipher the epigenetic signatures of regulatory DNA elements in cardiac progenitor cells, endothelial cells, and fibroblasts derived from human fetal heart. We reveal 2 classes of regulatory DNA elements according to epigenetic marks (H3K4me1 and H3K4me3) enrichment: class I elements are enriched with these marks in all cell types, whereas class II elements are labeled in a cell type–specific manner. Class I promoters exhibit stronger transcriptional regulation of the nearby genes, regardless of the presence of distal enhancers. We validate the functions of human enhancer elements in mouse embryos and human cells and identify enhancer elements that could mediate the cardiac-specific gene expression.