Wrzodek, Clemens and Büchel, Finja and Hinselmann, Georg and Eichner, Johannes and Mittag, Florian and Zell, Andreas

Linking the epigenome to the genome: Correlation of different features to DNA methylation of CpG islands

PLoS ONE vol. 7 (2012), no. 4, 04, Public Library of Science, pp. e35327


Abstract

DNA methylation of CpG islands plays a crucial role in the regulation of gene expression. More than half of all human promoters contain CpG islands with a tissue-specific methylation pattern in differentiated cells. Still today, the whole process of how DNA methyltransferases determine which region should be methylated is not completely revealed. There are many hypotheses of which genomic features are correlated to the epigenome that have not yet been evaluated. Furthermore, many explorative approaches of measuring DNA methylation are limited to a subset of the genome and thus, cannot be employed, e.g., for genome-wide biomarker prediction methods. In this study, we evaluated the correlation of genetic, epigenetic and hypothesis-driven features to DNA methylation of CpG islands. To this end, various binary classifiers were trained and evaluated by cross-validation on a dataset comprising DNA methylation data for 190 CpG islands in HEPG2, HEK293, fibroblasts and leukocytes. We achieved an accuracy of up to 91% with an MCC of 0.8 using ten-fold cross-validation and ten repetitions. With these models, we extended the existing dataset to the whole genome and thus, predicted the methylation landscape for the given cell types. The method used for these predictions is also validated on another external whole-genome dataset. Our results reveal features correlated to DNA methylation and confirm or disprove various hypotheses of DNA methylation related features. This study confirms correlations between DNA methylation and histone modifications, DNA structure, DNA sequence, genomic attributes and CpG island properties. Furthermore, the method has been validated on a genome-wide dataset from the ENCODE consortium. The developed software, as well as the predicted datasets and a web-service to compare methylation states of CpG islands are available at http://www.cogsys.cs.uni-tuebingen.de/software/dna-methylation/.


Downloads and Links

[doi] [pdf] [pdf]


BibTeX

@article{Wrzodek2012_1,
  author = {Wrzodek, Clemens and B\"uchel, Finja and Hinselmann, Georg and Eichner,
	Johannes and Mittag, Florian and Zell, Andreas},
  title = {{Linking the epigenome to the genome: Correlation of different features
	to DNA methylation of CpG islands}},
  journal = {PLoS ONE},
  publisher = {Public Library of Science},
  year = {2012},
  volume = {7},
  pages = {e35327},
  number = {4},
  month = {04},
  abstract = {DNA methylation of CpG islands plays a crucial role in the regulation
	of gene expression. More than half of all human promoters contain
	CpG islands with a tissue-specific methylation pattern in differentiated
	cells. Still today, the whole process of how DNA methyltransferases
	determine which region should be methylated is not completely revealed.
	There are many hypotheses of which genomic features are correlated
	to the epigenome that have not yet been evaluated. Furthermore, many
	explorative approaches of measuring DNA methylation are limited to
	a subset of the genome and thus, cannot be employed, e.g., for genome-wide
	biomarker prediction methods. In this study, we evaluated the correlation
	of genetic, epigenetic and hypothesis-driven features to DNA methylation
	of CpG islands. To this end, various binary classifiers were trained
	and evaluated by cross-validation on a dataset comprising DNA methylation
	data for 190 CpG islands in HEPG2, HEK293, fibroblasts and leukocytes.
	We achieved an accuracy of up to 91\% with an MCC of 0.8 using ten-fold
	cross-validation and ten repetitions. With these models, we extended
	the existing dataset to the whole genome and thus, predicted the
	methylation landscape for the given cell types. The method used for
	these predictions is also validated on another external whole-genome
	dataset. Our results reveal features correlated to DNA methylation
	and confirm or disprove various hypotheses of DNA methylation related
	features. This study confirms correlations between DNA methylation
	and histone modifications, DNA structure, DNA sequence, genomic attributes
	and CpG island properties. Furthermore, the method has been validated
	on a genome-wide dataset from the ENCODE consortium. The developed
	software, as well as the predicted datasets and a web-service to
	compare methylation states of CpG islands are available at http://www.cogsys.cs.uni-tuebingen.de/software/dna-methylation/.},
  doi = {10.1371/journal.pone.0035327},
  pdf = {http://www.ra.cs.uni-tuebingen.de/mitarb/wrzodek/publications/Linking%20the%20Epigenome%20to%20the%20Genome%20-%20Correlation%20of%20CpG%20islands.pdf},
  publisher = {Public Library of Science},
  url = {http://dx.doi.org/10.1371%2Fjournal.pone.0035327}
}