Wrzodek, Clemens and Büchel, Finja and Ruff, Manuel and Dräger, Andreas and Zell, Andreas

Precise generation of systems biology models from KEGG pathways.

BMC Systems Biology vol. 7 (2013), no. 1, pp. 15


Abstract

Background: The KEGG PATHWAY database provides a plethora of pathways for a diversity of organisms. All pathway components are directly linked to other KEGG databases, such as KEGG COMPOUND or KEGG REACTION. Therefore, the pathways can be extended with an enormous amount of information and provide a foundation for initial structural modeling approaches. As a drawback, KGML-formatted KEGG pathways are primarily designed for visualization purposes and often omit important details for the sake of a clear arrangement of its entries. Thus, a direct conversion into systems biology models would produce incomplete and erroneous models.

Results: Here, we present a precise method for processing and converting KEGG pathways into initial metabolic and signaling models encoded in the standardized community pathway formats SBML (Levels 2 and 3) and BioPAX (Levels 2 and 3). This method involves correcting invalid or incomplete KGML content, creating complete and valid stoichiometric reactions, translating relations to signaling models and augmenting the pathway content with various information, such as cross-references to Entrez Gene, OMIM, UniProt ChEBI, and many more. Finally, we compare several existing conversion tools for KEGG pathways and show that the conversion from KEGG to BioPAX does not involve a loss of information, whilst lossless translations to SBML can only be performed using SBML Level 3, including its recently proposed qualitative models and groups extension packages.

Conclusions: Building correct BioPAX and SBML signaling models from the KEGG database is a unique characteristic of the proposed method. Further, there is no other approach that is able to appropriately construct metabolic models from KEGG pathways, including correct reactions with stoichiometry. The resulting initial models, which contain valid and comprehensive SBML or BioPAX code and a multitude of cross-references, lay the foundation to facilitate further modeling steps.


Downloads and Links

[doi] [pdf] [pdf]


BibTeX

@article{Wrzodek2013a,
  author = {Wrzodek, Clemens and B\"uchel, Finja and Ruff, Manuel and Dr\"ager,
	Andreas and Zell, Andreas},
  title = {Precise generation of systems biology models from {KEGG} pathways.},
  journal = {BMC Systems Biology},
  year = {2013},
  volume = {7},
  pages = {15},
  number = {1},
  month = jan,
  abstract = {Background: The KEGG PATHWAY database provides a plethora of pathways
	for a diversity of organisms. All pathway components are directly
	linked to other KEGG databases, such as KEGG COMPOUND or KEGG REACTION.
	Therefore, the pathways can be extended with an enormous amount of
	information and provide a foundation for initial structural modeling
	approaches. As a drawback, KGML-formatted KEGG pathways are primarily
	designed for visualization purposes and often omit important details
	for the sake of a clear arrangement of its entries. Thus, a direct
	conversion into systems biology models would produce incomplete and
	erroneous models.

	Results: Here, we present a precise method for processing and converting
	KEGG pathways into initial metabolic and signaling models encoded in the
	standardized community pathway formats SBML (Levels 2 and 3) and BioPAX
	(Levels 2 and 3). This method involves correcting invalid or incomplete KGML
	content, creating complete and valid stoichiometric reactions, translating
	relations to signaling models and augmenting the pathway content with
	various information, such as cross-references to Entrez Gene, OMIM, UniProt
	ChEBI, and many more. Finally, we compare several existing conversion tools
	for KEGG pathways and show that the conversion from KEGG to BioPAX
	does not involve a loss of information, whilst lossless translations
	to SBML can only be performed using SBML Level 3, including its recently
	proposed qualitative models and groups extension packages.

	Conclusions: Building correct BioPAX and SBML signaling models from the KEGG
	database is a unique characteristic of the proposed method. Further, there
	is no other approach that is able to appropriately construct metabolic
	models from KEGG pathways, including correct reactions with stoichiometry.
	The resulting initial models, which contain valid and comprehensive
	SBML or BioPAX code and a multitude of cross-references, lay the
	foundation to facilitate further modeling steps.},
  doi = {10.1186/1752-0509-7-15},
  issn = {1752-0509},
  keywords = {KEGG, KGML, SBML, BioPAX, modeling, systems biology, qualitative modeling,
	quantitative modeling, converter, comparison},
  pdf = {http://www.biomedcentral.com/content/pdf/1752-0509-7-15.pdf},
  url = {http://www.biomedcentral.com/1752-0509/7/15}
}