Fröhlich, Holger and Wegner, Jörg K. and Sieker, Florian and Zell, Andreas

Kernel Functions for Attributed Molecular Graphs - A New Similarity Based Approach To ADME Prediction in Classification and Regression

QSAR & Combinatorial Science vol. 25 (2005), no. 4, Wiley, pp. 317-326


Abstract

Kernel methods, like the well-known Support Vector Machine (SVM), have received growing attention in recent years for designing QSAR models that have a high predictive strength. One of the key concepts of SVMs is the usage of a so-called kernel function, which can be thought of as a special similarity measure. In this paper we consider kernels for molecular structures, which are based on a graph representation of chemical compounds. The similarity score is calculated by computing an optimal assignment of the atoms from one molecule to those of another one, including information on specific chemical properties, membership to a substructure (e.g., aromatic ring, carbonyl group, etc.) and neighborhood for each atom. We show that by using this kernel we can achieve a generalization performance comparable to a classical model with a few descriptors, which are a-priori known to be relevant for the problem, and significantly better results than with and without performing an automatic descriptor selection. For this purpose we investigate ADME classification and regression datasets for predicting bioavailability (Yoshida), Human Intestinal Absorption (HIA), Blood-Brain-Barrier (BBB) penetration and a dataset consisting of four different inhibitor classes (SOL). We further explore the effect of combining our kernel with a problem-dependent descriptor set. We also demonstrate the usefulness of an extension of our method to a reduced graph representation of molecules, in which certain structural features, like, e.g., rings, donors or acceptors, are represented as a single node in the molecular graph.


Downloads and Links

[doi] [pdf]


BibTeX

@article{2005_27,
  author = {Fr\"ohlich, Holger and Wegner, J\"org K. and Sieker, Florian and
	Zell, Andreas},
  title = {Kernel Functions for Attributed Molecular Graphs - A New Similarity
	Based Approach To {ADME} Prediction in Classification and Regression},
  journal = {QSAR \& Combinatorial Science},
  year = {2005},
  volume = {25},
  pages = {317-326},
  number = {4},
  abstract = {Kernel methods, like the well-known Support Vector Machine (SVM),
	have received growing attention in recent years for designing QSAR
	models that have a high predictive strength. One of the key concepts
	of SVMs is the usage of a so-called kernel function, which can be
	thought of as a special similarity measure. In this paper we consider
	kernels for molecular structures, which are based on a graph representation
	of chemical compounds. The similarity score is calculated by computing
	an optimal assignment of the atoms from one molecule to those of
	another one, including information on specific chemical properties,
	membership to a substructure (e.g., aromatic ring, carbonyl group,
	etc.) and neighborhood for each atom. We show that by using this
	kernel we can achieve a generalization performance comparable to
	a classical model with a few descriptors, which are a-priori known
	to be relevant for the problem, and significantly better results
	than with and without performing an automatic descriptor selection.
	For this purpose we investigate ADME classification and regression
	datasets for predicting bioavailability (Yoshida), Human Intestinal
	Absorption (HIA), Blood-Brain-Barrier (BBB) penetration and a dataset
	consisting of four different inhibitor classes (SOL). We further
	explore the effect of combining our kernel with a problem-dependent
	descriptor set. We also demonstrate the usefulness of an extension
	of our method to a reduced graph representation of molecules, in
	which certain structural features, like, e.g., rings, donors or acceptors,
	are represented as a single node in the molecular graph.},
  doi = {10.1002/qsar.200510135},
  issn = {1611020X},
  publisher = {Wiley},
  url = {http://www.cogsys.cs.uni-tuebingen.de/publikationen/2005/froehlich05QSAR&CombSci.pdf}
}