|
|
Mining and Visualizing Large Anticancer Drug Discovery Databases
Leming M. Shi, Jae K. Lee, Yi Fan, Timothy G. Myers, Mark Waltham,
Darren T. Andrews, Uwe Scherf, Kenneth D. Paull, and John N. Weinstein.
J Chem Inf Comput Sci 2000
In order to find more effective anticancer drugs,
the U.S. National Cancer Institute (NCI) screens a large number of compounds in vitro
against 60 human cancer cell lines from different organs of origin. About 70,000
compounds have been tested in the program since 1990, and each tested compound can be
characterized by a vector (i.e. "fingerprint") of 60 anticancer activity, or
-[log(GI50 ], values. GI50 is the concentration required to
inhibit cell growth by 50% compared with untreated controls. Although cell growth
inhibitory activity for a single cell line is not very informative, activity patterns
across the 60 cell lines can provide incisive information on the mechanisms of action
of screened compounds and also on molecular targets and modulators of activity within the
cancer cells. Various statistical and artificial intelligence methods, including
principal component analysis, hierarchical cluster analysis, stepwise linear regression,
multidimensional scaling, neural network modeling, and genetic function approximation,
among others, can be used to analyze this large activity database. Mining the
database can provide useful information: (a) for the development of anticancer drugs; (b)
for a better understanding of the molecular pharmacology of cancer; and (c) for
improvement of the drug discovery process.
|