ISB Home



- Article -





Volume 2

Special Issue
GCB'01



Full article

In Silico Biology 2, 0033 (2002); ©2002, Bioinformation Systems e.V.  


Prediction and uncertainty in the analysis of gene expression profiles

Rainer Spang1, Harry Zuzan1, Mike West1, Joseph Nevins2, Carrie Blanchette3, Jeffrey R. Marks3

1Institute of Statistics and Decision Sciences, Duke University, Durham, NC, USA
2Department of Genetics,Howard Hughes Medical Institute, Duke University Medical Center, Durham, NC, USA
3Department of Experimental Surgery, Duke University Medical Center, Durham, NC, USA


Edited by E. Wingender; received October 22, 2001; revised and accepted January 7, 2002; published April 17, 2002


Abstract

We have developed a complete statistical model for the analysis of tumor specific gene expression profiles. The approach provides investigators with a global overview on large scale gene expression data, indicating aspects of the data that relate to tumor phenotype, but also summarizing the uncertainties inherent in classification of tumor types. We demonstrate the use of this method in the context of a gene expression profiling study of 27 human breast cancers. The study is aimed at defining molecular characteristics of tumors that reflect estrogen receptor status. In addition to good predictive performance with respect to pure classification of the expression profiles, the model also uncovers conflicts in the data with respect to the classification of some of the tumors, highlighting them as critical cases for which additional investigations are appropriate.

Key words: Computational diagnostics, gene expression analysis, expression profiles, micro array, gene chip, breast cancer, estrogen receptor status, Bayesian statistics, Bayesian regularization, binary regression, probit model, G-prior, singular value decomposition, predictive diagnosis, prognosis, tumor classification, uncertainty, factor regression, ridge regression, machine learning