#00-11

PRINCIPAL COMPONENTS REGRESSION WITH DATA-CHOSEN COMPONENTS

AND RELATED METHODS

by

J. T. Gene Hwang

Cornell University

Dan Nettleton

Iowa State University

ABSTRACT

Multiple regression with correlated predictor variables is relevant to a broad range of problems in the physical, chemical, and engineering sciences. Chemometricians, in particular, have made heavy use of principal components regression and related procedures for predicting a response variable from a large number of highly correlated predictors. In this paper we develop a general theory that guides us in choosing principal components that yield very good estimates of regression coefficients. Our numerical results suggest that the theory also can be used to improve partial least squares regression estimators and regression estimators based on rotated principal components. Our methods also provide insight about the subspace of the predictor matrix that explains the response best.

 

KEY WORDS: Biased regression; Eigenvalues; Mean squared error; Multicollinearity; Partial least squares; Varimax rotation.