#99-17

 

AGGREGATING REGRESSION PROCEDURES

FOR A BETTER PERFORMANCE

by

Yuhong Yang

Iowa State University

 

ABSTRACT

Methods have been proposed to linearly combine candidate regression procedures to improve estimation accuracy. Applications of these methods in many examples are very successful, pointing to the great potential of combining procedures. A fundamental question regarding combining procedure is: What is the potential gain and how much one needs to pay for it?

A partial answer to this question is obtained by Juditsky and Nemirovski (1996) for the case when a large number of procedures are to be combined. We attempt to give a more general solution. Under a |1 constrain on the linear coefficients, we show that for pursuing the best linear combination over nt procedures, in terms of rate of convergence under the squared L2 loss, one can pay a price of order 0(log n/n1-t when 0 < t < 1/2 and a price of order 0((log n/n)1/2) when 1/2 £ t < ¥ . These rates can not be improved or essentially improved in a uniform sense. This result suggests that one should be cautious in pursuing the best linear combination, because one may end up with paying a high price for nothing when linear combination in fact does not help. We show that with care in aggregation, the final procedure can automatically avoid paying the high price for such a case and then behaves as well as the best candidate procedure in terms of rate of convergence.

Keywords and phrases: Aggregating procedures, adaptive estimation, linear combining, nonparametric regression.