Sankhya: The Indian Journal of Statistics

2005, Volume 67, Pt. 1, 46--73

Prediction, Model selection and Random Dimension Penalties


Eitan Greenshtein, Department of Statistics, Haifa University, Israel

SUMMARY. Let $Z^1,...,Z^n$ be i.i.d. vectors, each consisting of a response and a few explanatory variables. Suppose we have $K$ collections of predictors, i.e., collections of functions of the explanatory variables, that predict the response variable. Given the ``empirically best" predictor within each of the collections, we suggest a criterion to select a predictor  from those $K$ candidates based on minimax regret; we also show how to find an asymptotically optimal selection procedure under this criterion. We then show how the conventional setting of model selection is related to the above. Conventionally, the term `model' refers to a collection of distributions, while its analog in our setting is a collection of predictors. The assumptions about the possible distributions of $Z^i$ (the model) are non-parametric, while the collections of the predictors are assumed to be `conveniently' parametrized.

AMS (1991) subject classification. 62C20, 62C99.

Key words and phrases. Model selection, prediction, minimax regret.

Full paper (PDF)