Sankhya: The Indian Journal of Statistics

2002, Volume 64, Series A, Pt. 2, 256--267

SELECTION OF VARIABLES FOR DISCRIMINANT ANALYSIS IN A HIGH-DIMENSIONAL CASE

By

YASUNORI FUJIKOSHI, Hiroshima University, Japan

SUMMARY. One way of proceeding with variable selection in discriminant analysis is to formulate it as a problem of selecting the best model from a family of variable selection models. The variable selection models considered are the ones based on no additional information, due to Rao (1948, 1973). We consider a selection criterion based on an approximately unbiased estimator (AIC, Akaike (1970)) for the expected log-predictive likelihood or equivalently the expected Kullback-Leibler information for a candidate model. Such a method has been proposed by Fujikoshi (1985), based on the asymptotic theory under the usual large sample framework. The purpose of the present paper is to explore the approach under a high-dimensional framework when the dimension is comparable to the sample size. For two-groups discriminant analysis, a modified criterion is proposed.

AMS (1991) subject classification}. Primary 62H12; secondary 62E30.

Key words and phrases. AIC, bias correction, discriminant analysis, high dimensional case, modified criterion, selection of variables, two-groups case.

Full paper (PDF)