Sankhya: The Indian Journal of Statistics

2007, Volume 69, Pt. 4, 842--869

Inferences in Contaminated Regression and Density Models

Hongying Dai, Columbus State University, Columbus, USA
Richard Charnigo, University of Kentucky, Lexington, USA

SUMMARY. A contaminated regression model allows a second regression regime to describe a subpopulation to which a known primary regression regime is inapplicable. In this paper, we study the asymptotic and the finite-sample performance of two tests for contamination, namely a modified likelihood ratio test and an empirical D-test. We show that each test statistic has a limiting (central) chi-square distribution under the null hypothesis of no contamination and a limiting noncentral chi-square distribution under contiguous local alternatives. Analogous results are derived for contaminated density models. Monte-Carlo experiments assess type I and type II error rates for finite samples from contaminated normal densities, contaminated linear regression models, and contaminated Poisson regression models. A case study illustrates an application involving microarray data.

AMS (2000) subject classification. Primary 62F03, 62F05, 62F10, 62F12.

Key words and phrases. Mixture model, mixture regression model, D-test, modified likelihood ratio test, modified maximum likelihood estimator.

Full paper (PDF)