Sankhya: The Indian Journal of Statistics

2002, Volume 64, Series A, Pt. 2, 429--452

ON SOME MATHEMATICS FOR VISUALIZING HIGH DIMENSIONAL DATA

By

EDWARD J. WEGMAN, George Mason University, Fairfax, VA, USA and JEFFREY L. SOLKA, Naval Surface Warfare Center, Dahlgren, VA, USA

SUMMARY. The analysis of high-dimensional data offers a great challenge to the analyst because the human intuition about geometry of high dimensions fails. We have found that a combination of three basic techniques proves to be extraordinarily effective for visualizing large, high-dimensional data sets. Two important methods for visualizing high-dimensional data involve the parallel coordinate system and the grand tour. Another technique which we have dubbed saturation brushing is the third method. The parallel coordinate system involves methods in high-dimensional Euclidean geometry, projective geometry, and graph theory while the the grand tour involves high-dimensional space filling curves, differential geometry, and fractal geometry. This paper describes a synthesis of these techniques into an approach that helps build the intuition of the analyst. The emphasis in this paper is on the underlying mathematics.

AMS (1991) subject classification}. Primary 62-07, 62-09; secondary 62H30.

Key words and phrases: Parallel coordinates, grand tour, saturation brushing.

Full paper (PDF)