NotesFAQContact Us
Search Tips
Peer reviewed Peer reviewed
Direct linkDirect link
ERIC Number: EJ789051
Record Type: Journal
Publication Date: 2008-Jan
Pages: 21
Abstractor: Author
Reference Count: 47
ISSN: ISSN-0027-3171
Cautionary Remarks on the Use of Clusterwise Regression
Brusco, Michael J.; Cradit, J. Dennis; Steinley, Douglas; Fox, Gavin L.
Multivariate Behavioral Research, v43 n1 p29-49 Jan 2008
Clusterwise linear regression is a multivariate statistical procedure that attempts to cluster objects with the objective of minimizing the sum of the error sums of squares for the within-cluster regression models. In this article, we show that the minimization of this criterion makes no effort to distinguish the error explained by the within-cluster regression models from the error explained by the clustering process. In some cases, most of the variation in the response variable is explained by clustering the objects, with little additional benefit provided by the within-cluster regression models. Accordingly, there is tremendous potential for overfitting with clusterwise regression, which is demonstrated with numerical examples and simulation experiments. To guard against the misuse of clusterwise regression, we recommend a benchmarking procedure that compares the results for the observed empirical data with those obtained across a set of random permutations of the response measures. We also demonstrate the potential for overfitting via an empirical application related to the prediction of reflective judgment using high school and college performance measures. (Contains 1 figure and 3 tables.)
Lawrence Erlbaum. Available from: Taylor & Francis, Ltd. 325 Chestnut Street Suite 800, Philadelphia, PA 19106. Tel: 800-354-1420; Fax: 215-625-2940; Web site:
Publication Type: Journal Articles; Reports - Evaluative
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A