NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
Peer reviewed Peer reviewed
Direct linkDirect link
ERIC Number: EJ685079
Record Type: Journal
Publication Date: 2004-Dec
Pages: 13
Abstractor: Author
ISBN: N/A
ISSN: ISSN-1082-989X
EISSN: N/A
Clustering Binary Data in the Presence of Masking Variables
Brusco, Michael J.
Psychological Methods, v9 n4 p510-523 Dec 2004
A number of important applications require the clustering of binary data sets. Traditional nonhierarchical cluster analysis techniques, such as the popular K-means algorithm, can often be successfully applied to these data sets. However, the presence of masking variables in a data set can impede the ability of the K-means algorithm to recover the true cluster structure. The author presents a heuristic procedure that selects an appropriate subset from among the set of all candidate clustering variables. Specifically, this procedure attempts to select only those variables that contribute to the definition of true cluster structure while eliminating variables that can hide (or mask) that true structure. Experimental testing of the proposed variable-selection procedure reveals that it is extremely successful at accomplishing this goal.
American Psychological Association, 750 First Street, NE, Washington, DC 20002-4242. Tel: 800-374-2721 (Toll Free); Tel: 202-336-5510; TDD/TTY: 202-336-6123; Fax: 202-336-5502; e-mail: journals@apa.org
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A