NotesFAQContact Us
Search Tips
ERIC Number: ED442866
Record Type: Non-Journal
Publication Date: 2000-Apr
Pages: 20
Abstractor: N/A
Reference Count: N/A
Comparison of Similarity Measures in Cluster Analysis with Binary Data.
Finch, Holmes; Huynh, Huynh
One set of approaches to the problem of clustering with dichotomous data in cluster analysis (CA) was studied. The techniques developed for clustering with binary data involve calculating distances between observations based on the variables and then applying one of the standard CA algorithms to these distances. One of the groups of distances that are designed for binary data is known collectively as matching coefficients. There are several incarnations of matching coefficients, but all take as their main goal the measurement of response similarity between any two observations. Thus, distance and similarity come to express the same concept with respect to the observations. This study examined four measures of association that are common to four previous studies. Using Monte Carlo simulation, cluster analysis was conducted using the four distance measures. Under the conditions of this study, the four measures performed very much the same in terms of correctly classifying individuals into two clusters based on dichotomous variables. Another interesting result is that clustering solutions were virtually identical for samples of size 240 and 1,000. (Contains 6 tables, 6 figures, and 12 references.) (SLD)
Publication Type: Reports - Evaluative; Speeches/Meeting Papers
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A