ERIC Number: ED072837
Record Type: RIE
Publication Date: 1972-Oct
Reference Count: N/A
Document Retrieval Experiments Using Cluster Analysis.
Minker, Jack; And Others
The objectives of this paper are to describe the effect of using weighted index terms in a document retrieval system, and to evaluate retrieval performance when queries are expanded by terms occurring in clusters with the query terms. Three data collections, each indexed by several methods, two of which were studied and reported on in previous work, are used to develop explicit results. The study both expands upon and extends previous work at the University of Maryland. The effect of weighting index terms in the document collection, the queries and the formation of clusters is analyzed. Eight cases are investigated in which index terms are weighted and unweighted. The best results are obtained when weighted index terms are used in forming clusters, in queries, and in documents. In this case, the results on the new collection demonstrate a significant improvement in retrieval performance relative to the performance with the unmodified data base, when clustered terms are added to queries. The improvement is in contrast to the results in the previous study, where a degradation in performance, or at best an insignificant improvement, was obtained. This study supports the conclusion that weighted index terms provide better retrieval performance. (Author/NH)
Descriptors: Classification, Cluster Analysis, Data Processing, Information Retrieval, Relevance (Information Retrieval), Search Strategies, Subject Index Terms
National Technical Information Service, Springfield, Va. 22151 (N-73-13195, MF $.95, HC $4.25)
Publication Type: N/A
Education Level: N/A
Sponsor: National Aeronautics and Space Administration, Washington, DC.; National Bureau of Standards (DOC), Washington, DC.; National Science Foundation, Washington, DC.
Authoring Institution: Maryland Univ., College Park. Computer Science Center.
Note: (16 References)