ERIC Number: ED051864
Record Type: RIE
Publication Date: 1970-Apr
Reference Count: 0
The National Physical Laboratory Experiments in Statistical Word Associations and Their Use in Document Indexing And Retrieval.
Vaswani, P. K. T.; Cameron, J. B.
The experiments involved 11,571 abstracts (with titles), 1,000 key-word stems and 93 search requests. Measures of word association are derived in several ways from the numbers of documents in which two given words co-occur, and measures of similarity from the numbers of words associated with both. Word clusters with different degrees of overlap are derived from the resulting networks of word connections for use as document descriptors. All are employed in retrieval and their performance analyzed. Two new measures, sensitivity and coverage, reflect the variation in a strategy's performance from request to request. The best strategy depends on the user's requirements. For a single strategy key-words are simplest but the quantities of output are erratic and may usefully be controlled according to word associations. If two strategies can be used key-words alone may be followed by associations, yielding in a similar output quantity 30% more relevant documents. The corresponding use of clusters is marginally better but unlikely to justify its extra cost. (Author)
Publication Type: N/A
Education Level: N/A
Authoring Institution: National Physical Lab., Teddington (England).