ERIC Number: ED028810
Record Type: Non-Journal
Publication Date: 1968-Nov
Reference Count: N/A
The Effectiveness of Weights and Links in Automatic Indexing. Project MEDICO Second Progress Report.
Artandi, Susan; Wolf, Edward H.
This report describes work concerned with the statistical evaluation of the output of MEDICO automatic indexing procedure. The statistical tests were designed to examine the validity of the assumptions which formed the bases of the indexing algorithms with primary emphasis on the algorithm development for the computation of weights and links. Some of the findings of the evaluation were: (1) of the weights assigned by the MEDICO and manual check procedures, 98% were either in agreement or differed by a weight of 1, indicating that the effectiveness of the method of weighting could be improved by allowing only two weights in the system instead of the three weights actually used; (2) when the definition of a link was changed from co-occurrence within a sentence to co-occurence between two punctuation marks, the percentage of relevant links increased from 72% to 84%; and (3) a comparison of the index terms generated from full text with those generated from the reduced text of abstracts or summaries showed that the proportion of terms indexed from reduced text is greatest for those terms which had higher weights in the full text analysis. An appendix includes the statistical tests used, the output of the full text and reduced text programs, and a bibliography of the articles used in the statistical test. (Author/JW)
Publication Type: N/A
Education Level: N/A
Authoring Institution: National Library of Medicine (DHEW), Bethesda, MD.