ERIC Number: ED261675
Record Type: Non-Journal
Publication Date: 1985-Jun
Reference Count: N/A
A Study of Free-Index Phrases. Final Report.
Katzer, Jeffrey; And Others
This research project was motivated by results of earlier work (Katzer et al., 1982) on the overlap among document representations. In the earlier study, one representation used in the INSPEC database proved to perform unexpectedly well in comparison with other commonly used representations, such as controlled vocabulary or free-text terms from the title/abstract of the document. That representation--free-index phrases--is mainly composed of free-text phrases selected by an indexer from the title/abstract. The objectives of the current research project were (1) to discover why the free-index phrases performed as well as they did, and (2) to attempt to produce surrogate free-index phrases automatically from the title/abstract. The free-index phrases in samples of INSPEC title/abstracts were examined and the results of the previous study were reconsidered in light of the current project. The project began with all of the noun phrases in the title/abstract. From these, several methods were used to select surrogate free-index phrases. Each method was compared statistically and empirically against the actual free-index phrases, and in no case did the surrogates perform as well. No clearcut cause for the performance of the phrases was found. One viable possibility has to do with those relatively few free-index phrases which do not derive directly from the title/abstract of the document, but are added by indexers at INSPEC, who take most of them from the controlled vocabulary. Numerous tables, 29 references, and five appendices are included. (Author/THC)
Publication Type: Reports - Research; Tests/Questionnaires
Education Level: N/A
Sponsor: National Science Foundation. Washington, DC. Div. of Information Science and Technology.
Authoring Institution: Syracuse Univ., NY. School of Information Studies.